This article is part of the Woehammer Data Literacy series, which focuses on how to read statistics.
Our aim is to explain what the data can reasonably tell us and what it’s limits are. We publish results once they become interesting, but we interpret them only when there is enough evidence to trust them. Statistics are a tool.
If you’re looking for instant tier lists, this series won’t give you those. If you want a clearer understanding of how to interpret the numbers, you’re in the right place.
Why Early Win Rates Lie
There is a familiar story to every Age of Sigmar rules cycle. A new battletome, or battlescroll lands and a handful of events are played. Someone posts a win rate chart and within days, the community has reached a conclusion. Sometimes within hours.
“This army is broken.”
“The army is dead.”
“GW didn’t test this.”
There’s nothing more certain, it’s like death and taxes, because the data never deserves it.
This article is the first in a new Woehammer Data Literacy series. It isn’t about defending or attacking any particular faction (Though I know you want to). It’s about how we read statistics, not just in Age of Sigmar, but for Old World and 40k as well. And how easily we mistake those early signals for final answers.
Because the biggest problem with data isn’t necessarily misinformation. It’s impatience.
Comfort in Numbers
Early win rates are enticing because they feel objective. A percentage carries an authority that anecdotes never seem to do. “Lumineth is on 60%” sounds strong, while “I keep losing with this army” does not.
The trouble is that early data is always fragile. At the start of a battlescroll, sample sizes are small. The meta hasn’t had time to adapt, and counter-play often hasn’t been identified yet. A single weekend of results can take up the picture. But that doesn’t make the data wrong, it just makes it provisional.
Pilot Effect
There is a pattern that appears again and again in early wargame statistics, the pilot effect.
A strong player picks up an army and they bring a sharp or unusual list. They go 5–0, sometimes the format of the event favours them, it could be a team events, an online TTS tournament, or a friendly local meta. Suddenly, the faction’s win rate spikes and screenshots circulate. Everyone then starts jumping to conclusions.
Nothing out of the ordinary happened. This is simply how small datasets behave. A single strong pilot can bend win rates out of shape, not because the army is dominant, but because skill differences matter when only a few games have been played.
The mistake comes later when those results are treated as universally replicable. Copying a list does not copy the decisions the player made when they went 5-0. Early win rates don’t really tell us whether we are seeing a powerful army, a powerful player, or a perfect matchup of factions on the way. They just flatten everything into one number and then we over-interpret it.
What a 5–0 Does to a Small Dataset
It’s easier to see the issue with an example.
Imagine a faction has 37 wins out of 68 games, thats a win rate of 54.4%. Now imagine the next event happens, and a strong player takes that faction and goes 5–0.
The new record becomes: 42 wins out of 73 games, the win rate jumps to around 57.5%.
No rules changed or points changed and the army didn’t get better overnight. One player simply added five wins to a small pool, and the conversation shifted from “healthy” to “dominant” in a single weekend.
Reverse the situation and the effect is just as dramatic. A new player goes 0–5, and the same faction suddenly looks mediocre or struggling. The only difference is a single player.
It’s not a flaw with the statistics; they’re just behaving as they should.
Decent Sample Sizes Aren’t Immune
Now take a sample size that feels more reassuring.
Imagine a faction with 80 wins out of 166 games, a win rate of 48.2%.
This looks stable. Then our strong player comes along (let’s call them Warson Chitlock), they add another 5 wins to that total. 85 wins out of 171 games, the win rate rises to 49.7%.
The shift is smaller than before, but it still has the potential to be meaningful. That could be enough to move a faction from “slightly underperforming” to “Ok”. Once again, nothing really changed, only five more wins were added.
This is what scale does. Larger sample sizes don’t remove variance, but they dampen it. Until you reach a point where individual players can’t move a number in a significant way, early conclusions are risky.
New Armies Don’t Start on a Level Playingfield
Another thing early data struggles to capture is who’s playing the faction.
Brand new armies like Helsmiths of Hashut attract new players to the hobby, and they attract people experimenting with unfamiliar mechanics. That often means a lower than average level of experience in the early days, combined with perhaps handful of very skilled players pushing the at the other end of the win rate scale. This results in polarisation.
Some players dominate and feel like the stats back their performance up. Others may struggle and feel as though the stats are telling them their experience doesn’t count. Both experiences are real and early win rates compress that complexity into a single percentage.
This is why early battlescroll debates so often feel like people talking past each other. They are describing different views of the same picture.
When Data Becomes a Conversation Killer
The most damaging misuse of early win rates is dismissal. Using a small, early dataset to tell someone that their frustration with a faction isn’t valid doesn’t help them improve and doesn’t help explain why they’re losing games. It simply shuts a conversation down.
Win rates are useful for spotting long term balance problems and identifying outliers. They are far less useful for explaining why someone went 1–4 at their local event, or why a new army feels punishing to learn.
A Note From Woehammer
It would be wrong of me to write an article like this without talking about our own history.
In the past, Woehammer has also published early win rates. We were keen to report what was happening in the first couple of weeks of a battlescroll. My intention was never to mislead, but I recognise now how easily those early numbers can be mis-interpreted.
I’ve learnt from my experiences, and while I still publish early results, I flag them clearly. On our win rate charts, factions that I feel do not yet have enough data are highlighted in bold italics to signal that the data set is small and should be treated with caution.
For me, a meaningful data set on a faction does not really begin until there are at least 100 GT games in the database. That’s a judgement call, based on watching early spikes. Below that threshold, it can still vary and pilot skill will skew results and conclusions. Sometimes the honest answer is simply: we don’t know yet.
A Calmer Take
Early win rates are not lies, but they are incomplete and easy to misinterpret. They should prompt questions, not panic and we should have more patience rather than treating them as a certainty.
Don’t let a two weeks worth of data convince you that the verdict is already in.


























