7b. Check sample size, data confidence, and relevance
Reliable benchmarks depend on having enough data points for the role, level, and location you need data for.
So make sure to review the compensation market data for sample size and data confidence level. These indicators reflect how robust the underlying data is, based on:
- Number of contributing companies
- Number of employees per role/level
- Variability across those data points.
If the sample isn’t large enough to hit required thresholds, some providers may model the range based on neighbouring roles, levels, and market patterns – rather than publish benchmarks with weak statistical confidence.
Ravio, for instance, has a high threshold for releasing new benchmarks, but offers Ravio IQ for benchmarks where we have enough data to model a reliable benchmark, but not enough to meet our rigorous sample size requirements.
The best providers also adjust thresholds based on role volatility. Salary data for junior administrative roles, for instance, tend to be consistent across companies – meaning fewer data points are needed for a credible benchmark. On the other hand, emerging roles like AI Engineer vary wildly, requiring more data to create a benchmark, to account for market uncertainty. Blanket thresholds can't distinguish between these scenarios, leading some providers to either over-index on stable roles or release volatile benchmarks prematurely.
Ravio, for instance, adjusts sample size requirements by role volatility and discloses data confidence in buckets of 'very strong,' 'strong,' 'good,' and 'moderate' – sharing the number of companies and employees that contribute to the benchmark, whilst accounting for variance in the underlying data, rather than just stating volume.
It’s also worth reminding here that sample size alone doesn't tell the full story – source and verification matter just as much as volume. 1,000 unverified Glassdoor salary submissions aren't equivalent to 1,000 HRIS-validated salaries. Sample size is useful for comparing roles within the same provider, but comparing across providers requires understanding where the data comes from (see 7a) and how it's validated (7c).