Accuratezza dei Dati del Pannello Permesso degli Uccelli
Uccello
21 ago 2021
1 min read

Conclusioni principali
Bird’s Permissioned Email Panel has historically been hard to validate due to lack of “ground truth” inbox placement data from mailbox providers.
A major mailbox provider now licenses inbox placement data, enabling direct comparison across more than 20,000 sending domains.
The analysis shows extremely high accuracy between Bird’s panel-based inbox rate estimates and the true inbox rate.
Accuracy improves as more distinct panelists receive the email stream—strong even at low signal, excellent at higher volumes.
RMSE (root mean square error) is used to measure deviation between panel predictions and ground-truth inbox rates.
Senders using top ESPs show materially better correlation—likely due to stricter compliance practices and more consistent inboxing.
With only 10 daily panelists, error rates remain below 10%.
With 50+ panelists, error drops significantly and becomes very tight.
Error rate rapidly approaches ~2% as panel size grows—indicating ~98% accuracy in predicting inbox placement.
This level of accuracy is excellent for diagnosing deliverability issues across a sender’s full mail stream.
Panel data remains critical because major providers like Google and Microsoft do not supply inbox placement metrics.
With proven correlation, senders can confidently rely on Bird’s panel data to understand inboxing where no ground truth exists.
Q&A Highlights
What problem was historically difficult to solve regarding inbox placement?
There was no reliable “ground truth” to validate how accurately a permissioned panel predicted inbox placement at scale.
What changed that enabled proper measurement?
A major mailbox provider began licensing real inbox placement data, allowing Bird to compare its panel predictions against actual results.
How large was the analysis dataset?
More than 20,000 sending domains—ranging from small senders to very large enterprise senders.
What metric was used to evaluate accuracy?
RMSE (root mean square error), a standard way to measure deviation between predicted and actual values.
How accurate is the panel with a very small number of daily panelists?
Even with only 10 distinct panelists, error rates stay under 10%, which is already strong for deliverability diagnostics.
What happens when more panelists see the email stream?
Accuracy increases rapidly—at 50+ panelists, correlation becomes extremely strong, and error drops sharply.
What is the best-case accuracy observed?
Error approaches ~2%, meaning Bird’s panel data can be up to 98% accurate compared to true inbox placement.
Why do top ESPs show better correlation?
Likely due to stricter compliance standards, which lead to more stable inboxing patterns and less variance in deliverability behavior.
Is the accuracy sufficient for diagnosing deliverability issues?
Absolutely—error rates below 5–10% provide more than enough precision to spot deliverability anomalies and trends.
Why is panel data still necessary if one mailbox provider offers ground truth?
Because major mailbox providers (Google, Microsoft, etc.) do not provide inbox placement reporting—panel data fills this visibility gap.
What does the analysis prove about Bird’s panel model overall?
That it is statistically reliable across a wide range of domains and sending behaviors, even with low sample sizes.
What is the practical outcome for senders?
They can trust Bird’s panel data to guide deliverability decisions, especially in ecosystems where no other inbox placement data exists.
One of the questions we regularly receive about our Permissioned Email Panel is how accurate it is in terms of forecasting inbox rates. Historically, this has been a difficult question to answer with any authority as there was no source of ground truth to measure against, and so opinions (and general faith in sample statistics) ruled the discussion.
Now, though, with a major mailbox provider licensing inbox placement data for their platform, it’s possible to do a real analysis, which we did over some 20,000 distinct sending domains of senders large and small, both on our sending platform and on other providers.
The results are exciting. The permissioned panel is highly accurate, even with relatively low signal, and gets extremely accurate as the number of distinct panelists seen in a send is increased. Using common statistical methods, we look at the root mean square error (RMSE – an analogue of the standard deviation) between the inboxing rate as seen at the major provider with what our panel sees.
In our analysis, we noted that senders who send their mail through top email service providers see a materially better correlation between the panel inbox rate and true inbox rate. The mechanism for this isn’t known, but we postulate that the compliance standards that large service providers hold their customers to generally result in more consistent inboxing rates across their audience and so are less prone to skew. We can see this if we restrict our plot to senders only on top ESPs, which also reduces the RMSE by about 30%.
Even when a small number (10) of daily panelists see the mail stream, we see a very strong correlation between the inbox rate as seen by the panel and the ground truth.
If we consider only streams where 50 or more panelists are seen daily, the correlation becomes even tighter.
If we look at how this error rate varies over time, we see a few things:
Even at extremely small numbers of unique panelists receiving the mail, the error rate is under 10%.
It quickly drops to 4% as the number of panelists increases.
It eventually approaches 2% – showing that panel data is 98% accurate.
For the purposes of identifying mail stream deliverability issues, this accuracy is fantastic.
So you might ask: with a major provider offering ground truth numbers, what’s the utility of having panel data as well, even if it is highly correlated? The majority of mailbox providers – including titans like Google and Microsoft – do not offer inbox placement data, and so for messages delivered there, you still need a source like panel data to understand inboxing rates.
And now we can all be confident in its accuracy for those cases.


