Data about students that are high quality yield the best outcomes for students—this is the basis of nearly all education data quality campaigns. As education data systems mature across the nation due to investments from Department of Education Statewide Longitudinal Data System grants, states have not been able to find more innovate ways to ensure that the data feeding into these systems is high quality.

Assuming that high data quality is an important goal for grantee states, why then do they continue to rely upon tired professional development workshops, giant posters, and endless PowerPoint decks devoted to awareness campaigns to identify and solve data quality issues? If we’re going to rely on more modern data systems, we also need to modernize our thinking around diagnosing the root causes of data quality issues.

We think we’ve found a way to do just that.

These new systems generate mountains of rich data—data so fine-grained that we were able to develop a series of indicators to test various aspects of data quality, such as timeliness, uniqueness, and accuracy. But what we found truly exciting was what happened when we combined these indicators with survey data about district technology and capacity. We used these data to reveal stark contrasts between high-data quality and low-data quality districts. For example, we found that districts high on the data quality scale had an average of 4 full-time data staff and spent $20-$30k per year in fixed system costs, while districts on the low end of the data quality scale had an average of 2 full-time data staff and spent $10-$15k per year.

Using this approach, we can now provide districts with a real “pulse” on their data quality while also giving them greater information about the quality control levers at their disposal. If more districts adopted this approach, we would stand a much better chance of improving outcomes for students.