From our friends at Catalist:
Political startups have had an incredible impact in the past several cycles, but in order to help the best technology propagate across our field, we need to ensure that new programs and organizations have ready access to the highest quality data and models.
Catalist built the largest and longest-running voter file outside the two major political parties and works exclusively with progressive organizations and Democratic campaigns. Over the past fifteen years, Catalist has matured and curated its data to create a civic engagement data asset that includes billions of civic interactions that the larger progressive ecosystem has generated, including 4 billion data points worth of reported observable behavior. And we use dozens of models, including models on vote propensity and vote choice, that inform millions of interactions with voters every campaign cycle.
This is the sort of infrastructure startups don’t need to replicate since it depends on having reliable data about voters over many cycles and is resource-intensive to maintain.
Specifically, we need people, relationships and processes in place across all 50-plus states—sometimes down to the township level—to collect public data about people’s voting records. Second, we have to understand how that data differs across jurisdictions so we can accurately match the millions of Americans who move every year to their voting history.
That record linkage relies heavily on data science, but it also requires a more subtle understanding of how these records differ, whether that’s based on quirks with local election administration or changes to election law. States have different standards and processes when it comes to reporting party registration, race, and even age. But with enough data—including commercial data—we can reliably assess people’s voting history over time, helping organizations identify non-voters, infrequent voters, and super-reliable voters and activists.
There’s no shortcut to good infrastructure. But once it’s in place, we need to ensure it’s accessible to tech startups. To that end, we recently launched an innovation pilot, which gives startups access to a portion of our database. This lets people with new applications and programming test their products using a class-leading voter file so they know they can scale up quickly with new investments or a growing user base.
Good infrastructure lets everyone learn from each other, too: Last cycle, we worked with a broad range of campaigns, organizations and other data providers to pivot to phones as we backed off canvassing during the pandemic. On their own, our clients might have had to spend millions of dollars duplicatively purchasing phone records. But thanks to our data trust model we were able to invest in nearly 60 million new phone numbers and match them to our existing database.
As voters have abandoned landlines, we also rapidly updated modeling to help campaigns anticipate how likely a phone number was to connect as well as how likely it was to be correctly assigned to a voter. Because we’re operating with a very large data set, we’re also able to hold back records for random control trials, assess the quality of phones from different vendors, and report back results from tests to partners, which allows the next generations of testing and programming to benefit from everyone’s experiences.
Of course, not all programs or startups are successful. But every time someone reaches out to a voter through the Catalist database, we’re able to retain the results of that interaction so they can inform future work. The amount of data our campaigns are generating has grown exponentially across the past several campaign cycles. Retaining and analyzing that data is important, but making it accessible to innovators is another important way we can put it into action and retain our competitive edge in close election contests.