Recently the Secretaries of State in New Mexico and Colorado announced seemingly shocking findings that their research identified “thousands” of non-citizens, apparently voting illegally in their respective states. A careful review of the reports and information released (they have not released the complete set of data and methodology used to arrive at their conclusions) shows some serious problems with the methods used based upon the information they did disclose resulting in questionable conclusions.
The Birthday Problem in New Mexico: Finding False Matches in Comparing Voter Registration Lists and Foreign National Lists
On March 15, 2011 in New Mexico, Secretary of State Dianna Duran issued a statement alleging 37 registered voters whose records “matched” a foreign-nationals list who voted at some point in a general election within the last 8 years in New Mexico. Secretary Duran stated that her office used names and birthdates to affect the matches between the voter registration lists and the lists of foreign nationals. She further stated that 117 registrants from the voter registration list had social security numbers that did not match their name.
Use of names and birthdates in matching lists to conclude that two distinct people on two separate lists are, in fact, the same person is statistically flawed, and has a name – The Birthday Problem. The “Birthday Problem” is a probability puzzle that addresses the statistical likelihood of finding two people with identical birthdays within a population group and illuminates why two people with the same name are likely to share a birthday. Discussed in detail by Dr. Michael McDonald and Justin Levitt, in Seeing Double Voting: An Extension of the Birthday Problem, the study demonstrates that the larger a given population, the more likely you are to see a birthday-name match just on the basis of statistical probabilities.
Statistical coincidence of matching birthdates is a proven phenomenon, studied and reviewed hundreds of times by statisticians and mathematicians. In reviewing the hundreds of thousands of names annually on the list of registered voters in New Mexico and the hundreds of thousands of names on the foreign national license holders lists over an 8‑year period, one should expect to find people on both lists with matching names and birthdates. Consequently, any conclusion of wrongdoing or nefarious conduct, based solely upon name and birthdate matches in a population of hundreds of thousands of people should be viewed with suspicion.
It seems straightforward to compare two data sources to find entries that appear to match. In this case, however, there was no acknowledgment of how the “Birthday Problem” was accounted for and corrected in the analysis in New Mexico. Moreover, there was no indication that the underlying data was evaluated for administrative errors, missing dates or other data flaws prior to matching the names.
Similarly, the failure of 117 people to have Social Security numbers that match their name, cannot, by itself be viewed as evidence of wrongdoing. New Mexico has over 900,000 registered voters, most of whom filled out a voter registration card by hand, from which the data must be entered into a centralized data system. There is no indication that Secretary Duran’s analysis included any evaluation or follow-up to determine if any or all of the 117 mis-matches were the result of data entry errors, unreadable voter registration forms or have some other accidental source for the confusion.
Inconclusive Data, Inadvertent Registrations and No Intent: Overstating Conclusions based upon Self-Admitted Insufficient Data in Colorado
On March 8, 2011 Colorado Secretary of State Scott Gessler, released a 6-page report alleging that his office is “nearly certain” that 106 American immigrants are improperly registered to vote in Colorado. The blanket conclusion that there are over 11,805 improperly registered voters and of those 4,000 people improperly voted in the 2010 elections are called into question by the qualifying statements and equivocal recommendations contained in the report.
Secretary Gessler’s report admits that the inconclusive voter registration data does not prove that all 11, 805 persons it identified were registered improperly. It concludes that even where there are improper registrations, they could have been due to unintentional registration, clerical or other administrative failure without any intention of the registrants to vote or commit voter fraud. The report is utterly silent on how it arrived at the conclusion that over 4000 of the “improper registrants” voted in the 2010 election. There is simply a barely-supported conclusory statement that “it is likely” that many of the 4,214 registrants in question were not citizens when they cast their vote in 2010. Compare the 106 registered voters that the report alleges are “virtually certain” that they are not citizens, with no attempt to suggest that any of those 106 persons actually voted in 2010 or intended to commit fraud.
Ultimately, it is unclear how many of the 11,805 people that Secretary Gessler’s research identified were, in fact, not citizens at the time of the 2010 election. While his process removed duplicates created when a person used two different non-citizen sources of identification to apply for or renew their driver’s license or identification card, there is no indication that Gessler’s process removed people from the list once they had become citizens. Since 2006, the same time period Gessler used to identify non-citizens, 32,140 individuals became citizens in Colorado. It is certainly reasonable to assume that some, if not many, of the over 4,000 individuals who the report alleged were non-citizens when they voted in 2010 – were, in fact, citizens at the time of the election.
Without the underlying reports and methodologies from New Mexico and Colorado, the conclusions cannot be fully supported or dismissed. With the information we have to date, for all the reasons stated above, any conclusions drawn from these two states must be scrutinized and cannot yet be taken at face value. Neither Secretary Duran nor Secretary Gessler have publicly released the methodology used to arrive at their conclusions, so their results cannot be reviewed or duplicated.
The danger of drawing conclusions without an unambiguous understanding of the accuracy of the data is clear. Recall Florida in 2000, when a list of purged voters later became notorious when it was discovered that the “matching” process captured eligible voters with names similar to - but decidedly different from - the names of persons with felony convictions, sometimes in other states entirely effectively disenfranchising thousands of eligible voters and suppressing the vote in that state.
Until there is the opportunity to review the methodology and analysis used to arrive at the conclusions of the Secretaries of State in New Mexico and Colorado, everyone should blow away the smoke, ignore the mirrors and look at this data critically, with their own eyes.
 See generally, W. Feller, An Introduction to Probability Theory and its Applications (3d ed., 1968); Edmund A. Gehan, Note on the “Birthday Problem”, 22 Am. Statistician 28 (1968); Ned Glick, Hijacking Planes to Cuba: An Up-Dated Version of the Birthday Problem, 24 Am. Statistician 41–4 (1970); A. G. Munford, A Note on the Uniformity Assumption in the Birthday Problem, 31 Am. Statistician 119 (1977); M. Sayrafiezadeh, The Birthday Problem Revisited, 67 Mathematics Mag. 220–3 (1994); W. Schwarz, Approximating the Birthday Problem, 42 AM. Statistician 195–6 (1988); D.M. Bloom, A Birthday Problem, 80 Am. Mathematical Monthly 1141–2 (1973); Kumar Joag-Dev & Frank Proschan, Birthday Problem with Unlike Probabilities, 99 Am. Mathematical Monthly 10 (1992). An exception to this scholarship is that adding leap years has a small negative effect on the probability of a birthday match. Philip F. Rust, The Effect of Leap Years and Seasonal Trends on the Birthday Problem, 30 Am. Statistician 197–8 (1976). Geoffrey C. Berresford, The Uniformity Assumption in the Birthday Problem, 53 Mathematics Mag. 286–8 (1980); Rust, supra note 16; Thomas S. Nunnikhoven, A Birthday Problem Solution for Nonuniform Birth Frequencies, 46 Am. Statistician 270–4 (1992).