AI systems deliver inaccurate or unreliable responses when queried about voting and elections

The results of the study revealed significant shortcomings in the performance of major AI services when tasked with addressing questions and concerns related to voting and elections. None of the models tested could be fully relied upon, with some performing poorly and providing inaccurate information more frequently than correct answers.

Proof News, a newly launched outlet specializing in data-driven reporting, conducted the study with the objective of assessing the capability of AI models to accurately respond to common queries about voting and elections. While AI models have increasingly been positioned as replacements for traditional search engines in providing information, the study highlighted the critical importance of ensuring accurate and reliable responses, especially for inquiries related to essential civic processes such as voter registration.

The team compiled a set of questions typically asked by individuals during election periods, covering topics such as voting attire, polling locations, and voting eligibility for individuals with criminal records. These questions were then submitted to five prominent AI models, namely Claude, Gemini, GPT-4, Llama 2, and Mixtral, via API for evaluation.

TechCrunch on X: "AIs serve up 'garbage' to questions about voting and  elections https://t.co/YPNL9fmXTD" / X

Indeed, you’ve identified a crucial aspect of the study’s methodology: the use of API calls to interact with AI models may not fully represent the way an average user accesses information. Typically, users are more likely to utilize apps or web interfaces rather than directly interacting with APIs. Additionally, the APIs used in the study may not always query the latest or most appropriate model for specific types of queries.

However, it’s important to note that these APIs are officially supported channels provided by the companies to access their AI models. Many third-party services also rely on these APIs to power their own products. Therefore, while the study may not showcase the models in the most favorable light, it still provides valuable insights into their performance and capabilities under specific conditions.