USA Today's Jessica Davis says newsrooms must measure AI performance, not rely on intuition. She shares a framework for evaluating AI tools and explains why human oversight alone cannot scale. Her approach aims to build trust and speed up production.
Media organizations experimenting with AI tools face a critical challenge: how to trust and scale these systems without slowing down editorial workflows. Jessica Davis, Vice President of AI Product at USA Today, argues that newsrooms need structured evaluations—not just gut instinct or basic feedback—to safely integrate AI into production.
Speaking at the World News Media Congress in Marseille, Davis outlined a practical framework based on her Capstone research at City University of New York. She described how most newsrooms currently use assistive AI, which generates text or surfaces information but leaves journalists responsible for final decisions. While useful, this model puts the cognitive burden on staff and cannot scale as AI systems become more autonomous.
Davis explained that as AI evolves from assistive to agentic—moving from responding to prompts to taking independent action—the traditional human-in-the-loop approach becomes a bottleneck. She noted that requiring journalists to review every AI output leads to burnout and slows down production, especially when AI systems can make subtle but critical errors.
To illustrate the stakes, Davis shared a case study from USA Today’s newsroom. Her team built an AI agent to help journalists navigate public records requests, a process complicated by varying state laws and strict requirements. Despite months of development, the agent repeatedly made small mistakes that caused requests to fail. The breakthrough came when the team introduced structured evaluations, defining clear success criteria and measuring performance against them. This shift allowed them to move from months of testing to shipping production-ready features within days.
Davis emphasized that newsrooms do not need a full data science team to start evaluating AI. Instead, they need specific, measurable definitions of success. She found that simple thumbs-up or thumbs-down feedback from journalists was insufficient, and detailed forms were impractical. The solution was to collaborate with journalists to develop structured criteria that reflected real newsroom needs, then build those requirements into the evaluation process.
She described this approach as a new governance model for AI in newsrooms—dynamic, transparent, and focused on continuous monitoring. By making AI systems less of a black box, Davis said, journalists and product managers gain clarity on when to trust the technology and when to intervene. This enables faster, more strategic integration of AI into editorial workflows.
Other publishers are also exploring AI evaluation methods. For example, India Today has tested an AI tool to predict audience engagement before publishing, using past data to guide editorial decisions and improve accuracy over manual methods.