AI tools promise to speed up headline writing, but using them with unpublished drafts could expose journalists to legal and security risks. Here’s what newsroom leaders need to know.
As AI-powered writing tools become more common in newsrooms, journalists are weighing the benefits of faster headline generation against the risks of exposing unpublished work. Many see large language models like ChatGPT and Claude as ideal for crafting headlines, given their training on vast amounts of published news. But entering unpublished drafts into these systems can create serious legal and security concerns for media professionals.
When journalists paste their work into public AI chatbots, they often have little control over how that data is used. Industry investigations have shown that major AI companies built their models by scraping news content without compensation or consent. According to a Washington Post analysis, half of the top ten sites used for chatbot training were news outlets. Some companies, such as Anthropic, reportedly spent tens of millions of dollars to scan millions of books, raising further questions about intellectual property rights. The publisher of the New York Times has described this process as an unprecedented scale of intellectual property theft.
As AI companies exhaust available human-written content, they increasingly rely on "synthetic data"-AI-generated text-to train new models. This approach risks amplifying existing biases and inaccuracies. More recently, AI firms have begun using the text that users enter into chatbots as a fresh source of high-quality data. A 2025 Stanford University study found that Amazon, Anthropic, Google, Meta, Microsoft, and OpenAI all use user chat data to improve their models by default, with some retaining this information indefinitely. Even platforms marketed to writers, like Lex, disclose that they may collect both user inputs and AI outputs for platform enhancement.
Given these practices, newsroom leaders are setting stricter policies. The Markup, for example, prohibits staff from submitting unpublished drafts to public generative AI tools. While enterprise-level, paid AI models may offer more security, their data use ultimately remains at the discretion of the provider. The California State University system’s $16.9 million agreement with OpenAI reportedly includes terms that prevent the company from training its models on university data. Still, news organizations such as the Associated Press, which signed a licensing deal with OpenAI in 2023, continue to urge staff not to enter confidential or sensitive information into AI tools.
The safest option for journalists is to use local large language models that run on newsroom-owned servers, giving organizations full control over data retention and deletion. However, such solutions are rare. Researchers have proposed a "Newsroom Tooling Alliance" to develop safer alternatives, and tools like Lumo offer encrypted, locally stored chats. Until these options become widespread, journalists must carefully review privacy policies, negotiate data use terms, and proceed with caution when using AI for editorial tasks.
These concerns echo broader challenges facing the media industry, as highlighted in recent reports on declining news trust and shifting audience behaviors. For example, the latest research on news consumption trends shows how rapidly changing technology is reshaping newsroom practices and public expectations.
This article was produced with support from the Craig Newmark Center for Journalism Ethics and Security. Anika Collier Navaroli, an award-winning writer, lawyer, and researcher, leads the center at Columbia Journalism School and has developed technology policies for organizations including Twitter and Twitch.