OpenAI Ordered To Turn Over 20 Million ChatGPT Logs in New York Times Copyright Fight – eWeek


Dr. Chris Hillman, Global AI Lead at Teradata, joins eSpeaks to explore why open data ecosystems are becoming essential for enterprise AI success. In this episode, he breaks down how openness — in architecture, tools, and partnerships like Teradata + AWS — helps organizations accelerate innovation, scale securely, and future-proof their AI strategies.
eSpeaks host Corey Noles sits down with Qualcomm’s Craig Tellalian to explore a workplace computing transformation: the rise of AI-ready PCs.
Matt Hillary, VP of Security and CISO at Drata, details problems and solutions as AI plays an expanding role in governance, risk, and compliance (GRC).
Dr. Chris Hillman, Global AI Lead at Teradata, joins eSpeaks to explore why open data ecosystems are becoming essential for enterprise AI success. In this episode, he breaks down how openness — in architecture, tools, and partnerships like Teradata + AWS — helps organizations accelerate innovation, scale securely, and future-proof their AI strategies.
eSpeaks host Corey Noles sits down with Qualcomm’s Craig Tellalian to explore a workplace computing transformation: the rise of AI-ready PCs.
Matt Hillary, VP of Security and CISO at Drata, details problems and solutions as AI plays an expanding role in governance, risk, and compliance (GRC).
Source: Solen Feyissa/Unsplash
OpenAI has been ordered to turn over 20 million de-identified ChatGPT conversation logs to a coalition of news publishers, including The New York Times, in a closely watched copyright battle over generative AI.
A US magistrate judge in Manhattan rejected OpenAI’s effort to keep the logs out of discovery, finding that the anonymized records are relevant to the case and protected by multiple privacy safeguards. The decision raises the stakes for both OpenAI and publishers, pressing claims that ChatGPT unlawfully used and reproduced their work.
US Magistrate Judge Ona Wang of the Southern District of New York denied OpenAI’s motion to reconsider an earlier order directing the company to produce a sample of 20 million consumer ChatGPT output logs for discovery in the consolidated copyright litigation involving the Times and other publishers.
The publishers argued that the logs are critical to determining whether ChatGPT reproduced their copyrighted articles and to testing OpenAI’s defenses, including fair use and substantial non-infringing uses.
OpenAI opposed the request, saying that turning over the logs would risk exposing confidential user information and that “99.99%” of the transcripts were irrelevant to the plaintiffs’ claims. Judge Wang rejected that characterization, noting that the 20 million logs represent only a small fraction of the “tens of billions” of consumer ChatGPT logs that OpenAI retains, and that the sample is relevant to issues including alleged reproductions, damages, and fair use.
The court stressed that there are “multiple layers of protection” for user privacy, including OpenAI’s de-identification of the logs, an existing protective order, and an “attorneys’ eyes only” designation for the data.
Publishers have been seeking output log data for more than a year to understand how ChatGPT interacts with their content. Early discovery requests swept in consumer, enterprise, and API logs, but the parties eventually narrowed the focus to a consumer log sample for merits discovery.
By mid-2025, the plaintiffs asked for a sample of 120 million logs spanning a two-year period. OpenAI countered with a proposal for 20 million logs, arguing that a smaller sample would be easier to de-identify and still useful for statistical analysis. The plaintiffs agreed to proceed on that basis.
After OpenAI finished, or nearly finished, de-identifying the logs, it told the publishers that it would not produce the full sample and instead suggested using keyword searches to narrow the set. The publishers moved to compel, and Wang granted the motion. OpenAI then sought reconsideration and also appealed the order to the presiding district judge.
Wang wrote that such motions are an “extraordinary remedy” and found that OpenAI had not pointed to any controlling law or facts that the court had previously overlooked.
News publishers have described the dispute in sharp terms. MediaNews Group executive editor Frank Pine said OpenAI’s leadership was “hallucinating when they thought they could get away with withholding evidence about how their business model relies on stealing from hardworking journalists,” according to reporting by Reuters.
OpenAI has tried to frame its position around privacy and security concerns. A company spokesperson pointed to a blog post by Chief Information Security Officer Dane Stuckey, saying that the Times’ demand for chat logs “disregards long-standing privacy protections” and “breaks with common-sense security practices.”
In court, OpenAI argued that handing over the logs would compromise user confidentiality despite de-identification and the protective order. Judge Wang was not persuaded, noting that existing privacy protections were adequate.
The opinion also raised questions about OpenAI’s litigation strategy. Wang observed that if OpenAI never intended to produce all 20 million logs, it was unclear why the company invested time and money in de-identifying the entire sample. She suggested that either OpenAI changed its mind after initially planning to produce the data or de-identified the full set as a tactic or for some other reason that it did not disclose.
The Times first sued in 2023, alleging that OpenAI and, in related cases, other technology companies used copyrighted material to train AI models without permission. Those suits have since been consolidated, and the case is emerging as a test of how existing copyright doctrines apply to AI training and outputs.
For publishers, the ordered log production could provide rare visibility into how LLMs actually handle news content, whether they reproduce it, paraphrase it, or avoid it. For AI developers, the ruling underscores that courts may not accept generalized privacy and burden arguments when faced with a limited, de-identified dataset that is central to the claims and defenses at issue.
Enterprise IT and legal teams will be watching the case for discovery standards as much as for the outcome. The way this court balances privacy, proportionality, and transparency could influence what regulators, plaintiffs, and partners can demand of AI systems that remain largely opaque to outside scrutiny.
In separate research, OpenAI is testing whether models can be taught to confess their own shortcuts and errors.
Subscribe to Daily Tech Insider for top news, trends & analysis
Waymo’s driverless taxis show lower crash and injury rates than human drivers, even as animal collisions and assertive maneuvers spark public debate.
Their partnership aims to bring agentic AI capabilities to more than 12,600 Snowflake customers.
Dario Amodei says AI could erase many entry-level jobs within five years and says governments must lead retraining and safety efforts.
Discover the top agentic AI tools to automate daily tasks, streamline routines, and bring more focus and ease into your life in 2026.
eWeek has the latest technology news and analysis, buying guides, and product reviews for IT professionals and technology buyers. The site’s focus is on innovative solutions and covering in-depth technical content. eWeek stays on the cutting edge of technology news and IT trends through interviews and expert analysis. Gain insight from top innovators and thought leaders in the fields of IT, business, enterprise software, startups, and more.
Property of TechnologyAdvice. © 2025 TechnologyAdvice. All Rights Reserved
Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.

source

Jesse
https://playwithchatgtp.com