Skip to content(if available)orjump to list(if available)

OpenAI is retaining all ChatGPT logs "indefinitely." Here's who's affected

gnabgib

Related: OpenAI slams court order to save all ChatGPT logs, including deleted chats (1101 points, 2 days ago, 906 comments) https://news.ycombinator.com/item?id=44185913

ViktorRay

Here is the direct link to OpenAi's official response:

https://openai.com/index/response-to-nyt-data-demands/

That official response was discussed on this website yesterday. Here is a link to the discussion:

https://news.ycombinator.com/item?id=44196850

JumpCrisscross

“This does not impact ChatGPT Enterprise or ChatGPT Edu customers.

This does not impact API customers who are using Zero Data Retention endpoints under our ZDR amendment.”

The court order seems reasonable. OpenAI must retain everything is can, and has not promised not to, retain.

senko

I cannot find it now because OpenAI blocks archive.org (oh the irony), but previously their API privacy policy said no data retention beyond 30 (or 90, I can't recall) day period for abuse/monitoring. I know because I was researching this for an (EU) customer.

Now, the promise is still to not train on API inputs/outputs, but the retention promise is nowhere to be found, unless you're an Enterprise customer (ZDR).

Moreover, at least in my understanding of the order, the court ordered them to keep ALL, not "all unless for those you promised not to keep":

> OpenAI is NOW DIRECTED to preserve and segregate all output log data that would otherwise be deleted on a going forward basis until further order of the Court (in essence, the output log data that OpenAI has been destroying), whether such data might be deleted at a user’s request or because of “numerous privacy laws and regulations” that might require OpenAI to do so.

So in effect, the statement you quoted is false, and OpenAI is actually in breach of their privacy policies.

I asked ChatGPT to parse this in case I misunderstood it, and its interpretation was:

> The company is ordered to preserve all output log data that would otherwise be deleted, going forward, regardless of the reason for deletion (user request, privacy law, default retention policy, etc.).

> In other words: They must stop deleting any output log data, even if it would normally be deleted by default or at a user's request.

> This is more than just keeping data they would normally retain — it's a preservation order for data they would otherwise destroy.

As a result, OpenAI is now unusable for serious business use in Europe. Since the competition (Anthropic, Google) is not affected by this court order, the only loser here is OpenAI.

JumpCrisscross

> in effect, the statement you quoted is false, and OpenAI is actually in breach of their privacy policies

The statement is from OpenAI’s own press release. I still wouldn’t argue that their lies somehow make the court order unreasonable.

> OpenAI is now unusable for serious business use in Europe

This is entirely in OpenAI’s control. They could convert everyone in the EU to ZDR. They choose not to, and that’s their right. (As it is the right of the EU to deem that noncompliance.)

throwaway314155

From bio: > Trade private equity

> The court order seems reasonable.

Checks out.

JumpCrisscross

Fair enough, I’m by default deferential to our courts.

In this case, however, there is a simple solution if OpenAI doesn’t want to save data: don’t retain it in the first place. If OpenAI committed to privacy as a value, their protests would have merit. But in this case it sounds like they want the ability to retain the data as well as delete it despite being in the midst of litigation. That’s simply not a privilege anyone is afforded by the courts. (We’re also talking about the Times versus a $100+ bn tech giant whose CEO has direct lines to heads of state. There is no David to default sympathy to.)

thegrim33

Could also just look at their submission history; all they do is post political content every single day of their life. Such a person is not going to have overly impartial/intelligent takes on issues. I wish there was a plugin that could do sentiment analysis and auto-filter out such users from submissions/comments shown to me. Or just a button I could manually click to never see a person's posts again.

kleiba

> Late Thursday, OpenAI confronted user panic over a sweeping court order requiring widespread chat log retention—including users' deleted chats—after moving to appeal the order that allegedly impacts the privacy of hundreds of millions of ChatGPT users globally.

When "delete" actually means "hide from my view", you can only hope that you live in a country with strong privacy and data protection laws.

sixothree

Even when companies are honest about how "delete" works, they will use weasel language such as "delete from history" or "delete from inbox" instead of actually doing the thing the user intends.

bombcar

Part of that is the company doesn't even know what they do internally - sometimes it purges instantly, other times it gets marked deleted in a database table that never gets purged until the fifteen billion row table takes down the service.

jajko

Company that wants to know, knows. The more these data are to the core of their reason for existence the better and more current this knowledge is. Don't give these aholes a pass they never ever deserved, they don't have any moral higher ground or moral 'credits' to burn. openAI is not different.

fizx

At a tech company of >1k engineers, this gets audited regularly by a dedicated data retention team.

xeromal

This half the time due to people wanting to recover their accidental mistakes from deleting stuff.

IMO hard deleting things is generally a bad practice until the user wants to delete their entire account or some other very explicit action is executed.

josefritzishere

It is still the process at most large firms to delete data after certain intervals of time. the failure to have and follow a data deletion policy is a huge legal risk... even if that interval is 10 years. Forever is the one definitively wrong answer.

yreg

I have no experience with this, but I imagine actually deleting files from some giant collection of data that needs to be safely backed up is borderline impossible, no?

I expect that the big tech companies have all kinds of cold storage backups and no one is going to actually go spelunking in those archives to physically delete my data when I delete an email. It's more likely that they will delete the keys to decrypt it, but even the keys must be safely stored somewhere, so it's the same problem just with less data.

LadyCailin

> When "delete" actually means "hide from my view", you can only hope that you live in a country with strong privacy and data protection laws.

I do, but presumably that doesn’t matter, as the US thinks its legal code outweighs the legal code for Europeans living in Europe. Jokes on Europeans for allowing Americans to take over the world stage for too long, I suppose.

longnguyen

Shameless plug: I build Chat Vault to help you import all chat data from ChatGPT, Claude, Grok and Le Chat (Mistral) to a native mac app.

You can search, browse and continue your chats 100% offline.

It’s free while in beta https://testflight.apple.com/join/RJx6sP6t

subarctic

Is this something I could do with openwebui?

accrual

Open WebUI does support import/export from a JSON file, but may need a translation for ChatGPT data.

pier25

> The order impacts users of ChatGPT Free, Plus, and Pro, as well as users of OpenAI’s application programming interface (API)

So how does this work with services using the API like Copilot or Cursor?

Is OpenAI now storing all the code sent to the API?

dawnerd

I think it's only safe to assume the answer is yes.

tintor

Depends if Cursor / Copilot are using Zero Data Retention API or not.

triceratops

It would be a bit much if this linked to an NYT article.

creaturemachine

With the requisite paywall-bypass link as the first comment.

Dachande663

Question for the crowd: if using the OpenAI service in Azure, is that included in the retention? OpenAI say API access but don’t specify if that’s just their endpoints or anyone running their models.

filmgirlcw

You’d have to check with Microsoft. OpenAI says that this doesn’t apply to customers with a Zero Data Retention endpoint policy, but my recollection is that Azure OpenAI doesn’t fall into that category unless it’s something that is explicitly paid for. That said, OpenAI also says that ChatGPT Enterprise customers aren’t impacted (aside from their standard policies around how long it takes to delete data, which they say is within 30 days), but only Microsoft would know if their API usage counts as “enterprise” or not.

puppycodes

I'd be very suprised if they weren't already doing this, the major change however might be attribution to those queries?

If you think this is scary you should see what google has been doing for a decade.

Trasmatta

I seriously doubt they were already doing this. What would the benefit have been? The vast majority of users will never delete their chats, so it's not like they lose a ton of data by hard deleting conversations.

puppycodes

The benefit is you train models on all the data you receive and also you want some audit trail of how that data got there. This is just a hunch though!

Trasmatta

What I mean is that the vast majority of conversations never get deleted by users anyway. So why risk breaking privacy laws (and their own privacy policy) for the percentage that do?

gmuslera

Not only the users of ChatGPT are affected, also the users of other hosted LLMs, most of which can get the same kind of orders from the jurisdictions they belong to.

Beijinger

Don't ask: How do I murder my wife and get away with it?

pxc

> Magistrate Judge Ona Wang granted the order within one day of the NYT's request. She agreed with news plaintiffs that it seemed likely that ChatGPT users may be spooked by the lawsuit and possibly set their chats to delete when using the chatbot to skirt NYT paywalls.

Are users who deliberately skirt paywalls ever shy about it? Since when?

paulddraper

Since lawsuits I guess.

pxc

Are they going after individual readers like those old RIAA and MPAA campaigns? Or does this just refer to the lawsuit been these two companies?