Skip to content(if available)orjump to list(if available)

How we’re responding to The NYT’s data demands in order to protect user privacy

molf

It would help tremendously if OpenAI would make it possible to apply for zero data retention (ZDR). For many business needs there is no reason to store or log any request at all.

In theory it is possible to apply (it's mentioned on multiple locations in the documentation), but in practice requests are just being ignored. I get that approval needs to be given, and that there are barriers to entry. But it seems to me they mention zero-data retention only for marketing purposes.

We have applied multiple times and have yet to receive ANY response. Reading through the forums this seems very common.

miles

> I get that approval needs to be given, and that there are barriers to entry.

Why is approval necessary, and what specific barriers (before the latest ruling) prevent privacy and no logging from being the default?

OpenAI’s assurances have long been met with skepticism by many, with the assumption that inputs are retained, analyzed, and potentially shared. For those concerned with genuine privacy, local LLMs remain essential.

AlecSchueler

> what specific barriers (before the latest ruling) prevent privacy and no logging from being the default?

Product development?

ArnoVW

My understanding is that they log 30 days by default, for handling of bugs. And that you can request 0 days. This is from their documentation

pclmulqdq

The missing ingredient is money.

jewelry

not just money. How are you going to support this client’s support ticket if there is no log at all?

ethbr1

Don't. "We're unable to provide support for your request, because you disabled retention." Easy.

lmm

> In theory it is possible to apply (it's mentioned on multiple locations in the documentation), but in practice requests are just being ignored. I get that approval needs to be given, and that there are barriers to entry. But it seems to me they mention zero-data retention only for marketing purposes.

What's the betting that they just write it on the website and never actually implemented it?

sigmoid10

Tbf the approach seems pretty standard. Azure also only offers zero retention to vetted customers and otherwise retains data for up to 30 days to monitor and detect abuse. Since the possibilities for abuse are so high with these models, it would make sense that they don't simply give that kind of privilege to everyone - if only to cover their own legal position.

belter

If this stands I dont think they can operate in the EU

bunderbunder

I highly doubt this court order affects people using OpenAI services from the EU, as long as they're connecting to EU-based servers.

glookler

>> Does this court order violate GDPR or my rights under European or other privacy laws?

>> We are taking steps to comply at this time because we must follow the law, but The New York Times’ demand does not align with our privacy standards. That is why we’re challenging it.

_jab

> How will you store my data and who can access it?

> The content covered by the court order is stored separately in a secure system. It’s protected under legal hold, meaning it can’t be accessed or used for purposes other than meeting legal obligations.

> Only a small, audited OpenAI legal and security team would be able to access this data as necessary to comply with our legal obligations.

So, by OpenAI's own admission, they are taking abundant and presumably effective steps to protect user privacy here? In the unlikely event that this data did somehow leak, I'd personally be blaming OpenAI, not the NYT.

Some of the other language in this post, like repeatedly calling the lawsuit "baseless", really makes this just read like an unconvincing attempt at a spin piece. Nothing to see here.

tptacek

No, there is a whole news cycle about how chats you delete aren't actually being deleted because of a lawsuit, they essentially have to respond. It's not an attempt to spin the lawsuit; it's about reassuring their customers.

VanTheBrand

The part where they go out of the way to call the lawsuit baseless is spin though, and mixing that with this messaging does exactly that, presents a mixed message. The NYT lawsuit is objectively not baseless. OpenAI did train on the Times and chat gpt does output information from that training. That’s the basis of the lawsuit. NYT may lose, this could end up being considered fair use, it might ultimately be a flimsy basis for a lawsuit, but to say it’s baseless (and with nothing to back that up) is spin and makes this message less reassuring.

tptacek

No, it's not. It's absolutely standard corporate communications. If they're fighting the lawsuit, that is essentially the only thing they can say about it. Ford Motor Company would say the same thing (well, they'd probably say "meritless and frivolous").

adamsb6

I’m typing these words from a brain that has absorbed copyrighted works.

mhitza

My understanding is that they have to keep chats based on an order, *as a result of their previous accidental deletion of potential evidence in the case*[0].

And per their own terms they likely only delete messages "when they want to" given the big catch-alls. "What happens when you delete a chat? -> It is scheduled for permanent deletion from OpenAI's systems within 30 days, unless: It has already been de-identified and disassociated from your account"[1]

[0] https://techcrunch.com/2024/11/22/openai-accidentally-delete...

[1] https://help.openai.com/en/articles/8809935-how-to-delete-an...

ofjcihen

They should include the part where the order is a result of them deleting things they shouldn’t have then. You know, if this isn’t spin.

Then again I’m starting to think OpenAI is gathering a cult leader like following where any negative comments will result in devoted followers or those with something to gain immediately jumping to its defense no matter how flimsy the ground.

gruez

>They should include the part where the order is a result of them deleting things they shouldn’t have then. You know, if this isn’t spin.

From what I can tell from the court filings, prior to the judge's order to retain everything, the request to retain everything was coming from the plaintiff, with openai objecting to the request and refusing to comply in the meantime. If so, it's a bit misleading to characterize this as "deleting things they shouldn’t have", because what they "should have" done wasn't even settled. That's a bit rich coming from someone accusing openai of "spin".

mmooss

> It's not an attempt to spin the lawsuit; it's about reassuring their customers.

It can be both. It clearly spins the lawsuit - it doesn't present the NYT's side at all.

roywiggins

It would be extremely unusual (and likely very stupid) for the defendant in a lawsuit to post publicly that the plaintiff maybe has a point.

fallingknife

Why does OpenAI have any obligation to present the NYTs side?

conartist6

It's hard to reassure your customers if you can't address the elephant in the room. OpenAI brought this on themselves by flaunting copyright law and assuring everyone else that such aggressive and probably-illegal action would be retroactively acceptable once they were too big to fail.

lxgr

If the stored data is found to be relevant to the lawsuit during discovery, it becomes available to at least both parties involved and the court, as far as I understand.

sashank_1509

Obviously openAI’s point of view will be their point of view. They are going to call this lawsuit baseless, they would not be fighting it or else.

ivape

To me it's pretty clear the way this will happen. You will need to buy additional credits or subscriptions through these LLMs that feedback payment to things like NYT and book publishers. It's all stolen. I don't even want to hear it. This company doesn't want to pay up and willing to let user's privacy hang in the balance to draw the case out until they get sure footing with their device launches or the like (or additional markets like enterprise, etc).

Workaccount2

> It's all stolen.

LLMs are not massive archives of data. The big models are a few TB in size. No one is forgoing a NYT subscription because they can ask ChatGPT to print out NYT news stories.

fallingknife

Copyright is pretty narrowly tailored to verbatim reproduction of content so I doubt they will have to pay anything.

pritambarhate

May be because you are not OpenAI user. I am. I find it useful and I pay for it. I don't want my data to be retained beyond what's promised in the Terms of Use and Privacy Policy.

I don't think the Judge is equipped to handle this case if they don't understand how their order jeopardies the privacy of millions of users worldwide who don't even care about NYT's content or bypassing their paywalls.

conartist6

You live on a pirate ship. You have no right to ignore the ethics and law of that just because you could be hurt in conflict related to piracy

DrillShopper

The OpenAI Privacy Policy specifically allows them to keep data as required by law.

mmooss

> who don't even care about NYT's content or bypassing their paywalls.

Whether or not you care is not relevant, and is usually the case for customers. If a drug company resold an expensive cancer drug without IP, you might say 'their order jeopardies the health of millions of users worldwide who don't even care about Drug Co's IP.

If the NYT is right - I can only guess - then you are benefitting from the NYT IP. Why should you get that without their consent and for free - because you don't care?

> (jeapordizes)

... is a strong word. I don't see much risk - the NYT isn't going to de-anonymize users and report on them, or sell the data (which probably would be illegal). They want to see if their content is being used.

hiddencost

> So, by OpenAI's own admission, they are taking abundant and presumably effective steps to protect user privacy here? In the unlikely event that this data did somehow leak, I'd personally be blaming OpenAI, not the NYT.

I am not an Open AI stan, but this needs to be responded to.

The first principle of information security is that all systems can be compromised and the only way to secure data is to not retain it.

This is like saying "well I know they didn't want to go sky diving but we forced them to go sky diving and they died because they had a stroke mid air, it's their fault they died.".

Anyone who makes promises about data security is at best incompetent and at worst dishonest.

JohnKemeny

> Anyone who makes promises about data security is at best incompetent and at worst dishonest.

Shouldn't that be "at best dishonest and at worst incompetent"?

I mean, would you rather be a competent person telling a lie or an incompetent person believing you're competent?

HPsquared

An incompetent but honest person is more likely to accept correction and respond to feedback generally.

hombre_fatal

You know how it's always been a meme that you'd be mortally embarrassed if your browser history ever leaked?

Imagine how much worse it is for your LLM chat history to leak.

It's even worse than your private comms with humans because it's a raw look at how you are when you think you're alone, untempered by social expectations.

vitaflo

WTF are you asking LLMs and why would you expect any of it to be private?

threecheese

This product is positioned as a personal copilot, and future iterations (based on leaked plans, may or may not be true) as a wholly integrated life assistant.

Why would a customer expect this not to be private? How can one even know how it could be used against them, when they do t even know what’s being collected or gleaned from collected data?

I am following these issues closely, as I am terrified that my “assistant” will some day prevent me from obtaining employment, insurance, medical care etc. And I’m just a non law breaking normie.

A current day example would be TX state authorities using third party social/ad data to identify potentially pregnant women along with ALPR data purchased from a third party to identify any who attempt to have an out of state abortion, so they can be prosecuted. Whatever you think about that law, it is terrifying that a shift in it could find arbitrary digital signals being used against you in this way.

hombre_fatal

It's not that the convos are necessarily icky.

It's that it's like watching how someone might treat a slave when they think they're alone. And how you might talk down to or up to something that looks like another person. And how pathetic you might act when it's not doing what you want. And what level of questions you outsource to an LLM. And what things you refuse to do yourself. And how petty the tasks might be, like workshopping a stupid twitter comment before you post it. And how you copied that long text from your distraught girlfriend and asked it for some response ideas. etc. etc. etc.

At the very least, I'd wager that it reveals that bit of true helpless patheticness inherent in all of us that we try so hard to hide.

Show me your LLM chat history and I will learn a lot about your personality. Nothing else compares.

alec_irl

> how you copied that long text from your distraught girlfriend and asked it for some response ideas

good lord, if tech were ethical then there would be mandatory reporting when someone consults an LLM to tell them how they should be responding to their intimate partner. are your skills of expression already that hobbled by chat bots?

Jackpillar

Might have to reemphasize his question again but - what questions are you asking your LLM? Why are you responding to it and/or "treating" it differently then how you would a calculator or search engine.

ofjcihen

“Write a song in the style of Slipknot about my dumb inbred dogs. I love them very much but they are…reaaaaally dumb.”

To be fair the song was intense.

supriyo-biswas

I wonder whether OpenAI legal can make the case for storing fuzzy hashes of the content, in the form of ssdeep[1] hashes or content-defined chunks[2] of said data, instead of the actual conversations themselves.

After all, since the NYT has a very limited corpus of information, and supposedly people are generating infringing content using their APIs, said hashes can be used to compare whether such content has been generated.

I'd rather have them store nothing, but given the overly broad court order I think this may be the best middle ground. Of course, I haven't read the lawsuit documents and don't know if NYT is requesting far more, or alleging some indirect form of infringement which would invalidate my proposal.

[1] https://ssdeep-project.github.io/ssdeep/index.html

[2] https://joshleeb.com/posts/content-defined-chunking.html

paxys

Yeah, try explaining any of these words to a lawyer or judge.

sthatipamala

The judges in these technical cases can be quite sophisticated and absolutely do learn terms of art. See Oracle v. Google (Java API case)

anshumankmr

As I looked up the judge for this one(https://en.wikipedia.org/wiki/William_Alsup) who was a hobbyist basic programmer, one would need a judge who coded MNIST as a passtime hobby if that is the case.

fc417fc802

I thought that's what GPT was for.

m463

"you are a helpful law assistant."

landl0rd

"You are a long-suffering clerk speaking to a judge who's sat the same federal bench for two decades and who believes 'everything is computer' constitutes a deep technical insight."

LandoCalrissian

Trying to actively circumvent the intention of a judges order is a pretty bad idea.

Aeolun

That’s not circumvention though. The intent of the order is to be able to prove that ChatGPT regurgitates NYT content, not to read the personal communications of all ChatGPT users.

girvo

Deeply, deeply so. In fact so much so that people who suggest them show they've (luckily) not had to interact with the legal system much. Judges take an incredibly dim view of that kind of thing haha

delusional

I haven't been able to find any of the supporting documents, but the court order makes it seem like OpenAI has been unhelpful in producing any alternative during the conversation.

For example, the judge seems to have asked if it would be possible to segregate data that the users wanted deleted from other data, but OpenAI has failed to answer. Not just denied the request, but simply ignored it.

I think it's quite likely that OpenAI has taken the PR route instead of seriously engaging with any way to constructively honor the request for retention of data.

bigyabai

All of that does fit on a real spiffy whitepaper. Let's not fool around though, every ChatGPT session is sent directly into an S3 bucket that some three-letter spook backs up onto their tapes every month. It's a database of candid, timestamped text interactions from a bunch of rubes that logged in with their Google account - you couldn't ask for a juicer target unless you reinvented email. Of course it's backdoored, you can't even begin to try proving me wrong.

Maybe I'm alone, but a pinkie-promise from Sam Altman does not confer any assurances about my data to me. It's about equally as reassuring as a singing telegram from Mark Zuckerberg dancing to a song about how secure WhatsApp is.

landl0rd

Of course I can't even begin trying to prove you wrong. You're making an unfalsifiable statement. You're pointing to the Russel's Teapot of sigint.

It's well-established that the American IC, primarily NSA, collects a lot of metadata about internet traffic. There are some justifications for this and it's less bad in the age of ubiquitous TLS, but it generally sucks. However, legal protections against directly spying on the actual decrypted content of Americans are at least in theory stronger.

Snowden's leaks mentioned the NSA tapping inter-DC links of Google and Yahoo, so I doubt if they had to tap links that there's a ton of voluntary cooperation.

I'd also point out that trying to parse the unabridged prodigious output of the SlopGenerator9000 is a really hard task unless you also use LLMs to do it.

tdeck

> Snowden's leaks mentioned the NSA tapping inter-DC links of Google and Yahoo, so I doubt if they had to tap links that there's a ton of voluntary cooperation.

The laws have changed since then and it's not for the better:

https://www.aclu.org/press-releases/congress-passing-bill-th...

dmurray

> You're pointing to the Russel's Teapot of sigint.

If there were multiple agencies with billion dollar budgets and a belief that they had an absolute national security mandate to get a teapot into solar orbit, and to lie about it, I would believe there was enough porcelain up there to make a second asteroid belt.

cwillu

> I'd also point out that trying to parse the unabridged prodigious output of the SlopGenerator9000 is a really hard task unless you also use LLMs to do it.

The input is what's interesting.

Yizahi

Metadata is spying (c) Bruce Schneier

If a CIA spook is stalking you everywhere, documenting your every visible move or interaction, you probably would call that spying. Same applies to digital.

Also, teapot argument can be applied in reverse. We have all these documented open digital network systems everywhere, and you want to say that one the most unprofitable and certainly the most expensive to run system is somehow protecting all user data? That belief is based on what? At least selling data is based on evidence of the industry and on actual ToS'es of other similar corpos.

rl3

>However, legal protections against directly spying on the actual decrypted content of Americans are at least in theory stronger.

Yeah, because the definition of collection was redefined to mean accessing the full content already stored on their systems, post-interception. It wasn't considered collected until an analyst views it. Metadata was a laughable dog and pony show that was part of the same legal shell games at the time, over a decade ago now.

That said, from an outsider's perspective it sounded like the IC did collectively erect robust guard rails such that access to information was generally controlled and audited. I felt like this broke down a bit once sharing 702 data with other federal agencies was expanded around the same time period.

These days, those guard rails might be the only thing standing in the way of democracy as we know it ending in the US. AI processing applied to full-take collection is terrifying, just ask the Chinese.

Workaccount2

My choice conspiracy is that the three letter agencies actively support their omnipresent, omniknowing conspiracies because it ultimately plays into their hand. Sorta like a Santa Claus for citizens.

zer00eyz

> However, legal protections against directly spying on the actual decrypted content of Americans are at least in theory stronger.

This was the point of the lots of the five eyes programs. Its not legal for the US to spy on its own citizens, but it isnt against the law for us to do to the Australians... Who are all to happy to reciprocate.

> Snowden's leaks mentioned the NSA tapping inter-DC links of Google and Yahoo...

Snowden's info wasn't really news for many of us who were paying attention in the aftermath of 9/11: https://en.wikipedia.org/wiki/Room_641A (This was huge on slashdot at the time... )

komali2

There's no way to know, but it's safer to assume.

7speter

Maybe I’m wrong, and maybe this was discussed previously, but of course openai keeps our data, they use it for training!

nl

As the linked page points out you can turn this off in settings if you are an end user or choose zero retention if you are an API user.

rl3

>Of course it's backdoored, you can't even begin to try proving me wrong.

On the contrary.

>Maybe I'm alone, but a pinkie-promise from Sam Altman does not confer any assurances about my data to me.

I think you're being unduly paranoid. /s

https://www.theverge.com/2024/6/13/24178079/openai-board-pau...

https://www.wsj.com/tech/ai/the-real-story-behind-sam-altman...

farts_mckensy

Think of all the complete garbage interactions you'd have to sift through to find anything useful from a national security standpoint. The data is practically obfuscated by virtue of its banality.

artursapek

I’ve done my part cluttering it with my requests for the same banana bread recipe like 5 separate times.

brigandish

Search engines have been doing this since the mid 90s and have only improved, to think that any data is obfuscated by its being part of some huge volume of other data is a fallacy at best.

bigyabai

"We kill people based on metadata." - National Security Agency Gen. Michael Hayden

Raw data with time-series significance is their absolute favorite. You might argue something like Google Maps data is "obfuscated by virtue of its banality" until you catch the right person in the wrong place. ChatGPT sessions are the same way, and it's going to be fed into aggregate surveillance systems in the way modern telecom and advertiser data is.

tomhow

Related discussion:

OpenAI slams court order to save all ChatGPT logs, including deleted chats - https://news.ycombinator.com/item?id=44185913 - June 2025 (878 comments)

sega_sai

Strange smear against NYT. If NYT has a case, and the court approves that, it's bizarre to to use the court order to smear NYT. If there is no case, "Open"AI will have a chance to prove its case in court.

lxgr

The NYT is, in my view, exploiting a systematic weakness of the US legal system here, i.e. extremely wide reaching discovery laws with almost no regard for the privacy of parties not involved to a given dispute, or aspects of their lives not relevant to the dispute at hand.

Of course it's out of self-serving interests, but I find it hard to disagree with OpenAI on this one.

JumpCrisscross

> with almost no regard for the privacy of parties not involved to a given dispute

Third-party privacy and relevance is a constant point of contestion in discovery. Exhibit A: this article.

thinkingtoilet

The privacy onus is entirely on the company. If Open AI is concerned about user privacy then don't collect that data. End of story.

acheron

…the whole point of this story is that the court is forcing them to collect the data.

Arainach

What right to privacy? There is no right to have your interactions with a company (1) remain private, nor should there be. Even if there was you agree to let OpenAI do essentially whatever they want with your data - including hand it over to the courts in response to a subpoena.

(1) With limited well scoped exclusions for lawyers, medical records, erc.

ChadNauseam

Given how many important interactions people have with companies in our modern age, saying "There is no right to have your interactions with a company remain private" is essentially equivalent to saying "there is no right to privacy at all". When I talk to my friends over facetime or imessage, that interaction is being mediated by Apple, as well as by my internet service provider and (I assume) many other parties.

fc417fc802

> There is no right to have your interactions with a company (1) remain private, nor should there be.

Why should two entities not be able to have a confidential interaction if that is what they both want? Certainly a court order could supersede such a right just as it could most others provided sufficient evidence. However I would expect such things to be both highly justified and narrowly targeted.

This specific case isn't so much about a right to privacy as it is a more general freedom to enter into contracts with others and expect those to be honored.

bionhoward

It’s also a matter of competition…there are other AI services available today with various privacy policies ranging from no training by default, ability to opt out of training, ability to turn off data retention, or e2e encryption. A lot of workloads (cough, working on private git repos) logically require private AI to make sense

levocardia

But there's a very big difference between "no company is legally required to keep your data private" and "a company that explicitly and publically wants to protect your privacy is being legally coerced into not keeping your data private"

1shooner

>(1) With limited well scoped exclusions for lawyers, medical records, erc.

Is this referring to some actual legal precedent, or just your personal opinion?

lxgr

That may be your or your jurisdiction's view, but such privacy rights definitely exist in many countries.

You might have heard of the GDPR, but even before that, several countries had "privacy by default" laws on the books.

davedx

Hello. I live in the EU. Have you heard of GDPR?

Imustaskforhelp

But if both the parties agree, then there should be The freedom to stay private.

Your comment is dystopian given how the interaction is basically like how some people treat ai as their "friend" imagine no matter what encrypted messaging app or smth they use, the govt still snoops

visarga

NYT wants it both ways. When they were the ones putting freelancer articles into a database to rent, they argued against enforcing copyright and for supporting the new industry, and that it was too hard to revert their original assumptions. Now they absolutely love copyright.

https://harvardlawreview.org/blog/2024/04/nyt-v-openai-the-t...

moefh

Another way of looking at it is that they lost that case over 20 years ago, and have been building their business model for 20 years accordingly.

In other words, they want everyone to be forced to follow the same rules they were forced to follow 20 years ago.

tptacek

They're a party to the case! Saying it's baseless isn't a "smear". There is literally nothing else they can say (other than something synonymous with "baseless", like "without merit").

lucianbr

Oh they definitely can say other things. It's just that it would be inconvenient. They might lose money.

I wonder if the laws and legal procedures are written considering this general assumption that a party to a lawsuit will naturally lie if it is in their interest. And then I read articles and comments about a "trust based society"...

tptacek

I'm not taking one side or the other in the case itself, but it's lazy and superficial to suggest that the defendant in a civil suit would say anything other than that the suit has no merit. The version of this statement where they generously interpret anything the NYT (I subscribe) says, they might as well just surrender.

I'm not sticking up for OpenAI so much as just for decent, interesting threads here.

fastball

This is the nature of the civil court system – it exists for when parties disagree.

Why would a defendant who agrees a case has merit go to court at all? Much easier (and generally less expensive) to make the other party whole, assuming the parties agree on what "whole" is. And if they don't agree on what "whole" is, we are back to square one and of course you'd maintain that the other side's suit is baseless.

wilg

> They might lose money.

I expect it's more about them losing the _case_. Silly to expect someone fighting a lawsuit not to try to win it.

mmooss

They could say nothing about the merits of the case.

eviks

And if NYT has no case, but the court approves it, is that still bizarre?

tootie

It's PR. OpenAI stole mountains of copyrighted content and are trying to make NYT look like bad guys. OpenAI would not be in the position of defending a lawsuit if they hadn't done something that is very likely illegal. OpenAI can also end this requirement right now by offering a settlement.

wyager

Lots of people abuse the legal system in various ways. They don't get a free pass just because their abuse is technically legal itself.

energy123

> Consumer customers: You control whether your chats are used to help improve ChatGPT within settings, and this order doesn’t change that either.

Within "settings"? Is this referring to the dark pattern of providing users with a toggle "Improve model for everyone" that doesn't actually do anything? Instead users must submit a request manually on a hard to discover off-app portal, but this dark pattern has deceived them into think they don't need to look for it.

sib301

Can you please elaborate?

energy123

To opt-out of your data being trained on, you need to go to https://privacy.openai.com and click the button "Make a Privacy Request".

alextheparrot

in the app: Settings ~> Data Controls ~> Improve the model for everyone

curtisblaine

Yes, could you please explain why toggling "Improve model for everyone" off doesn't do anything and provide a link to this off-portal app that you mention?

atleastoptimal

I've always assumed that anything sent to any company's hosted API will be logged forever. To assume otherwise always seemed naive, like thinking that apps aren't tracking your web activity.

lxgr

Assuming the worst is wise, settling for the worst case outcome without any fight seems foolish.

fragmede

privacy nhilism is a decision all on its own

morsch

I'd only call it nihilism if you are in agreement with the grandparent and then do it anyway. Other choices are pretending it's not true (denialism), or just not thinking about (ignorance). Or you complicate your life by not uploading your private info.

Barrin92

not really, it's basically just being anti fragile. Consider any corporate entity that interacts with you to be an Eldritch horror from outer space that wants to siphon your soul, because that's effectively what it is, and keep your business with them to a minimum.

It's just realism. Protect your private data yourself, relying on companies or governments to do it for you is like the saying goes, letting a tiger devour you up to the neck and then ask it to stop at the head

nraynaud

Isn't Altman collecting millions of eye scans? Since when did he care about privacy?

yoaviram

>Trust and privacy are at the core of our products. We give you tools to control your data—including easy opt-outs and permanent removal of deleted ChatGPT chats (opens in a new window) and API content from OpenAI’s systems within 30 days.

No you don't. You charge extra for privacy and list it as a feature on your enterprise plan. Not event paying pro customer get "privacy". Also, you refuse to delete personal data included in your models and training data following numerous data protection requests.

that_was_good

Except all users can opt out. Am I missing something?

It says here:

> If you are on a ChatGPT Plus, ChatGPT Pro or ChatGPT Free plan on a personal workspace, data sharing is enabled for you by default, however, you can opt out of using the data for training.

Enterprise is just opt out by default...

https://help.openai.com/en/articles/8983130-what-if-i-want-t...

bartvk

Indeed. Click your profile in the top right, click on the settings icon. In Settings, select "Data Controls" (not "privacy") and then there's a setting called "Improve the model for everyone" (not "privacy" or "data sharing") and turn it off.

bugtodiffer

so they technically kind of follow the law but make it as hard as possible?

atoav

Not sharing you data with other users does not mean the data of a deleted chat are gone, those are very likely two completely different mechanisms.

And whether and how they use your data for their own purposes isn't touched by that either.

agos

what about all the rest of the data they use for training, there's no opt out from that

baxtr

This is a typical "corporate speak" / "trustwahsing" statement. It’s usually super vague, filled with feel-good buzzwords, with a couple of empty value statements sprinkled on top.

null

[deleted]

paxys

> Does this court order violate GDPR or my rights under European or other privacy laws?

> We are taking steps to comply at this time because we must follow the law, but The New York Times’ demand does not align with our privacy standards. That is why we’re challenging it.

That's a lot of words to say "yes, we are violating GDPR".

3836293648

No, they're not, because the GDPR has an explicit exception for when a court orders that a company keeps data for discovery. It'd only be a GDPR violation if it's kept after this case is over.

lompad

This is not correct.

> Any judgment of a court or tribunal and any decision of an administrative authority of a third country requiring a controller or processor to transfer or disclose personal data may only be recognised or enforceable in any manner if based on an international agreement, such as a mutual legal assistance treaty, in force between the requesting third country and the Union or a Member State, without prejudice to other grounds for transfer pursuant to this Chapter.

So if, and only if, an agreement between the US and the EU allows it explicitly, it is legal. Otherwise it is not.

dragonwriter

That's what they are trying to suggest, because they are still trying to use the GDPR as part of their argument challenging the US court order. (Kind of a longshot to get a US court to agree that the obligation of a US party to preserve evidence related to a suit in US courts under US law filed by another US party is mitigated by European regulations in any case, even if their argument that such preservation would violate obligations that the EU had imposed on them.)

kelvinjps

Maybe the will ot store the chats of the European users?

esafak

Could a European court not have ordered the same thing? Is there an exception for lawsuits?

lxgr

There is, but I highly doubt a European court would have given such an order (or if they did, it would probably be axed by a higher court pretty quickly).

There's decades of legal disputes in some European countries on whether it's even legitimate for the government to mandate your ISP or phone company to collect metadata on you for after-the-fact law enforcement searches.

Looking at the actual data seems much more invasive than that and, in my (non-legally trained) estimate doesn't seem like it would stand a chance at least in higher courts.

dragonwriter

> There's decades of legal disputes in some European countries on whether it's even legitimate for the government to mandate your ISP or phone company to collect metadata on you for after-the-fact law enforcement searches.

> Looking at the actual data seems much more invasive than that

Looking at the data isn't involved in the current order, which requires OpenAI to preserve and segregate the data that would otherwise have been deleted. The reason for segregation is because any challenges OpenAI has to providing that data in disccovery will be heard before anyone other than OpenAI is ordered to have access to the data.

This is, in fact, less invasive than the government mandating collection for speculative future uses, since it applies only to not destroying evidence already collected by OpenAI in the course of operating their business, and only for potential use, subject to other challenges by OpenAI, in the present case.

amluto

It appears that the “Zero Data Retention” APIs they mention are something that customers need to request access to, and that it’s really quite hard to get this access. I’d be more impressed if any API user could use those APIs.

JimDabell

I believe Apple’s agreement includes this, at least when a user isn’t signed into an OpenAI account:

> OpenAI must process your request solely for the purpose of fulfilling it and not store your request or any responses it provides unless required under applicable laws. OpenAI also must not use your request to improve or train its models.

https://www.apple.com/legal/privacy/data/en/chatgpt-extensio...

I wonder if we’ll end up seeing Apple dragged into this lawsuit. I’m sure after telling their users it’s private, they won’t be happy about everything getting logged, even if they do have that caveat in there about complying with laws.

fc417fc802

> I’m sure after telling their users it’s private, they won’t be happy about everything getting logged,

The ZDR APIs are not and will not be logged. The linked page is clear about that.

singron

If OpenAI cared about our privacy, ZDR would be a setting anyone could turn on.