Skip to content(if available)orjump to list(if available)

The journalists training AI models for Meta and OpenAI

bix6

“Most of Partika’s work on Outlier takes place in 30-minute blocks and requires reviewing real, anonymized chat histories from products like Meta AI or ChatGPT.”

How anonymized can this really be? Certainly people will inevitably input PII or other details that could de-anonymize? Is Scale AI first having its employees screen for PII? Do Meta and Open do a first pass?

kleiba

> “Most of Partika’s work on Outlier takes place in 30-minute blocks and requires reviewing real, anonymized chat histories from products like Meta AI or ChatGPT.”

This isn't exactly "training AI models" in the sense that we normally use that expression in.

drsim

It is RLHF if I understand correctly.

grumpopotamus

Well, HF.

alphabetting

The fact they are reaching out to journos to read the logs seems really misguided. Like from a customer standpoint that's the worst industry you could reach out to for checking private messages lol

nick486

Yes. There was just recently a post about a person who got his life saved by chatGPT reading his blood results and saying "ER. NOW." Would that Medical result PDF be anonymized here? Stripping PI from random pdfs, sounds like a very nontrivial problem.

What if, instead of random internet person, some celebrity asks Chatgpt about some spicy Medical results? Would the journalist reviewing the logs resist the temptation of "accidentally finding the test results in a garbage bin"?

What I read here, is "don't discuss with chatgpt anything you wouldn't be comfortable becoming public knowledge.".

diggan

> What I read here, is "don't discuss with chatgpt anything you wouldn't be comfortable becoming public knowledge.".

For the last two decades, I've lived by a similar mantra: Don't send anything over the internet you aren't comfortable becoming public knowledge.

Make the mantra broad enough and you don't have to care about specific services, they all the chance of leaking what is supposed to be "secret'.

diggan

> Like from a customer standpoint that's the worst industry

I guess it depends on the country, but generally journalists are some of the more principled workers when it comes to protecting the privacy of the people they interact with. Probably the industry where Signal has the highest amount of usage, if I would guess.

But again, really depends on the country. My perspective is probably biased by growing up in Sweden.

dannyw

A journalist that wilfully breaks a legally binding confidentiality agreement, is actually a terrible sign for them.

Media conglomerates will deeply worry about a journo leaking their dirty internal secrets if they morally disagree. Disney, Comcast, Fox, or Bezos don’t want them.

Sources will worry about confidentiality. If a journo confirms something is off the record, it’s off the record. No buts. This is treated very seriously: it ruins the entire publication’s reputation and ability to talk to sources.

If a naive journos tries, it’ll be killed by their editor, if not the editor-in-chief, probably under the veneer of legal and/or ethical grounds.

Of course, a journo can talk to someone else who chooses to disclose whatever, be protected, etc, and that’s how it’s done. But the oldest adage in journalism is: “don’t be the story”.

It’s probably one of the best professions, tbh, as paradoxical as it sounds.

Remember that the journalism industry, as a whole, is not the idealised dream you think it is.

pastage

I agree with you.. Though it took me a few read throughs before I understood you liked journalists for this job. I find it interesting that it is so hard to understand people in.

dullcrisp

Why are you saying it like it’s a veneer of legal or ethical grounds? Publishing something that was said off the record would be a violation of professional ethics, whether you personally agree with those ethics or not.

danielscrubs

”The skills the recruiter alluded to were her journalism experience — her professional writing, research, and fact-checking abilities”

Anyone want to tell them?

jgalt212

Google killed journalism years ago. It was once a well compensated, high prestige job with nice perks. The recruits should get the money while they can.

I've been very disappointed by the MSM over the last 15 years, but I think a good portion of this disappointment can be attributed to the talent pool drying up.

Kuinox

The 3D printed figure have some serious layer shifting.