Skip to content(if available)orjump to list(if available)

Anthropic judge rejects $1.5B AI copyright settlement

jawns

I'm an author, and I've confirmed that 3 of my books are in the 500K dataset.

Thus, I stand to receive about $9,000 as a result of this settlement.

I think that's fair, considering that two of those books received advances under $20K and never earned out. Also, while I'm sure that Anthropic has benefited from training its models on this dataset, that doesn't necessarily mean that those models are a lasting asset.

shermozle

It's far from fair given that if _I_ breach copyright and get caught, I go to jail, not just pay a fine.

dragonwriter

> It's far from fair given that if _I_ breach copyright and get caught, I go to jail, not just pay a fine.

This settlement has nothing to do with any criminal liability Anrhropic might have, only tort liability (and it doesn’t involves damages, not fines.)

stingraycharles

Also, you can’t put a business in jail.

mcv

Yeah, but this is a corporation. They don't go to jail. They're only people when it's beneficial to them.

weird-eye-issue

No you wouldn't

singpolyma3

You don't though

DyslexicAtheist

arent the US trying to extradite Kim Dotcom for years now? (or were at least in the past)

stevage

What? Who goes to jail over copyright infringement?

hmmokidk

…Aaron Swartz?!

kg

> Penalties to be applied in cases of criminal copyright infringement (i.e., violations of 17 U.S.C. § 506(a)), are set forth at 18 U.S.C. § 2319. Congress has increased these penalties substantially in recent years, and has broadened the scope of behaviors to which they can apply. See this Manual at 1847.

> Statutory penalties are found at 18 U.S.C. § 2319. A defendant, convicted for the first time of violating 17 U.S.C. § 506(a) by the unauthorized reproduction or distribution, during any 180-day period, of at least 10 copies or phonorecords, or 1 or more copyrighted works, with a retail value of more than $2,500 can be imprisoned for up to 5 years and fined up to $250,000, or both. 18 U.S.C. §§ 2319(b), 3571(b)(3).

If you broaden it to include DMCA violations you could spend a lot of time in jail. It's even worse in some other countries.

decremental

You don't go to jail for copyright infringement lol

dragonwriter

You can, but criminal copyright infringement has narrower scope as well as more stringent standard of proof compared to civil copyright infringement.

koolala

Even if you don't pay the exorbitant fines?

jonplackett

Will you actually get the mo ey or will your publisher finally earn out the advances?

Unai

As I understand, this case is not about training but about illegitimately sourcing the books, so unless you sell your books at $3k per copy, I don't see how it is fair.

tartoran

> I think that's fair, considering that two of those books received advances under $20K and never earned out.

It may be fair to you but how about other authors? Maybe it's not fair at all to them.

terminalshort

Do they sell their books for more than $3000 per copy? In that case it isn't fair. Otherwise they are getting a windfall because of Anthropic's stupidity in not buying the books.

paulryanrogers

Some judgements are punitive, to deter future abuse. Otherwise why pay for anything when you can just always steal and pay only what's owed whenever you're caught?

godelski

  | Please respond to the strongest plausible interpretation of what someone says, not a weaker one that's easier to criticize. Assume good faith.[0]
Please don't be disingenuous. You know that none of the authors were selling their books for $3k a piece, so obviously this is about something more

  > because of Anthropic's stupidity in not buying the books.
And what about OpenAI, who did the same thing?

What about Meta, who did the same thing?

What about Google, who did the same thing?

What about Nvidia, who did the same thing?

Clearly something should be done because it's not like these companies can't afford the cost of the books. I mean Meta recently hired people giving out >$100m packages and bought a data company for $15bn. Do you think they can't afford to buy the books, videos, or even the porn? We're talking about trillion dollar companies.

It's been what, a year since Eric Schmidt said to steal everything and let the lawyers figure it out if you become successful?[1] Personal I'm not a big fan of "the ends justify the means" arguments. It's led to a lot of unrest, theft, wars, and death.

Do you really not think it's possible to make useful products ethically?

[0] https://news.ycombinator.com/newsguidelines.html

[1] https://www.theverge.com/2024/8/14/24220658/google-eric-schm...

giveita

If I copy your book and sell a million bootleg copies that compete directly with your book is that worth the $30 cover price?

This is what generative AI essentially is.

Maybe the payment should be $500/h (say $5k a page) to cover the cost of preparing a human verified dataset for anthropic.

jawns

Then they can opt out of the class.

gowld

Or the judge can reject the settlement as insufficient, which is what TFA is about.

thayne

How much of that $9000 will go to your publisher?

jawns

Remains to be seen, but generally the holder of copyright is the author not the publisher.

jonathanstrange

That depends on the publishers and your standing with them. Many publishers want a copyright transfer agreement whereas others are fine with exclusive licensing rights. You can't transfer copyright in some countries (e.g. Germany) but you can in the US.

franze

where can i check if my book was in it?

nextworddev

Fair for you maybe

echelon

> that doesn't necessarily mean that those models are a lasting asset.

It remains to be seen, but typically this forms a moat. Other companies can't bring together the investment resources to duplicate the effort and they die.

The only reasons why this wouldn't be a moat:

1. Too many investment dollars and companies chasing the same goal, and none of them consolidate. (Non-consolidation feels impractical.)

2. Open source / commoditize-my-complement offerings that devalue foundation models. We have a few of these, but the best still require H100s and they're not building product.

I think there's a moat. I think Anthropic is well positioned to capitalize from this.

anp

Comments so far seem to be focusing on the rejection without considering the stated reasons for rejection. AFAICT Alsup is saying that the problems are procedural (how do payouts happen, does the agreement indemnify Anthropic from civil “double jeopardy”, etc), not that he’s rejecting the negotiated payout. Definitely not a lawyer but it seems to me like the negotiators could address the rejection without changing any dollar numbers.

rideontime

Direct link to Judge Alsup's order: https://www.bloomberglaw.com/public/desktop/document/Bartzet...

Name should sound familiar to those who follow tech law; he presided over Oracle v Google, along with Anthony Levandowski's criminal case for stealing Waymo tech for Uber.

wrsh07

As someone who has had a passing interest in most of these cases, I've actually come to like Alsup and am impressed by his technical understanding.

His orders and opinions are, imo, a success story of the US judicial system. I think this is true even if you disagree with them

darkwizard42

He actually does understand most of what he is ruling on which is a welcome surprise. Not just legal jargon but also the technical spirit of what is at stake.

lxe

Good. Approving this would have set a concerning precedent.

Edit: My stance on information freedom and copyright hasn't changed since Aaron Swartz's death in 2013. Intellectual property laws, patents, copyright, and similar protections feel outdated and serve mainly to protect established interests. Despite widespread piracy making virtually all media available immediately upon release, content creators and media companies continue to grow and profit. Why should publishers rely on century-old laws to restrict access?

tene80i

Because whenever anyone argues that all creative and knowledge works should be freely available, accessible without compensating the creators, they conveniently leave out software and the people who make it.

Moreover, IP law protects plenty of people who aren’t “established interests”. You just, perhaps, don’t know them.

lxe

I make the software. I use free software and I contribute to free software. I wish all the software were free from all sorts of restrictions.

tene80i

That’s great. But not exclusively, right? What about your salary, assuming you’re a professional in software? Do you still want that? I would argue you deserve it, but I also believe authors and other creators should be compensated. Too many people here argue for the software professional compensation only, conveniently.

alok-g

Is tbat saying that all software should be free? And extending beyond software, that all books, art, movies, etc., should be free to copy? Likewise, would it be fine for anyone to use any other company's logo?

gabriel666smith

Would it actually set any kind of legal precedent, or just establish a sort of cultural vibe baseline? I know Anthropic doesn't have to admit fault, and I don't know if that establishes anything in either direction. But I'm not from the US, so I wouldn't want to pretend to have intimate knowledge of its system.

The number of bizarre, contradictory inferences this settlement asks you to make - no matter your stance on the wider question - is wild.

stingraycharles

A settlement means that no legal precedent is set, so I can only assume a cultural precedent.

Sometimes these companies specifically seek out a settlement to avoid setting a legal precedent in case they feel like they will lose.

lxe

Hmm my huge concern was that if the settlement were to get approved, it would set a legal precedent for other "settlement approvals" like this one, setting back AI research in the US, paving way for China to win the race.

cleandreams

The judge IIRC found that training models using copyrighted materials was fair use. I disagree. Furthermore this will be a problem for anyone who generates text for a living. Eventually LLMs will undercut web, news, and book publishing because LLMs capture the value and don't pay for it. The ecosystem will be harmed.

The only problem the judge found here was training on pirated texts.

jokoon

That tiles is weird, what is an "Anthropic judge"?

rideontime

The judge for the Anthropic lawsuit, obviously.

null

[deleted]

alok-g

Indeed. While I could sense what was implied, I also thought of some newly-launched 'AI Judge' by Anthropic making the said claim. :-)

phaedryx

It sounds like the judge works for Anthropic

giveita

A human judge. Make the most of it, times are changing.

paddw

Anthropic should drop the deal and take the battle up the court system, they'll probably win

AuthError

they did, judge told authors to get better representation to prove harm. I think it's dicey cause if anthropic loses then it could be catastrophic (i.e. if judge jury thinks reward is 5x of what the proposal is would mean they would need to raise a new round)

3np

Anthropic having to raise a new round doesn't sound "catastrophic"...

pier25

It's an indisputable fact they downloaded like 7M books illegally.

bhelkey

From the article:

> Alsup gave the parties a Sept. 15 deadline to submit a final list of works, which currently stands around 465,000.

rvz

> they'll probably win

They are settling because the risk of losing will cost their entire business.

Anthropic knows that they will lose if they were brought to trial.

wrsh07

Yeah settling seems good for investors imo. It's variance reduction.

alok-g

Indeed.

I know little but perhaps the harm felt on future valuation is more than the settlement amount.

firesteelrain

How do any of these AI companies protect authors by users uploading full PDF or even plaintext of anything? Aren’t the same piracy concerns real even if they train on what users are providing ?

jahbrewski

If you’re vacuuming, shouldn’t you be responsible for what you’re vacuuming?

robryan

Training aside, an llm reading a pdf as part of a prompt feels similar to say Dropbox storing a pdf for you.

terminalshort

It's not similar at all because you can't get the book back out of the LLM like you can out of Dropbox. Copyright law is concerned with outputs, not with inputs. If you could make a machine that could create full exact copies of books without ever training on or copying those books, that would still be infringement.

gowld

If this is detected as leading to copyright violation, then that can be the subject of a lawsuit.

Since the violation is detected via model output, it doesn't matter what the input method is.

SamoyedFurFluff

For folks unable to access the full article, does the judge say why?

jorams

Two paragraphs from the article that I think sum it up pretty nicely:

> Judge William Alsup at the hearing said the motion to approve the deal was denied without prejudice, but in a minute order after the hearing said approval is postponed pending submission of further clarifying information.

> Alsup said class members “get the shaft” in many class actions once the monetary relief is established and attorneys stop caring. He told the parties that “very good notice” must be given to class members to ensure they have the opportunity to opt in or out, and protect Anthropic from potential claimants coming out of the woodwork later.

Essentially he has concerns about missing details in two directions:

1. How class members are going to get notified, submit claims, and paid out, what works are even included, and the involvement of an army of lawyers that shouldn't be paid from the settlement.

2. How this deal is going to prevent Anthropic getting sued for cases that should have been covered.

stingraycharles

Almost feels like the judge is siding with Anthropic here. But he’s right that in these types of cases, the lawyers stop caring once a settlement is reached because that’s the massive pay day they were after.

HardCodedBias

Outputs not inputs needs to become law.

throwmeaway222

[flagged]

jawns

In class action lawsuits (Rule 23, Federal Rules of Civil Procedure in the U.S.) any settlement must be reviewed and approved by the judge, who has a duty to ensure it is "fair, reasonable, and adequate" for all members of the class. Judges often reject settlements if they appear unfair, too favorable to one side (often the defendant), or if attorney fees seem excessive.

SilverElfin

They may not have a choice. Fighting these legal battles is very expensive and exhausting. And books are a low margin business. Anthropic has access to funding. Authors? Not so much. Losing your book to AI for a one time $3000 settlement feels like a bad deal to me.

pier25

> Authors? Not so much. Losing your book to AI for a one time $3000 settlement feels like a bad deal to me.

AFAIK it's even worse as this settlement is only about downloading pirated copies of the books. IIRC the training itself was deemed fair use.

Dylan16807

Well things are worse in the sense that a different ruling feels really bad for them.

But it means that this case is getting them thousands of dollars instead of one or two purchases, which is a pretty good outcome.

visarga

> Losing your book to AI for a one time $3000 settlement feels like a bad deal to me.

I'm wondering how lost would the book be? What would be the difference in sales.

program_whiz

It isn't about sales in the short term. Same with code. Your project might even be OSS (so not expecting profit). Its about a system that exploits that for one party's profit, while putting the other out of business. They reduced the writer's ability to ever be employed again, or to make royalties at the same level as before (not necessarily reduced the sales of existing titles, though that may occur as well).

Same with code, AI hoovering all the code doesn't mean people won't use libCurl, but it does mean jobs are disappearing and people may not be around to write the next libCurl.

null

[deleted]

NelsonMinar

Incorrect information from a throwaway account. This is simply not true.

dboreham

Very wrong.

ceejayoz

Gotta love the confidence, though!