Name: 169: OpenAI admits they steal copyrighted material
Uploaded: 2024-01-10T13:16:55+00:00
Duration: 3 min 27 s
Description: My friend Charles Benaiah takes apart the on-going conflict between The New York Times and OpenAI. Bo Sacks picked it up the other day, and I’ll provide a link below. It pains me to say it – honestly

8 months ago

Podcasts Publishing media AI copyright

My friend Charles Benaiah takes apart the on-going conflict between The New York Times and OpenAI. Bo Sacks picked it up the other day, and I’ll provide a link below.

It pains me to say it – honestly it does – but the New York Times is doing God’s work on this topic. This is, of course, the only topic where it’s possible to say that.

Here’s the basic problem. Publishers put their content online for “free” – supported by ads – because that was their path to discovery in the search engines. They didn’t have the foresight to put restrictive terms on access to that content. They should have made it clear that the content was only available under certain terms.

I mean – why the heck do we hire lawyers?

OpenAI took advantage of the ambiguity. They slurped up New York Times content and used it to create a service to compete with The New York Times.

That’s disgusting and bad form and all that, but is it technically illegal? That’s what we need to find out.

OpenAI claims this is “fair use” – which is an exception to copyright protection. I’m no lawyer, but I’ve been in and around copyright questions my whole career, and I think this is transparently stupid.

That doesn’t mean it won’t win in court.

The argument seems to be centering on whether or not AI is quoting copyrighted material verbatim. That is an incredibly short-sighted approach, because if the court rules that AI can’t quote verbatim, they’ll just put in a subroutine to make sure they don’t do that, and the fundamental problem will persist.

The fundamental problem being that OpenAI is using copyrighted content in a way that the copyright owner never approved. Unfortunately, they didn’t have the foresight to specifically disclaim this use of their content.

There’s a lawsuit in England right now where OpenAI has apparently admitted that it’s impossible to train their pet dragon without eating lots of young people.

They said "[l]imiting training data to public domain books and drawings created more than a century ago might yield an interesting experiment, but would not provide AI systems that meet the needs of today's citizens."

This is a classic “the ends justify any means” argument.

I love ChatGPT. Just last night I had a great conversation with it about recipes to make fortified wines. It’s an amazing service.

And the neighborhoods controlled by the mafia were pretty safe.

Links

New York Times: All is fair in love and AI
https://mediamakersmeet.com/new-york-times-all-is-fair-in-love-and-ai/

OpenAI admits it's impossible to train generative AI without copyrighted materials
https://www.engadget.com/openai-admits-its-impossible-to-train-generative-ai-without-copyrighted-materials-103311496.html

Loading comments...