Training "AI" On Public Data Is Totally Fine And Not Stealing.

31337@sh.itjust.works · 1 month ago

Training "AI" On Public Data Is Totally Fine And Not Stealing.

Zagorath@aussie.zone · 1 month ago

They have indeed made a statement of fact. But to the best of my knowledge it’s not one that’s got any definite controlling precedent in law.

You are still not permitted to, for example, repost it elsewhere without the copyright holder’s permission

That’s the thing. It’s not clear that an LLM does “repost it elsewhere”. As the OP said, the model itself is basically just a mathematical construct that can’t really be turned back into the original work, which is possibly a sign that it’s not a derivative work, but a transformative one, which is much more likely to be given Fair Use protection. Though Fair Use is always a question mark and you never really know if a use is Fair without going to court.

You could be right here. Or OP could. As far as I’m concerned anyone claiming to know either way is talking out of their arse.