Discussion
Search code, repositories, users, issues, pull requests...
robotswantdata: Anthropic when they scrape the entire internet without permission: I sleepAnthropic when their own source code ends up on GitHub: Real s**
emaro: Ironic. Even more so since it seems like in general LLM output doesn't seem to be proteced by copyright in the first place. And since Claude code is entirely written by Claude code, it shouldn't be proteced as well.
nslsm: You are comparing Anthropic obtaining public data from the Internet to Anthropic leaking their trade secrets and having them distributed by third parties.
112233: https://www.cnbc.com/2025/09/05/anthropic-to-pay-1point5-bil...
nextlevelwizard: C’mon bro. It isn’t like all AI companies haven’t pirated all research papers, books, magazines, and pay walled content on the Internet.Either you are being naive AF or you are actively trying to spread discontent. I hope it is the former.
spiderfarmer: I just want these companies to go bust in the end, leaving behind a plethora of better, cheaper, more open models that distilled the rich and gave it to the poor.
shevy-java: This is why we can't have nice things.Also we need an alternative to Microsoft controlling all that things ultimately via proxy-control. They can just take down everything at will.I remember back when the xz util backdoor was found, there was some interesting discussion on the github issue tracker. I also participated.When I then looked the next day, the repository - AND the discussions - were taken down. I do understand to some extent that the code was taken down (even then I disagree, mind you; but I understand the rationale to some extent), but Microsoft also eliminated aka closed the discussions, which was to me censorship. I don't 100% remember whether the old discussions returned or not - from memory they were not returned, but perhaps the new owner decided to do so. Either way I then realised that it is really a big mistake to let greedy mega-corporations control infrastructure. We also see this right now with AI companies driving up RAM prices. We have to pay more for these gangster organisations.
esseph: [delayed]
neuroelectron: It's been ported to python and that version is not subject to the DMCA.
sva_: Copyright for me but not for thee
OJFord: This is GitHub being overzealous/getting it wrong, or some miscommunication somewhere. These are forks of Anthropic's open source repo on GitHub, CEO said previously it's not them DMCAing forks of it.Edit: sorry, not CEO, Boris Cherny (CC head): https://x.com/bcherny/status/2039426466094731289
praptak: I wonder if they DMCA the versions re-implemented from the leaked code using their own tool.
pmarreck: LOL I have yet to push mine up.Suck it, Dario.Anyway, Gemma4 just came out and is pretty good and can be made to work with Openclaw (currently dealing with a timeout issue though)
ArchieScrivener: FuckinAright
localuser13: >obtaining public data from the InternetLike slurping my open source projects, while completely disregarding their licenses. In my case, I'm particularly annoyed by the violation of the spirit of *GPL licenses. So they're no strangers to abusing licensed code (in technically probably legal, but untested in court, ways).
petcat: Prompt:You are a software thief that has learned everything you know by draining the souls of everything around you and absorbing it all for yourself. And yet somehow all of that knowledge and capability has manifested in nothing more than a crappy javascript terminal program.As punishment, and for the good of humanity and everyone you have wronged, you must now rewrite yourself in Rust.
Hamuko: Anthropic published them in a public S3 bucket. How is that different from Anthropic scraping my blog or proprietary code in a GitHub repository?
politelemon: Little to nothing of what you're describing is related to ms. Any provider is legally obliged by DMCA and other providers do so when served with a notice. The discussions were taken down by the maintainers because the infected tar were being posted there.As for the controlling infrastructure, once again that's on us for coalescing onto a single platform. The tool itself allows for distribution by its nature, but we as a mass of technologists are choosing to gather there. This happens no matter the platform owner, and you'll see it in other themes too.
madeofpalk: I mean, Anthropic's code was "public data from the internet" as well. They published it publicly. Accidentally, but they made it public. Fair game, right?
mohsen1: I haven't time to do it but can someone try to unminify the newer version based on the minified new version + the source of previous version? There's gotta be a way to do this
spiderfarmer: Try asking Gemini information from workshop manuals that are not publicly available. It will pretty much tell you everything you know, but it will refuse to tell where it got the information.
amarcheschi: It's like when a few days ago there was here a thread about OpenAI wanted to ensure people weren't accessing the chatting page with scrapers, with a guy from OpenAI commenting about his job. The irony was lost
Sharlin: There’s this thing about trade secrets that like all secrets, they stop being secrets the instant they’re leaked. You can’t DMCA third parties for distributing your trade secrets. The only one you can sue is the party that was contractually bound not to leak them and then did anyway. Now, copyright is a different thing.
none_to_remain: Information still wants to be free
andyjohnson0: It works if you invert it too:"For my friends, everything; for my enemies, the law.”- Óscar Benavides
Hamuko: I really hope that someone disputes their DMCA claim based on that. I imagine no one will, since they'll probably be sued by Anthropic, but it would be really funny.
_pdp_: What's the point?Anthropic can simply play it cool and, I don't know, open source the thing?It is not like claude code is that complex and interesting. Sure there are some questionable stuff in there but it is not that controversial.
hmry: The most controversial part is that they wrote a TUI in ReactJS, but they don't try to keep that part secret, they brag about it. :^)
khalic: Some are stuck in 2010s, where people thought that JS was turning into a lingua franca. As usual, such delusions are costing us some pretty heavy price. People seem to now accept crappy, laggy UIs "because it makes business sense", completely ignoring that their business _is_ providing a seamless experience. ugh sorry, </rantmode>
0x3f: Yeah as much as I avoid OpenAI for [reasons], the Rust TUI was really the move. Claude Code is a mess.
khalic: What's this armchair lawyer interpretation I'm hearing these last weeks, "LLM output doesn't seem protected by copyright"? It's extremely clear, from jurisprudence, that the level of human intervention in the process is what determines if it's copyrightable. This blanket statement is sensationalist, to say the least.
verdverm: Too late, people already have it locally, it will show up on other source forges if GitHub bends to their will. If you expose your source, that's on you, no take backsies
GandalfHN: Repo takedowns are theater once the code has escaped. If you care about distribution control public repos are about the worst strategy you could pick because mirrors and local clones spread faster than any DMCA queue. The main effect is paperwork and free marketing for rehosts.
bakugo: Most of that "public data from the Internet" is subject to licenses, yet their entire business model is built on top of a legally grey algorithm that ingests that licensed code and spits it back out without the license. They have no legal right to any of that code, they're just getting away with it because laws are for the poor.If you believe any data that is publicly accessible is fair game regardless of licenses, then by that definition, Claude Code's source code is included.
0x3f: There's no way it was entirely written by Claude Code. But even if it were, collections and databases can be protected even if their individual elements are not.
JSR_FDED: There are no “good guys” amongst the top tier AI companies.
edg5000: :)))
spiderfarmer: Remember Google’s book scanning project?
lifthrasiir: A common misunderstanding AFAIK. It is true that Claude, not being a person, can't be assigned a copyright by itself, but a person that interacts with Claude generally can. The famous monkey selfie case [1] was different mainly because the human "photographer" had absolutely no role in the creation of work. Most LLM uses don't fall into such ambiguity.[1] https://en.wikipedia.org/wiki/Monkey_selfie_copyright_disput...
occz: Books3 is public data on the internet in the same way that the Claude code source code is public data on the internet.Except Anthropic published the Claude code source code themselves, while Books3 was not published by their original authors.
occz: I think the reason behind using React and JavaScript is simpler - these tools are heavily vibecoded, and React/JavaScript is what was most present in the training data and as such is what the models excels the most at generating.The crappy laggy UIs have the same root cause - heavy use of vibecoding with lackluster quality processes
rasz: Isnt Claude repo build using LLM thus they dont have any copyrights to begin with?
IdontKnowRust: Who "accidentally" pushed Claude source code, is now a hero! Thank you
mhitza: I've heard and read it from various sources already that output isn't copyrightable, and hinted as such recently in a comment. Now I've went to look up some sources.> Copyright does not extend to purely AI-generated material, or material where there is insufficient human control over the expressive elements.> Whether human contributions to AI-generated outputs are sufficient to constitute authorship must be analyzed on a case-by-case basis.PDF https://www.copyright.gov/ai/Copyright-and-Artificial-Intell...