Discussion
Vibe-coded ext4 for OpenBSD
bitwizeshift: Paywalled article on something vibe-coded? That seems like a bold strategy.
nurettin: It is amusing to see that the only concern seems to be about a confusion around licensing, not the validity or maintainability of the code itself.
LeFantome: Vibe coding and OpenBSD. The perfect combination.
throwatdem12311: Can someone just copyright wash Windows already.
greyface-: ReactOS did this without any need for an LLM.
dana321: click to continue
g0xA52A2A: Is it worth the effort to review until such implications are understood?
nurettin: No of course not, bike shedding licenses is where it is at.
kvuj: How is that different than a human writing the code? Whether an AI or a human wrote it, I would expect the same bar of validity/maintainability.
charcircuit: >incorporate knowledge carrying an illiberal license.Copyright prevents copying. It doesn't prevent using knowledge.
wongarsu: The Windows 2000 and Windows XP sources are readily available and must have made it into the training data. But most software has dropped XP support. You really need at least some of the Win 8 and Win 10 APIs to claim compatibility with modern software, and I doubt claude has seen those from the inside
scuff3d: Because humans make design decisions, AI just bangs it's head against the problem until it gets something that "works".
FeepingCreature: > So as of today, the Copyright system does not have a way for the output of a non-human produced set of files to contain the grant of permissions which the OpenBSD project needs to perform combination and redistribution.This seems extremely confused. The copyright system does not have a way to grant these permissions because the material is not covered under copyright! You can distribute it at will, not due to any sort of legal grant but simply because you have the ability and the law says nothing to stop you.
jagged-chisel: Eh … the argument will likely be things created by Thing at the behest of Author is owned by the Author. It’ll take a few cases going through the courts, or an Act of Congress to solidify this stuff.
wongarsu: Just like we settled on photographers havin copyright on the works created by their camera. The same arguments seem to applyThe US Copyright Office has published a piece that argues otherwise, but a) unless they pass regulation their opinion doesn't really matter, and b) there is way too much money resting on the assumption code can be copyrighted.
g0xA52A2A: Wow that thread just kept going. Whilst the LWN article covered most of the "highlights" I think this reply from Theo is pretty suscient on the topic at large [1].[1] https://marc.info/?l=openbsd-tech&m=177425035627562&w=2
bt1a: > Lacking Copyright (or similarily a Public Domain declaration by a human), we don't receive sufficient rights grants which would permit us to include it into the aggregate body of source code, without that aggregate body becoming less free than it is now.Thats awesome lmao
raggi: that's not a statement from a lawyer, and it's confused. there is one true thing in there which is that at least under US considerations the LLM output may not be copyrightable due to insufficient human involvement, but the rest of the implications are poorly extrapolated.there are lots of portions of code today, prior to AI authorship, that are already not copyrightable due to the way they are produced. the existence of such code does not decimate the copyright of an overall collective work.
ethin: > Lacking Copyright (or similarily a Public Domain declaration by a human), we don't receive sufficient rights grants which would permit us to include it into the aggregate body of source code, without that aggregate body becoming less free than it is now.Can someone explain this to me? I was under the impression that if a work of authorship was not copyrightable because it was AI generated and not authored by a human, it was in the public domain and therefore you could do whatever you wanted with it. Normal copyright restrictions would not apply here.
hypeatei: > This obsession with copyrights between different free software ecosystems - who put the lawyers in charge?This comment on the article is spot on. I don't vibe code or care about AI really, but it's so exhausting to see people playing lawyer in threads about LLM-generated code. No one knows, a ton of people are using LLMs, the companies behind these models torrented content themselves, and why would you spend your time defending copyright / use it as a tool to spread FUD? Copyright is a made up concept that exists to kill competition and protect those who suck at executing on ideas.
longislandguido: ~20 years ago, the Linux camp accused OpenBSD of importing GPL'd code (a wireless driver IIRC) and cried foul. The code was removed.Fast forward to 2026, Theo says no to vibe-coded slop, prove to me your magic oracle LLM didn't ingest gobs of GPL code before spitting out an answer.People are big mad of course, but you want me to believe Theo is the bad guy here for playing it conservatively?
whalesalad: I vibe-configured an Edgerouter 4 as a hot-drop box that would establish a secure tunnel and create a fake WAN for some servers that had to be temporarily pulled from service but remain operational in someones home garage. I overnight shipped it to them with two of the ports labeled, they plugged in home internet on one port, the rack on the other port, and it secure tunneled to a Linode VPS to get a public IP, circumventing all the Verizon home internet crap. I used OpenBSD. Claude did most of the work.
CodeWriter23: Well this is ironic, GPL advocate(s) declaring a clean implementation based on specifications infringing due to someone/something reading specs provided under license. Didn't Oracle lose that argument in court as pertains to Android implementation of Java libraries?
corbet: I'm not sure what you're reading; there is a distinct lack of GPL advocates in that conversation.
fragmede: It's not settled. The monkey selfie copyright dispute ruled that a monkey that pressed the button to take a selfie, does not and cannot open the copyright to that photo, and neither does the photographer who's camera it was. How that extends to AI generated code is for the courts to decide, but there are some parallels to that case.https://en.wikipedia.org/wiki/Monkey_selfie_copyright_disput...
charcircuit: This filesystem driver was made by a human using AI, not a monkey.
michaelmrose: The law is whatever it needs to be to satisfy monied interests with the degree of acceptable of adaptation being a function of the unity of those interests and the political ascendancy of those in favor.Overwhelmingly this is in favor of treating ai as a tool like Photoshop.Even those against AI disagree on different matters and will overwhelmingly want a cut not a different interpretation.
ksherlock: The history is a bit backwards but the point is good. OpenBSD atheros wireless code was imported into linux, the BSD attributions were removed, and it was re-declared as GPL. That was later changed back.
Joel_Mckay: Data theft of service or piracy from the web and "AI" users content are used in the model training sets, and when codified the statistical saliency is significant if popular content is present.For example, when an LLM does a vector search, there is a high probability of pirated content bleed-though and isomorphic plagiarism in the high dimensional vector space results. Thus, often when you coincidentally type in "name a cartoon mouse", there is a higher probability Disney "Micky Mouse" will pop out in the output rather than "Mighty Mouse". Note Trademarks never expire if the fees are paid, and Disney can still technically sue anyone that messes with their mouse.Much like em dashes "--", telling the current set of models to stop using them inappropriately often fails. Also, activation capping is used to improve the models behavioral vector, and have nothing to do with the Anthropic CEO developing political ethics.LLM are useful for context search, but can't function properly without constantly stealing from actual humans. Thus, will often violate copyright, trademark, and patents. In a commercial context it is legally irrelevant how the output has misappropriated IP, and one can bet your wallet the lawyers won't care either. No, IP is not public domain for a long time (17 to 78 years) regardless of peoples delusions, even if some kid in a place like India (no software patents) thinks it is..This channel offers several simplified explanations of the work being done with models, and Anthropic posts detailed research papers on its website.https://www.youtube.com/watch?v=YDdKiQNw80chttps://www.youtube.com/watch?v=Xx4Tpsk_fnMhttps://www.youtube.com/watch?v=JAcwtV_bFp4Many YC bots are poisoning discourse -- so this thread will likely get negative karma. Some LLM users seem to develop emotional or delusional relationships with the algorithms. The internet is already >52% generated nonsense and growing. =3
kgeist: Binaries are copyrightable in both the US and the EU, and they are not technically produced by a human either, they're produced by a computer program. I honestly don't understand why this isn't extended to AI-generated code. Isn't it the same thing? One could argue that compilers merely transform source code into binaries "as is," while AI models have some "knowledge" baked in that they extract and paste as code. But there are compilers that also generate binaries by selecting ready-to-use binary patches authored by compiler developers and combining them into a program. One could also argue that, in the case of compilers, at least the input source code is authored by a human. But why can't we treat prompts as "source code in natural language" too? Where is the line between authorship and non-authorship, and how is the line defined? "Your prompt was too basic to constitute authorship" doesn't sound like an objectibe criterion.Maybe for lawyers, AI is some kind of magical thing on its own. But having successfully created a working inference engine for Qwen3, and seeing how the core loop is just ~50 lines of very simple matrix multiplication code, I can't see LLMs as anything more than pretty simple interpreters that process "neural network bytecode," which can output code from pre-existing templates just like some compilers. And I'm not sure how this is different from transpilers or autogenerated code (like server generators based on an OpenAPI schema)Sure, if an LLM was trained on GPL code, it's possible it may output GPL-licensed code verbatim, but that's a different matter from the question of whether AI-generated code is copyrightable in principle.Interestingly, I found an opinion here [0] that binaries technically shouldn't be copyrightable, and currently they are because: the copyright office listened to software publishers, and they wanted binaries protected by copyright so they could sell them that way [0] https://freesoftwaremagazine.com/articles/what_if_copyright_...
wahern: That linked opinion overstates the case. In the real-world, two different programs performing any non-trivial but functionally identical task will look substantially dissimilar in their source code, and that dissimilarity will carry over to the compiled binary, meaning what was expressive (if anything) is largely preserved. To the extent two different programs do end up with identical code, then that aspect was likely primarily functional and non-copyrightable.IMO, your intuition regarding AI is right--it's not a magic copyright laundering machine, and AFAIU courts have very quickly agreed that infringement is occurring. But in copyright law establishing infringement (or the possibility of infringement) is the easy, straight-forward part. Copyright infringement liability is a much more complex question. Transformative uses in particular are a Fair Use, and Fair Use is technically treated as an affirmative defense to infringement.[1] If something is Fair Use, infringement is effectively presumed.[1] There's a scholarly pedantic debate about whether Fair Use is properly a "defense", rather than "exception" to infringement, but it walks and talks like a defense in the sense that the defendant has the burden of proving Fair Use after the plaintiff has established infringement. There's a similarly pedantic (though slightly more substantive) debate in criminal law regarding affirmative defenses. But the very term "affirmative defense" was coined to recognize and avoid these pedantic debates.