Discussion
Clean code in the age of coding agents.
Insensitivity: > LLMs are pretty good at picking up the style in your repo. So keeping it clean and organized already helps.At least in my experience, they are good at imitating a "visually" similar style, but they'll hide a lot of coupling that is easy to miss, since they don't understand the concepts they're imitating.They think "Clean Code" means splitting into tiny functions, rather than cohesive functions. The Uncle Bob style of "Clean Code" is horrifyingThey're also very trigger-happy to add methods to interfaces (or contracts), that leak implementation detail, or for testing, which means they are testing implementation rather than behavior
fooker: I’ll go against the prevailing wisdom and bet that clean code does not matter any more.No more than the exact order of items being placed in main memory matters now. This used to be a pretty significant consideration in software engineering until the early 1990s. This is almost completely irrelevant when we have ‘unlimited’ memory.Similarly generating code, refactoring, implementing large changes are easy to a point now that you can just rewrite stuff later. If you are not happy about how something is designed, a two sentence prompt fixes it in a million line codebase in thirty minutes.
raincole: In the past ~15 years, there are only two new languages that went from "shiny new niche toy" to "mainstream" status. Rust and Go.This fact alone insinuates that the idea of having unlimited memory or unlimited CPU clocks is just wrong.
DrBazza: If you work in finance, you've probably just bankrupted your company.Nanoseconds matter.Clean code tends to equal simple code, which tends to equal fast code.The order of items in memory does matter, as does cache locality. 32Kb fits in L1 cache.If of course you're talking about web apps then that's just always been the Wild West.
devin: Funny you should mention that. I just used a two sentence prompt to do something straightforward. It took careful human consideration and 3 rounds of "two sentence" prompts to arrive at the _correct_ transformation.I think you're missing the cost of screwing up design-level decision-making. If you fundamentally need to rethink how you're doing data storage, have a production system with other dependent systems, have public-facing APIs, and so on and so forth, you are definitely not talking about "two sentence prompts". You are playing a dangerous game with risk if you are not paying some of it off, or at the very least, accounting for it as you go.
luc_: I don't think they are, I think they're not talking about that... "It's all about the spec."
nlh: I'm guessing a lot of similar debates were had in the 1970s when we first started compiling C to Assembly, and I wonder if the outcome will be the same.(BTW: I was not around then, so if I'm guessing wrong here please correct me!)Over time compilers have gotten better and we're now at the point where we trust them enough that we don't need to review the Assembly or machine code for cleanliness, optimization, etc. And in fact we've even moved at least one abstraction layer up.Are there mission-critical inner loops in systems these days that DO need hand-written C or Assembly? Sure. Does that matter for 99% of software projects? Negative.I'm extrapolating that AI-generated code will follow the same path.
mbesto: I've seen enough dirty code (900+ tech diligences over the last 12 years) to know that many businesses are successful in spite of having bad code.
zer00eyz: It never started that way.Time, feature changes, bugs, emergent needs of the system all drive these sorts of changes.No amount of "clean code" is going to eliminate these problems in the long term.All AI is doing is speed running your code base into a legacy system (like the one you describe).
CharlieDigital: Clean code still matters.If it's easier for a human to read and grasp, it will end up using less context and be less error prone for the LLM. If the entities are better isolated, then you also save context and time when making changes since the AoE is isolated.Clean code matters because it saves cycles and tokens.If you're going to generate the code anyways, why not generate "pristine" code?. Why would you want the agent to generate shitty code?
WillAdams: Yes, but the problem is the advocate for it, and the text on it arrived at the correct conclusion using a bad implementation/set of standards.c.f.,https://github.com/johnousterhout/aposd-vs-clean-codeand instead of cleaning your code, design it:https://www.goodreads.com/en/book/show/39996759-a-philosophy...
iterateoften: Garbage in, garbage out.The llm is forced to eat its own output. If the output is garbage, its inputs will be garbage in future passes. How code is structured makes the llm implement new features in different ways.
aspenmartin: Why would “messy” code be garbage? Also LLMs do a great job even today at assessing what code is trying to do and/or asking you for more context. I think the article is well balanced though: it’s probably worth it for the next few months to try to help the agent out a bit with code quality and high level guidance on coding practices. But as OP says this is clearly temporary.
SpicyLemonZest: I'm dealing with a situation right now where a critical mass of "messy" code means that nobody, human or LLM, can understand what it is trying to do or how a straightforward user-specified update should be applied to the underlying domain objects. Multiple proposed semantics have failed so far.
embedding-shape: 4. Iterate on a AGENTS.md (or any other similar file you can reuse) that you keep updating every time the agent does something wrong. Don't make an LLM write this file, write it with your own words. Iterate on it whenever the agent did something wrong, then retry the same prompt to verify it actually steers the agent correctly. Eventually you'll build up a relatively concise file with your personal "coding guidelines" that the agent can stick with with relative ease.
ramesh31: Call me crazy but I think the entire book is about to be rewritten. Almost none of the old rules still apply anymore. We're moving to a new world where SWE will be a discipline of knowing how and when and where to dig under the hood if you really finally absolutely have to, and even then using agents to fix it. Not something where people are writing code anymore, where ideals like clean matter at all. Maybe hand crafted organic farm-to-table manual coding will live on in some particular niches, but the days of corporate enterprise development as we knew it are coming to a close.
esilverberg2: Our company makes extensive use of architectural linters -- Konsist for Android and Harmonize for Swift. At this point we have hundreds of architectural lint rules that the AI will violate, read, and correct. We also describe our architecture in a variety of skills files. I can't imagine relying solely on markdown files to keep consistency throughout our codebase, the AI still makes too many mistakes or shortcuts.
williamdclt: Amongst others reasons, one of the reasons for clean code is that it avoids bugs. AIs producing dirty code are producing more bugs, like humans. AIs iterating on dirty code are producing more bugs, like humans.
groos: The high level language -> assembly seems like an apt analogy for using LLMs but I would like to argue that it is only a weak one. The reason in that, previously, both the high level language and the assembly language had well defined semantics and the transform was deterministic whereas now you are using English or other human language, with ambiguities and lacking well-defined semantics. The reason math symbolisms were invented in the first place is because human language did not have the required unambiguous precision, and if we encounter hurdles with LLMs, we may need to reinvent this once more.
zamalek: The last two weeks with Claude has been a nightmare with code quality, it outright ignores standards (in CLAUDE.md). Just yesterday I was reviewed a PR from a coworker where it undid some compliant code, and then proceeded to struggle with exactly what the standards were designed to address.I threw in the towel last night and switched to codex, which has actually been following instructions.
SeanDav: Humans are quite capable of bankrupting financial companies with coding issues. Knight Capital Group introduced a bug into their system while using high frequency trading software. 45 minutes later, they were effectively bankrupt.
mbesto: > All AI is doing is speed running your code base into a legacy systemAre you implying legacy systems stop growing because I didn't mean to imply those companies stop growing.
Verdex: Heres the thing about clean code. Is it really good? Or is it just something that people get familiar with and actually familiarity is all that matters.You can't really run the experiment because to do it you have to isolate a bunch of software engineers and carefully measure them as they go through parallel test careers. I mean I guess you could measure it but it's expensive and time consuming and likely to have massive experimental issues.Although now you can sort of run the experiment with an LLM. Clean code vs unclean code. Let's redefine clean code to mean this other thing. Rerun everything from a blank state and then give it identical inputs. Evaluate on tokens used, time spent, propensity for unit tests to fail, and rework.The history of science and technology is people coming up with simple but wrong untestable theories which topple over once someone invents a thingamajig that allows tests to be run.
tracker1: I'm with you... personally, I always found Clean/Onion to be more annoying than helpful in practice... you're working on a feature or section of an application, only now you have to work across disconnected, mirrored trees of structure in order to work on a given feature.I tend to prefer Feature-Centric Layout or even Vertical Slices, where related work is closer together based on what is being worked on as opposed to the type of work being done. I find it to be far more discoverable in practice while able to be simpler and easier to maintain over time... no need to add unnecessary complexity at all. In general, you don't need a lot of the patterns introduced by Clean or Onion structures as you aren't creating multiple, in production, implementations of interfaces and you don't need that type of inheritance for testing.Just my own take... which of course, has been fighting upstream having done a lot of work in the .Net space.
embedding-shape: > a two sentence prompt fixes it in a million line codebase in thirty minutes.Could you please create a verifiable and reproducible example of this? In my experience, agents get slower the larger a repository is. Maybe I'm just very strict with my prompts, but while initial changes in a greenfield project might take 5-10 minutes for each change, unless you deeply care about the design and architecture, you'll reach 30 minute change cycles way before you reach a million lines of code.
fooker: Your observation was valid in 2025.This is largely a solved problem now with better harnesses and 1M context windows.
tracker1: On the plus side.. AI is pretty good at creating (often excessive) tests around a given codebase in order to (re)implement the utility using different backends or structures. The one thing to look out for is that the agent does NOT try to change a failing test, where the test is valid, but the code isn't.
namar0x0309: You haven't worked or serviced any engineering systems, I can tell.There are fundamental truths about complex systems that go beyond "coding". Patterns can be experienced in nature where engineering principals and "prevailing wisdom" are truer than ever.I suggest you take some time to study systems that are powering critical infrastructure. You'll see and read about grizzled veterans that keep them alive. And how they are even more religious about clean engineering principals and how "prevailing wisdom" is very much needed and will always be needed.That said there are a lot of spaces where not following wisdom works temporarily. But at scale, it crashes and crumbles. Web-apps are a good example of this.
fooker: > You haven't worked or serviced any engineering systems, I can tell.I have worked on compilers and databases the entire world runs on, the code quality (even before AI) is absolutely garbage.Real systems built by hundreds of engineers over twenty years do not have clean code.
stickfigure: I actively use AI to refactor a poorly structured two million line Java codebase. A two-sentence prompt does not work. At all.I think the OP is right; the problem is context. If you have a nicely modularized codebase where the LLM can neatly process one module at a time, you're in good shape. But two million lines of spaghetti requires too much context. The AI companies may advertise million-token windows, but response quality drops off long before you hit the end.You still need discipline. Personally I think the biggest gains in my company will not come from smarter AIs, but from getting the codebase modularized enough that LLMs can comfortably digest it. AI is helping in that effort but it's still mostly human driven - and not for lack of trying.
fooker: Have you tried this in the last few months with an expensive model? (Claude 4.6 opus high, for example)You might be pleasantly surprised if you haven’t yet.