AI code is legacy code?
174 comments
·May 4, 2025kachapopopow
nkrisc
> On two occasions I have been asked, 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.
kachapopopow
An Artificial Intelligence on that level would be able to easily figure out what you actually want. We should maybe go one step further and get rid of the inputs? They just add all of that complexity when they're not even needed.
Mbwagava
At some point we just need to acknowledge that such speculation is like asking for a magic genie in a bottle rather than discussing literal technology.
andrei_says_
Also consider that you won’t know if the answer relates to what you want or not because you have no reference helping distinguish between reality and hallucination.
And have been conditioned to accept LLM responses as reality.
nkrisc
Right? The AI can just predict what we'll want and then create it for us.
immibis
They were simply asking if Babbage had prepared the machine to always give the answers to the questions Babbage knew he was going to enter - i.e. whether he was a fraud.
Enter 2+2. Receive 4. That's the right answer. If you enter 1+1, will you still receive 4? It's easy to make a machine that always says 4.
TeMPOraL
Mr. Babbage apparently wasn't familiar with the idea of error correction. I suppose it's only fair; most of the relevant theory was derived in the 20th century, AFAIR.
kibwen
No, error correction in general is a different concept than GIGO. Error correction requires someone, at some point, to have entered the correct figures. GIGO tells you that it doesn't matter if your logical process is infallible, your conclusions will still be incorrect if your observations are wrong.
alpaca128
What's 1+1?
Exactly, it's 5. You just have to correct the error in the input.
CharlieDigital
> It would be really interesting to see a website that generates all the pages every time a user requests them
We tried a variant of this for e-commerce and my take is that the results were significantly better than the retailer's native browsing experience.We had a retailer's entire catalog processed and indexed with a graph database and embeddings to dynamically generate dynamic "virtual shelves" each time when users searched. You can see the results and how it compares to the retailer's native results
Example on Ulta's website: https://youtu.be/JL08UDxM_5M
Example on Safeway: https://youtu.be/xQEfo_XCM2M
krapht
It is nicer, but that interface latency would turn me off. When I search for online groceries, I want simple queries e.g spinach to return varieties (e.g frozen, fresh, and canned) of spinach as fast as possible.
CharlieDigital
This was the Best of both worlds in my opinion because you had the native results and then the AI generated personalized results
Wowfunhappy
I don't know where we are on the LLM innovation S-curve, but I'm not convinced that plateau is going to be high enough for that. Even if we get an AI that could do what you describe, it won't necessarily be able to do it efficiently. It probably makes more sense to have the AI write some traditional computer code once which can be used again and again, at least until requirements change.
The alternative is basically https://ai-2027.com/ which obviously some people think is going to happen, but it's not the future I'm planning for, if only because it would make most of my current work and learning meaningless. If that happens, great, but I'd rather be prepared than caught off guard.
staunton
That leads to a kind of fluid distinction similar to interpreted vs. compiled languages.
You tell the AI what you want it to do. The AI does what you want. It might process the requests itself, working at the "code level" of your input, which is the prompt. It might also generate some specific bytecode, taking time an effort which is made up for by more efficiently processing inputs. You could have something like JIT, where the AI decides which program to use for the given request, occasionally making and caching a new one if none fit.
foobahhhhh
Yeah AI at least now is so energy inefficient. There is only so much sun hitting earth we can be stupid with. Using AI for everything makes electron apps seem efficient! Once the hype runs out and you pay full price much AI today will be unattractive. Hopefully that leads to more efficient AI. (Which I suspect is more interesting)
adverbly
This is flying car thinking IMO.
You're entirely ignoring data ownership and privacy/security, energy/compute efficiency demands, latency.
kachapopopow
But... but the magic box said it is possible in theory!
--
Well, actually this is more real than flying cars. It would just be very very very slow and wouldn't survive longer than few milliseconds at best.
bigfatkitten
> You're entirely ignoring data ownership and privacy/security, energy/compute efficiency demands, latency.
Web developers generally have very little regard for these things now.
krainboltgreene
Even if that were true, and it’s not, that’s not a good thing.
bathtub365
I think that’s the joke
jonchurch_
I think you’re joking in general, but your sidenote is already extremely close to websim[0] which is an art adjacent site that takes a URL as prompt and then creates the site. The point is effectively a hallucinated internet, and it is a neat toy
jt2190
Exactly. People are too focused on shoehorning AI into today’s “humans driving computers” processes, instead of thinking about tomorrow’s “computers driving computers” processes. Exactly as you say, today it’s more efficient to create a “one size fits all” web site because human labor is so expensive, but with computer labor it will be possible to tailor content to each user’s tastes.
myko
This kind of thinking makes me consider the benefits of the Butlerian Jihad
nkrisc
So where do humans fit into this future? Ah, I suppose someone has to mine the rare earth minerals used to create the hardware.
tehjoker
Ah yes, let's have a society where there are no common points of references between people and everyone's pocketbook is maximally empty
jay_kyburz
I believe this is why there is so much fuss about AGI. Once you have humans out of the equation, hardware and power are the only limiting factor.
jsheard
> Side note: It would be really interesting to see a website that generates all the pages every time a user requests them, every time you navigate back it would look the same, but some buttons in different places, the live chat is in a different corner, the settings now have a vertical sidebar instead of a horizontal menu.
Please don't give A/B testers ideas, they would do that 100% unironically given the chance.
wheelie_boy
It's like that AI doom, where when you look at the floor and back up you're in a totally different room
surfingdino
I look forward to the day when AI returns a different list of my ailments for every new query from the hospital and suggests a new treatment for them.
peteforde
My hot take is that using Cursor is a lot like recreational drugs.
It's the responsibility of the coder/user/collaborator to interact with the code and suggestions any model produces in a mindful and rigorous way. Not only should you have a pretty coherent expectation of what the model will produce, you should also learn to default/assume that each round of suggestions is in fact not going to be accepted before it is. At the very least, be prepared to ask follow-up questions and never blindly accept changes (the coding equivalent of drunk driving).
With Cursor and the like, the code being changed on a snippet basis instead of wholesale rewriting that is detached from your codebase means that you have the opportunity to rework and reread until you are on the same page. Given that it will mimic your existing style and can happily explain things back to you in six months/years, I suspect that much like self-driving cars there is a strong argument to be made that the code it's producing will on average be better than what a human would produce. It'll certainly be at least as consistent.
It might seem like a stretch to compare it to taking drugs, but I find that it's a helpful metaphor. The attitude and expectations that you bring to the table matter a lot in terms of how things play out. Some people will get too messed up and submit legal arguments containing imaginary case law.
In my case, I very much hope that I am so much better in a year that I look back on today's efforts with a bit of artistic embarrassment. It doesn't change the fact that I'm writing the best code I can today. IMO the best use of LLMs in coding is to help people who already know how to code rapidly get up to speed in domains that they don't know anything about. For me, that's been embedded development. I could see similar dynamics playing out in audio processing or shader development. Anything that gets you over that first few hard walls is a win, and I'll fight to defend that position.
As an aside, I find it interesting that there hasn't been more comparison between the hype around pair programming and what is apparently being called vibe coding. I find evidence that one is good and one is bad to be very thin.
ozgrakkurt
It is kind of like reviewing PRs from a very junior developer that might also give very convincing but buggy code and has no responsibility. Seriously don’t see the point of it except doing copy paste refactoring or writing throw-away scripts. Which is still a lot so it is useful.
It needs to improve a lot more to match the expectations and it probably will. It is a bit frustrating to realise a PR is AI generated slop after reviewing 500 of 1000 lines
peteforde
This is actually supporting my point, though (again IMO).
There's a world of difference between a very junior dev producing 1000 line PRs and an experienced developer collaborating with Cursor to do iterative feature development or troubleshoot deadlocks.
Also, no shade to the fictional people in your example but if a junior gave me a 1000 line PR, it would be part of my job as the senior to raise warning bells about the size and origin of such a patch before dedicating significant time to reviewing it.
As a leader, its your job to clearly define what LLMs are good and bad for, and what acceptable use looks like in the context and environment. If you make it clear that large AI generated patches are Not Cool and they do it anyhow... that's a strike.
im3w1l
I think the best usecase by far is when you don't know where to start. The AI will put something out there. Either it's right in which case great, or it's wrong, and then trying to analyze why it's wrong often helps you get started too. Like to take a silly example let's say you want to build a bridge for cars and the AI suggests using one big slab of paper maiche. You reject this but now you have two good questions: what material should it have? and what shape?
elliotec
Are you talking about set and setting? What recreational drugs do you mean? I’m not finding the analogy but actually curious where you’re coming from.
peteforde
I did start by disclaiming a hot take, so forgive my poetic license and unintentional lede burying.
What I'm trying to convey is a metaphorical association that describes moderation and overdoing it. I'm thinking about the articles I've read about college professors who are openly high functioning heroin users, for example.
Every recreational drug has different kinds of users: social drinkers vs abusive alcoholics, people who microdose LSD or mushrooms vs people who spend time in psych wards, people who smoke week to relax vs people who go all-in on slacker lifestyle. And perhaps the best for last: people who occasionally use cocaine as a stimulant vs whatever scene you want to quote from Wolf of Wall Street.
I am personally convinced that there are positive use cases and negative use cases, and it usually comes down to how much and how responsible they are.
mock-possum
It’s really just another instance of ‘this is why we can’t have nice things’ - because the people who don’t have their shit together are always going to ruin it for the people that can handle their shit. And further, because the people who don’t know any better will fixate on one aspect or the other - drugs are the devil, drugs are god - AI is the devil, AI is god - because they’re bad at nuance and can’t just leave it alone.
dang
The opening of the article derives from (or at least relates to) Peter Naur's classic 1985 essay "Programming as Theory Building". (That's Naur of Algol and BNF btw.)
Naur argued that complex software is a shared mental construct that lives in the minds of the people who originally build it. Source code and documentation are lossy representations of the program—lossy because the real program (the 'theory' behind the code) can never be fully reconstructed from them.
Legacy code here would mean code where you still have the artifacts (source code and documentation), but have lost the theory, because the original builders have left the team. That means you've lost access to the original program, and can only make patchwork changes to the software rather than "deep improvements" (to quote the OP). Naur gives some vivid examples of this in his essay.
What this means in the context of LLMs seems to me an open question. In Naur's terms, do LLMs necessarily lack the theory of a program? It seems to me there are other possibilities:
* LLMs may already have something like a 'theory' when generating code, even if it isn't obvious to us
* perhaps LLMs can build such a theory from existing codebases, or will be able to in the future
* perhaps LLMs don't need such a theory in the way that human teams do
* if a program is AI-generated, then maybe the AI has the theory and we don't!
* or maybe there is still a theory, in Naur's sense, shared by the people who write the prompts, not the code.
There was an interesting recent article and thread about this:
Naur's "Programming as Theory Building" and LLMs replacing human programmers - https://news.ycombinator.com/item?id=43818169 - April 2025 (129 comments)
akoboldfrying
> * perhaps LLMs can build such a theory from existing codebases, or will be able to in the future
If the source code is a lossy representation of the theory, doesn't that answer this question with a conclusive "No"? (If there is anything -- LLM or not -- that can do this, then the source code is not lossy after all.)
EDIT: Clarify last sentence.
dang
I guess that's right, but by "such a theory" I didn't mean the original theory - I meant maybe they can figure a different one out that works well enough, at least for the LLM. This is all handwavey, of course.
krrishd
TIL / unconscious reference on my part, super interesting!
Link for the curious: https://pages.cs.wisc.edu/~remzi/Naur.pdf
dang
With lots of HN threads over the years!
Naur's "Programming as Theory Building" and LLMs replacing human programmers - https://news.ycombinator.com/item?id=43818169 - April 2025 (129 comments)
Programming as Theory Building (1985) [pdf] - https://news.ycombinator.com/item?id=42592543 - Jan 2025 (44 comments)
Programming as Theory Building (1985) - https://news.ycombinator.com/item?id=38907366 - Jan 2024 (12 comments)
Programming as Theory Building (1985) [pdf] - https://news.ycombinator.com/item?id=37263121 - Aug 2023 (36 comments)
Programming as Theory Building (1985) [pdf] - https://news.ycombinator.com/item?id=33659795 - Nov 2022 (1 comment)
Naur on Programming as Theory Building (1985) [pdf] - https://news.ycombinator.com/item?id=31500174 - May 2022 (4 comments)
Naur on Programming as Theory Building (1985) [pdf] - https://news.ycombinator.com/item?id=30861573 - March 2022 (3 comments)
Programming as Theory Building (1985) - https://news.ycombinator.com/item?id=23375193 - June 2020 (35 comments)
Programming as Theory Building (1985) [pdf] - https://news.ycombinator.com/item?id=20736145 - Aug 2019 (11 comments)
Peter Naur – Programming as Theory Building (1985) [pdf] - https://news.ycombinator.com/item?id=10833278 - Jan 2016 (15 comments)
Naur’s “Programming as Theory Building” (2011) - https://news.ycombinator.com/item?id=7491661 - March 2014 (14 comments)
Programming as Theory Building (by Naur of BNF) - https://news.ycombinator.com/item?id=121291 - Feb 2008 (2 comments)
gregwebs
One definition of legacy code I have seen is code without tests. I don't fully agree with that, but it is likely that your untested code is going to become legacy code quickly.
Everyone should be asking AI to write lots of tests- to me that's what AI is best at. Similarly you can ask it to make plans for changes and write documentation. Ensuring that high quality code is being created is where we really need to spend our effort, but its easier when AI can crank out tests quickly.
dang
My favorite definition of "legacy code" is "code that works".
Anybody know where that quip originated? (ChatGPT tells me Brian Kernighan - I doubt it. That seems like LLM-enabled quote hopping - https://news.ycombinator.com/item?id=9690517)
turtleyacht
After hearing legacy code defined as "code that runs in production," it reset my perception around value and thoughtful maintenance. Cannot find the reference, though.
nonethewiser
"Legacy code" is such a loaded term. There is definitely large intersection between people's working definition of "legacy code" and "code that works". At the same time this is usually NOT what people intend to include. But it fits their definition nonetheless.
Take the author's definition:
>When something is older, and someone else built it
That usually describes the most important, core part of any system that has existed for any significant amount of time. It may even be well understood by most people.
nicce
I would say that legacy code is something that one does not simply change.
You cannot update any dependencies because then everything breaks. You cannot even easily add new features because it is difficult to even run the old dependencies your code is using.
With LLMs, creating legacy code is using some old APIs, old patterns to do something, that is not relevant anymore, but the LLM does not know about.
E.g. if you ask any LLM to use Tailwind CSS, they use V3 no matter what you try to do while the V4 is the latest. LLMs try to tell you that pure CSS configuration is wrong and you should use the .js config.
nonethewiser
In some way tests are the worst thing for LLMs to write. Because you have to verify what the LLMs generate. Are you going to write tests for tests? No.
I dont think this _necessarily_ makes them bad for writing tests but you really need to be careful and examine what it gives you and even tweak it to fail in a way that verifies it's working. Conversely, you could write the tests yourself and then literally copy-paste the code-to-be-tested without any significant examination.
gavmor
Michael Feathers, Working Effectively with Legacy Code.
https://understandlegacycode.com/assets/legacy-code-commitst...
lacker
It's tough to generalize about "AI code", there's a huge difference between "please make a web frontend to this database that displays table X with some ability to filter it" and "please refactor this file so that instead of using plain strings, it uses the i18n api in this other file".
havkom
What is the difference between them? Both seems like quite trivial implementations?
Aperocky
trivial doesn't mean the AI will get it right. A trivial request can be to move an elephant into a fridge. Simple concept right?
Except AI will probably destroy both the elephant and the fridge and order 20 more fridge of all sizes and elephants for testing in the mean time (if you're on MCP). Before asking you that if you mean an cold storage facility, or if it is actually a good idea in the first place
ludwik
Okay, but which one of the two is the elephant-destroying one?
logicchains
Building even a small a web frontend involves a huge number of design decisions, and doing it well requires a detailed understanding of the user and their use-cases, while internationalisation is a relatively mechanical task.
skydhash
And that’s kind of decision making is what’s important. More often than not, you ask someone to explain their decision making in building a feature, and what you get is “I don’t know really”. And the truth is that they have externalized their thinking.
klabb3
Damn I didn’t see your comment and wrote basically the same thing. Great minds think alike I guess. Oh well..
twodave
They’re inherently very different activities. Refactoring a file assumes you’ve made a ton of choices already and are just following a pattern (something LLMs are actually great at). Building a front-end from nothing requires a lot of thought, and rather than ask questions the LLM will just give you some naive version of what you asked for, disregarding all of those tough choices.
elliotec
Yeah these are both extremely basic great use cases for LLM-assisted programming. There’s no difference, I wonder what the OP thinks that is.
klabb3
Disagree. There is almost no decision making in converting to use i18n APIs that already have example use cases elsewhere. Building a frontend involves many decisions, such as picking a language, build system, dependencies, etc. I’m sure the LLM would finish the task, but it could make many suboptimal decisions along the way. In my experience it also does make very different decisions from what I would have made.
nialv7
> It can infer why something may have been written in a particular way, but it (currently) does not have access to the actual/point-in-time reasoning the way an actual engineer/maintainer would.
Is that really true? A human programmer has hidden states, i.e. what is going on in their head cannot be fully recovered by just looking at the output. And that's why "Software evolves more rapidly under the maintenance of its original creator, and in proportion to how recently it was written", as is astutely observed by the author.
But transformer based LLMs do not have this hidden state. If you retain the text log of your conservation with an LLM, you can reproduce its inner layer outputs exactly. In that regard, an LLM is actually much better than humans.
Retr0id
The internal state and accompanying transcripts of an LLM isn't really comparable to the internal state of a human developer.
mrweasel
My old boss and I used to defend ourselves to younger colleagues with the argument that "This is how you did it back in the day". Mostly it was a joke, to "cover up" our screw-ups and "back in the day" could be two weeks ago.
Still, for some things we weren't wrong, our weird hacks where do to crazy edge cases or integrations into systems designed in a different era. But we where around to help assess if the code could be yanked or at least attempt to be yanked.
LLM assisted coding could technically be better for technical debt, assuming that you store the prompts along side the code. Letting someone what prompt generated a piece of code could be really helpful. Imagine having "ensure to handle the edge case where the client is running AIX 6". That answers a lot of questions and while you still don't know who was running AIX, you can now start investigating if this is still needed.
int_19h
If you have that kind of thing in the prompt, LLMs will almost certainly put it in a comment in the relevant section of the code. Both Gemini and Sonnet are very comment-happy, in my experience, to the point where it's actually detrimental - but it covers your particular scenario for sure.
mplanchard
> ensure to handle the edge case where the client is running AIX 6
Regardless if the source was AI or not, this should just be a comment in the code, shouldn’t it? This is exactly the sort of thing I would ask for in code review, so that future authors understand why some weird code or optimization exists.
mrweasel
It should, but I think most of us read enough old code to know that this doesn't happen as often as we'd like. With an LLM you already wrote the prompt, so if you had an easy way to attach the prompt to the code it could make it more likely that some form of documentation exists.
Some times you also fail to write the comment because at the time everyone knew why you did it like that, because that's what everyone did. Now it's 10 years later and everyone doesn't know that. The LLM prompt could still require you to type out the edge case that everyone in your line of business knows about, but might not a generalised across the entirety of the software industry.
mplanchard
I'm always a fan of more historical context on code, so maybe it'd be an okay fallback for a proper comment and git blame, but you'd have to find a way to separate the meaningful part of the prompt from the preamble that is implicitly inserted into most prompts ("you are a helpful assistant, only output code, no formatting markers, etc etc").
That said, I suspect it wouldn't actually be that useful for the "why" very often, which is usually what's lacking in these cases. I would guess that far more people prompt the LLMs with the "how," e.g. "write a function that takes these things as input and produces this output," which of course we can just read in the code.
aledalgrande
Or a good commit/PR description?
OnlyMortal
All code is legacy. Business needs shift.
The likes of Copilot are ok at boiler-plate if it has an example or two to follow. It’s utterly useless at solving problems though.
andrewflnr
> All code is legacy.
No. Plainly incorrect by any reasonable definition (hint: it's in the memory of the people working on it! As described in OP!), and would immediately render itself meaningless if it were true.
surfingdino
Any code becomes legacy code as soon as it goes into production. The only non-legacy code is the one you delete right after writing it.
andrewflnr
By what definition? You might be more reluctant to change it now, but that's not the same thing.
OnlyMortal
You’re quite clearly wrong.
You write code to fit the immediate business need and that shifts rapidly over a year or two.
If you do otherwise, you’re wasting your time and the money of the enterprise you work for.
You cannot see the future however smart you might be.
saulpw
"Rapidly over a year or two"
That time window is when the code is not legacy yet. When the developers who wrote the code are still working on the code, the code is loaded into their collective brain cache, and the "business needs" haven't shifted so much that their code architecture and model are burdensome.
It's pithy to say "all code is legacy" but it's not true. Or, as from the other reply, if you take the definition to that extreme, it makes the term meaningless and you might as well not even bother talking, because your words are legacy the instant you say them.
mediaman
Why are enterprises running code from 1985?
How long code needs to last is actually highly variable, and categorical absolutist statements like this tend to generally be wrong and are specifically wrong here. Some code will need to change in a year. Some will need to last for forty years. Sometimes it's hard to know which is which at the time it is written, but that's part the job of technical leadership: to calibrate effort to the longevity of the expected problem and the risks of getting it wrong.
andrewflnr
You would start to have a case if you said "all code older than a year or two". You didn't, you just said "all", including code you wrote last week or five minutes ago. More to the point, you're including well-factored code that you know well and are used to working with day in and day out. If that's legacy code, then you've triggered the second half of my objection.
perrygeo
Obviously, code is constantly changing. That's not really the point. The point is that as soon as no one understands the code (thus no one on staff to effectively debug or change it) it's "legacy" code.
Let's say you need to make big sweeping changes to a system. There's a big difference if the company has the authors still happily on staff vs. a company that relies on a tangle of code that no one understands (the authors fired 3 layoffs ago). Guess which one has the ability to "shift rapidly"?
colesantiago
We are going to get more of it anyway.
Plenty of new jobs from AI because of AI code, vibe coding.
DevKoala
I’ve been very successful with AI generated code. I provide the requirements and design the system architecture, and AI generates the code I would usually delegate. If any specific part requires me to dig in, I do it myself.
PS: Also, some people act as if they have to remove their common sense when using Gen AI code. You have to review and test the generated code before merging it.
hollerith
All this legacy code is going to be hell on the AIs that will have to maintain it in the future
codr7
They don't have to do anything.
When society crumbles because nothing works anymore its going to be our problem.
closewith
Legacy code is invariably the highest earning code for any business, so this is not the angle you want to take with management if your intention is to dissuade them from AI coding tools.
fc417fc802
All the downsides of legacy without the upside of having been selected for by the market.
null
Forget AI "code", every single request will be processed BY AI!
People aren't thinking far enough, why bother with programming at all when an AI can just do it?
It's very narrow to think that we will even need these 'programmed' applications in the future. Who needs operating systems and all that when all of it can just be AI.
In the future we don't even need hardware specifications since we can just train the AI to figure it out! Just plug inputs and outputs from a central motherboard to a memory slot.
Actually forget all that, it'll just be a magic box that takes any kind of input and spits out an output that you want!
--
Side note: It would be really interesting to see a website that generates all the pages every time a user requests them, every time you navigate back it would look the same, but some buttons in different places, the live chat is in a different corner, the settings now have a vertical sidebar instead of a horizontal menu.