Naur's "Programming as Theory Building" and LLMs replacing human programmers
79 comments
·April 28, 2025n4r9
ryandv
> To make this point stick, you would also have to show why it's not an illusion when humans "appear" to have a theory.
This idea has already been explored by thought experiments such as John Searle's so-called "Chinese room" [0]; an LLM cannot have a theory about a program, any more than the computer in Searle's "Chinese room" understands "Chinese" by using lookup tables to generate canned responses to an input prompt.
One says the computer lacks "intentionality" regarding the topics that the LLM ostensibly appears to be discussing. Their words aren't "about" anything, they don't represent concepts or ideas or physical phenomena the same way the words and thoughts of a human do. The computer doesn't actually "understand Chinese" the way a human can.
smithkl42
The Chinese Room argument is a great thought experiment for understanding why the computational model is an inadequate explanation of consciousness and qualia. But it proves nothing about reason, which LLMs have clearly shown needs to be distinguished from consciousness. And theories fall into the category of reason, not of consciousness. Or another way of putting it that you might find more acceptable: maybe a computer will never, internally, know that it has developed a theory - but it sure seems like it will be able to act and talk as if it had, much like a philosophical zombie.
ryandv
> The Chinese Room argument is a great thought experiment for understanding why the computational model is an inadequate explanation of consciousness and qualia.
To be as accurate as possible with respect to the primary source [0], the Chinese room thought experiment was devised as a refutation of "strong AI," or the position that
the appropriately programmed computer really is a mind, in the
sense that computers given the right programs can be literally
said to understand and have other cognitive states.
Searle's position? Rather, whatever purely formal principles you put into the
computer, they will not be sufficient for understanding, since
a human will be able to follow the formal principles without
understanding anything. [...] I will argue that in the literal
sense the programmed computer understands what the car and the
adding machine understand, namely, exactly nothing.
[0] https://home.csulb.edu/~cwallis/382/readings/482/searle.mind...slippybit
> maybe a computer will never, internally, know that it has developed a theory
Happens to people all the time :) ... especially if they don't have a concept of theories and hypotheses.
People are dumb and uneducated only until they aren't anymore, which is, even in the worst cases, no more than a decade of effort put in time. In fact, we don't even know how crazy fast neuro-genesis and or cognitive abilities might increase when a previously dense person reaches or "breaks through" a certain plateau. I'm sure there is research, but this is not something a satisfyingly precise enough answer can be formulated for.
If I formulate a new hypothesis, the LLM can tell me, "nope, you are the only idiot believing this path is worth pursuing". And if I go ahead, the LLM can tell me: "that's not how this usually works, you know", "professionals do it this way", "this is not a proof", "this is not a logical link", "this is nonsense but I commend your creativity!", all the way until the actual aha-moment when everything fits together and we have an actual working theory ... in theory.
We can then analyze the "knowledge graph" in 4D and the LLM could learn a theory of what it's like to have a potential theory even though there is absolutely nothing that supports the hypothesis or it's constituent links at the moment of "conception".
Stay put, it will happen.
lo_zamoyski
> The Chinese Room argument is a great thought experiment for understanding why the computational model is an inadequate explanation of consciousness and qualia. But it proves nothing about reason
I think you misunderstand the Chinese Room argument [0]. It is exactly about how a mechanical process can produce results without having to reason.
dingnuts
> it proves nothing about reason, which LLMs have clearly shown needs to be distinguished from consciousness.
Uh, they have? Are you saying they know how to reason? Because if so, why is it that when I give a state of the art model documentation lacking examples for a new library and ask it to write something, it cannot even begin to do that, even if the documentation is in the training data? A model that can reason should be able to understand the documentation and create novel examples. It cannot.
This happened to me just the other day. If the model can reason, examples of the language, which it has, and the expository documentation should have been sufficient.
Instead, the model repeatedly inserted bullshitted code in the style of the language I wanted, but with library calls and names based on a version of the library for another language.
This is evidence of reasoning ability? Claude Sonnet 3.7 and Gemini Pro both exhibited this behavior last week.
I think this technology is fundamentally the same as it has been since GPT2
looofooo0
But the LLM interacts with the program and the world through debugger, run-time feedback, linter, fuzzer etc., we can collect all the user feedback, user pattern ... Moreover, it can also get visual feedback. Reason through other programs like physic simulation etc. Use a robot to interact with the device running the code physically. Can use proof verifier like lean, to ensure its logical model of the program is sound. Do some back and forth between the logical model and the actual program through experiments. Maybe not now, but I don't see why the LLM needs to be kept in the Chinese Room.
CamperBob2
You're seriously still going to invoke the Chinese Room argument after what we've seen lately? Wow.
The computer understands Chinese better than Searle (or anyone else) understood the nature and functionality of language.
ryandv
You're seriously going to invoke this braindead reddit-tier of "argumentation," or rather lack thereof, by claiming bewilderment and offering zero substantive points?
Wow.
TeMPOraL
Wait, isn't the conclusion to take from the "Chinese room" literally the opposite of what you suggest? I.e. it's the most basic, go-to example of a larger system showing capability (here, understanding Chinese) that is not present in any of its constituent parts individually.
> Their words aren't "about" anything, they don't represent concepts or ideas or physical phenomena the same way the words and thoughts of a human do. The computer doesn't actually "understand Chinese" the way a human can.
That's very much unclear at this point. We don't fully understand how we relate words to concepts and meaning ourselves, but to the extent we do, LLMs are by far the closest implementation of those same ideas in a computer.
ryandv
> the conclusion to take from the "Chinese room"
We can hem and haw about whether or not there are others, but the particular conclusion I am drawing from is that computers lack "intentionality" regarding language, and indeed about anything at all. Symbol shunting, pencil pushing, and the mechanics of syntax are insufficient for the production of meaning and understanding.
That is, to oversimplify, the broad distinction drawn in Naur's article regarding the "programming as text manipulation" view vis-a-vis "programming as theory building."
> That's very much unclear at this point.
It's certainly a central point of contention.
vacuity
The Chinese room experiment was originally intended by Searle to (IIUC) do as you claim and justify computers as being capable of understanding like humans do. Since then, it has been used both in this pro-computer, "black box" sense and in the anti-computer, "white box" sense. Personally, I think both are relevant, and the issue with LLMs currently is not a theoretical failing but rather that they aren't convincing when viewed as black boxes (e.g. the Turing test fails).
dragonwriter
The Chinese Room is a mirror that reflects people’s hidden (well, often not very, but still) biases about whether the universe is mechanical or whether understanding involves dualistic metaphysical woo back at them as conclusions.
That's not why it was presented, of course, Searle aimed at proving something, but his use of it just illustrates which side of that divide he was on.
Jensson
> To make this point stick, you would also have to show why it's not an illusion when humans "appear" to have a theory.
Human theory building works, we have demonstrated this, our science letting us build things on top of things proves it.
LLM theory building so far doesn't, they always veer in a wrong direction after a few steps, you will need to prove that LLM can build theories just like we proved that humans can.
jerf
You can't prove LLMs can build theories like humans can, because we can effectively prove they can't. Most code bases do not fit in a context window. And any "theory" an LLM might build about a code base, analogously to the recent reasoning models, itself has to carve a chunk out of the context window, at what would have to be a fairly non-trivial percentage expansion of tokens versus the underlying code base, and there's already not enough tokens. There's no way that is big enough to build a theory of a code base.
"Building a theory" is something I expect the next generation of AIs to do, something that has some sort of memory that isn't just a bigger and bigger context window. As I often observe, LLMs != AI. The fact that an LLM by its nature can't build a model of a program doesn't mean that some future AI can't.
imtringued
This is correct. The model context is a form of short term memory. It turns out LLMs have an incredible short term memory, but simultaneously that is all they have.
What I personally find perplexing is that we are still stuck at having a single context window. Everyone knows that turing machines with two tapes require significantly fewer operations than a single tape turning machine that needs to simulate multiple tapes.
The reasoning stuff should be thrown into a separate context window that is not subject to training loss (only the final answer).
dkarl
The article is about what LLMs can do, and I read it as what they can do in theory, as they're developed further. It's an argument based on principle, not on their current limitations.
You can read it as a claim about what LLMs can do now, but that wouldn't be very interesting, because it's obvious that no current LLM can replace a human programmer.
I think the author contradicts themselves. They argue that LLMs cannot build theories because they fundamentally do not work like humans do, and they conclude that LLMs can't replace human programmers because human programmers need to build theories. But if LLMs fundamentally do not work like humans, how do we know that they need to build theories the same way that humans do?
falcor84
> they always veer in a wrong direction after a few steps
Arguably that's the case for humans too in the general case, as per the aphorism "Beware of a guy in a room" [0]. But as for AIs, the thing is that they're exponentially improving at this, such that according to METR, "The length of tasks that AI can do is doubling every 7 months"[1].
[0] https://medium.com/machine-words/a-guy-in-a-room-bbbe058645e...
[1] https://metr.org/blog/2025-03-19-measuring-ai-ability-to-com...
Jensson
Even dumb humans learn to play and beat video games on their own, so humans don't fail on this. Some humans fail to update their world model based on what other people tell them or when they don't care, but basically every human can learn from their own direct experiences if they focus on it.
psychoslave
> To make this point stick, you would also have to show why it's not an illusion when humans "appear" to have a theory.
That burden of proof is on you, since you are presumably human and you are challenging the need of humans to have more than a mere appearance of having a theory when they claim to have one.
Note that even when the only theoretical assumption we go with is that we will have a good laugh watching other people going crazy after random bullshits thrown at them, we still have a theory.
dcre
I agree. Of course you can learn and use a theory without having developed it yourself!
IanCal
Skipping that they say it's fallacious at the start, none of the arguments in the article are valid if you simply have models
1. Run code 2. Communicate with POs 3. Iteratively write code
n4r9
I thought the fallacy bit was tongue-in-cheek. They're not actually arguing from authority in the article.
The system you describe appears to treat programmers as mere cogs. Programmers do not simply write and iterate code as dictated by POs. That's a terrible system for all but the simplest of products. We could implement that system, then lose the ability to make broad architectural improvements, effectively adapt the model to new circumstances, or fix bugs that the model cannot.
IanCal
> The system you describe appears to treat programmers as mere cogs
Not at all, it simply addresses key issues raised. That they cannot have a theory of the program because they are reading it and not actually writing it - so have them write code, fix problems and iterate. Have them communicate with others to get more understanding of the "why".
> . Programmers do not simply write and iterate code as dictated by POs.
Communicating with POs is not the same as writing code directed by POs.
woah
These long winded philosophical arguments about what LLMs can't do which are invariably proven wrong within months are about as misguided as the gloom and doom pieces about how corporations will be staffed by "teams" of "AI agents". Maybe it's best just to let them cancel each other out. Both types of article seem to be written by people with little experience actually using AI.
falcor84
> First, you cannot obtain the "theory" of a large program without actually working with that program...
> Second, you cannot effectively work on a large program without a working "theory" of that program...
I find the whole argument and particularly the above to be a senseless rejection of bootstrapping. Obviously there was a point in time (for any program, individual programmer and humanity as a whole) that we didn't have a "theory" and didn't do the work, but now we have both, so a program and its theory can appear "de novo".
So with that in mind, how can we reject the possibility that as an AI Agent (e.g. Aider) works on a program over time, it bootstraps a theory?
Jensson
> So with that in mind, how can we reject the possibility that as an AI Agent (e.g. Aider) works on a program over time, it bootstraps a theory?
Lack of effective memory, that might have worked if you constantly retrained the LLM incorporating the new wisdom iteratively like a human does, but current LLM architecture doesn't enable that. The context provided is neither large enough nor can it use it effectively enough for complex problems.
And this isn't easy to solve, you very quickly collapse the LLM if you try to do this in the naive ways. We need some special insight that lets us update LLM continuously as it works in a positive direction the way humans can.
falcor84
Yeah, that's a good point. I absolutely agree that it needs access to effective long-term memory, but it's unclear to me that we need some "special insight". Research is relatively early on this, but we already see significant sparks of theory-building using basic memory retention, when Claude and Gemini are asked to play Pokemon [0][1]. It's clearly not at the level of a human player yet, but it (particularly Gemini) is doing significantly better than I expected at this stage.
Jensson
They update that gemini plays pokemon model when it gets stuck with new prompt engineering etc. So there the learning happens by a human and not the LLM, the LLM can do a lot with trial and error but if you follow it there it does the same action over and over and get stuck until the prompt engineering kicks it into self evaluating 20 steps later.
So that isn't just "ask it to play pokemon", that is a large program with tons of different prompts and memories that kicks in at different times, and even with all that and updates to the program when it gets stuck it still struggles massively and repeats mistakes over and over in ways human never would.
raincom
Yes, indeed. They think that every circular argument is vicious. Not at all, there are two kinds of circularity: virtuous circularity; vicious circularity. Bootstrapping falls under the former. Check [1] and [2]
[1] https://www.hipkapi.com/2011/03/10/foundationalism-and-virtu...
[2] Brown, Harold I. “Circular Justifications.” PSA: Proceedings of the Biennial Meeting of the Philosophy of Science Association 1994 (1994): 406–14. http://www.jstor.org/stable/193045.
mrkeen
> So with that in mind, how can we reject the possibility that as an AI Agent (e.g. Aider) works on a program over time, it bootstraps a theory?
That's the appropriate level of faith for today's LLMs. They're not good enough to replace programmers. They're good enough that we can't reject the possibility of them one day being good enough to replace programmers.
2mlWQbCK
And good enough does not mean "as good as". Companies happily outsource programming jobs to worse, but much cheaper, programmers, all the time.
codr7
I for one wouldn't mind seeing more focus on probability than possibility here.
Possibility means practically nothing.
mlsu
The information needs to propagate through the network either forward (when the model has the codebase in context) or backward (when it updates its weights).
You can have the models pseudo “learn” by putting things in something like a system prompt but this is limited by context, and they will never permanently learn. But we don’t train at inference time with today’s LLMs.
We can explicitly reject this possibility by looking at the information that goes into the model at train and test time.
BenoitEssiambre
Solomonoff induction says that the shortest program that can simulate something is its best explanatory theory. OpenAI researchers very much seem to be trying to do theory building ( https://x.com/bessiambre/status/1910424632248934495 ).
analyte123
The "theory" of a program is supposed to be majority embedded in its identifiers, tests, and type definitions. The same line of reasoning in this article could be used to argue that you should just name all your variables random 1 or 2 letter combinations since the theory is supposed to be all in your head anyway.
Indeed, it's quickly obvious where an LLM is lacking context because the type of a variable is not well-specified (or specified at all), the schema of a JSON blob is not specified, or there is some other secret constraint that maybe someone had in their head X years ago.
ebiester
First, I think it's fair to say that today, an LLM cannot replace a programmer fully.
However, I have two counters:
- First, the rational argument right now is that one person and money spent toward LLMs can replace three - or more - programmers total. This is the argument with a three year bound. The current technology will improve and developers will learn how to use it to its potential.
- Second, the optimistic argument is that a combination of the LLM model with larger context windows and other supporting technology around it will allow it to emulate a theory of mind that is similar to the average programmer. Consider Go or Chess - we didn't think computers had the theory of mind to be better than a human, but it found other ways. For humans, Naur's advice stands. We cannot assume that this is true if there are tools with different strengths and weaknesses than humans.
ActionHank
I think that everyone is misjudging what will improve.
There is no doubt it will improve, but if you look at a car, it is still the same fundamental "shape" of a model T.
There are niceties and conveniences, efficiency went way up, but we don't have flying cars.
I think we are going to have something, somewhere in the middle, AI features will eventually find their niche, people will continue to leverage whatever tools and products are available to build the best thing they can.
I believe that a future of self-writing code pooping out products, AI doing all the other white collar jobs, and robots doing the rest cannot work. Fundamentally there is no "business" without customers and no customers if no one is earning.
rowanseymour
If you forced me to put a number on how much more productive having copilot makes me I think I would say < 5%, so I'm struggling to see how anyone can just assert that "the rational argument right now" is that I can be 200% more productive.
Maybe as a senior dev working on a large complex established project I don't benefit from LLMs as much as others because as I and the project mature.. productivity becomes less and less correlated with lines of code, and more about the ability to comprehend the bigger picture and how different components interact... things that even LLMs with bigger context aren't good at.
spacemadness
This is what I tried explaining to our management who are using lines of code metrics on engineers working on an established codebase. Other than lines of code being a terrible metric in general, they don’t seem to understand or care to understand the difference.
xpe
> Theories are developed by doing the work and LLMs do not do the work. They ingest the output of work.
This is often the case but does not _have_ to be so. LLMs can use chain of thought to “talk out loud” and “do the work”. It can use supplementary documents and iterate on its work. The quality of course varies, but it is getting better. When I read Gemini 2.5’s “thinking” notes, it indeed can build up text that is not directly present in its training data.
Putting aside anthropocentric definitions of “reasoning” and “consciousness” are key to how I think about the issues here. I’m intentionally steering completely clear of consciousness.
Modern SOTA LLMs are indeed getting better at what people call “reasoning”. We don’t need to quibble over defining some quality bar; that is probably context-dependent and maybe even arbitrary.
It is clear LLMs are doing better at “reasoning” — I’m using quotes to emphasize that (to me) it doesn’t matter if their inner mechanisms for doing reasoning don’t look like human mechanisms. Instead, run experiments and look at the results.
We’re not talking about the hard problem of consciousness, we’re talking about something that can indeed be measured: roughly speaking, the ability to derive new truths from existing ones.
(Because this topic is charged and easily misunderstood, let me clarify some questions that I’m not commenting on here: How far can the transformer-based model take us? Are data and power hungry AI models cost-effective? What viable business plans exist? How much short-term risk, to say, employment and cybersecurity? How much long-term risk to human values, security, thriving, and self-determination?)
Even if you disagree with parts of my characterization above, hear this: We should at least be honest to ourselves when we move the goal posts.
Don’t mistake my tone for zealotry. I’m open to careful criticism. If you do, please don’t try to lump me into one “side” on the topic of AI — whether it be market conditions, commercialization, safety, or research priorities — you probably don’t know me well enough to do that (yet). Apologies for the pre-defensive posture; but the convos here are often … fraught, so I’m trying to head off some of the usual styles of reply.
BiraIgnacio
Great post and Naur's paper is really great. What I can't help stop thinking is of the many other cases where something should-not-be because being is less than ideal, and yet, they insist on being. In other words, LLMs should not be able to largely replace programmers and yet, they might.
codr7
Might, potentially; it's all wishful thinking.
I might one day wake up and find my dog to be more intelligent than me, not very likely but I can't prove it to be impossible.
It's still useless.
lo_zamoyski
In some respects, perhaps in principle they could. But what is the point of handing off the entire process to a machine, even if you could?
If programming is a tool for thinking and modeling, with execution by a machine as a secondary benefit, then outsourcing these things to LLMs contributes nothing to our understanding. By analogy, we do math because we wish to understand the mathematical universe, so to speak, not because we just want some practical result.
To understand, to know, are some of the highest powers of the human person. Machines are useful for helping us enable certain work or alleviate tedium to focus on the important stuff, but handing off understanding and knowledge to a machine (if it were possible, which it isn't) would be one of the most inhuman things you could do.
philipswood
> Theories are developed by doing the work and LLMs do not do the work. They ingest the output of work.
It isn't certain that this framing is true. As part of learning to predict the outcome of the work token by token, LLMs very well might be "doing the work" as an intermediate step via some kind of reverse engineering.
skydhash
> As part of learning to predict the outcome of the work token by token
They're already have the full work available. When you're reading the source code of a program to learn how it works, your objective is not to learn what keyword are close to each other or extract the common patterns. You're extracting a model which is an abstraction about some real world concept (or some other abstractions) and rules of manipulation of that abstraction.
After internalizing that abstraction, you can replicate it with whatever you want, extends it further,... It's an internal model that you can shape as you please in your mind, then create a concrete realization once you're happy with the shape.
philipswood
As Naur describes this, the full code and documentation, and the resulting model you can build up from it is merely "walking the path" (as the blogpost put it), and does not encode "building the path".
I.e. the theory of the program as it exist in the minds of the development team might not be fully available for reconstruction from just the final code and docs since it includes a lot of activity that does not end up in the code.
skydhash
It could be, if you were trying to only understand how the code does something. But more often, you're actively trying to understand how it was built by comparing assumptions with the code in front of you. It is not merely walking the path, if you've created a similar path and are comparing techniques.
MarkusQ
> the theory of the program as it exist in the minds of the development
> team might not be fully available for reconstruction from just the
> final code and docs
As an obvious and specific source of examples, all the features they decided to omit, "optimizations" they considered but rejected for various reasons, etc. are not present in the code and seldom in the comments or documentation.
Occasionally you will see things like "Full search rather than early exit on match to prevent timing attacks" or "We don't write it in format xyz because of patent issues" or some such, but the vast majority of such cases pass unremarked.
fedeb95
there's an additional difficulty. Who told the man to build a road? This is the main stuff that LLMs or any other technology currently seem to lack, the "why", a reason to do stuff a certain way and not another.
A problem as old as human itself.
lo_zamoyski
Yes, but it's more than that. As I've written before, LLMs (and all AI) lack intentionality. They do not possess concepts. They only possess, at best, conventional physical elements of signs whose meaning, and in fact identity as signs, are entirely subjective and observer relative, belonging only to the human user who interprets these signs. It's a bit like a book: the streaks of pigmentation on cellulose have no intrinsic meaning apart from being streaks of pigmentation on cellulose. They possess none of the conceptual content we associate with books. All of the meaning comes from the reader who must first treat these marks on paper as signs, and then interpret these signs accordingly. That's what the meaning of "reading" entails: the interpretation of symbols, which is to say, the assignment of meanings to symbols.
Formal languages are the same, and all physical machines typically contain are some kind of physical state that can be changed in ways established by convention that align with interpretation. LLMs, from a computational perspective, are just a particular application. They do not introduce a new phenomenon into the world.
So in that sense, of course LLMs cannot build theories strictly speaking, but they can perhaps rearrange symbols in a manner consistent with their training that might aid human users.
To make it more explicit: can LLMs/AI be powerful practically? Sure. But practicality is not identity. And even if an LLM can produce desired effects, the aim of theory in its strictest sense is understanding on the part of the person practicing it. Even if LLMs could understand and practice theory, unless they were used to aid us in our understanding of the world, who cares? I want to understand reality!
philipswood
The paper he quotes is a favorite of mine and I think is has strong implications for the use of LLMs, but I don't think that this implies that LLMs can't form theories or write code effectively.
I suspect that the question to his final answer is:
> To replace human programmers, LLMs would need to be able to build theories by Ryle’s definition
skydhash
Having a theory of the program, means you can argue about its current state or its transition in a new state, not merely describing what it is doing.
If you see "a = b + 1" it's obvious that the variable a is taking the value of variable b incremented by one. What LLMs can't do is explaining why we have this and why it needs to change to "a = b - 1" in the new iteration. Writing code is orthogonal to this capability.
philipswood
> What LLMs can't do is explaining why we have this and why it needs to change to "a = b - 1" in the new iteration.
I did a search on Github for code containing `a=b+1` and found this:
https://github.com/haoxizhong/problem/blob/a2b934ee7bb33bbe9...
It looks to me that ChatGPT specifically does a more than OK job at explaining why we have this.
https://chatgpt.com/share/680f877d-b588-8003-bed5-b425e14a53...
While your use of 'theory' is reasonable Naur uses a specific and more elaborate definition of theory.
Example from the paper:
>Case 1 concerns a compiler. It has been developed by a group A for a Language L and worked very well on computer X. Now another group B has the task to write a compiler for a language L + M, a modest extension of L, for computer Y. Group B decides that the compiler for L developed by group A will be a good starting point for their design, and get a contract with group A that they will get support in the form of full documentation, including annotated program texts and much additional written design discussion, and also personal advice. The arrangement was effective and group B managed to develop the compiler they wanted. In the present context the significant issue is the importance of the personal advice from group A in the matters that concerned how to implement the extensions M to the language. During the design phase group B made suggestions for the manner in which the extensions should be accommodated and submitted them to group A for review. In several major cases it turned out that the solutions suggested by group B were found by group A to make no use of the facilities that were not only inherent in the structure of the existing compiler but were discussed at length in its documentation, and to be based instead on additions to that structure in the form of patches that effectively destroyed its power and simplicity. The members of group A were able to spot these cases instantly and could propose simple and effective solutions, framed entirely within the existing structure. This is an example of how the full program text and additional documentation is insufficient in conveying to even the highly motivated group B the deeper insight into the design, that theory which is immediately present to the members of group A.
woah
This is like when I was making Wordpress sites in 2010 and I would hook it up with all the awesome admin panel plugins that I could find to provide the client the ability to customize any part of the site with one click and they still called me any time they needed to switch an image in the slideshow or publish a blog post
IanCal
What's the purpose of this?
> In this essay, I will perform the logical fallacy of argument from authority (wikipedia.org) to attack the notion that large language model (LLM)-based generative "AI" systems are capable of doing the work of human programmers.
Is any part of this intended to be valid? It's a very weak argument - is that the purpose?
Although I'm sympathetic to the author's argument, I don't think they've found the best way to frame it. I have two main objections i.e. points I guess LLM advocates might dispute.
Firstly:
> LLMs are capable of appearing to have a theory about a program ... but it’s, charitably, illusion.
To make this point stick, you would also have to show why it's not an illusion when humans "appear" to have a theory.
Secondly:
> Theories are developed by doing the work and LLMs do not do the work
Isn't this a little... anthropocentric? That's the way humans develop theories. In principle, could a theory not be developed by transmitting information into someone's brain patterns as if they had done the work?