GitHub CEO: manual coding remains key despite AI boom

116 comments

·June 23, 2025

agentultra

… because programming languages are the right level of precision for specifying a program you want. Natural language isn’t it. Of course you need to review and edit what it generates. Of course it’s often easier to make the change yourself instead of describing how to make the change.

I wonder if the independent studies that show Copilot increasing the rate of errors in software have anything to do with this less bold attitude. Most people selling AI are predicting the obsolescence of human authors.

soulofmischief

Transformers can be used to automate testing, create deeper and broader specification, accelerate greenfield projects, rapidly and precisely expand a developer's knowledge as needed, navigate unfamiliar APIs without relying on reference, build out initial features, do code review and so much more.

Even if code is the right medium for specifying a program, transformers act as an automated interface between that medium and natural language. Modern high-end transformers have no problem producing code, while benefiting from a wealth of knowledge that far surpasses any individual.

> Most people selling AI are predicting the obsolescence of human authors.

It's entirely possible that we do become obsolete for a wide variety of programming domains. That's simply a reality, just as weavers saw massive layoffs in the wake of the automated loom, or scribes lost work after the printing press, or human calculators became pointless after high-precision calculators became commonplace.

This replacement might not happen tomorrow, or next year, or even in the next decade, but it's clear that we are able to build capable models. What remains to be done is R&D around things like hallucinations, accuracy, affordability, etc. as well as tooling and infrastructure built around this new paradigm. But the cat's out of the bag, and we are not returning to a paradigm that doesn't involve intelligent automation in our daily work; programming is literally about automating things and transformers are a massive forward step.

That doesn't really mean anything, though; You can still be as involved in your programming work as you'd like. Whether you can find paid, professional work depends on your domain, skill level and compensation preferences. But you can always program for fun or personal projects, and decide how much or how little automation you use. But I will recommend that you take these tools seriously, and that you aren't too dismissive, or you could find yourself left behind in a rapidly evolving landscape, similarly to the advent of personal computing and the internet.

interstice

> Modern high-end transformers have no problem producing code, while benefiting from a wealth of knowledge that far surpasses any individual.

It will also still happily turn your whole codebase into garbage rather than undo the first thing it tried to try something else. I've yet to see one that can back itself out of a logical corner.

jcalvinowens

This is it for me. If you ask these models to write something new, the result can be okay.

But the second you start iterating with them... the codebase goes to shit, because they never delete code. Never. They always bolt new shit on to solve any problem, even when there's an incredibly obvious path to achieve the same thing in a much more maintainable way with what already exists.

Show me a language model that can turn rube goldberg code into good readable code, and I'll suddenly become very interested in them. Until then, I remain a hater, because they only seem capable of the opposite :)

soulofmischief

That's a combination of current context limitations and a lack of quality tooling and prompting.

A well-designed agent can absolutely roll back code if given proper context and access to tooling such as git. Even flushing context/message history becomes viable for agents if the functionality is exposed to them.

recursive

> It will also still happily turn your whole codebase into garbage rather than undo the first thing it tried to try something else.

That's not true at all.

...

It's only pretending to be happy.

nitwit005

I don't disagree exactly, but the AI that fully replaces all the programmers is essentially a superhuman one. It's matching human output, but will obviously be able to do some tasks like calculations much faster, and won't need a lunch break.

At that point it's less "programmers will be out of work" as "most work may cease to exist".

johnnyjeans

> That's simply a reality, just as weavers saw massive layoffs in the wake of the automated loom, or scribes lost work after the printing press, or human calculators became pointless after high-precision calculators became commonplace.

See, this is the kind of conception of a programmer I find completely befuddling. Programming isn't like those jobs at all. There's a reason people who are overly attached to code and see their job as "writing code" are pejoratively called "code monkeys." Did CAD kill the engineer? No. It didn't. The idea is ridiculous.

soulofmischief

> Programming isn't like those jobs at all

I'm sure you understand the analogy was about automation and reduction in workforce, and that each of these professions have both commonalities and differences. You should assume good faith and interpret comments on Hacker News in the best reasonable light.

> There's a reason people who are overly attached to code and see their job as "writing code" are pejoratively called "code monkeys."

Strange. My experience is that "code monkeys" don't give a crap about the quality of their code or its impact with regards to the product, which is why they remain programmers and don't move on to roles which incorporate management or product responsibilities. Actually, the people who are "overly attached to code" tend to be computer scientists who are deeply interested in computation and its expression.

> Did CAD kill the engineer? No. It didn't. The idea is ridiculous.

Of course not. It led to a reduction in draftsmen, as now draftsmen can work more quickly and engineers can take on work that used to be done by draftsmen. The US Bureau of Labor Statistics states[0]:

  Expected employment decreases will be driven by the use of computer-aided design (CAD) and building information modeling (BIM) technologies. These technologies increase drafter productivity and allow engineers and architects to perform many tasks that used to be done by drafters.

Similarly, the other professions I mentioned were absorbed into higher-level professions. It has been stated many times that the future focus of software engineers will be less about programming and more about product design and management.

I saw this a decade ago at the start of my professional career and from the start have been product and design focused, using code as a tool to get things done. That is not to say that I don't care deeply about computer science, I find coding and product development to each be incredibly creatively rewarding, and I find that a comprehensive understanding of each unlocks an entirely new way to see and act on the world.

[0] https://www.bls.gov/ooh/architecture-and-engineering/drafter...

JoeOfTexas

Doesn't AI have diminishing returns on it's pseudo creativity? Throw all the training output of LLM into a circle. If all input comes from other LLM output, the circle never grows. Humans constantly step outside the circle.

Perhaps LLM can be modified to step outside the circle, but as of today, it would be akin to monkeys typing.

svachalek

I think you're either imagining the circle too small or overestimating how often humans step outside it. The typical programming job involves lots and lots of work, and yet none of it creating wholly original computer science. Current LLMs can customize well known UI/IO/CRUD/REST patterns with little difficulty, and these make up the vast majority of commercial software development.

msgodel

In my experience the best use of AI is to stay in the flow state when you get blocked by an API you don't understand or a feature you don't want to implement for whatever reason.

JamesBarney

Right level for exactly specifying program behavior in a global domain without context.

But once you add repo context, domain knowledge etc... programming languages are far too verbose.

sysmax

AI can very efficiently apply common patterns to vast amounts of code, but it has no inherent "idea" of what it's doing.

Here's a fresh example that I stumbled upon just a few hours ago. I needed to refactor some code that first computes the size of a popup, and then separately, the top left corner.

For brevity, one part used an "if", while the other one had a "switch":

    if (orientation == Dock.Left || orientation == Dock.Right)
        size = /* horizontal placement */
    else
        size = /* vertical placement */

    var point = orientation switch
    {
        Dock.Left => ...
        Dock.Right => ...
        Dock.Top => ...
        Dock.Bottom => ...
    };

I wanted the LLM to refactor it to store the position rather than applying it immediately. Turns out, it just could not handle different things (if vs. switch) doing a similar thing. I tried several variations of prompts, but it very strongly leaning to either have two ifs, or two switches, despite rather explicit instructions not to do so.

It sort of makes sense: once the model has "completed" an if, and then encounters the need for a similar thing, it will pick an "if" again, because, well, it is completing the previous tokens.

Harmless here, but in many slightly less trivial examples, it would just steamroll over nuance and produce code that appears good, but fails in weird ways.

That said, splitting tasks into smaller parts devoid of such ambiguities works really well. Way easier to say "store size in m_StateStorage and apply on render" than manually editing 5 different points in the code. Especially with stuff like Cerebras, that can chew through complex code at several kilobytes per second, expanding simple thoughts faster than you could physically type them.

gametorch

Yeah that's one model that you happen to be using in June 2025.

Give it to o3 and it could definitely handle that today.

Sweeping generalizations about how LLMs will never be able to do X, Y, or Z coding task will all be proven wrong with time, imo.

npinsker

Sweeping generalizations about how LLMs will always (someday) be able to do arbitrary X, Y, and Z don't really capture me either

gametorch

In response to your sweeping generalization, I posit a sweeping generalization of my own, said the bard:

Whatever can be statistically predicted

by the human brain

Will one day also be

statistically predicted by melted sand

sysmax

I am working on a GUI for delegating coding tasks to LLMs, so I routinely experiment with a bunch of models doing all kinds of things. In this case, Claude Sonnet 3.7 handled it just fine, while Llama-3.3-70B just couldn't get it. But that is literally the simplest example that illustrates the problem.

When I tried giving top-notch LLMs harder tasks (scan an abstract syntax tree coming from a parser in a particular way, and generate nodes for particular things) they completely blew it. Didn't even compile, let alone logical errors and missed points. But once I broke down the problem to making lists of relevant parsing contexts, and generating one wrapper class at a time, it saved me a whole ton of work. It took me a day to accomplish what would normally take a week.

Maybe they will figure it out eventually, maybe not. The point is, right now the technology has fundamental limitations, and you are better off knowing how to work around them, rather than blindly trusting the black box.

gametorch

Yeah exactly.

I think it's a combination of

1) wrong level of granularity in prompting

2) lack of engineering experience

3) autistic rigidity regarding a single hallucination throwing the whole experience off

4) subconscious anxiety over the threat to their jerbs

5) unnecessary guilt over going against the tide; anything pro AI gets heavily downvoted on Reddit and is, at best, controversial as hell here

I, for one, have shipped like literally a product per day for the last month and it's amazing. Literally 2,000,000+ impressions, paying users, almost 100 sign ups across the various products. I am fucking flying. Hit the front page of Reddit and HN countless times in the last month.

Idk if I break down the prompts better or what. But this is production grade shit and I don't even remember the last time I wrote more than two consecutive lines of code.

guappa

If you need a model per task, we're very far from AGI.

DataDaoDe

The interesting questions happen when you define X, Y and Z and time. For example, will llms be able to solve the P=NP problem in two weeks, 6 months, 5 years, a century? And then exploring why or why not

soulofmischief

> AI can very efficiently apply common patterns to vast amounts of code, but it has no inherent "idea" of what it's doing.

AI stands for Artificial Intelligence. There are no inherent limits around what AI can and can't do or comprehend. What you are specifically critiquing is the capability of today's popular models, specifically transformer models, and accompanying tooling. This is a rapidly evolving landscape, and your assertions might no longer be relevant in a month, much less a year or five years. In fact, your criticism might not even be relevant between current models. It's one thing to speak about idiosyncrasies between models, but any broad conclusions drawn outside of a comprehensive multi-model review with strict procedure and controls is to be taken with a massive grain of salt, and one should be careful to avoid authoritative language about capabilities.

It would be useful to be precise in what you are critiquing, so that the critique actually has merit and applicability. Even saying "LLM" is a misnomer, as modern transformer models are multi-modal and trained on much more than just textual language.

mattbee

What a ridiculous response, to scold the GP for criticising today's AI because tomorrow's might be better. Sure, it might! But it ain't here yet buddy.

Lots of us are interested in technology that's actually available, and we can all read date stamps on comments.

soulofmischief

You're projecting that I am scolding OP, but I'm not. My language was neutral and precise. I presented no judgment, but gave OP the tools to better clarify their argument and express valid, actionable criticism instead of wholesale criticizing "AI" in a manner so imprecise as to reduce the relevance and effectiveness of their argument.

> But it ain't here yet buddy . . . we can all read date stamps on comments.

That has no bearing on the general trajectory that we are currently on in computer science and informatics. Additionally, your language is patronizing and dismissive, trading substance for insult. This is generally frowned upon in this community.

You failed to actually address my comment, both by failing to recognize that it was mainly about using the correct terminology instead of criticizing an entire branch of research that extends far beyond transformers or LLMs, and by failing to establish why a rapidly evolving landscape does not mean that certain generalizations cannot yet be made, unless they are presented with several constraints and caveats, which includes not making temporally-invariant claims about capabilities.

I would ask that you reconsider your approach to discourse here, so that we can avoid this thread degenerating into an emotional argument.

taysix

I had a fun result the other day from Claude. I opened a script in Zed and asked it to "fix the error on line 71". Claude happily went and fixed the error on line 91....

1. There was no error on line 91, it did some inconsequential formatting on that line 2. More importantly, it just ignored the very specific line I told it to go to. It's like I was playing telephone with the LLM which felt so strange with text-based communication.

This was me trying to get better at using the LLM while coding and seeing if I could "one-shot" some very simple things. Of course me doing this _very_ tiny fix myself would have been faster. Just felt weird and reinforces this idea that the LLM isn't actually thinking at all.

klysm

LLMs probably have bad awareness of line numbers

mcintyre1994

I suspect if OP highlighted line 71 and added it to chat and said fix the error, they’d get a much better response. I assume Cursor could create a tool to help it interpret line numbers, but that’s not how they expect you to use it really.

recursive

How is this better from just using a formal language again?

senko

> This was me trying to get better at using the LLM while coding

And now you've learned that LLMs can't count lines. Next time, try asking it to "fix the error in function XYZ" or copy/paste the line in question, and see if you get better results.

> reinforces this idea that the LLM isn't actually thinking at all.

Of course it's not thinking, how could it? It's just a (rather big) equation.

throwdbaaway

As shared by Simon in https://news.ycombinator.com/item?id=44176523, a better agent will prepend the line numbers as a workaround, e.g. Claude Code:

    54 def dicts_to_table_string(
    55     headings: List[str], dicts: List[Dict[str, str]]
    56 ) -> List[str]:
    57     max_lengths = [len(h) for h in headings]
    58 
    59     # Compute maximum length for each column
    60     for d in dicts:

emp17344

That’s not what he’s saying there. There’s a separate tool that adds line numbers before feeding the prompt into the LLM. It’s not the LLM doing it itself.

toephu2

Sounds like operator error to me.

You need to give LLMs context. Line number isn't good context.

meepmorp

> Line number isn't good context.

a line number is plenty of context - it's directly translatable into a range of bytes/characters in the file

mediaman

It's a tool. It's not a human. A line number works great for humans. Today, they're terrible for LLMs.

I can choose to use a screwdriver to hammer in a nail and complain about how useless screwdrivers are. Or I can realize when and how to use it.

We (including marketing & execs) have made a huge mistake in anthropomorphizing these things, because then we stop treating them like tools that have specific use cases to be used in certain ways, and more like smart humans that don't need that.

Maybe one day they'll be there, but today they are screwdrivers. That doesn't make them useless.

lvl155

AI is still not there yet. For example, I constantly have to deal with mis-referenced documents and specifications. I think part of the problem is that it was trained on outdated materials and is unable to actively update on the fly. I can get around this problem but…

The problem with today’s LLM and GI solutions is that they try to solve all n steps when solving the first i steps would be infinitely more useful for human consumption.

exiguus

Personally, I define the job of a software engineer as transform requirements into software. Software is not only code. Requirements are not only natural language. At the moment I can't manage to be faster with the AI than manually. Unless its a simple task or software. In my experience AI's are atm junior or mid-level developers. And in the last two years, they didn't get significant better.

nicbou

Most of the time, the requirements are not spelled out. Nobody even knows what the business logic is supposed to be. A lot of it has to be decided by the software engineer based on available information. It sometimes involve walking around the office asking people things.

It also requires a fair bit of wisdom to know where the software is expected to grow, and how to architect for that eventuality.

I can't picture an LLM doing a fraction of that work.

exiguus

I think that's my problem with AI. Let's say I have all the requirements, down to the smallest detail. Then I make my decisions at a micro level. Formulate an architecture. Take all the non-functionals into account. I would write a book as a prompt that is not able to express my thoughts as accurately as if I were writing code right away. Apart from the fact that the prompt is generally a superfluous intermediate step in which I struggle to create an exact programming language with an imprecise natural language with a result that is not reproduce-able.

mediaman

AI is less useful when I'm picking through specific, careful business logic.

But we still have lots of boilerplate code that's required, and it's really good at that. If I need a reasonably good-looking front-end UI, Claude will put that together quickly for me. Or if I need it to use a common algorithm against some data, it's good at applying that.

What I like about AI is that it lets me slide along the automation-specificity spectrum as I go through my work. Sometimes it feels like I've opened up a big stretch of open highway (some standard issue UI work with structures already well-defined) and I can "goose" it. Sometimes there's some tricky logic, and I just know the AI won't do well; it takes just as long to explain the logic as to write it in code. So I hit the brakes, slow down, navigate those tricky parts.

Previously, in the prior attempt to remove the need for engineers, no-code solutions instead trapped people. They pretended like everything is open highway. It's not, which is why they fail. But here we can apply our own judgment and skill in deciding when a lot of "common-pattern" code is required versus the times where specificity and business context is required.

That's also why these LLMs get over-hyped: people over-hype them for the same reason they over-hyped no-code; they don't understand the job to be done, and because of that, they actually don't understand why this is much better than the fanciful capabilities they pretend it has ("it's like an employee and automates everything!"). The tools are much more powerful when an engineer is in the driver's seat, exercising technical judgment on where and how much to use it for every given problem to be solved.

ra0x3

This is a really good point.

At a certain point, huge prompts (which you can readily find available on Github, used by large LLM-based corps) are just....files of code? Why wouldn't I just write the code myself at that point?

Almost makes me think this "AI coding takeover" is strictly pushed by non-technical people.

layer8

One of the most useful properties of computers is that they enable reliable, eminently reproducible automation. Formal languages (like programming languages) not only allow to unambiguously specify the desired automation to the upmost level of precision, they also allow humans to reason about the automation with precision and confidence. Natural language is a poor substitute for that. The ground truth of programs will always be the code, and if humans want to precisely control what a program does, they’ll be best served by understanding, manipulating, and reasoning about the code.

CoffeeOnWrite

“Manual” has a negative connotation. If I understand the article correctly, they mean “human coding remains key”. It’s not clear to me the GitHub CEO actually used the word “manual”, that would surprise me. Is there another source on this that’s either more neutral or better at choosing accurate words? The last thing we need is to put down human coding as “manual”; human coders have a large toolbox of non-AI tools to automate their coding.

(Wow I sound triggered! sigh)

upghost

It's almost as bad as "manual" thinking!

anamexis

What is the distinction between manual coding and human coding?

null

[deleted]

ivanjermakov

Analog coding

null

[deleted]

dalyons

Acoustic coding

GuinansEyebrows

> Wow I sound triggered! sigh

this is okay! it's a sign of your humanity :)

layer8

How about “organic coding”? ;)

bad_haircut72

I think so many devs fail to realise that to your product manager / team lead, the interface between you and the LLM is basically the same. They write a ticket/prompt and they get back a bunch of code that undoubtedly has bugs and misinterpretations in it, will probably go through a few rounds of revisions of back and forth until its good enough to ship (ie they tested it black-box style and it worked once) then they can move on to the next thing until whatever this ticket was about rears its ugly head again at some point in the future. If you arent used to writing user stories / planning, youre really gonna be obsolete soon.

hnthrow90348765

My guess is they will settle for 2x the productivity as a before-AI developer as their skill floor, but then not take a look at how long meetings and other processes take.

Why not look at Bob who takes like 2 weeks to write tickets on what they actually want in a feature? Or Alice who's really slow getting Figma designs done and validated? How nice would having a "someone's bothered a developer" metric be and having the business seek to get that to zero and talk very loudly about it as they have about developers?

strict9

It's interesting to see a CEO express thoughts on AI and coding go in a slightly different direction.

Usually the CEO or investor says 30% (or some other made up number) of all code is written by AI and the number will only increase, implying that developers will soon be obsolete.

It's implied that 30% of all code submitted and shipped to production is from AI agents with zero human interaction. But of course this is not the case, it's the same developers as before using tools to more rapidly write code.

And writing code is only one part of a developer's job in building software.

madeofpalk

He’s probably more right than not. But he also has a vested interest in this (just like the other CEOs who say the opposite), being in the business of human-mediated code.

yodon

Presumably you're aware that the full name of Microsoft's Copilot AI code authoring tool is "GitHub Copilot", that GitHub developed it, and that he runs GitHub.

Imustaskforhelp

Yea, which is why I was surprised too when he said this.

madeofpalk

Copilot. Not Pilot.

heisenbit

Well, I suspect GitHub's income is a function of the number of developers using it so it is not surprising that he takes this position.

boshalfoshal

Imo this is a misunderstanding of what AI companies want AI tools to be and where the industry is heading in the near future. The endgame for many companies is SWE automation, not augmentation.

To expand -

1. Models "reason" and can increasingly generate code given natural language. Its not just fancy autocomplete, its like having an intern - mid level engineer at your beck and call to implement some feature. Natural language is generally sufficient enough when I interact with other engineers, why is it not sufficient for an AI, which (in the limit), approaches an actual human engineer?

2. Business wise, companies will not settle for augmentation. Software companies pay tons of money in headcount, its probably most mid-sized companies top or second line item. The endgame for leadership at these companies is to do more with less. This necessitates automation (in addition to augmenting the remaining roles).

People need to stop thinking of LLMs as "autocomplete on steroids" and actually start thinking of them as a "24/7 junior SWE who doesn't need to eat or sleep and can do small tasks at 90% accuracy with some reasonable spec." Yeah you'll need to edit their code once in a while but they also get better and cost less than an actual person.

throw234234234

That is the ideal for the management types and the AI labs themselves yes. Copilots are just a way to test the market, and gain data and adoption early. I don't think it is much of a secret anymore. We even see benchmarks created (e.g. OpenAI recently did one) that are solely about taking paid work away from programmers and how many "paid tasks" it can actually do. They have a benchmark - that's their target.

As a standard pleb though I can understand why this world; where the people with connections, social standing and capital have an advantage isn't appealing to many on this forum. If anyone can do something - other advantages that aren't as easy to acquire matter more relatively.

__loam

Folks who believe this are going to lose a lot of money fixing broken software and shipping products that piss off their users.

HexDecOctBin

Good for outsourcing companies. India used Y2K to build its IT sector and lift up its economy, hopefully Africa can do the same fixing AI slop.

h4kunamata

Too late, I am seeing developer after developer doing copy/paste from AI tools and when asked, they have no idea how the code works coz "it just works"

Google itself said 30% of their code is AI generated, and yet they had a recent outage worldwide, coincidence??

You tell me.