AI slows down open source developers. Peter Naur can teach us why

176 comments

·July 14, 2025

narush

Hey HN -- study author here! (See previous thread on the paper here [1].)

I think this blog post is an interesting take on one specific factor that is likely contributing to slowdown. We discuss this in the paper [2] in the section "Implicit repository context (C.1.5)" -- check it out if you want to see some developer quotes about this factor.

> This is why AI coding tools, as they exist today, will generally slow someone down if they know what they are doing, and are working on a project that they understand.

I made this point in the other thread discussing the study, but in general, these results being surprising makes it easy to read the paper, find one factor that resonates, and conclude "ah, this one factor probably just explains slowdown." My guess: there is no one factor -- there's a bunch of factors that contribute to this result -- at least 5 seem likely, and at least 9 we can't rule out (see the full factors table on page 11).

> If there are no takers then I might try experimenting on myself.

This sounds super cool! I'd be very excited to see how you set this up + how it turns out... please do shoot me an email (in the paper) if you do this!

> AI slows down open source developers. Peter Naur can teach us why

Nit: I appreciate how hard it is to write short titles summarizing the paper (the graph title is the best I was able to do after a lot of trying) -- but I might have written this "Early-2025 AI slows down experienced open-source developers. Peter Naur can give us more context about one specific factor." It's admittedly less of a catchy-title, but I think getting the qualifications right are really important!

Thanks again for the sweet write-up! I'll hang around in the comments today as well.

[1] https://news.ycombinator.com/item?id=44522772

[2] https://metr.org/Early_2025_AI_Experienced_OS_Devs_Study.pdf

jwhiles

Thanks for the response, and apologies for misrepresenting your results somewhat! I'm probably not going to change the title since I am at heart and polemicist and a sloppy thinker, but I'll update the article to call out this misrepresentation.

That said, I think that what I wrote more or less encompasses three of the factors you call out as being likely to contribute: "High developer familiarity with reposito- ries", "Large and complex repositories", and "Implicit repository context".

I thought more about experimenting on myself, and while I hope to do it - I think it will be very hard to create a controlled enviornment whilst also responding to the demands the job puts on me. I also don't have the luxury of a list of well scoped tasks that could feasibly be completed in a few hours.

seanwilson

If this makes sense, how is the study able to give a reasonable measure of how long an issue/task should have taken, vs how long it took with AI to determine that using AI was slower?

Or it's comparing how long the dev thought it should take with AI vs how long it actually took, which now includes the dev's guess of how AI impacts their productivity?

When it's hard to estimate how difficult an issue should be to complete, how does the study account for this? What percent speed up or slow down would be noise due to estimates being difficult?

I do appreciate that this stuff is very hard to measure.

antonvs

> Early-2025 AI slows down experienced open-source developers.

Even that's too general, because it'll depend on what the task is. It's not as if open source developers in general never work on tasks where AI could save time.

narush

We call this over-generalization out specifically in the "We do not provide evidence that:" table in the blog post and paper - I agree there are tasks these developers are likely sped up on with early-2025 tools.

mung_daal

[dead]

munificent

> The inability of developers to tell if a tool sped them up or slowed them down is fascinating in itself, probably applies to many other forms of human endeavour, and explains things as varied as why so many people think that AI has made them 10 times more productive, why I continue to use Vim, why people drive in London etc.

In boating, there's a notion of a "set and drift" which describes how wind and current pushes a boat off course. If a mariner isn't careful, they'll end up far from their destination because of it.

This is because when you're sitting in a boat, your perception of motion is relative and local. You feel the breeze on your face, and you see how the boat cuts through the surrounding water. You interpret that as motion towards your destination, but it can equally consist of wind and current where the medium itself is moving.

I think a similar effect explains all of these. Our perception of "making progress" is mostly a sense of motion and "stuff happening" in our immediate vicinity. It's not based on a perception of the goal getting closer, which is much harder to measure and develop an intuition for.

So people tend to choose strategies that make them feel like they're making progress even if it's not the most effective strategy. I think this is why people often take "shortcuts" when driving that are actually longer. All of the twists and turns keep them busy and make them feel like they're making more progress than zoning out on a boring interstate does.

wrsh07

Something I noticed early on when using AI tools was that it was great because I didn't get blocked. Somehow, I always wanted to keep going and always felt like I could keep going.

The problem, of course, is that one might thoughtlessly invoke the ai tool when it would be faster to make the one line change directly

Edit

This could make sense with the driving analogy. If the road I was planning to take is closed, gps will happily tell me to try something else. But if that fails too, it might go back to the original suggestion.

thinkingemote

Exactly! Waze the navigation app tends to route users on longer routes but which feels more fast. When driving we perceive our journey as fast or slow not by the actual length but by our memories of what happened. Waze knows human drivers are happier with driving a route that may be longer in time and distance of they feel like they are making progress with the twists and turns.

Ai tools makes programming feel easier. That it might be actually less productive is interesting but we humans prefer the easier shortcuts. Our memories of coding with AI tells us that we didn't struggle and therefore we made progress.

tjr

That sounds like a navigation tool that I absolutely do not want! Occasionally I do enjoy meandering around, but usually fastest / shortest path would be preferred.

And I'm not sure about the other either. In my 20+ year career in aerospace software, the most memorable times were solving interesting problems, not days with no struggle just churning out code.

Alex_L_Wood

We all as humans are hardwired to prefer greedy algorithms, basically.

PicassoCTs

I also think that AI written code- is just not read. People hate code-reviews, and actively refuse to read code- because that is hard work, reading into other peoples thoughts and ideas.

This is why pushing for new code, rewrites, new frameworks is so popular. https://www.joelonsoftware.com/2000/04/06/things-you-should-...

So a ton of ai-generated code- is just that, never read. Its generated, tested against test-functions - and thats it. I wouldn't wonder, if some of these devs themselves have only marginal ideas whats in there codebases and why.

tjr

I have mostly worked in aerospace software, and find this rather horrifying. I suppose, if your tests are in fact good and comprehensive enough, there could be a logical argument for not needing to understand the code, but if we're talking people's safety in the hands of your software, I don't know if there is any number of tests I would accept in exchange for willingly giving up understanding of the code.

blake1

I think a reasonable summary of the study referenced is that: "AI creates the perception of productivity enhancements far beyond the reality."

Even within the study, there were some participants who saw mild improvements to productivity, but most had a significant drop in productivity. This thread is now full of people telling their story about huge productivity gains they made with AI, but none of the comments contend with the central insight of this study: that these productivity gains are illusions. AI is a product designed to make you value the product.

In matters of personal value, perception is reality, no question. Anyone relying heavily on AI should really be worried that it is mostly a tool for warping their self-perception, one that creates dependency and a false sense of accomplishment. After all, it speaks a highly optimized stream of tokens at you, and you really have to wonder what the optimization goal was.

BriggyDwiggs42

I’ve noticed that you can definitely use them to help you learn something, but that your understanding tends to be more abstract and LLM-like that way. You definitely want to mix it up when learning too.

daxfohl

I've also had bad results with hallucinations there. I was trying to learn more about multi-dimensional qubit algorithms, and spent a whole day learning a bunch of stuff that was fascinating but plain wrong. I only figured out it was wrong at the end of the day when I tried to do a simulation and the results weren't consistent.

Early in the chat it substituted a `-1` for an `i`, and everything that followed was garbage. There were also some errors that I spotted real-time and got it to correct itself.

But yeah, IDK, it presents itself so confidently and "knows" so much and is so easy to use, that it's hard not to try to use as a reference / teacher. But it's also quite dangerous if you're not confirming things; it can send you down incorrect paths and waste a ton of time. I haven't decided whether the cost is worth the benefit or not.

Presumably they'll get better at this over time, so in the long run (probably no more than a year) it'll likely easily exceed the ROI breakeven point, but for now, you do have to remain vigilant.

tonyedgecombe

I keep wondering whether the best way to use these tools is to do the work yourself then ask the AI to critique it, to find the bugs, optimisations or missing features.

nico

> They are experienced open source developers, working on their own projects

I just started working on a 3-month old codebase written by someone else, in a framework and architecture I had never used before

Within a couple hours, with the help of Claude Code, I had already created a really nice system to replicate data from staging to local development. Something I had built before in other projects, and I new that manually it would take me a full day or two, especially without experience in the architecture

That immediately sped up my development even more, as now I had better data to test things locally

Then a couple hours later, I had already pushed my first PR. All code following the proper coding style and practices of the existing project and the framework. That PR, would have taken me at least a couple of days and up to 2 weeks to fully manually write out and test

So sure, AI won’t speed everyone or everything up. But at least in this one case, it gave me a huge boost

As I keep going, I expect things to slow down a bit, as the complexity of the project grows. However, it’s also given me the chance to get an amazing jumpstart

Vegenoid

I have had similar experiences as you, but this is not the kind of work that the study is talking about:

“When open source developers working in codebases that they are deeply familiar with use AI tools to complete a task, they take longer to complete that task”

I have anecdotally found this to be true as well, that an LLM greatly accelerates my ramp up time in a new codebase, but then actually leads me astray once I am familiar with the project.

Navarr

> I have anecdotally found this to be true as well, that an LLM greatly accelerates my ramp up time in a new codebase, but then actually leads me astray once I am familiar with the project.

If you are unfamiliar with the project, how do you determine that it wasn't leading you astray in the first place? Do you ever revisit what you had done with AI previously to make sure that, once you know your way around, it was doing it the right way?

Vegenoid

In some cases, I have not revisited, as I was happy to simply make a small modification for my use only. In others, I have taken the time to ensure the changes are suitable for upstreaming. In my experience, which I have not methodically recorded in any way, the LLM’s changes at this early stage have been pretty good. This is also partly because the changes I am making at the early stage are generally small, usually not requiring adding new functionality but simply hooking up existing functionality to a new input or output.

What’s most useful about the LLM in the early stages is not the actual code it writes, but its reasoning that helps me learn about the structure of the project. I don’t take the code blind, I am more interested in the reasoning than the code itself. I have found this to be reliably useful.

quantumHazer

no, they just claim that AI coding tools are magic and drink their kool-aid

Gormo

> I have anecdotally found this to be true as well, that an LLM greatly accelerates my ramp up time in a new codebase, but then actually leads me astray once I am familiar with the project.

How does using AI impact the amount of time it takes you to become sufficiently familiar with the project to recognize when you are being led astray?

One of the worries I have with the fast ramp-up is that a lot of that ramp-up time isn't just grunt work to be optimized a way, it's active learning, and bypassing too much of it can leave you with an incomplete understanding of the problem domain that slows you down perpetually.

Sometimes, there are real efficiencies to be gained; other times those perceived efficiencies are actually incurring heavy technical debt, and I suspect that overuse of AI is usually the latter.

pragma_x

Not just new code-bases. I recently used an LLM to accelerate my learning of Rust.

Coming from other programming languages, I had a lot of questions that would be tough to nail down in a Google search, or combing through docs and/or tutorials. In retrospect, it's super fast at finding answers to things that _don't exist_ explicitly, or are implied through the lack of documentation, or exist at the intersection of wildly different resources:

- Can I get compile-time type information of Enum values?

- Can I specialize a generic function/type based on Enum values?

- How can I use macros to reflect on struct fields?

- Can I use an enum without its enclosing namespace, as I can in C++?

- Does rust have a 'with' clause?

- How do I avoid declaring timelines on my types?

- What is an idiomatic way to implement the Strategy pattern?

- What is an idiomatic way to return a closure from a function?

...and so on. This "conversation" happened here and there over the period of two weeks. Not only was ChatGPT up to the task, but it was able to suggest what technologies would get me close to the mark if Rust wasn't built to do what I had in mind. I'm now much more comfortable and competent in the language, but miles ahead of where I would have been without it.

OptionOfT

Now the question is: did you gain the same knowledge and proficiency in the codebase that you would've gained organically?

I find that when working with an LLM the difference in knowledge is the same as learning a new language. Learning to understanding another language is easier than learning to speak another language.

It's like my knowledge of C++. I can read it, and I can make modifications of existing files. But writing something from scratch without a template? That's a lot harder.

davidclark

> That PR, would have taken me at least a couple of days and up to 2 weeks to fully manually write out and test

What is your accuracy on software development estimates? I always see these productivity claims matched again “It would’ve taken me” timelines.

But, it’s never examined if we’re good at estimating. I know I am not good at estimates.

It’s also never examined if the quality of the PR is the same as it would’ve been. Are you skipping steps and system understanding which let you go faster, but with a higher % chance of bugs? You can do that without AI and get the same speed up.

PaulDavisThe1st

TFA was specifically about people very familiar with the project and codebase that they are working on. Your anecdots is precisely the opposite of the situation is was about, and it acknowledged the sort of process you describe.

nico

Some additional notes given the comments in the thread

* I wasn’t trying to be dismissive of the article or the study, just wanted to present a different context in which AI tools do help a lot

* It’s not just code. It also helps with a lot of tasks. For example, Claude Code figured out how to “manually” connect to the AWS cluster that hosted the source db, tested different commands via docker inside the project containers and overall helped immensely with discovery of the overall structure and infrastructure of the project

* My professional experience as a developer, has been that 80-90% of the time, results trump code quality. That’s just the projects and companies I’ve been personally involved with. Mostly saas products in which business goals are usually considered more important than the specifics of the tech stack used. This doesn’t mean that 80-90% of code is garbage, it just means that most of the time readability, maintainability and shipping are more important than DRY, clever solutions or optimizations

* I don’t know how helpful AI is or could be for things that require super clever algorithms or special data structures, or where code quality is incredibly important

* Having said that, the AI tools I’ve used can write pretty good quality code, as long as they are provided with good examples and references, and the developer is on top of properly managing the context

* Additionally, these tools are improving almost on a weekly or monthly basis. My experience with them has drastically changed even in the last 3 months

At the end of the day, AI is not magic, it’s a tool, and I as the developer, am still accountable for the code and results I’m expected to deliver

kevmo314

You've missed the point of the article, which in fact agrees with your anecdote.

> It's equally common for developers to work in environments where little value is placed on understanding systems, but a lot of value is placed on quickly delivering changes that mostly work. In this context, I think that AI tools have more of an advantage. They can ingest the unfamiliar codebase faster than any human can, and can often generate changes that will essentially work.

moogleii

That would be an aside, or a comment, not the point of the article.

antonvs

> You've missed the point of the article

Sadly clickbait headlines like the OP, "AI slows down open source developers," spread this misinformation, ensuring that a majority of people will have the same misapprehension.

raincole

Which is a good thing for people who are currently benefiting from AI, though. The slower other programmers adopt AI, the more edge those who are proficient with it have.

It took me an embarrassingly long time to realize a simple fact: using AI well is a shallow skill that everyone can learn in days or even hours if they want. And then my small advantage of knowing AI tools will disappear. Since the realization I've been always upvoting articles that claims AI makes you less productive (like the OP).

samtp

Well that's exactly what it does well at the moment. Boilerplate starter templates, landing pages, throwaway apps, etc. But for projects that need precision like data pipelines, security - it code generated has many subtle flaws that can/will cause giant headaches in your project unless you dig through every line produced

quantumHazer

You clearly have not read the study. Problem is developers thought they were 20% faster, but they were actually slower. Anyway from a fast review about your profile you're in conflict of interest about vibe coding, so I will definitely take your opinion with a grain of salt.

floren

> Anyway from a fast review about your profile you're in conflict of interest about vibe coding

Seems to happen every time, doesn't it?

trey-jones

Doing my own post-mortem of a recent project (the first that I've leaned on "AI" tools to any extent), my feeling was the following:

1. It did not make me faster. I don't know that I expected it to.

2. It's very possible that it made me slower.

3. The quality of my work was better.

Slower and better are related here, because I used these tools more to either check ideas that I had for soundness, or to get some fresh ideas if I didn't have a good one. In many cases the workflow would be: "I don't like that idea, what else do you have for me?"

There were also instances of being led by my tools into a rabbit hole that I eventually just abandoned, so that also contributes to the slowness. This might happen in instances where I'm using "AI" to help cover areas that I'm less of an expert in (and these were great learning experiences). In my areas of expertise, it was much more likely that I would refine my ideas, or the "AI" tool's ideas into something that I was ultimately very pleased with, hence the improved quality.

Now, some people might think that speed is the only metric that matters, and certainly it's harder to quantify quality - but it definitely felt worth it to me.

jpc0

I do this a lot and absolutely think it might even improve it, and this is why I like the current crop of AIs that are more likely to be argumentative and not just capitulate.

I will ask the AI for an idea and then start blowing holes in its idea, or will ask it to do the same for my idea.

And I might end up not going with it’s idea regardless but it got me thinking about things I wouldn’t have thought about.

Effectively its like chatting to a coworker that has a reasonable idea about the domain and can bounce ideas around.

trey-jones

I'm on record saying it's "like the smartest coworker I've ever had" (no offense).

yomismoaqui

Someone on X said that these agentic AI tools (Claude Code, Amp, Gemini Cli) are to programming like the table saw was to hand-made woodworking.

It can make some things faster and better than a human with a saw, but you have to learn how to use them right (or you will loose some fingers).

I personally find that agentic AI tools make me be more ambitious in my projects, I can tackle some things I didn't tthougth about doing before. And I also delegate work that I don't like to them because they are going to do it better and quicker than me. So my mind is free to think on the real problems like architecture, the technical debt balance of my code...

Problem is that there is the temptation of letting the AI agent do everything and just commit the result without understanding YOUR code (yes, it was generated by an AI but if you sign the commit YOU are responsible for that code).

So as with any tool try to take the time to understand how to better use it and see if it works for you.

candiddevmike

> to programming like the table saw was to hand-made woodworking

This is a ridiculous comparison because the table saw is a precision tool (compared to manual woodworking) when agentic AI is anything but IMO.

marcellus23

The nature of the comparison is in the second paragraph. It's nothing to do with how precise it is.

bgwalter

"You are using it wrong!"

This is insulting to all pre-2023 open source developers, who produced the entire stack that the "AI" robber barons use in their companies.

It is even more insulting because no actual software of value has been demonstrably produced using "AI".

yomismoaqui

> It is even more insulting because no actual software of value has been demonstrably produced using "AI".

Claude Code and Amp (equivalent from Sourcegraph) are created by humans using these same tools to add new features and fix bugs.

Having used both tools for some weeks I can tell you that they provide a great value to me, enough that I see paying $100 monthly as a bargain related to that value.

Edit: typo

jdiff

GP is pointing out the distinct lack of AI driven development in the wild. At this point, agents should be visibly maintaining at least a few popular codebases across this world wide web. The fact that there aren't raises some eyebrows for the claims that are regularly made by proponents. Not just the breathless proponents, either. Even taking claims very conservatively, FOSS maintainer burnout should be a thing of the past, but the only noted interaction with AI seems to be amplifying it.

tomasz_fm

Only one developer in this study had more than 50h of Cursor experience, including time spent using Cursor during the study. That one developer saw a 25% speed improvement.

Everyone else was an absolute Cursor beginner with barely any Cursor experience. I don't find it surprising that using tools they're unfamiliar with slows software engineers down.

I don't think this study can be used to reach any sort of conclusion on use of AI and development speed.

narush

Hey, thanks for digging into the details here! Copying a relevant comment (https://news.ycombinator.com/item?id=44523638) from the other thread on the paper, in case it's help on this point.

1. Some prior studies that find speedup do so with developers that have similar (or less!) experience with the tools they use. In other words, the "steep learning curve" theory doesn't differentially explain our results vs. other results.

2. Prior to the study, 90+% of developers had reasonable experience prompting LLMs. Before we found slowdown, this was the only concern that most external reviewers had about experience was about prompting -- as prompting was considered the primary skill. In general, the standard wisdom was/is Cursor is very easy to pick up if you're used to VSCode, which most developers used prior to the study.

3. Imagine all these developers had a TON of AI experience. One thing this might do is make them worse programmers when not using AI (relatable, at least for me), which in turn would raise the speedup we find (but not because AI was better, but just because with AI is much worse). In other words, we're sorta in between a rock and a hard place here -- it's just plain hard to figure out what the right baseline should be!

4. We shared information on developer prior experience with expert forecasters. Even with this information, forecasters were still dramatically over-optimistic about speedup.

5. As you say, it's totally possible that there is a long-tail of skills to using these tools -- things you only pick up and realize after hundreds of hours of usage. Our study doesn't really speak to this. I'd be excited for future literature to explore this more.

In general, these results being surprising makes it easy to read the paper, find one factor that resonates, and conclude "ah, this one factor probably just explains slowdown." My guess: there is no one factor -- there's a bunch of factors that contribute to this result -- at least 5 seem likely, and at least 9 we can't rule out (see the factors table on page 11).

I'll also note that one really important takeaway -- that developer self-reports after using AI are overoptimistic to the point of being on the wrong side of speedup/slowdown -- isn't a function of which tool they use. The need for robust, on-the-ground measurements to accurately judge productivity gains is a key takeaway here for me!

(You can see a lot more detail in section C.2.7 of the paper ("Below-average use of AI tools") -- where we explore the points here in more detail.)

null

[deleted]

Art9681

This is exactly my same take. Any tool an engineer is inexperienced with will slow them down. AI is no different.

bluefirebrand

This runs counter to the starry eyed promises of AI letting people with no experience accomplish things

TeMPOraL

That promise is true, though, and the two claims are not opposite. The devil is in details, specifically in what you mean by "people" and "accomplish things".

If by "people" you mean "general public", and by "accomplish things" you mean solving some immediate problems, that may or may not involve authoring a script or even a small app - then yes, this is already happening, and is a big reason behind the AI hype as it is.

If by "people" you mean "experienced software engineers", and by "accomplish things" you mean meaningful contributions to a large software product, measured by high internal code and process quality standards, then no - AI tools may not help with that directly, though chances are greater when you have enough experience with those tools to reliably give them right context and steer away from failure modes.

Still, solving one-off problems != incremental improvements to a large system.

jonfw

AI let's people with no experience accomplish things. People who have experience can create those things without AI. Those experienced folks will likely outperform novices, even when novices leverage AI.

None of these statements are controversial. What we have to establish is- Does the experienced AI builder outperform the experienced manual coder?

antimora

I'm one of the regular code reviewers for Burn (a deep learning framework in Rust). I recently had to close a PR because the submitter's bug fix was clearly written entirely by an AI agent. The "fix" simply muted an error instead of addressing the root cause. This is exactly what AI tends to do when it can't identify the actual problem. The code was unnecessarily verbose and even included tests for muting the error. Based on the person's profile, I suspect their motivation was just to get a commit on their record. This is becoming a troubling trend with AI tools.

dawnerd

That's what I love about LLMs. You can spot it doesn't know the answer, tell it that it's wrong and it'll go, "You're absolutely right. Let me actually fix it"

It scares me how much code is being produced by people without enough experience to spot issues or people that just gave up caring. We're going to be in for wild ride when all the exploits start flowing.

cogman10

My favorite LLM moment. I wrote some code, asked the LLM "Find any bugs or problems with this code" and of course what it did was hyperfocus on an out of date comment (that I didn't write). Since the problem no longer existed identified in the comment, the LLM just spat out like 100 lines of garbage to refactor the code.

rectang

> "You're absolutely right."

I admit a tendency to anthropomorphize the LLM and get irritated by this quirk of language, although it's not bad enough to prevent me from leveraging the LLM to its fullest.

The key when acknowledging fault is to show your sincerity through actual effort. For technical problems, that means demonstrating that you have worked to analyze the issue, take corrective action, and verify the solution.

But of course current LLMs are weak at understanding, so they can't pull that off. I wish that the LLM could say, "I don't know", but apparently the current tech can't know that that it doesn't know.

And so, as the LLM flails over and over, it shamelessly kisses ass and bullshits you about the work its doing.

I figure that this quirk of LLMs will be minimized in the near future by tweaking the language to be slightly less obsequious. Improved modeling and acknowledging uncertainty will be a heavier lift.

daxfohl

It'd be nice if github had a feature that updated the issue with this context automatically too, so that if this agent gives up and closes the PR, the next agent doesn't go and do the exact same thing.

candiddevmike

> tell it that it's wrong and it'll go, "You're absolutely right. Let me actually fix it"

...and then it still doesn't actually fix it

mlyle

So, I recently have done my first couple heavily AI augmented tasks for hobby projects.

I wrote a TON of LVGL code. The result wasn’t perfect for placement, but when I iterated a couple of times, it fixed almost all of the issues. The result is a little hacked together but a bit better than my typical first pass writing UI code. I think this saved me a factor of 10 in time. Next I am going to see how much of the cleanup and factoring of the pile of code it can do.

Next I had it write a bunch of low level code to init hardware. It saved me a little time compared to reading the reference manual, and was more pleasant, but it wasn’t perfectly correct. If I did not have domain expertise I would not have been able to complete the task with the LLM.

Retr0id

I was trying out Copilot recently for something trivial. It made the change as requested, but also added a comment that stated something obvious.

I asked it to remove the comment, which it enthusiastically agreed to, and then... didn't. I couldn't tell if it was the LLM being dense or just a bug in Copilot's implementation.

seunosewa

Some prompts can help:

"Find the root cause of this problem and explain it"

"Explain why the previous fix didn't work."

Often, it's best to undo the action and provide more context/tips.

Often, switching to Gemini 2.5 Pro when Claude is stumped helps a lot.

brazzy

My favourite recent experience was switching multiple times between using a library function and rolling its own implementation, each time claiming that it's "simplifying" the code and making it "more reliable".

colechristensen

Sometimes it does... sometimes.

I recently had a nice conversation looking for some reading suggestions from an LLM. The first round of suggestions were superb, some of them I'd already read, some were entirely new and turned out great. Maybe a dozen or so great suggestions. Then it was like squeezing blood from a stone but I did get a few more. After that it was like talking to a babbling idiot. Repeating the same suggestions over and over, failing to listen to instructions, and generally just being useless.

LLMs are great on the first pass but the further you get away from that they degrade into uselessness.

colechristensen

I also get things like this from very experienced engineers working outside their area of expertise. It's obviously less of the completely boneheaded suggestion but still doing exactly the wrong thing suggested by AI that required a person to step in and correct.

Macha

I recently reviewed a MR from a coworker. There was a test that was clearly written by AI, except I guess however he prompted it, it gave some rather poor variable names like "thing1", "thing2", etc. in test cases. Basically, these were multiple permutations of data that all needed to be represented in the result set. So I asked for them to be named distinctively, maybe by what makes them special.

It's clear he just took that feedback and asked the AI to make the change, and it came up with a change that gave them all very long, very unique names, that just listed all the unique properties in the test case. But to the extent that they sort of became noise.

It's clear writing the PR was very fast for that developer, I'm sure they felt they were X times faster than writing it themselves. But this isn't a good outcome for the tool either. And I'm sure if they'd reviewed it to the extent I did, a lot of that gained time would have dissipated.

meindnoch

>a deep learning framework in Rust [...] This is becoming a troubling trend with AI tools.

The serpent is devouring its own tail.

TeMPOraL

OTOH when they'll start getting good AI contributions, then... it'll be too late for us all.

LoganDark

Deep learning can be incredibly cool and not just used for AI slop.

jampa

> I suspect their motivation was just to get a commit on their record. This is becoming a troubling trend with AI tools.

It has been for a while, AI just makes SPAM more effective:

https://news.ycombinator.com/item?id=24643894

pennomi

This is the most frustrating thing LLMs do. They put wide try:catch structures around the code making it impossible to actually track down the source of a problem. I want my code to fail fast and HARD during development so I can solve every problem immediately.

daxfohl

Seems like there's a need for github to create a separate flow for AI-cretaed PRs. Project maintainers should be able to stipulate rules like this in English, and an AI "pre-reviewer" would check that the AI has followed all these rules before the PR is created, and chat with the AI submitter to resolve any violations. For exceptional cases, a human submitter is required.

Granted, the compute required is probably more expensive than github would offer for free, and IDK whether it'd be within budget for many open-source projects.

Also granted, something like this may be useful for human-sourced PRs as well, though perhaps post-submission so that maintainers can see and provide some manual assistance if desired. (And also granted, in some cases maybe maintainers would want to provide manual assistance to AI submissions, but I expect the initial triaging based on whether it's a human or AI would be what makes sense in most cases).

0xbadcafebee

> The "fix" simply muted an error instead of addressing the root cause.

FWIW, I have seen human developers do this countless times. In fact there are many people in engineering that will argue for these kinds of "fixes" by default. Usually it's in closed-source projects where the shittiness is hidden from the world, but trust me, it's common.

> I suspect their motivation was just to get a commit on their record. This is becoming a troubling trend with AI tools.

There was already a problem (pre-AI) with shitty PRs on GitHub made to try to game a system. Regardless of how they made the change, the underlying problem is a policy one: how to deal with people making shitty changes for ulterior motives. I expect the solution is actually more AI to detect shitty changes from suspicious submitters.

Another solution (that I know nobody's going to go for): stop using GitHub. Back in the "olden times", we just had CVS, mailing lists and patches. You had to perform some effort in order to get to the point of getting the change done and merged, and it was not necessarily obvious afterward that you had contributed. This would probably stop 99% of people who are hoping for a quick change to boost their profile.

nerdjon

I will never forget being in a code review for a upcoming release, there was a method that was... different. Like massively different with no good reason why it was changed as much as it was for such a small addition.

We asked the person why they made the change, and "silence". They had no reason. It became painfully clear that all they did was copy and paste the method into an LLM and say "add this thing" and it spit out a completely redone method.

So now we had a change that no one in the company actually knew just because the developer took a shortcut. (this change was rejected and reverted).

The scariest thing to me is no one actually knowing what code is running anymore with these models having a tendency to make change for the sake of making change (and likely not actually addressing the root thing but a shortcut like you mentioned)

tomrod

As a side question: I work in AI, but mostly python and theory work. How can I best jump into Burn? Rust has been intriguing to me for a long time

andix

What I noticed: AI development constantly breaks my flow. It makes me more tired, and I work for shorter time periods on coding.

It's a myth that you can code a whole day long. I usually do intervals of 1-3 hours for coding, with some breaks in between. Procrastination can even happen on work related things, like reading other project members code/changes for an hour. It has a benefit to some extent, but during this time I don't get my work done.

Agentic AI works the best for me. Small refactoring tasks on a selected code snippet can be helpful, but isn't a huge time saver. The worst are AI code completions (first version Copilot style), they are much more noise then help.

lsy

Typically debugging, e.g., a tricky race condition in an unfamiliar code base would require adding logging, refactoring library calls, inspecting existing logs, and even rewriting parts of your program to be more modular or understandable. This is part of the theory-building.

When you have an AI that says "here is the race condition and here is the code change to make to fix it", that might be "faster" in the immediate sense, but it means you aren't understanding the program better or making it easier for anyone else to understand. There is also the question of whether this process is sustainable: does an AI-edited program eventually fall so far outside what is "normal" for a program that the AI becomes unable to model correct responses?

sodapopcan

This is always my thought whenever I hear the "AI let me build a feature in a codebase I didn't know in a language I didn't know" (which is often, there is at one in these comments). Great, but what have you learned? This is fine for small contributions, I guess, but I don't hear a lot of stories of long-term maintenance. Unpopular opinion, though, I know.

threetonesun

I guess it's a question of how anyone learns. There's some value in typing code, I suppose, but with tab complete that's been gone for a long time. Letting AI write something and then reading it seems as good as copying and pasting from some other source.

sodapopcan

I'm not super qualified to answer as I haven't gone deep into AI at all. But from my limited observations I'd say yes and no. You generally aren't copy/pasting entire features, just snippets that you yourself have to string together in a sensible way. Of course there are lots of people who still do this and what's why I find most people in this industry infuriating to work with. It's all good when it's boilerplate, and that's actually my primary use of "AI"—it's essentially been a snippets replacement (and is quite good at that).

piker

My main two attempts at using an “agentic” coding workflow were trying to incorporate an Outlook COM interface into my rust code base and to streamline an existing abstract windows API interaction to avoid copying memory a couple of times. Both wasted tremendous amounts of time and were ultimately abandoned leaving me only slightly more educated about windows development. They make great autocompletion engines but I just cannot see them being useful in my project otherwise.

jdiff

They make great autocompletion engines, most of the time. It's nice when it can recognize that I'm replicating a specific math formula and expands out the next dozen lines for me. It's less nice when it predicts code that's not even syntactically valid for the language or the correct API for the library I'm using. Those times, for whatever reason, seem to be popping up a lot in the last few weeks so I find myself disabling those suggestions more often than not.

crinkly

This is typically what I see when I’ve seen it applied. And as always trying to hammer nails in with a banana.

Rather than fit two generally disparate things together it’s probably better to just use VSTO and C# (hammer and nails) rather than some unholy combination no one else has tried or suffered through. When it goes wrong there’s more info to get you unstuck.

piker

To be fair though, unsafe rust (where the COM lives) is basically just C, so I totally expected it to be tractable in the same way it has been tractable for the last 20ish years? But it isn’t.

Why is interacting with the OS’ API in a compiled language the wrong approach in 2025? Why must I use this managed Frankenstein’s monster of dotnet? I didn’t want to ship or expect a whole runtime for what should be a tiny convenience DLL. Insane

charcircuit

I had the opposite experience. Gemini was able to work with COM and accomplish what I needed despite me never using COM before.

tonyedgecombe

I've done a lot of work with COM over the years and that is the last technology I would trust to an AI. It's very easy to write COM code that appears to work but contains subtle bugs.

piker

Actually hadn’t tried Gemini with it yet. Perhaps worth taking a look.

doc_manhat

I directionally disagree with this:

``` It's common for engineers to end up working on projects which they don't have an accurate mental model of. Projects built by people who have long since left the company for pastures new. It's equally common for developers to work in environments where little value is placed on understanding systems, but a lot of value is placed on quickly delivering changes that mostly work. In this context, I think that AI tools have more of an advantage. They can ingest the unfamiliar codebase faster than any human can, and can often generate changes that will essentially work. ```

Reason: you cannot evaluate the work accurately if you have no mental model. If there's a bug given the systems unwritten assumptions you may not catch it.

Having said that it also depends on how important it is to be writing bug free code in the given domain I guess.

I like AI particularly for green field stuff and one off scripts as it let's you go faster here. Basically you build up the mental model as you're coding with the AI.

Not sure about whether this breaks down at a certain codebase size though.

horsawlarway

Just anecdotally - I think your reason for disagreeing is a valid statement, but not a valid counterpoint to the argument being made.

> Reason: you cannot evaluate the work accurately if you have no mental model. If there's a bug given the systems unwritten assumptions you may not catch it.

This is completely correct. It's a very fair statement. The problem is that a developer coming into a large legacy project is in this spot regardless of the existence of AI.

I've found that asking AI tools to generate a changeset in this case is actually a pretty solid way of starting to learn the mental model.

I want to see where it tries to make changes, what files it wants to touch, what libraries and patterns it uses, etc.

It's a poor man's proxy for having a subject matter expert in the code give you pointers. But it doesn't take anyone else's time, and as long as you're not just trying to dump output into a PR can actually be a pretty good resource.

The key is not letting it dump out a lot of code, in favor of directional signaling.

ex: Prompts like "Which files should I edit to implement a feature which does [detailed description of feature]?" Or "Where is [specific functionality] implemented in this codebase?" Have been real timesavers for me.

The actual code generation has probably been a net time loss.

Roscius

> I've found that asking AI tools to generate a changeset in this case is actually a pretty solid way of starting to learn the mental model.

This. Leveraging the AI to start to develop the mental model is an advantage. But, using the AI is a non-trivial skill set that needs to be learned. Skepticism of what it's saying is important. AI can be really useful just like a 747 can be useful, but you don't want someone picked off the street at random flying it.

bluefirebrand

> This. Leveraging the AI to start to develop the mental model is an advantage

Is there any evidence that AI helps you build the mental model of an unfamiliar codebase more quickly?

In my experience trying to use AI for this it often leads me into the weeds

doc_manhat

Yeah fair points particularly for larger codebases I could see this being a huge time saver.