Your job is to deliver code you have proven to work

274 comments

·December 18, 2025

endorphine

> there’s one depressing anecdote that I keep on seeing: the junior engineer, empowered by some class of LLM tool, who deposits giant, untested PRs on their coworkers—or open source maintainers—and expects the “code review” process to handle the rest.

It's even worse than that: non-junior devs are doing it as well.

acedTrex

Juniors aren't even the problem here, they can and should be taught better thats the point.

Its when your PEERS do it that its a huge problem.

analog31

Where are the junior devs while their code is being reviewed? I'm not a software developer, but I'd be loath to review someone's work unless they have enough skin in the game to be present for the review.

tyrust

Code review is rarely done live. It's usually asynchronous, giving the reviewer plenty of time to read, digest, and give considered feedback on the changes.

Perhaps a spicy patch would involve some kind of meeting. Or maybe in a mentor/mentee situation where you'd want high-bandwidth communication.

jopsen

Doing only IRL code reviews would certainly improve quality in some projects :)

It's probably also fairly expensive to do.

rootusrootus

As someone else mentioned, the process is async. But I achieve a similar effect by requiring my team to review their own PRs before they expect a senior developer to review them and approve for merging.

That solves some of the problem with people thinking it's okay to fire off a huge AI slop PR and make it the reviewer's responsibility to see how much the LLM hallucinated. No, you have to look at yourself first, because it's YOUR code no matter what tool you used to help write it.

stuaxo

Git PRs work on async model for reviews.

DrewADesign

And even then, in my experience, they work more like support tickets than business email, for which there are loose norms for response time, etc. Unless there’s a specific reason it needs to be urgently handled, people will prioritize other tasks.

hnthrow0287345

>It's even worse than that: non-junior devs are doing it as well.

This might be unpopular, but that is seeming more like an opportunity if we want to continue allowing AI to generate code.

One of the annoying things engineers have to deal with is stopping whatever they're doing and doing a review. Obviously this gets worse if more total code is being produced.

We could eliminate that interruption by having someone doing more thorough code reviews, full-time. Someone who is not being bound by sprint deadlines and tempted to gloss over reviews to get back to their own work. Someone who has time to pull down the branch and actually run the code and lightly test things from an engineer's perspective so QA doesn't hit super obvious issues. They can also be the gatekeeper for code quality and PR quality.

marcosdumay

A full-time code reviewer will quickly lose touch with all practical matters and steer the codebase into some unmaintainable mess.

This is not the first time somebody had that idea.

JohnBooty

I've often thought this could work if the code reviewer was full-time, but rotated regularly. Just like a lot of jobs do with on-call weeks, or weeks spent as release manager - like if you have 10 engineers, and once every ten weeks it's your turn to be on call.

That would definitely solve the "code reviewer loses touch with reality" issue.

Whether it would be a net reduction in disruption, I don't know.

jaggederest

I think it's amenable if you make code review a primary responsibility, but not the only responsibility. I think this is a big thing at staff+ levels, doing more than your share of code review (and other high level concerns, of course).

sorokod

> One of the annoying things engineers have to deal with is stopping whatever they're doing and doing a review.

I would have thought that reviewing PRs and doing it well is in the job description. You latter mention "someone" a few times - who that someone might be?

bee_rider

Can we make an LLM do it?

“You are a cranky senior software engineer who loves to nitpick change requests. Here are your coding standards. You only sign off of a change after you are sure it works; if you run out of compute credits before you can prove it to yourself, reject the change as too complex.”

Balance things, pit the LLMs against each other.

jms703

>One of the annoying things engineers have to deal with is stopping whatever they're doing and doing a review

Code reviews are a part of the job. Even at the junior level, an engineer should be able to figure out a reasonable time to take a break and shift efforts for a bit to handle things like code reviews.

gottagocode

> We could eliminate that interruption by having someone doing more thorough code reviews, full-time. Someone who is not being bound by sprint deadlines and tempted to gloss over reviews to get back to their own work.

This is effectively my role (outside of mentoring) as a lead developer over a team of juniors we train in house. I'm not sure many engineers would enjoy a day of only reviewing, me included.

immibis

As I read once: all that little stuff that feels like it stops you from doing your job is your job.

ohwaitnvm

So pair programming?

sodapopcan

Yep, eliminates code reviews altogether. Unfortunately it remains wisely unpopular with perle even saying “AI” can be the pair.

aslakhellesoy

Shhh don’t let them know!

reedf1

it's even worse than that! non-devs are doing it as well

esafak

That's what democratization looks like. And the new participants are happy!

rvz

…Until you tell them to maintain all the technical debt that was generated when it breaks and waste more time and money / tokens on fixing the security issues.

A great time to be a vibe coding cleanup specialist (i.e, professional security software engineer)

snowstormsun

it's even worse than that! bots are doing it as well

marcosdumay

Abandon any platform that decides to put bots into your workflow without you telling it to.

Vote with your wallet.

pydry

Hopefully once this AI nonsense blows over they'll reach the same realisation they did after the mid 2000s outsourcing craze: that actually you gotta pay for good engineering talent.

vernrVingingIt

That's the goal. Through further training, whittle away at unnecessary states until only the electrical states that matter remain.

Developers have created too many layers of abstraction and indirection to do their jobs. We're burning a ton of energy managing state management frameworks, that are many layers of indirection away from the computations that are salient to users.

All those DSLs, config syntaxes, layers of boilerplate waste a huge amount of electricity, when end users want to draw geometric shapes.

So a non-dev generates a mess, but in a way so do devs with Django and Elixir, RoR, Terraform. When really end of the day it's matrix math against memory and sync of that state to the display.

From a hardware engineers perspective, the mess of devs and non-devs is the same abstract mess of electrical states that have nothing to do with the goal. All those frameworks can be generalized into a handful of electrical patterns, saving a ton of electricity.

rcbdev

This sounds like the exact kind of profound pseudo-enlightenment that one gets from psychedelics. Of course, it's all electrons in the end.

Trying to create a secure, reliable and scalable system that enables many people to work on one code base, share their code around with others and at the end of the day coordinate this dance of electrons across multiple computers, that's where all of these 'useless' layers of abstraction become absolutely necessary.

iwontberude

And here I thought people just used computers for the heat

whattheheckheck

What process / path do you take to get to such an enlightened state? Like books or experience or anything more about this please?

nanomonkey

There are some contradictory claims here.

Boilerplate comes when your language doesn't have affordances, you get around this with /abstraction/ which leads to DSLs (Domain Specific Languages).

Matrix math is generally done on more than raw bits provided by digital circuits. Simple things like numbers require some amount of abstraction and indirection (pointers to memory addresses that begin arrays).

My point is yes, we've gotten ourselves in a complicated tar pit, but it's not because there wasn't a simpler solution lower in the stack.

seanmcdirmid

Don’t accept PRs without test coverage? I mean, LLMs can do those also, but it’s something.

null

[deleted]

Aurornis

This mirrors my experience with the texting while driving problem: The debate started as angry complaints about how all the kids are texting while driving. Yet it’s a common problem for people of all ages. The worst offender I knew for years was in her 50s, but even she would get angry about the youths texting while driving.

Pretending it’s just the kids and young people doing the bad thing makes the outrage easier to sell to adults.

null

[deleted]

jennyholzer2

[flagged]

rootusrootus

One thing I've pushed developers on my team to do since way before AI slop became a thing was to review their own PR. Go through the PR diff and leave comments in places where it feels like a little explanation of your thought process could be helpful in the review. It's a bit like rubber duck debugging, I've seen plenty of things get caught that way.

As an upside, it helps with AI slop too. Because as I see it, what you're doing when you use an LLM is becoming a code reviewer. So you need to actually read the code and review it! If you have not reviewed it yourself first, I am not going to waste my time reviewing it for you.

It helps obviously that I'm on a small team of a half dozen developers and I'm the lead, and management hasn't even hinted at giving us stupid decrees like "now that you have Claude Code you can do 10x as many features!!!1!".

strangattractor

Not at Meta - their job is to "Move fast and break things". I think people are just doing what they've been told.

robgibbons

For what it's worth, writing good PRs applies in more cases than just AI generated contributions. In my PR descriptions, I usually start by describing how things currently work, then a summary of what needs to change, and why. Then I go on to describe what exactly is changing with the PR. This high level summary serves to educate the reviewer, and acts as a historical record in the git log for the benefit of those who come after you.

From there, I include explicit steps for how to test, including manual testing, and unit test/E2E test commands. If it's something visual, I try to include at least a screenshot, or sometimes even a brief screen capture demonstrating the feature.

Really go out of your way to make the reviewer's life easier. One benefit of doing all of this is that in most cases, the reviewer won't need to reach out to ask simple questions. This also helps to enable more asynchronous workflows, or distributed teams in different time zones.

Hovertruck

Also, take a moment to review your own change before asking someone else to. You can save them the trouble of finding your typos or that test logging that you meant to remove before pushing.

To be fair, copilot review is actually alright at catching these sorts of things. It remains a nice courtesy to extend to your reviewer.

simonw

100%. There's no difference at all in my mind between an AI-assisted PR and a regular PR: in both cases they should include proof that the change works and that the author has put the work in to test it.

oceanplexian

At the last company I worked at (Large popular tech company) it took an act of the CTO to get engineers to simply attach a JIRA Ticket to the PR they were working on so we could track it for tax purposes.

The Devs went in kicking and screaming. As an SRE it seemed like for SDEs, writing a description of the change, explaining the problem the code is solving, testing methodology, etc is harder than actually coding. Ironically AI is proving that this theory was right all along.

sodapopcan

Complaining about including a ticket number in the commit is a new one for me. Good grief.

p2detar

Strange, I thought this is actually the norm. Our PRs are almost always tagged with a corresponding Jira ticket. I think this is more helpful to developers than to other roles, because it allows them to have history of what has been fixed.

One can also point QA or consultants to a ticket for documentation purposes or timeline details.

phito

And these people probably call themselves software engineers... What a joke.

phito

I often write PR descriptions, in which I write a short explanation and try to anticipate some comments I might get. Well, every time I do, I will still get those exact comments because nobody bothers reading the description.

Not to say you shouldn't write descriptions, I will keep doing it because it's my job. But a lot of people just don't care enough or are too distracted to read them.

simonw

For many of my PR and issue comments the intended audience is myself. I find them useful even a few days later, and they become invaluable months or years later when I'm trying to understand why the code is how it is.

ffsm8

After I accepted that, I then tried to preempt the comment by just commenting myself on the function/class etc that I thought might need some elaboration...

Well, I'm sure you can guess what happened after that - within the same file even

skydhash

I just point people to the description. no need to type things twice.

lanstin

Sadly, when communicating with people, important things have to be repeated over and over. Maybe less so with highly trained and experienced people on something that their training and experience make the statement plausible, but if the thing is at all surprising or diverges from common experience, I've found a need to bang it out via multiple communication channels.

walthamstow

At my place nobody reads my descriptions because nobody writes them so they assume there isn't one!

reactordev

I do this too with our PR templates. They have the ticket/issue/story number, the description of the ask (you can copy pasta from ticket). Current state of affairs. Proposed changes. Post state of affairs. Mood gif.

null

[deleted]

bob1029

> I try to include at least a screenshot

This is ~mandatory for me. Even if what I am working on is non-visual. I will take a screenshot of a new type in my IDE and put a red box around it. This conveys the focus of my attention and other important aspects of the work effort.

toomuchtodo

This is how PRs should be, but rarely are (in my experience as a reviewer, ymmv, n=1). Keep on keepin' on.

layer8

I’d go further and say while testing is necessary, it is not sufficient. You have to understand the code and convince yourself that it is logically correct under all relevant circumstances, by reasoning over the code.

Testing only “proves” correctness for the specific state, environment, configuration, and inputs the code was tested with. In practice that only tests a tiny portion of possible circumstances, and omits all kinds of edge and non-edge cases.

aspbee555

I find myself not really trusting just tests, I really need to try the app/new function in multiple ways with the goal of breaking it. In that process I may not break it but I will notice something that might break, so I rewrite it better

Nizoss

If you write your tests the Test-Driven Development way in that they first fail before production changes are introduced, you will be able to trust them a lot more. Especially if they are well-written tests that test behavior or contracts, not implementation details. I find that dependency injection helps a lot with this. I try to avoid mocking and complex dependencies as much as possible. This also allows me to easily refactor the code without having to worry about breaking anything if all the tests still pass.

When it comes to agentic coding. I created an open source tool that enforces those practices. The agent gets blocked by a hook if it tries to do anything that violates those principles. I think it helps a lot if I may say so myself.

https://github.com/nizos/tdd-guard

Edit: I realize now that I misunderstood your comment. I was quick to respond.

lanstin

If you don't push your system to failure, you can't really say you understand it. And anyways the precise failure modes under various conditions are important characteristics for stability/resiliency. (Does it shed load all the way upto network bandwidth of SYNs; allocate all the memory and then exit; freeze up with deadlocks/disk contention; go unresponsive for a few minutes then recover if the traffic dies off; answer health check pings only and not progress on actual work).

simianwords

I would like to challenge this claim. I think LLMs are maybe accurate enough that we don't need to check every line and remember everything. High level design is enough.

shepherdjerred

A good type system helps with this quite a lot

crazygringo

It helps some. There are plenty of errors, a large majority I'd say, where types don't help at all. Types don't free up memory or avoid off-by-one errors or keep you from mixing up two counter variables.

user34283

I'd go further and say vibe coding it up, testing the green case, and deploying it straight into the testing environment is good enough.

The rest we can figure out during testing, or maybe you even have users willing to beta-test for you.

This way, while you're still on the understanding part and reasoning over the code, your competitor already shipped ten features, most of them working.

Ok, that was a provocative scenario. Still, nowadays I am not sure you even have to understand the code anymore. Maybe having a reasonable belief that it does work will be sufficient in some circumstances.

TheTxT

This approach sounds like a great way to get a lot of security holes into your code. Maybe your competitors will be faster at first, but it’s probably better to be a bit slower and not leaking all your users data.

WhyOhWhyQ

Isn't this in contradiction to your blog post from yesterday though? It's impossible to prove a complex project made in 4.5 hours works. It might have passed 9000 tests, but surely there are always going to be edge cases. I personally wouldn't be comfortable claiming I've proved it works and saying the job is done even, if the LLM did the whole thing and all existing tests passed, until I played with it for several months. And even then I would assume I would need to rely on bug reports coming in because it's running on lots of different systems. I honestly don't know if software is ever really finished.

My takeaway from your blog post yesterday was that with a robust enough testing system the LLM can do the entire thing while I do Christmas with the family.

(Before all the AI fans come in here. I'm not criticizing AI.)

trevor-e

> Your job is to deliver code you have proven to work.

Strong disagree here, your job is to deliver solutions that help the business solve a problem. In _most_ cases that means delivering code that you should be able to confidently prove satisfies the requirements like the OP mentioned, but I think this is an important nitpick distinction I didn't understand until later on in my career.

newsoftheday

There's no distinction there, proving the work is correct is within the scope of helping the business solve a problem; not without and not beside it. So your point is hot air, making a distinction where none exists.

draw_down

Man we better hope the solution to that problem is working code. Otherwise we better start working the fryers or something.

dfxm12

there’s one depressing anecdote that I keep on seeing: the junior engineer, empowered by some class of LLM tool, who deposits giant, untested PRs on their coworkers—or open source maintainers—and expects the “code review” process to handle the rest.

Is anyone else seeing this in their orgs? I'm not...

0x500x79

I am currently going through this with someone in our organization.

Unfortunately, this person is vibe coding completely, and even the PR process is painful: * The coding agent reverts previously applied feedback * Coding agent not following standards throughout the code base * Coding agent re-inventing solutions that already exist * PR feedback is being responded to with agent output * 50k line PRs that required a 10-20 line change * Lack of testing (though there are some automated tests, but their validations are slim/lacking) * Bad error handling/flow handling

LandR

Fire them?

0x500x79

I believe it is getting close to this. Things like this just take time though, and when this person talks to management/leadership they talk about how much they are producing and how everyone is blocking their work. So it becomes a challenging political maneuvering depending on the ability of certain leadership to see through the BS.

(By my organization, I meant my company - this person doesn't report to me or in my tree).

briliantbrandon

I'm seeing a little bit of this. However, I will add that the primary culprits are engineers that were submitting low quality PRs before they had access to LLMs, they can just submit them faster now.

lm28469

LLMs are tools that make mediocre devs 100x more "productive" and good devs 2x more productive

jennyholzer2

From my vantage I would argue LLMs make good devs around 0.65x more productive

lunar_mycroft

[citation needed]. No study I've seen shows an even 50% productivity improvement for programming, let alone a 100% or 9900% improvement.

dfxm12

What's the ratio of people who things the right way vs not? I mean, is it a matter of giving them feedback to remind them what a "quality PR" is? Does that help?

briliantbrandon

It's roughly 1/10 that are causing issues. Not a huge deal but dealing with them inevitably takes up a couple hours a week. We also have a codebase that is shared with some other teams and our primary offenders are on one of those separate teams.

I think this is largely an issue that can be solved culturally within a team, we just unfortunately only have so much input on how other teams work. It doesn't help either when their manager doesn't seem to care about the feedback... Corporate politics are fun.

jennyholzer2

LLMs have dramatically empowered sociopath software developers.

If you are sufficiently motivated to appear more "productive" than your coworkers, you can force them to review thousands of lines of incorrect AI slop code while you sit back and mess around with your chatbots.

Your coworkers no longer have enough time to work on their in-progress PRs, so you can dominate the development team in terms of LOC shipped.

Understand that sociopaths are skilled at navigating social and bureaucratic environments. A sociopath who ships the most LOC will get the promotion every single time.

zx2c4

Voila:

https://github.com/WireGuard/wireguard-android/pull/82 https://github.com/WireGuard/wireguard-android/pull/80

In that first one, the double pasted AI retort in the last comment is pretty wild. In both of these, look at the actual "files changed" tab for the wtf.

null

[deleted]

IshKebab

Yeah this guys comment here is spot on: https://github.com/WireGuard/wireguard-android/pull/80#issue...

I recently reviewed a PR that I suspect is AI generated. It added a function that doesn't appear to be called from anywhere.

It's shit because AI is absolutely not on the level of a good developer yet. So it changes the expectation. If a PR is not AI generated then there is a reasonable expectation that a vaguely competent human has actually thought about it. If it's AI generated then the expectation is that they didn't really think about it at all and are just hoping the AI got it right (which it very often doesn't). It's rude because you're essentially pawning off work that the author should have done to the reviewer.

Obviously not everyone dumps raw AI generated code straight into a PR, so I don't have any problem with using AI in general. But if I can tell that your code is AI generated (as you easily can in the cases you linked), then you've definitely done it wrong.

fnands

A friend of mine is working for a small-ish startup (11 people) and he gets to work and sees the CTO push 10k loc changes straight to main at 3 am.

Probs fine when you are still in the exploration phase of a startup, scary once you get to some kind of stability

ryandrake

I feel like this becomes kind of unacceptable as soon as you take on your first developer employee. 10K LOC changes from the CTO is fine when it's only the CTO working on the project.

Hell, for my hobby projects, I try to keep individual commits under 50-100 lines of code.

tossandthrow

The cto is ultimately responsible for the outcome and will be there at 4am to fix stuff.

pjc50

Yes .. and no. Someone who does this will definitely make the staff clean up after them.

peab

Lol I worked at a startup where the CTO did this. The problem was that it was pure spaghetti code. It was so bad it kept me up at night, thinking about how to fix things. I left within 30 days

jimbohn

I'd go mental if I was a SWE having to mop that up later

titzer

That's...idiotic.

jennyholzer2

LLMs are for idiots

davey48016

A friend of mine has a junior engineer who does this and then responds to questions like "Why did you do X?" with "I didn't, Claude did, I don't know why".

tossandthrow

That would be an immidiate reason of termination in my book.

fennecfoxy

Yes, if they can't debug + fix the reason the production system is down or not working correctly then they're not doing their job, imo.

Developers aren't hired to write code that's never run (at least in my opinion). We're also responsible for running the code/keeping it running.

Ekaros

I think words that would follow from me would get me send to HR...

And if it was repeated... Well I would probably get fired...

insin

See also "Why did you do X?" → Flurry of new commits → Conversation marked as resolved

And not just from juniors

jennyholzer2

no hate but i would try to fire someone for saying that

stackskipton

Yep. Remember, people not posting on this website are just grinding away at jobs where their individual output does not matter, and entire motivation is work JUST hard enough not to get fired. They don't get stock grants, extremely favorable stock options or anything else, they get salary and MAYBE a small bonus based off business factors they have little control over.

My eyes were wide open when 2 jobs ago, they said they would be blocking all personal web browsing from work computers. Multiple Software Devs were unhappy because they were using their work laptop for booking flights, dealing with their kids schools stuff and other personal things. They did not have personal computer at all.

nutjob2

They don't have phones?

ncruces

Yes, in the only successful OSS project that I “maintain.”

Fully vibe coded, which at least they admitted. And when I pointed out the thing is off by an order of magnitude, and as such doesn't implement said feature — at all — we get pressed on our AI policy, so as to not waste their time.

I don't have an AI policy, like I don't have an IDE policy, but things get ridiculous fast with vibe coding.

peab

Definitely seeing a bit of this, but it isn't constrained to junior devs. It's also pretty solvable by explaining to the person why it's not great, and just updating team norms.

kords

I agree that tests and automation are probably the best things we can do to validate our code and author of the PR should be more responsible. However they can't prove that the code works. It's almost the opposite: If they pass and it's a good coverage, then code has better chances to work. If they fail, then they prove code doesn't work.

alexgotoi

The "deliver proven code" line is spot on, but the real failure mode right now is pretending that a 2000-line LLM dump + "plz review?" counts as "proven".

Nobody sane expects reviewers to babysit untested mega-PRs - that's not their job. The dirty secret is that good tests (unit, integration, property-based if you're feeling fancy) + a pre-commit CI gate are what actually prove code works before it ever hits anyone's inbox. AI makes those cheaper and faster to write, not optional.

Dropping untested behemoths and expecting the team to clean up hallucinations isn't shipping. It's just making everyone's Thursday worse.

Will include this one in my next https://hackernewsai.com/ newsletter.

thopkinson

If your org has engineers LLM dumping into your MRs/PRs, make it policy that code reviews require a live-review session where the engineer who wrote the code walks the other engineers through the code. This will rapidly fix your problem as the 'author' of the code who cannot explain his/her code will not want to be grilled this way ever again.

Accountability is the real answer. If you don't enable individual and team accountability, then you are part of the problem.

0xbadcafebee

Actually it's more specific than that. A company pays you not just to "write code", not just to "write code that works", but to write code that works in the real world. Not on your laptop. Not in CI tests. Not on some staging environment. But in the real world. It may work fine in a theoretical environment, but deflate like a popped balloon in production. This code has no value to the business; they don't pay you to ship popped balloons.

Therefore you must verify it works as intended in the real world. This means not shipping code and hoping for the best, but checking that it actually does the right thing in production. And on top of that, you have to verify that it hasn't caused a regression in something else in production.

You could try to do that with tests, but tests aren't always feasible. Therefore it's important to design fail-safes into your code that ALERT YOU to unexpected or erroneous conditions. It needs to do more than just log an error to some logging system you never check - you must actually be notified of it, and you should consider it a flaw in your work, like a defective pair of Nikes on an assembly line. Some kind of plumbing must exist to take these error logs (or metrics, traces, whatever) and send it to you. Otherwise you end up producing a defective product, but never know it, because there's nothing in place to tell you its flaws.

Every single day I run into somebody's broken webapp or mobile app. Not only do the authors have no idea (either because they aren't notified of the errors, or don't care about them), there is no way for me to even e-mail the devs to tell them. I try to go through customer support, a chat agent, anything, and even they don't have a way to send in bug reports. They've insulated themselves from the knowledge of their own failures.

cynicalsecurity

It gets interesting when a company assigns 2 story points to a task that requires 6 minimum. No time for writing tests, barely any time to perform code reviews and QA. Also, next year the company tells you since we have AI now, all tickets must be done 2 times quicker.

Who popped this balloon? I know I need to change my employer, but it's not so easy. And I'm not sure another employer is going to be any better.

roryirvine

Are you not involved in doing the estimation?

SunshineTheCat

I know this won't be popular, however, I think the idea of differentiating a "real developer" from one who relies mostly, or even solely on an LLM is coming to an end. Right now, I fully agree relying wholly upon an LLM and failing to test it is very irresponsible.

LLMs do make mistakes. They do a sloppy job at times.

But give it a year. Two years. five years. It seems unreasonable to assume they will hit a plateau that will prevent them from being able to build, test, and ship code better than any human on earth.

I say this because it's already happened.

It was thought impossible for a computer to reach the point of being able to beat a grandmaster at chess.

There was too much "art," experience, and nuance to the game that a computer could ever fully grasp or understand. Sure there was the "math" of it all, but it lacked the human intuition that many thought were essential to winning and could only be achieved through a lifetime of practice.

Many years following Deep Blue vs. Garry Kasparov, the best players in the world laugh at the idea of even getting close to beating Stockfish or any other even mediocre game engine.

I say all of this as a 15-year developer. This happens over and over again throughout history. Something comes along to disrupt an industry or profession and people scream about how dangerous or bad it is, but it never matters in the end. Technology is undefeated.

newsoftheday

> There was too much "art," experience, and nuance to the game that a computer could ever fully grasp or understand.

That's the thing though, AI doesn't understand, it makes us feel like it understands, but it doesn't understand anything.

JackSlateur

  This happens over and over again throughout history.

Could you share a single instance of a machine that think ? Are we sharing the same timeline ?

mapontosevenths

I agree with this, except it glosses over security. Your job is to deliver SECURE code that you have proven to work.

Manual and automatic testing are still both required, but you must explicitly ensure that security considerations are included in those tests.

The LLM doesn't care. Caring is YOUR job.

andy99

I think the problem is in what “proven” means. People that don’t know any better will just do that all with LLMs and still deliver the giant untested PRs but with some LLM written tests attached.

I vibe code a lot of stuff for myself, mostly for viewing data, when I don’t really need to care how it works. I’m coming around to the idea that outside of some specific circumstances where everyone has agreed they don’t need to care about or understand the code, team vibe coding is a bad practice.

If I’m paying an engineer, it’s for their work, unless explicitly agreed otherwise.

I think vibe coding is soon going to be seen the same way as “research” where you engage an offshore team (common e.g. in consulting) to give you a rundown on some topic and get back the first five google search results. Everyone knows how to do that, if it’s what they wanted they wouldn’t be hiring someone to do it.

simonw

That's why I emphasized the manual testing component as well. Attaching a screenshot or video of a feature working to your PR is a great way to prove that you've actually seen it work correctly - at least once, which is still a huge improvement over it not actually working at all.

HN

Your job is to deliver code you have proven to work

Your job is to deliver code you have proven to work