Skip to content(if available)orjump to list(if available)

Vibe Coding and the Future of Software Engineering

Karrot_Kream

If you're writing one-off scripts though, I find vibe coding fantastic. I found myself in a work meeting where I was mostly there to let a junior present some joint work we did and answer any questions the junior couldn't. Since I wasn't really needed (the junior eng was awesome), I was fidgeting and wanted to analyze the results from an API I had access to. A few prompts from Claude and I was hitting the API, fetching results, using numpy to crunch what I needed, and getting matplotlib to get me nice pretty graphs. I know Python and the ecosystem well so it wasn't hard to guide Claude correctly.

I probably got the whole thing done in 5 prompts and still had enough brain space to vaguely follow along the presentation. Before this kind of thing would have taken 20-30 min of heads down coding. This would have been a strictly "after work" project which means I probably wouldn't have done it (my real side projects and family need that time more than this analysis did.) That's the kind of thing that an experienced programmer can get out of vibes coding.

brundolf

Agreed. I've had a lot of success using ChatGPT for nontrivial bash one-liners. They're small in scope and there must be a huge amount of training data for them, I use them rarely enough that I don't remember details off the top of my head, and they're intrinsically throwaway code

Taylor_OD

I was doing some file management last weekend and wanted a little script to remove any numbers at the start of a file name and any region values at the end. I also wanted to ensure any duplicate files were moved into a separate folder so i could remove them.

I could have written that code. I'm sure there is a program that can do exactly that for me that I could have downloaded. But one prompt, a few tests to make sure it wasnt going to nuke all my files, and within 5 minutes I was completely done with the file management. For this type of stuff it just saves time.

AdieuToLogic

> I was doing some file management last weekend and wanted a little script to remove any numbers at the start of a file name and any region values at the end. I also wanted to ensure any duplicate files were moved into a separate folder so i could remove them.

> I could have written that code.

Not knowing precisely your desired result, the benefit of going through the effort of writing a script is experienced gained and deepening one's understanding of the tools involved.

For example, assume this file structure exists:

  .
  ├── 001name-eu-central1.ext
  ├── 001name-us-west.ext
  ├── 002name-eu-central1.ext
  ├── 003name-eu-central1.ext
  ├── keep
  └── research
The script logic to do what you describe could be similar to:

  for file in [0-9]*
  do
    dest="$(echo $file | sed -E -e 's/^[0-9]*//' -e 's/-(eu-central1|us-west)//')"

    if [[ -f "keep/$dest" ]]
    then
      mv "$file" "research/$file"
    else
      mv "$file" "keep/$dest"
    fi
  done

This results in:

  .
  ├── keep
  │   └── name.ext
  └── research
      ├── 001name-us-west.ext
      ├── 002name-eu-central1.ext
      └── 003name-eu-central1.ext

The net-net is that the journey is sometimes more valuable than the destination.

thwarted

There's probably a program in /usr/bin on your machine that does that.

LPisGood

I recently tried this _exact_ same thing (except it was less of a meeting and more like a phone call) and I got stuck because it kept making up fake API endpoints OR I had to go manually login somewhere to get the API key.

Taylor_OD

When? I've found over the last 6 months or so there has been a drastic change in quality of results. I used to have a ton of hallucinations like that. It still happens, but it is significantly less common now.

I also get better results when using a prompt window built into and ide than something detached like the openai website.

linsomniac

AFAICT, OpenAI o3-mini-high, a couple days ago completely hallucinated functionality within systemd homectl that would mount a home directory from another machine using sshfs. It certainly wasn't available on Ubuntu 24.04, but searching the Internet I can't find those options documented anywhere.

Though, I do agree that hallucinations have dropped dramatically.

Karrot_Kream

Hm not my experience at all. No idea what the difference was. I gave it my API endpoint and told it the response shape. As far as the key was concerned, I told Claude to look at a given envar I defined and place that in a header. Claude did all the rest. Are you using 3.7? I don't think I used the Extended Thinking mode in the UI for it.

adamq_q

Which model?

dkkergoog

[dead]

antirez

As I said in one of my latest YT videos: if you can't code, sure, go all for it, the difference with x -> 0 in the denominator tends to infinity. But if you can code, there are much better ways to use generative AI to aid you in your coding tasks. Ways that will make you faster while you learn more, understand every single line of the code that is in your application, never letting badly written code go into your code base. Maybe in the future AI will be able to write much better code than humans, and vibe coding will be the right way, but now it's like when assembly was a hell lot better written by hand than what a compiler could do.

PartiallyTyped

Are you willing to elaborate about how they can accelerate you?

antirez

I have a youtube channel with a playlist of coding with AI, where I show what I do with it (a small part, actually, but representative enough I hope). This is is the first video of the series I believe:

https://www.youtube.com/watch?v=_pLlet9Jrzc&list=PLrEMgOSrS_...

And here, Redis bugfixing with Claude Sonnet:

https://www.youtube.com/watch?v=rCIZflYEpEk&list=PLrEMgOSrS_...

jcgrillo

I don't get it. Watched the first video, and it seems like the LLMs provided no value at all? Like, you already knew where the bug was, and what the bug was, and they couldn't find it? So 20+min of messing around with prompts and reading through pages of output.. for nothing? How is this supposed to help?

PartiallyTyped

Many thanks! I will take a look!

jbreckmckye

Data point of one: ChatGPT 3.5, even the free product, is so much better at answering technical questions than Google.

Some questions I had successfully answered recently:

> "I would like to animate changing a snippet of code. I'll probably be using Remotion. Is there a JavaScript library that can animate changing one block of text into another?"

> "In Golang, how can I unit test a http mux? How can I test the routes I've registered without making real http calls?"

> "Hello ChatGPT. I have an application using OpenTelemetry. I want to test locally whether it can collect logs and metrics. The application is running in Tilt. How can my local integration test read the logs and metrics from the OTel collector?"

oortoo

ChatGPT is better on average for sure than Google for arriving at a correct answer, but they fail in different ways. When Google fails, it's usually in the form of, "I cannot find an answer. Better ask someone smart for help." but when ChatGPT fails, it's often giving an incorrect answer.

Depending on your fault tolerance and timeline, one will be better than the other. If you have low tolerance for faults, ChatGPT is bad, but if you are on a crunch and decide its OK to be confidently incorrect some small percentage of the time, then ChatGPT is a great tool.

Most industry software jobs, at least the high paying ones, are generally low fault tolerant and that's why ChatGPT is not entirely replacing anyone yet.

So, even in your example, and even if you write the code all yourself, there is still a risk that you are operating above your own competence level, do exactly as ChatGPT instructs, and then it fails miserably down the line because ChatGPT provided a set of steps that an expert would have seen the flaws in.

re-thc

> Data point of one: ChatGPT 3.5, even the free product, is so much better at answering technical questions than Google.

That's not the point of Google. It gives you a start to research the answer you need. ChatGPT just gives you an answer that might not be correct. So how do you define "successfully answered"?

In programming there are always tradeoffs. It's not about picking the 1st answer that looks like it "runs".

sergiotapia

One example: I had to send a report to a slack webhook, showing how many oban jobs were completed in the last 24 hours for specific uses cases based on oban job params.

That's:

sql query, slack webhook api docs reading, ecto query for oban jobs with complex para filtering, oban job to run cron, cron syntax.

easily like a 2 hour job?

it took me 5 minutes with AI. then we decided to send the slack alert at 7am EST instead of 12pm PST. instead of doing all that math, I just ctrl+k and asked it to change it. 1 second change.

these gains are compounding. if you're an experienced engineer, you let go of minutae and FLY. i believe syntax memorization is dead.

batshit_beaver

> i believe syntax memorization is dead.

Hold up, if you don't know a language's syntax, how can you verify that the answer returned by LLM is correct (at a glance, because a) nobody writes exhaustive tests, LLMs included, and b) you wouldn't be able to read the tests to confirm their validity either)?

I struggle to think of a case where explaining a task to an LLM in a natural language is somehow faster than writing it yourself, specifically in the case where you know a programming language and related libs to accomplish the task, implying non-zero ROI on learning these things.

re-thc

> these gains are compounding.

It's not really. You could have done the same using no code tools in a similar time.

Question is, would you? It's alarming the amount of trust/praise that goes into "AI".

> if you're an experienced engineer > it took me 5 minutes with AI

You'd at least read the code and fix things up so it wouldn't be 5 minutes.

goosejuice

antirez's series looks awesome. My two cents w/ composer:

Don't rely on the LLM for design. Start by defining some tests via comments or whatever your tools allow. The tab completion models can be helpful here. Start the interfaces / function defs and constrain them as much as possible. Let the LLM fill in the rest using TDD. The compiler / test loop along with the constraints can get you pretty far. You'll always need to do a review but I find this cuts out a lot of the wasted cycles you'll inevitably get if you "vibe" with it.

For Cursor specifically, you'll want to aim for prompts that will get you from A to B with the least amount of interactions. Cursor will do 25 tool calls (run the tests / compile, whatever) with one interaction. A little planning goes a long way to doing less and paying less.

jstummbillig

If you are optimizing for less tool calls, I think you are rapidly and increasingly optimizing for the wrong thing here.

Swizec

> Are you willing to elaborate about how they can accelerate you?

A few examples from my experience:

    - Here is a SQL query. Translate this back into our janky ORM
    - Write tests that cover cases X, Y, Z
    - Wtf is this code trying to do?
    - I want to do X, write the boilerplate to get me started
    - Reshape this code to follow the new conventions
And it often picks up on me doing a refactor then starts making suggestions so refactoring feels like tab tab tab instead of type type type.

t-writescode

To add:

  - "I've never used this language / framework, this is what I'm trying to do, how would I do it?"
  - the documentation for these libraries is ... not useful. How do I do X?
    Followed by: "these parts didn't work due to these restrictions, tell me more".
    (I'm currently using this one to navigate Unity and UdonSharp's APIs. It is far from perfect
     but having *something* that half-works and moves me in the right direction of understanding
     how everything connects together is much much faster than sitting there, confused, unable
     to take a single step forward)
I find that a lot of cases where "just read the documentation" is the best route are situations where there is good (or any) documentation that is organized in a single, usable space and that doesn't require literal days worth of study to sufficiently understand the whole context to do what is, with all that context, a very simple task.

I'm reminded a bit of the days when I was a brand new Java programmer and I would regularly Google / copy-paste:

  public class Foo {
  
    public static void main(String[] args) {
        
    }
  }
Or new Python devs when they constantly have to look up

  if __name__ == '__main__':
    run_me()
because it's just a weird, magical incantation that's blocking their ability to do what they want to do

null

[deleted]

genewitch

off-topic: You made both the redis and dump1090? I recognized your handle but not why, so i googled it, and dump1090 came up; i've used both.

on-topic: I have no desire to be a front-end dev but my friends don't know a front-end dev from a front-end loader, so whenever they need something thrown together i usually reach for Ghost or whatever. But now they want custom "web apps" - and that's fine, thank goodness for AI.

Most of my "LLM" use is remembering phrases, terms of art, and the like. I have used aider and Cursor, and they're about as useful to me as stackoverflow, i suppose. LLMs have definitely improved a lot, i don't get circular chats much except on Gemini, Gemini is the weakest LLM by far, which is ironic considering the extent of data they have.

antirez

Indeed! I'm the original author of both projects, and thank you for using them :) I agree about Gemini, but I understand the sentiment is not widespread. For me the strongest LLM for coding is Claude Sonnet.

gngoo

For me, vibe coding is the only logical way forward. I see that the term gets a lot of flak. But in almost 10 years of writing web-applications for a living, this feels even more exciting then when I finally "got it". And its measurable, I am sitting on 5 completed web-apps with traffic just this year, working with 3 clients, and due to my ability to be this productive, feel like I have a very stable future ahead. I even had my last client seek me out, for writing about coding like this, because he wants to replace his back-end and front-end coders with people who can "AI Code" the full-stack, but aren't new to this. Well, its sad, but I am likely replacing 3 people on their team. Only time will tell if that was their right decision, but I am not saying no to that.

But then again, I have been doing this for 10 years; that is my edge. Same exact stack (Django + boring frontend). I know the ins and out of my stack, quite obviously every single day, I see AI go into a direction that I know is going to produce a huge footgun along the way. I can just see that up ahead, suggest a different approach, and continue. IF I was entirely new to this, I would end up building stuff that breaks down after weeks or months or investments, not knowing when things went wrong, or how to go forward. Regardless, I feel like my time has come, and I am definitely spending 95% of my time just prompting the AI versus writing actual code. Even for the most minor changes, like changing a CharField to a TextField, I don't even want to open the models.py myself. In Cursor, I am averaging 5000-7000 fast requests per month, because in terms of ROI it pays off. I am looking forward to this getting better.

namaria

> Well, its sad, but I am likely replacing 3 people on their team.

Are you capturing this value? Are you getting paid 4x as much as before? If you are not capturing this increase in the value of your work, who is? If you're not getting paid 4x as much as before, which I doubt you are, why are you doing this?

gngoo

I am getting paid significantly more than a year ago, but definitely not 4x as much. However, I have a lot more free time than I did before. And this while working on my own side projects as well; which are slowly growing into ramen-profitability. There are also other things to factor in, like living in South East Asia at a low cost of living and billing US and EU standard rates.

> why are you doing this?

Because I love doing this? Both web and software development are passions, AI feels like a lever; and making my money with this is nothing short of a dream come true.

namaria

Not why you build software man, why are you taking 3 jobs and foregoing receiving 3 salaries.

weakfish

I think the main difference here is that you _know_ your stack and can foresee the issues if the AI goes that direction. The vibes based coder in the article seems to be incapable of seeing those problems, and solves them by just spamming the prompts harder.

digdugdirk

... The current global system just isn't going to make it, is it?

gngoo

I am not the judge of that, but I always felt like I freelanced/contracted out in the fringes. So I am definitely not the average of the industry here.

hnthrow90348765

My guess is CRUD app development will become commoditized like ordinary pentesting. The money in security is definitely not regular pentesting these days.

You'll see smaller team sizes at first, then continuing to shrink as individual positions get a higher workload and spread of knowledge.

I think "Vibe coding" is probably a canary for all of this so it's worth paying attention to what a non-programmer can actually accomplish. This creates narratives that get picked up by managers and decision-makers.

The capabilities taking user input will surely be hacked eventually, so I think those are a non-starter, not-for-nothing because of bitter, laid off developers wanting to see you fail.

throwaway290

> You'll see smaller team sizes at first, then continuing to shrink as individual positions get a higher workload and spread of knowledge.

In other words: "programmers on whose work the LLMs were trained will lose their jobs"

cadamsdotcom

If a “non coder” makes a project by vibe coding and learns to code as a side effect, great! One more coder. And it IS a great way to learn to code. If you want to know how to do something just ask for it and watch what was generated then ask why for anything you’re curious about. Can’t get much better as an education tool than that. Perfect for solo founders.

Alternatively if a “non coder” creates a project by vibe coding and it fails, maybe that failure happened faster with lower costs (especially their time, if it’s their own project) than if they’d had to go get financing, hire an offshore dev or two, and go back and forth for a few weeks or months.

Vibes are high on vibe coding.

mncharity

Tweaking that list of capabilities yields integrated pervasive user observation, user interviewing, UX refinement discussions and prototyping and validation.

Friday AI report: A user was observed seeming to struggle a bit with X; we had a pain point discussion; they suggested some documentation and UI changes; they were confused, but further discussion turned up plausible improvements; we iterated on drafts and prototypes; I did an expanding alpha with interviews, and beta with sampled surveys, and integrated some feedback; evaluation was above threshold with no blockers; I've pushed the change to prod, and fed the nice users cookies.

risyachka

>> Seems like programmers are terrified

Idk why terrified, vibe coding is nice but everyone who developed something bigger than a toy knows that code is 5% of the task and never was a bottleneck. Its not like faang employees write code all day long, or even half day.

Ah and you need to make sure it doesn’t nuke your db or send weird email to your users because someone prompt-engineered it badly.

d_burfoot

> why terrified

I'm terrified some colleague is going to vibe-code a product and use it to get promoted, then move to another team and dump the project on me to maintain.

risyachka

You can either vibe-code or have sharp skills. Can’t have both.

In that particular situation you will have easier time switching a job(that usually pays more than promotion) than your colleague.

mixedump

Exactly.

I feel people/hypers keep rumbling about many things and way too often seem they have no real life experience.

The biggest problem I’ve seen over 20 years long career is people and the games they play (on different levels of leadership and management within an org) and their inability to agree to and then verbalise what on Earth they want. No matter is it a greenfield or some digital transformation project, they a plagued by fear and self-interest (of various kind).

And even if/when they identify a problem (e.g. games players) it becomes a risky cutting the cord since the one needs to identify who to rely on to clean the mess and their confidence caves in (often) while resolving to “let’s layoff x number of ‘leaf’ employees” is safe scared-with-no-vision leader move and it looks good on the stock market.

Software engineers (generally) kept delivering despite all that for years across the board (generally). And, generally, kept their values and principles and that bothers this managerial class a lot it seems and they (in a way) can’t wait to stop paying “those nerds” big salaries and that’s why those often low values and unprincipled people can’t wait to see our backs and are getting hyped about this “AI will replace SD/SE” mantra.

All this above is obviously generally speaking but yeah we people tend way to often to misplace our focus and solve lesser priority problems, and this “ai replacing engineers” is one of them to a good degree.

And for the majority of software developers there is nothing to worry about. The sheer keep-knowledge-up-to-date demand this industry put on us primed us to be by far the most able professional group to jump into a career change in no time.

Which other group in huge numbers can just sit down and learn and work for 10 hours a day 7 days a week and not complain or get emotionally disturbed (too much) and get the thing but us.

So in the worst case scenario we will be fine, others should be scared if we in numbers pick their industry to move into.

null

[deleted]

Havoc

Does this really work for anyone?

I still find myself building more building block style. „Make me a python function that does X“ and then stringing those together by hand

mohsen1

It's interesting to see a lot of senior folks are against this arguing that if you're working on a larger software project this falls apart. Two things can be argued against this:

1. Context sizes are going to grow. Gemini with 2M tokens is already doing amazing feats

2. We all agree that we should break bigger problems into smaller ones. So if you can isolate the problem into something that fits in a LLM context, no matter how large the larger software system is, you can make a lot of quick progress by leveraging LLMs for that isolated piece of software.

makeitdouble

Compartmentizing the code only matters if it works in the first place.

The main argument from senior folks is probably that vibe codes won't cut it for actual sizeable problem. There is complexity that can't be abstracted away just because we want to.

mohsen1

I think we're overestimating how much humans can keep in context. Throughout my career I've seen many instances that folks completely get it wrong and miss the context.

makeitdouble

You're right, but it's also why vibe coding doesn't work IMHO.

We had the same situation with TDD: can we give out succinct specs and ignore what happens in the code if the specs are met ? For anything beside Hello world, the answer was no, absolutely not.

It still mattered that the logic was somewhat reasonable and you were not building a Rube Goldberg machine giving the right answers to the given tests 95% of the time. Especially as the tests didn't cover all the valid input/output nor all the possible error cases in the problem space.

It's because there's a lot happening that we need to have simple blocks that we can trust and not black boxes that were already beyond our understanding.

apwell23

Do you have real life example of someone using AI to do task in a big project like linux kernel ?

All i see are toy examples. great you built tic-tac-toe with a prompt. now what?

voidhorse

You don't understand software engineering. The goal isn't to produce a bunch of code. The goals are actually:

1. Build a computable model of some facet of reality to an achieve certain goals. 2. Realize a system that manifests the model, satisfying a set of other constraints, such as resource constraints and performance. 3. Ensure a community of system owners comprehend the key decisions made in the system and model design, and the degree to which certain constraints can be relaxed or increased and how the system can evolve such that it behavior remains predictable over time.

Basically none of that, in my view, is supported by LLM driven "vibe" coding. Sure your hobby project might be ok to treat like an art project, but, oh, I don't know, how about software for train communications, or aircraft guidance systems? Do you want to vibe code your way then? Do you want a community of engineers who only dimly understand the invariants satisfied in individual components and across the system as whole?

LLM fanatics are totally ignorant of the actual process of software development. Maybe one day we'll find a way to use them in a way that produces software that is predictable, well-specified, and safe, but we are not there yet.

mohsen1

honestly after the first line I decided to stop reading your comment. but to answer that one line, I probably do. You probably used software that I've worked on

chis

I honestly think it comes down to preferred work style in some cases. Once a codebase gets decently complex, it’s true that you have to supervise the AI extremely closely to get a good result. But at least for me I tend to enjoy doing that more than writing it myself. I’m a fast typer of prose, I like to plan my coding ahead of time anyways, so whether the AI does it or me is kinda immaterial

mohsen1

same! and let's be honest, some times AI comes up with ideas that makes us say "woah"

williamcotton

I vibe coded this entire DSL, changing syntax and grammar while following my whimsy:

https://github.com/williamcotton/webdsl

cadamsdotcom

Vibe coding will eat software.

Engineering - working to constraints, including user needs (ie Product Management) is forever.

CyberDildonics

This vague meaningless term that some blogger just made up based on how teenagers talk will "eat software" ?

dakshbabbar

Karpathy is just 'some blogger' who talks in teenage English, sure.

CyberDildonics

The results speak for themselves. Here is clickbait trying to coin a nonsense word and you aren't defending it on substance, just on the fact that you recognize the name of the person who wrote it.

The most embarrassing part is everyone immediately going along with it. It is like an improv game or a man on the street interview where they ask a pedestrian about an event that didn't happen and the person being interviewed just acts like they were there and know all about it.