After months of coding with LLMs, I'm going back to using my brain

228 comments

·May 16, 2025

jclardy

I don't get the whole "all-in" mentality around LLMs. I'm an iOS dev by trade, I continue to do that as I always have. The difference now is I'll use an LLM to quickly generate a one-off view based on a design. This isn't a core view of an app, the core functionality, or really anything of importance. It's a view that promotes a new feature, or how to install widgets, or random things. This would normally take me 30-60 min to implement depending on complexity, now it takes 5.

I also use it for building things like app landing pages. I hate web development, and LLMs are pretty good at it because I'd guess that is 90% of their training data related to software development. For that I make larger changes, review them manually, and commit them to git, like any other project. It's crazy to me that people will just go completely off the rails for multiple hours and run into a major issue, then just start over when instead you can use a measured approach and always continue forward momentum.

mritchie712

How useful the various tools will be depends on the person and the problem. Take two hypothetical people working on different problems and consider if, for example, Cursor would be useful.

IF you're a:

* 10 year python dev

* work almost entirely on a very large, complex python code base

* have a pycharm IDE fine tuned over many years to work perfectly on that code base

* have very low tolerance for bugs (stable product, no room for move fast, break things)

THEN: LLMs aren't going to 10x you. An IDE like Cursor will likely make you slower for a very long time until you've learned to use it.

IF you're a:

* 1 year JS (react, nextjs, etc.) dev

* start mostly from scratch on new ideas

* have little prior IDE preference

* have high tolerance for bugs and just want to ship and try stuff

THEN: LLMs will to 10x you. An IDE like Cursor will immediately make you way faster.

infecto

These all-or-nothing takes on LLMs are getting tiresome.

I get the point you’re trying to make, LLMs can be a force multiplier for less experienced devs, but the sweeping generalizations don’t hold up. If you’re okay with a higher tolerance for bugs or loose guardrails, sure, LLMs can feel magical. But that doesn’t mean they’re less valuable to experienced developers.

I’ve been writing Python and Java professionally for 15+ years. I’ve lived through JetBrains IDEs, and switching to VS Code took me days. If you’re coming from a heavily customized Vim setup, the adjustment will be harder. I don’t tolerate flaky output, and I work on a mix of greenfield and legacy systems. Yes, greenfield is more LLM-friendly, but I still get plenty of value from LLMs when navigating and extending mature codebases.

What frustrates me is how polarized these conversations are. There are valid insights on both sides, but too many posts frame their take as gospel. The reality is more nuanced: LLMs are a tool, not a revolution, and their value depends on how you integrate them into your workflow, regardless of experience level.

mritchie712

my stance is the opposite of all-or-nothing. The note above is one example. How much value you get out of CURSOR specifically is going to vary based on person & problem. The Python dev in my example might immediately get value out of o3 in ChatGPT.

It's not all or nothing. What you get value out of immediately will vary based on circumstance.

lelandbatey

This doesn't seem like an "all or nothing" take. This person is trying to be clear about their claims, but they're not trying to state these are the only possible takes. Add the word "probably" after each "then" and I image their intended tone becomes a little clearer.

mexicocitinluez

> I get the point you’re trying to make, LLMs can be a force multiplier for less experienced devs, but the sweeping generalizations don’t hold up. If you’re okay with a higher tolerance for bugs or loose guardrails, sure, LLMs can feel magical. But that doesn’t mean they’re less valuable to experienced developers.

Amen. Seriously. They're tools. Sometimes they work wonderfully. Sometimes, not so much. But I have DEFINITELY found value. And I've been building stuff for over 15 years as well.

I'm not "vibe coding", I don't use Cursor or any of the ai-based IDEs. I just use Claude and Copilot since it's integrated.

motorest

I think you're trying very hard to pin LLMs as a tool for inexperienced developers. There's a hint of paternalism in them.

Back in the real world, LLMs are a tool that excels at generating and updating code based on their context and following your prompts. If you realize this fact, you'll understand that there's nothing in the description that makes them helpful exclusively to "inexperienced" developers. Do experience developers need to refactor code or write new software? Do you believe veteran software engineers are barred from writing proofs of concept? Is the job of pushing architecture changes a junior developer gig?

What exactly do you think an experienced developer does?

> What frustrates me is how polarized these conversations are. There are valid insights on both sides, but too many posts frame their take as gospel. The reality is more nuanced: LLMs are a tool, not a revolution, and their value depends on how you integrate them into your workflow, regardless of experience level.

I completely disagree: LLMs have a revolutionary impact on how software engineers do their job. Your workflows changed overnight. You can create new things faster, you can iterate faster, you can even rewrite whole applications and services in another tech stacks and frameworks in a few days. Things like TDD will become of critical importance as automates test suites are now a critical factor in providing feedback to LLMs. Things are no longer the way they were. At least to those who bothered learning.

cbm-vic-20

I've got over 30 years of professional development experience, and I've found LLMs most useful for

* Figuring out how to write small functions (10 lines) in canonical form in a language that I don't have much experience with. This is so I don't end up writing Rust code as if it were Java.

* Writing small shell pipelines that rely on obscure command line arguments, regexes, etc.

maccard

I’ve found the biggest thing LLMs and agents let me do are build the things that I really suck at to a prototype level. I’m not a frontend engineer, and pitching feature prototypes without a fronted is tough.

But with aider/claude/bolt/whatever your tool of choice is, I can give it a handful of instructions and get a working page to demo my feature. It’s the difference between me pitching the feature or not, as opposed to pitching it with or without the frontend.

CuriouslyC

16 year python dev who's done all that, lead multiple projects from inception to success, and I rarely manually code anymore. I can specify precisely what I want, and how I want it built (this is the key part), stub out a few files and create a few directories, and let an agent run wild but configured for static analysis tools/test suite to run after every iteration with the instructions to fix their mistakes before moving on.

I can deliver 5k LoC in a day easily on a greenfield project and 10k if I sweat or there's a lot of boilerplate. I can do code reviews of massive multi-thousand line PRs in a few minutes that are better than most of the ones done by engineers I've worked with throughout a long career, the list just goes on and on. I only manually code stuff if there's a small issue that I see the LLM isn't understanding that I can edit faster than I can run another round of the agent, which isn't often.

LLMs are a force multiplier for everyone, really senior devs just need to learn to use them as well as they've learned to use their current tools. It's like saying that a master archer proves bows are as good as guns because the archer doesn't know how to aim a rifle.

mrweasel

Assuming that your workflow works, and the rest of us just need to learn to use LLMs equally effective, won't that plateau us at the current level of programming?

The LLMs learn from examples, but if everyone uses LLMs to generate code, there's no new code to learn new features, libraries or methods from. The next generation of models are just going to be trained on the code generated by it's predecessors with now new inputs.

Being an LLM maximalist is basically freeze development in the present, now and forever.

mritchie712

were you immediately more productive in Cursor specifically?

my point is exactly inline with your comment. The tools you get immediate value out of will vary based on circumstance. There's no silver bullet.

otabdeveloper4

> we're back to counting programming projects in kloc, like it's the radical 1980's again

Yikes. But also lol.

WD-42

Let’s see the GitHub project for an easy 10k line day.

alvis

20+ years dev here, started coding with AI when Javis was still a thing (before it became Jasper, and way before Copilot or ChatGPT).

Back then, Javis wasn’t built for code, but it was a surprisingly great coding companion. Yes. It only gave you 80% working code, but because you had to get your hands dirty, you actually understand what was happening. It didn't give me 10x but I'm happy with 2x with good understanding on what's going on.

Fast-forward to now: Copilot, Cursor, roo code, windsurf and the rest are shockingly good at output, but sometimes the more fluent the AI, the sneakier the bugs. They hand you big chunks of code, and I bet most of us don't have a clear picture of what's going on at ground 0 but just an overall idea. It's just too tempting to blindly "accept all" the changes.

It’s still the old wisdom — good devs are the ones not getting paged at 3am to fix bugs. I'm with the OP. I'm more happy with my 2x than waking up at 3am.

palmotea

> IF you're a:

> * 1 year JS (react, nextjs, etc.) dev

> * start mostly from scratch on new ideas

> * have little prior IDE preference

> * have high tolerance for bugs and just want to ship and try stuff

> THEN: LLMs will to 10x you. An IDE like Cursor will immediately make you way faster.

And also probably dead-end you, and you'll stay the bug-tolerate 1 year JS dev for the next 10 years of your career.

It's like eating your seed corn. Sure you'll be fat and have it easy for a little while, but then next year...

prisenco

Agreed but the 1 year JS dev should know they're making a deal with the devil in terms of building their skillset long term.

diggan

I basically learned programming by ExpertSexchange, Google (and Altavista...), SourceForge and eventually StackOverflow + GitHub. Many people with more experience than me at the time always told me I was making a deal with the devil since I searched so much, didn't read manuals and asked so many questions instead of thinking for myself.

~15 years later, I don't think I'm worse off than my peers who stayed away from all those websites. Doing the right searches is probably as important as being able to read manuals properly today.

whiplash451

The jury is still out on that one.

Having a tool that’s embedded into your workflow and shows you how things can be done based on tons of example codebases could help a junior dev quite a lot to learn, not just to produce.

StefanBatory

From workplace perspective, they don't have a reason to care. What they'll care about is that you're productive right now - if you won't become better dev in the future? Your issue.

aqme28

Maybe. But I also think that ignoring AI tools will hamper your long-term skillsets, as our profession adapts to these new tools.

null

[deleted]

ekianjo

> THEN: LLMs will to 10x you. An IDE like Cursor will immediately make you way faster.

they will make you clueless about what the code does and your code will be unmaintanable.

llm_nerd

>I don't get the whole "all-in" mentality around LLMs

To be uncharitable and cynical for a moment (and talking generally rather than about this specific post), it yields content. It gives people something to talk about. Defining their personality by their absolutes, when in reality the world is an infinite shades of gradients.

Go "all in" on something and write about how amazing it is. In a month you can write your "why I'm giving up" the thing you went all in on and write about how relieved/better it is. It's such an incredibly tired gimmick.

"Why I dumped SQL for NoSQL and am never looking back" "Why NoSQL failed me"

"Why we at FlakeyCo are all in on this new JavaScript framework!" "Why we dumped that new JavaScript framework"

This same incredibly boring cycle is seen on here over and over and over again, and somehow people fall for it. Like, it's a huge indicator that the writer more than likely has bad judgment and probably shouldn't be the person to listen to about much.

Like most rational people that use decent judgement (rather than feeling I need to "all in" on something, as if the more I commit the more real the thing I'm committing to is), I leverage LLMs many, many times in my day to day. Yet somehow it has authored approximately zero percentage of my actual code, yet is still a spectacular resource.

stevepotter

I do a variety of things, including iOS and web. Like you mentioned, LLM results between the two are very different. I can't trust LLM output to even compile, much less work. Just last night, it told me to use an API called `CMVideoFormatDescriptionGetCameraIntrinsicMatrix`. That API is very interesting because it doesn't exist. It also did a great job of digging some deep holes when dealing with some tricky Swift 6 concurrency stuff. Meanwhile it generated an entire nextjs app that worked great on the first shot. It's all about that training data baby

kartoffelsaft

Honestly, with a lot of HN debating the merits of LLMs for generating code, I wish it were an unwritten rule that everyone states the stack they're using with it. It seems that the people who rave about it creating a whole product line in a weekend are asking it to write them a web iterface using [popular js framework] that connects to [ubiquitous database], and their app is a step or two away from being CRUD. Meanwhile, the people who say it's done nothing for them are writing against [proprietary in-house library from 2005].

The worst is the middleground of stacks that are popular enough to be known but not enough for an LLM to know them. I say worst because in these cases the facade that the LLM understands how to create your product will fall before you the software's lifecycle ends (at least, if you're vibe-coding).

For what it's worth, I've mostly been a hobbyist but I'm getting close to graduating with a CS degree. I've avoided using LLMs for classwork because I don't want to rob myself of an education, but I've occasionally used them for personal, weird projects (or tried to at least). I always give up with it because I tend to like trying out niche languages that the LLM will just start to assume work like python (ex: most LLMs struggle with zig in my experience).

englishspot

> Meanwhile, the people who say it's done nothing for them are writing against [proprietary in-house library from 2005].

there's MCP servers now that should theoretically help with that, but that's its own can of worms.

andrekandre

  > `CMVideoFormatDescriptionGetCameraIntrinsicMatrix`. That API is very interesting because it doesn't exist.

same experience, and its been not great for juniors around me cause they have no idea why its not compiling or that the thing it wrote doesn't even exist...

maerch

Exactly my thoughts. It seems there’s a lot of all-or-nothing thinking around this. What makes it valuable to me is its ability to simplify and automate mundane, repetitive tasks. Things like implementing small functions and interfaces I’ve designed, or even building something like a linting tool to keep docs and tests up to date. All of this has saved me countless hours and a good deal of sanity.

andy99

Overshooting the capabilities of LLMs is pretty natural when you're exploring them. I've been using them to partially replace stack overflow or get short snippets of code for ~2 years. When Claude code came out, I gave it increased responsibility until I made a mess with it, and now I understand where it doesn't work and am back to using LLMs more for ideas and advice. I think this arc is pretty common.

spacemadness

I've found LLMs are extremely hit or miss with iOS development. I think part of that might be how quickly Swift and SwiftUI is changing coupled with how bad Apple documentation is. I have loved using them to generate quick views and such for scaffolding purposes and quick iterations, but they tend to break down quickly around asynchronous coding and non-trivial business logic. I will say they're still incredibly useful to point you in a direction, but can be very misleading and send you down a hallucination rabbit hole easily.

arctek

Similar to my experience, it works well for small tasks, replacing search (most of the time) and doing alot of boilerplate work.

I have one project that is very complex and for this I can't and don't use LLMs for.

I've also found it's better if you can get it code generate everything in the one session, if you try other LLMs or sessions it will quickly degrade. That's when you will see duplicate functions and dead end code.

spiderfarmer

People just do stupid stuff like "going all in" for their blog posts and videos. Nuance, like rationalism, doesn't get engagement.

meander_water

The thing most LLM maximalists don't realize is that the bottleneck for most people is not code generation, it's code understanding. You may have doubled the speed at which you created something, but you need to pay double that time back in code review, testing and building a mental model of the codebase in your head. And you _need_ to do this if you want to have any chance of maintaining the codebase (i.e. bugfixes, refactoring etc.)

emushack

Totally agree! Reading code is harder than writing it, and I think I spend more time reading and trying to understand than I do writing.

But this CEO I just met on LinkedIn?

"we already have the possibility to both improve our productivity and increase our joy. To do that we have to look at what software engineering is. That might be harder than it looks because the opportunity was hidden in plain sight for decades. It starts with rethinking how we make decisions and with eliminating the need for reading code by creating and employing contextual tools."

Context is how AI is a whole new layer of complexity that SWE teams have to maintain.

I'm so confused.

emushack

Ah I got more info. Interesting concept: https://moldabledevelopment.com/

marssaxman

I have often had the same thought in response to the effusive praise some people have for their sophisticated, automated code editors.

kragen

I've found LLMs are pretty good at explaining my code back to me.

Too

That will only work for the most simple of functions. How would you even make such a prompt? "Plz explain this code for me?" It's going to come back with the same essential complexity except in English instead of Javascript, you still need to understand the surroundings. Often you need to approach the understanding from different angles. Are you going to ask it "What will happen given preconditions A, B and C". How many of those questions do you need?

Not to mention how slow this is, instead of just reading a piece of code or instantly query my already existing internal mental model of it, I have to type a prompt, wait for the response, read the response, hope it understood my intent, hope it understood the code, map it to my own understanding.

kragen

That isn't my experience at all.

VMG

This is not true.

It may be bad practice, but consider that the median developer does not care at all about the internals of the dependencies that they are using.

They care about the interface and about whether they work or not.

They usually do not care about the implementation.

Code generated by LLM is not that different than pulling in a random npm package or rust crate. We all understand the downsides, but there is a reason that practice is so popular.

emushack

"Code generated by LLM is not that different than pulling in a random npm package or rust crate"

So I really hope you don't pull in packages randomly. That sounds like a security risk.

Also, good packages tend have a team of people maintaining it. How is that the same exactly?

VMG

> So I really hope you don't pull in packages randomly. That sounds like a security risk.

It absolutely is, but that is besides the point

> Also, good packages tend have a team of people maintaining it. How is that the same exactly?

The famously do not https://xkcd.com/2347/

rurp

Popular packages are regularly being used and vetted by thousands of engineers and that level of usage generally leads to subtle bugs being found and fixed. Blindly copy/pasting some LLM code is the opposite of that. It might be regurgitating some well developed code, but it's at least as likely to be generating something that looks right but is completely wrong in some way.

qudat

> They usually do not care about the implementation.

[citation needed]

> Code generated by LLM is not that different than pulling in a random npm package or rust crate

It's not random, there's an algorithm for picking "good" packages and it's much simpler than reviewing every single line of LLM code.

VMG

>> They usually do not care about the implementation. > [citation needed]

Everybody agrees that e.g. `make` and autotools is a pile of garbage. It doesn't matter, it works and people use it.

> It's not random, there's an algorithm for picking "good" packages and it's much simpler than reviewing every single line of LLM code.

But you don't need to review every single line of LLM code just as you don't need to review every single line of dependency code. If it works, it works.

Why does it matter who wrote it?

skydhash

Everything compounds. Good architecture makes it easy to maintain things later. Bad code will slow you down to a snail pace and will result in 1000s of bug tickets.

lawn

> Code generated by LLM is not that different than pulling in a random npm package or rust crate.

Yes, LLM code is significantly worse than even a random package as it very often doesn't even compile.

marcosdumay

If you as a developer care so much about stuff that the software users won't care about, you should look for better tools.

rco8786

> So I do a “coding review” session. And the horror ensues.

Yup. I've spoken about this on here before. I was a Cursor user for a few months. Whatever efficiency gains I "achieved" were instantly erased in review, as we uncovered all the subtle and not-so-subtle bugs it produced.

Went back to vanilla VSCode and still use copilot but only when I prompt it to do something specific (scaffold a test, write a migration with these columns, etc).

Cursor's tab complete feels like magic at first, but the shine wore off for me.

Izkata

> Cursor's tab complete feels like magic at first, but the shine wore off for me.

My favorite thing here watching a co-worker is when Cursor tries to tab complete what he just removed, and sometimes he does it by reflex.

manmal

What kind of guardrails did you give the agent? Like following SOLID, linting, 100% code coverage, templates, architectural documents before implementing, architectural rules, DRY cleanup cycles, code review guidelines (incl strict rules around consistency), review by another LLM etc?

breckenedge

Not the OP, but in my experience LLMs are still not quite there on guardrails. They might be for 25-50% of sessions, but it’ll vary wildly.

manmal

Depends on the LLM, recent Gemini models are quite good in this regard.

dmazin

This rings true to me.

I still use LLMs heavily. However, I now follow two rules:

* Do not delegate any deep thought to them. For example, when thinking through a difficult design problem, I do it myself.

* Deeply review and modify any code they generate. I go through it line-by-line and edit it thoroughly. I have to do this because I find that much of what they generate is verbose, overly defensive, etc. I don't care if you can fix this through prompting; I take ownership over future maintainability.

"Vibe coding" (not caring about the generated code) gives me a bad feeling. The above approach leaves me with a good feeling. And, to repeat, I am still using them a lot and coding a lot faster because of it.

chuckadams

I delegate all kinds of deep analysis to the AI, but it's to create detailed plans with specific implementation steps and validation criteria, backed by data in reproducible reports (i.e. "generate a script to generate this json data and another to render this data"). Plans have a specific goal that is reflected in the report ("migrated total should be 100%"). It's still an iterative process, the generators and plans have to be refined as it misses edge cases, but that's plans in general, AI or no.

It takes a good hour or two to draw up the plans, but it's the kind of thing that would take me all day to do, possibly several as my ADHD brain rebels against the tedium. AI can do yeomans work when it just wings it, and sometimes I have just pointed at a task and did it in one shot, but they work best when they have detailed plans. Plus it's really satisfying to be able to point at the plan doc and literally just say "make it so".

rco8786

> Deeply review and modify any code they generate. I go through it line-by-line and edit it thoroughly

This is the issue, right? If you have to do this, are you saving any time?

tasuki

(Usually) Yes. The LLM can generate three functions, over 100 lines of code, and I spend perhaps 15 minutes rearranging it so it pleases me aesthetically. It would've taken me an hour or two to write.

I find most benefit in writing tests for a yet-inexistent function I need, then giving the LLM the function signature, and having it implement the function. TDD in the age of LLMs is great!

dmazin

I know what you’re saying, but I’m saving time by: * getting ideas for how logic etc could be implemented * boilerplate is a thing of the past

The other thing is that I have the LLM make the modifications I want.

I know how long it takes to get an extremely bad programmer to do what you want, but the LLM is far better than that, so I do come out ahead.

jgilias

I believe this depends on the individual. For me, yeah, I am. But I do have colleagues who wouldn’t be.

nicodjimenez

I tend to agree with this. These days I usually use LLMs to learn about something new or to help me generate client code for common APIs (especially boto3 these days). I tried Windsurf to help me make basic changes to my docker compose files, but when it couldn't even do that correctly, I lost a little enthusiasm. I'm sure it can build a working prototype of a small web app but that's not enough for me.

For me LLMs are a game changer for devops (API knowledge is way less important now that it's even been) but I'm still doing copy pasting from ChatGPT, however primitive it may seem.

Fundamentally I don't think it's a good idea to outsource your thinking to a bot unless it's truly better than you at long term decision making. If you're still the decision maker, then you probably want to make the final call as to what the interfaces should look like. I've definitely had good experiences carefully defining object oriented interfaces (eg for interfacing with AWS) and having LLMs fill in the implementation details but I'm not sure that's "vibe coding" per se.

pizzathyme

I had a similar experience as the author. I've found found that cursor / copilot are FANTASTIC at "smart autocomplete", or "write a (small function that does this)" and quick viral prototypes.

But after I got a week into my LLM-led code base, it became clear it was all spaghetti code and progress ground to a halt.

This article is a perfect snapshot of the state of the art. It might improve in the future, but this is where it is in May 2025.

lesser23

Many places are quite literally forcing their software engineers to use LLMs. Complete with cursor/copilot is the ability to see usage statistics and surely at these companies these statistics will eventually be used as firing criteria.

I gave them a fair shake. However, I do not like them for many reasons. Code quality is one major reason. I have found that after around a month of being forced to use them I felt my skill atrophy at an accelerated rate. It became like a drug where instead of thinking through the solution and coming up with something parsimonious I would just go to the LLM and offload all my thinking. For simple things it worked okay but it’s very easy to get stuck in a loop. I don’t feel any more productive but at my company they’ve used it as justification to increase sprint load significantly.

There has been almost a religious quality associated to LLMs. This seems especially true among the worst quality developers and the non-technical morons at the top. There are significant security concerns that extend beyond simple bad code.

To me we have all the indicators of the maximum of the hype cycle. Go visit LinkedIn for confirmation. Unless the big AI companies begin to build nuclear power it will eventually become too expensive and unprofitable to run these models. They will continue to exist as turbo autocomplete but no further. The transformer model has fundamental limitations and much like neural networks in the 80s it’ll become more niche and die everywhere else. Like its cousins WYSIWIG and NoCode in 30 more years it’ll rise again like a phoenix to bring “unemployment” to developers once more. It will be interesting to see who among us was swimming without clothes when the water goes out.

alexjplant

> I have found that after around a month of being forced to use them I felt my skill atrophy at an accelerated rate

I've started a "no Copilot Fridays" rule for myself at $DAYJOB to avoid this specifically happening.

bwfan123

My use of cursor is limited to auto-complete, and small snippets. Even then, i can feel my skills atrophying.

use-it-or-lose-it is the cognitive rule.

bwfan123

This [1] is an old and brilliant article titled "On the foolishness of natural language programming" by Dijkstra relevant to this debate.

The argument is that the precision allowed by formal languages for programming, math etc were the key enabler for all of the progress made in information processing.

ie, Vibe-coding with LLMs will make coding into a black-art known only to the shamans who can prompt well.

[1] https://www.cs.utexas.edu/~EWD/transcriptions/EWD06xx/EWD667...

VMG

I get it and I see the same problems as the author.

I'm working on a few toy projects and I am using LLM for 90% of it.

The result is 10x faster than if I coded it "by hand", but the architecture is worse and somewhat alien.

I'm still keeping at it, because I'm convinced that LLM driven code is where things are headed, inevitably. These tools are just crazy powerful, but we will have to learn how to use them in a way that does not create a huge mess.

Currently I'm repeatedly prompting it to improve the architecture this way or that way, with mixed results. Maybe better prompt engineering is the answer? Writing down the architecture and guidelines more explicitly?

Imagine how the whole experience will be if the latency was 1/10th of what it is right now and the tools are 10x better.

a7fort

I hope we get to that "10x better" point. I think the problem right now is people advertising LLMs as if we're there already. And it's not just the providers, it's also the enthusiasts on X/Reddit/etc that think they have found the perfect workflow or prompt.

Just like you're mentioning "maybe better prompt engineering", I feel like we're being conditioned to think "I'm just not using it right" where maybe the tool is just not that good yet.

VMG

Well "I'm just not using it right" is a perfectly reasonable thought for a new technology. Isn't this the default? When new powerful tools come along, they often fit awkwardly into existing processes.

jfim

One thing you can do is to define the classes and methods that you want to have, and have the LLM implement them. For tricky things, you can leave additional notes in the empty method body as to how things should be implemented.

This way you're doing the big picture thinking while having the LLM do what's it's good at, generating code within the limits of its context window and ability to reason about larger software design.

I mostly treat the LLM as an overly eager to please junior engineer that types very quickly, who can read the documentation really quickly, but also tends to write too much code and implement features that weren't asked for.

One of the good things is that the code that's generated is so low effort to generate that you can afford to throw away large chunks of it and regenerate it. With LLM assistance, I wrote some code to process a dataset, and when it was too screwy, I just deleted all of it and rewrote it a few times using different approaches until I got something that worked and was performant enough. If I had to type all of that I would've been disappointed at having to start over, and probably more hesitant to do so even if it's the right thing to do.

acureau

I've found a lot of value in this approach as well, I don't delegate any architecture decisions to LLMs. I build out the high-level and I see if the LLM can fill the gaps. I've found they are good at writing pure functions, and am good at composing them and managing state.

jacob019

This is where they shine, for prototyping greenfield projects, but as the project gets closer to production that 10x erodes. You have to be really intentional about the architecture, fixing core design issues later can turn 10x into 0.1x.

jdiff

At least currently, the only use pattern that can withstand complex codebases is as advanced speech-to-text, but without the speech if that makes sense. The main issue with that is phrasing things in English is often far more verbose, so without the speech, it's very often faster to just do it by hand.

geraneum

> Writing down the architecture and guidelines more explicitly?

Yes, very explicit like “if (condition) do (action)” and get more explicit when… oh wait!

skydhash

Yeah. I never understood where people are coming with “you need guardrails, extensive architecture docs, coding rules,…”. For every software and features I wrote, I already have a good idea of the objectives before I even start to code. I do the specific part with code, going back to the whiteboard when I need to think.

It’s an iterative process, not a linear one. And the only hige commits are the scaffolding and the refactorings. It’s more like sculpture than 3d printing, a perpetual refinement of the code instead of adding huge lines of code.

This is the reason I switched to Vim, then Emacs. They allow for fast navigation, and faster editing. And so easy to add your own tool as the code is a repetitive structure. The rare cases I needed to add 10s of lines of code is with a code generator, or copy-pasting from some other file.

mrighele

I have been trying LLMs in a couple of new small projects recently.

I got more success that I hoped for, but I had to adjust my usage to be effective.

First of all, treat the LLM as a less experienced programmer. Don't trust it blindly but always make code review of its changes. This gives several benefits.

1) It keeps you in touch with the code base, so when need arise you can delve into it without too much trouble

2) You catch errors (sometimes huge ones) right away, and you can have them fixed easily

3) You can catch errors on your specification right away. Sometimes I forget some detail and I realize it only when reviewing, or maybe the LLMs did actually handle it, and I can just tell it to update the documentation

4) You can adjust little by little the guidelines for the LLM, so that it won't repeat the same "mistakes" (wrong technical decisions) again.

In time you get a feeling of what it can and cannot do, where you need to be specific and where you know it will get it right, or where you don't need to go into detail. The time required will be higher than vibe coding, but decreases over time and still better than doing by myself.

There is another important benefit for me in using an LLM. I don't only write code, I do in fact many other things. Calls, writing documentation, discussing requirements etc. Going back to writing code requires a change of mental state and to recall into memory all the required knowledge (like how is the project structured, how to use some apis etc.). If I can do two hours of coding it is ok, but if the change is small, it becomes the part where I spend the majority of time and mental energy.

Or I can ask the LLM to make the changes and review them. Seeing the code already done requires less energy and will help me reminding stuff.

tuan

> LLMs are okay at coding, but at scale they build jumbled messes.

This reminds me of the day of Dreamweaver and the like. Everybody loved how quickly they could drag and drop UI components on a canvas, and the tool generated HTML code for them. It was great at the beginning, but when something didn't work correctly, you spent hours looking at spaghetti HTML code generated by the tool.

At least, back then, Dreamweaver used deterministic logic to generate the code. Now, you have AI with the capability to hallucinate...

Ozzie_osman

> One morning, I decide to actually inspect closely what’s all this code that Cursor has been writing.

You can't abdicate your responsibility as a builder to the LLM. You are still responsible for the architecture, for the integrity, for the quality. In the same way you wouldn't abdicate your responsibility if you hired a more junior engineer.

JimDabell

> I’ve never used Go or Clickhouse before

> I have no concept of Go or Clickhouse best practices.

> One morning, I decide to actually inspect closely what’s all this code that Cursor has been writing. It’s not like I was blindly prompting without looking at the end result, but I was optimizing for speed and I hadn’t actually sat down just to review the code.

> I’m defaulting to coding the first draft of that function on my own.

I feel like he’s learnt the wrong lesson from this. There is a vast gulf between letting an LLM loose without oversight in a language you don’t know and starting from scratch yourself. There’s absolutely nothing wrong with having AI do the first draft. But it actually has to be a first draft, not something you blindly commit without review.

> “Vibe coding”, or whatever “coding with AI without knowing how to code” is called, is as of today a recipe for disaster, if you’re building anything that’s not a quick prototype.

But that’s what vibe coding is. It’s explicitly about quick throwaway prototypes. If you care about the code, you are not vibe coding.

> There's a new kind of coding I call "vibe coding", where you fully give in to the vibes, embrace exponentials, and forget that the code even exists. […] It's not too bad for throwaway weekend projects, but still quite amusing. I'm building a project or webapp, but it's not really coding - I just see stuff, say stuff, run stuff, and copy paste stuff, and it mostly works.

— https://x.com/karpathy/status/1886192184808149383

He’s basically saying that vibe coding is a disaster after his experience doing something that is not vibe coding.

jdiff

That is what vibe coding is. The tweet says "It's not too bad for throwaway projects" but that does not limit the definition, it only limits the advisable application of it.

JimDabell

Vibe coding is “Forget the code even exists; it mostly works for throwaway stuff”. That is clearly not what this person wants.

You can’t pick up vibe coding and then complain that it’s behaving as described or that it isn’t giving you something that wasn’t promised.

jdiff

Don't disagree at all. It was never promised that LLMs could do what he's trying to do. But just because he's holding it wrong doesn't mean that he's not holding it.