Skip to content(if available)orjump to list(if available)

Typed languages are better suited for vibecoding

timuckun

It's been my experience that strongly opinionated frameworks are better for vibe coding regardless of the type system.

For example if you are using rails vibe coding is great because there is an MCP, there are published prompts, and there is basically only one way to do things in rails. You know how files are to be named, where they go, what format they should take etc.

Try the same thing in go and you end up with a very different result despite the fact that go has stronger typing. Both Claude and Gemini have struggled with one shotting simple apps in go but succeed with rails.

topato

This is pretty anecdotal, but it feels like most of the published rails source code you find online (and by extension, an LLM has found) is from large, stable, and well-documented code.

rafamvc

Claude code with rails is amazing. Should out to Obie for the Claude on rails. Works phenomenally well.

delifue

In my experience Gemini can one-shot go apps. Determining it requires sound eval instead of anecdotes.

EGreg

Basically it's like this:

the more constraints you have, the more freedom you have to "vibe" code

and if someone actually built AI for writing tests, catching bugs and iterating 24/7 then you'd have something even cooler

woodruffw

> I am managing projects in languages I am not fluent in—TypeScript, Rust and Go—and seem to be doing pretty well.

This framing reminds me of the classic problem in media literacy: people know when a journalistic source is poor when they’re a subject matter expert, but tend to assume that the same source is at least passably good when less familiar with the subject.

I’ve had the same experience as the author when doing web development with LLMs: it seems to be doing a pretty good job, at least compared to the mess I would make. But I’m not actually qualified to make that determination, and I think a nontrivial amount of AI value is derived from engineers thinking that they are qualified as such.

js2

[delayed]

muglug

Yup — this doesn't match my experience using Rust with Claude. I've spent 2.5 years writing Rust professionally, and I'm pretty good at it. Claude will hallucinate things about Rust code because it’s a statistical model, not a static analysis tool. When it’s able to create code that compiles, the code is invariably inefficient and ugly.

But if you want it to generate chunks of usable and eloquent Python from scratch, it’s pretty decent.

And, FWIW, I’m not fluent in Python.

null

[deleted]

bravesoul2

Why I only use it on stuff I can properly judge.

giantrobot

woodruffw

Thank you! I couldn’t remember the term.

lukev

As has been said, actual evals are needed here.

Anecdotally, the worst and most common failure mode of an agent is when an agent starts spinning its wheels and unproductively trying to fix some error and failing, iterating wildly, eventually landing on a bullshit (if any) “solution”.

In my experience, in Typescript, these “spin out” situations are almost always type-related and often involve a lot of really horrible “any” casts.

resonious

Right, I've noticed agents are very trigger happy with 'any'.

I have had a good time with Rust. It's not nearly as easy to skirt the type system in Rust, and I suspect the culture is also more disciplined when it comes to 'unwrap' and proper error management. I find I don't have to explicitly say "stop using unwrap" nearly as often as I have to say "stop using any".

smackeyacky

Experienced devs coming in to TypeScript are also trigger happy with 'any' until they work out what's going on. Especially if they've come from Javascript.

energy123

The question can be asked two ways:

(1) Are current LLMs better at vibe coding typed languages, under some assumptions about user workflow?

(2) Are LLMs as a technology more suited to typed languages in principle, and should RL pipelines gravitate that way?

mewpmewp2

This is why I have very specific ruleset and linting for my LLMs, not allowing any at all and other quality checks.

Mtinie

Is this a shareable ruleset? I would completely understand if not but I’m interested in learning new ways to interact with my tools.

jbellis

I'm really shocked at how slow people are to realize this, because it's blindingly obvious. I guess that just shows how much the early adopter crowed is dominated by python and javascript.

(BTW the answer is Go, not Rust, because the other thing that makes a language well suited for AI development is fast compile times.)

woodruffw

My experience with agent-assisted programming in Rust is that the agent typically runs `cargo check` instead of `cargo build` for this exact reason -- it's much faster and catches the relevant compilation errors.

(I don't have an opinion on one being better than the other for LLM-driven development; I've heard that Go benefits from having a lot more public data available, which makes sense to me and seems like a very strong advantage.)

anupshinde

I am comfortable with both Python and Go. I prefer Go for performance; however, the earlier issue was verbosity.

It is easier to write things using a Python dict than to create a struct in Go or use the weird `map[string]interface{}` and then deal with the resulting typecast code.

After I started using GitHub Copilot (before the Agents), that pain went away. It would auto-create the field names, just by looking at the intent or a couple of fields. It was just a matter of TAB, TAB, TAB... and of course I had to read and verify - the typing headache was done with.

I could refactor the code easily. The autocomplete is very productive. Type conversion was just a TAB. The loops are just a TAB.

With Agents, things have become even better - but also riskier, because I can't keep up with the code review now - it's overwhelming.

jjcm

I've noticed a fairly similar pattern. I particularly like vibecoding with golang. Go is extremely verbose, which makes it almost like an opposite perl - writing go is a bad experience, but reading go is delightful. The verbosity of golang makes it so you're able to always jump in and understand context, often from just a single file.

Pre-llms, this was an up front cost when writing golang, which made the cost/benefit tradeoff often not worth it. With LLMs, the cost of writing verbose code not only goes down, it forces the LLM to be strict with what it's writing and keeps it on track. The cost/benefit tradeoff has increased greatly in go's favor as a result.

herrington_d

The logic above can support exactly the opposite conclusion: LLM can do dynamic typed language better since it does not need to solve type errors and save several context tokens.

Practically, it was reported that LLM-backed coding agents just worked around type errors by using `any` in a gradually typed language like TypeScript. I also personally observed such usage multiple times.

I also tried using LLM agents with stronger languages like Rust. When complex type errors occured, the agents struggled to fix them and eventually just used `todo!()`

The experience above can be caused by insufficient training data. But it illustrates the importance of eval instead of ideological speculation.

mithras

In my experience you can get around it by having a linter rule disallowing it and using a local claude file instructing it to fix the linting issues every time it does something.

vidarh

You can equally get around a significant portion of the purported issues with dynamically typed languages by having Claude run tests, and try to run the actual code.

I have no problem believing they will handle some languages better than others, but I don't think we'll know whether typing makes a significant difference vs. other factors without actual tests.

herrington_d

it does not always work in my experience due to complex type definitions. Also extra tool calls and time are needed to fix linting.

MattGaiser

Or just bad training data. I've seen "any" casually used everywhere.

linkage

This claim needs to be backed up by evals. I could just as well argue the opposite, that LLMs are best at coding Python because there are two orders of magnitude more Python in their training sets than C++ or Rust.

In any case, you can easily get most of the benefits of typed languages by adding a rule that requires the LLM to always output Python code with type annotations and validate its output by running ruff and ty.

yibers

I agree that the training sets for LLMs have much more training data for Python than for Rust. But C++ has existed before Python I believe. So I doubt there is 2 orders of magnitude of Python code more than C++.

vidarh

It's not just a question of whether there is more actual code in a given language, but how much is available in the public and private training data.

I've done work on reviewing and fine-tuning training data with a couple of providers, and the amount of Python code I got to see at least out-distanced C++ code by far more than 2 orders of magnitude. It could be a heavily biased sample, but I have no problems believing it also could be representative.

hibikir

You miss how many fewer programmers were there in the early years, how much of that code was ever public, and even if it was, how useful it was, as C++ has changed drastically since, say, what we used to write in 2001.

dccsillag

I think you vastly overestimate the capacity of Python typing.

chrisjharris

I've been wondering about this for some time. My initial assumption was that would be that LLMs will ultimately be the death of typed languages, because type systems are there to help programmers not make obvious mistakes, and near-perfect LLMs would almost never make obvious mistakes. So in a world of near-perfect LLMs, a type system is just adding pointless overhead.

In this current world of quite imperfect LLMs, I agree with the OP, though. I also wonder whether, even if LLMs improve, we will be able to use type systems not exactly for their original purpose but more as a way of establishing that the generated code is really doing what we want it to, something similar to formal verification.

ImprobableTruth

Even near-perfect LLMs would benefit from the compiler optimizations that types allow.

However perfect LLMs would just replace compilers and programming languages above assembly completely.

exclipy

The closest we got to vibe coding pre-LLMs was using a language with a very good strong type system in a good IDE and hitting Ctrl-Space to autocomplete your way to a working program.

I wonder if LLMs can use the type information more like a human with an IDE.

eg. It generates "(blah blah...); foo." and at that point it is constrained to only generate tokens corresponding to public members of foo's type.

Just like how current gen LLMs can reliably generate JSON that satisfies a schema, the next gen will be guaranteed to natively generate syntactically and type- correct code.

koolba

> I wonder if LLMs can use the type information more like a human with an IDE.

Just throw more GPUs at the problem and generate N responses in parallel and discard the ones that fail to match the required type signature. It’s like running a linter or type check step, but specific to that one line.

xwolfi

We have infinite uranium anyway !

treyd

You already can use LLM engines that force generation according to an arbitrary CFG definition. I am not aware of any systems that apply that to generating actual programming language code.

null

[deleted]

nu11ptr

Everything said is true without AI as well, at least for me. I don't hate Python, and I like it for very small scripts, but for large programs the lack of static type makes it much to brittle IMO. Static typing gives the confidence that not every single line needs testing, which reduces friction during the lifecycle of the code.

SteveJS

I think this is true -- especially for new code.

I did this not knowing any rust: https://github.com/KnowSeams/KnowSeams and rust felt like a very easy to use a scripting language.

xwolfi

That seems a little bit dangerous, why not do it in a language you know ? Plus, this is not launching rockets on the moon, it's a sentence splitter with a fancy state machine (probably very useful in your niche, not a critique) - the difficulty was for you to put the effort to build a complicated state machine, the rest was frankly... not very LLM-needing and now you can't maintain your own stuff without Nvidia burning uranium.

Did the LLM help at all in designing the core, the state machine itself ?

SteveJS

Nah it was a hobby project because I was laid off for a bit.

Rust's RegEx was perfect because it doesn't allow anything that isn't a DFA. Yes-ish, the LLM facilitated designing the state machine, because it was part of the dev-loop I was trying out.

The speed is primarily what enabled finding all of the edge cases I cared about. Given it can split 'all' of a local project gutenberg mirror in < 10 seconds on my local dev box I could do things I wouldn't otherwise attempt.

The whole thing is there in the ~100 "completed tasks" directory.