Why Claude Code feels like magic?

54 comments

·June 17, 2025

GardenLetter27

This feels a bit too optimistic, in practice it often gets stuck going down a rabbit hole (and burning up your requests / tokens doing it!).

Like even when I tested it on a clean assessment (albeit with Cursor in this case) - https://jamesmcm.github.io/blog/claude-data-engineer/ - it did very well in agent mode, but the questions it got wrong were worrying because they're the sort of things that a human might not notice either.

That said I do think you could get a lot more accuracy between the agent checking and running its own answers, and then also sending its diff to a very strong LLM like o3 or Gemini Pro 2.5 to review it - it's just a bit expensive to do that atm.

The main issue on real projects is that just having enough context to even approach problems, and build and run tests is very difficult when you have 100k+ lines of code and it takes 15 minutes to clean build and run tests. And it feels like we're still years away from having all of the above, plus a large enough context window that this is a non-issue, for a reasonable price.

marliechiller

I find the use of the word intelligence to be a bit of a misnomer. Is something intelligent if all its doing is pattern matching? Is the evolution that led to owl butterflies appearing like an owl intelligent? Im not sure.

As an aside, its amusing we simulatenously have this article on the front page as well as [Generative AI coding tools and agents do not work for me](https://news.ycombinator.com/item?id=44294633) also on the front page. LLMs are really dividing the community at the moment and its exhausting to keep up with what I (as a dev) should be doing to stay sharp

viraptor

Keep in mind that as usual, mostly the extreme views are getting posted. The urge to both write and click on "sometimes I find LLMs useful for partial solutions in the right context" is low compared to "AI will replace all developers in 2 years". It may not be as dividing as we read here. It certainly isn't, when looking at what my co-workers do. You can chill and learn it like any other new tech. (Without following every detail day to day)

jorvi

I will die on the hill that for the foreseeable future, LLMs inside an IDE are just fancy autocomplete.

In a more general interface they're also nice for getting a birds-eye view on a topic you're unfamiliar with.

However, just as a counterexample of how dumb they really are: I asked both Gemini 2.5 Pro and Opus 4 if there were any extra settings for VSCode's UI density and without hesitation both of them made up a bunch of 'window.density' settings.

If they can't even get something so extremely basic and well-documented right, how are you going to trust them with giving you flawless C or Typescript?

ojosilva

Well the article briefly addresses this: it's about the iteration: given a problem and sufficient processing power we can attain an intelligent,correct answer by quickly iterating from prompt to results.

There's also a measurement vector for zero-shot LLM responses. But excelling at zero-shot is not a requirement for making LLMs useful.

The market is pointing the way, agents increase iteration capabilities, increasing usefulness. Reasoning models/architectures are another example where iterations make advances - the LLM iterates "in-band" and self-evals so that there's a better chance of a correct outcome.

All that in a mere 3.5 years since launch. To call it an autocomplete is very short sighted. Even if we reached LLMs ceiling, the choice of AI-oriented workflows (TTS, TDD, YOLO...), tooling, protocols and additional architecture adjustments (gigantic context windows, instant adaptors, speed, etc) will make up for any lack of precision the same way we work around human flaws to help us succeed in most tasks.

rolisz

I trust them more with Typescript because there's a compiler that gives them feedback and that has been used for training LLMs.

0x416c6578

| its exhausting to keep up with what I (as a dev) should be doing to stay sharp

That for me is the biggest thing I am feeling about LLMs at the moment, things are moving so quickly but to what end? I know this industry is constantly evolving and in some ways that is very exciting but I also feel like it is this exponential runaway that requires very deliberate attention focused on the bleeding edge to stay relevant, when a lot of my time in my day job doesn't facilitate this (which I have identified and have made the effort and will be changing company in a month).

My own two cents on LLMs (as a junior / low mid level early career software engineer) is that they work best as a better version of Google for any well explored issue, and being able to talk through problems in a conversational manner has been a game changer. But I do fear sometimes that I am not gaining the same amount of knowledge as I would before LLMs became mainstream, it's a shortcut that in the long run I fear is going to reduce the average problem solving ability and original / novel thinking ability of software engineers (whether that is even a requirement in most SWE jobs is up for debate).

vaylian

> its exhausting to keep up with what I (as a dev) should be doing to stay sharp

I have observed the JavaScript ecosystem producing one new framework after another. I decided to wait for the dust to settle. Turns out vanilla.js is still fine for the things I need to do.

ozim

I think this staying sharp is FOMO instilled by influencers and people selling guides/courses. Most of the stuff will be implemented by Anthropic, OpenAI etc.

You can run local models but it is like playing matchbox cars in your backyard and imagining you will be F1 driver some day.

Big guys have APIs you pay for to do serious work that’s all you need to know.

conartist6

Running models at all is playing with matchbox cars. If you want to play in the big leagues, you have to become the model.

jackstraw42

A bit unfair to call local models Matchbox cars compared to F1. There are plenty of uses for LLMs locally that don't require the largest models, it's not like it has to be all-or-nothing. For example, as a general browser assistant to help summarize articles, explain context, etc. the gemma-3-4B model does very well and is lightning fast on my old 3060 Ti.

csomar

> I find the use of the word intelligence to be a bit of a misnomer. Is something intelligent if all its doing is pattern matching? Is the evolution that led to owl butterflies appearing like an owl intelligent? Im not sure.

Is a random number generator intelligent? I don't think people perceive or understand intelligence equally. I don't think we have an answer to what exactly is intelligence or how to create it.

> LLMs are really dividing the community at the moment and its exhausting to keep up with what I (as a dev) should be doing to stay sharp

You could try at your comfortable pace. I only started using agents very recently. The dangerous thing is to go to extremes (all in on AI or completely refusing the tech)

j_crick

> Is something intelligent if all its doing is pattern matching?

Aren’t we humans doing just that either? If yes, then what?

marliechiller

Personally, I dont think so. I can understand a mathmatical axiom and reason with it. In a sequence of numbers I will be able to tell you N + 1, regardless of where N appears in the sequence. An LLM does not "know" this in the way a human does. It just applies whatever is the most likely thing that the training data suggests.

j_crick

But technically you can do that only because you recognize the pattern, because the pattern (sequence) is there and you were taught that it’s a pattern and how to recognize it. Publicly available LLMs of now are taught different patterns, and are also constrained by how they are made.

Maybe there’s something for LLMs in reflection and self-reference that has to be “taught” to them (or has to be not blocked from them if it’s already achieved somehow), and once it becomes a thing they will be “cognizant” in the way humans feel about their own cognition. Or maybe the technology, the way we wire LLMs now simply doesn’t allow that. Who knows.

Of course humans are wired differently, but the point I’m trying to make is that it’s pattern recognition all the way down both for humans and LLMs and whatnot.

monista

Would it surprise you seeing eg. seeing on the front page articles about both Nobel prize winners and Darwin award winners? What is intelligence, after all? We expect AI to be as smart as Einstein or Terens Tao, but so far, we see that LLMs are pretty good at behaving just like humans, that is, most times stupid.

talles

When talking about AI, intelligence is meaningless if you don't defined it beforehand. The common sense meaning of intelligence fails on this kind of discussion.

Uehreka

I keep seeing the same “middlebrow dismissals” of LLMs in HN comments, it’s getting pretty repetitive to have to cover all of this over and over, but here goes (I recognize GP is only saying one of these, I’m just trying to preempt the others).

- “LLMs don’t have real intelligence” - We as a society don’t have a rigorous+falsifiable consensus on what “intelligence” is to begin with. Also many things that we all agree are not intelligent (cars, CPUs, egg timers, etc.) are still useful.

- “But people are claiming they’re intelligent and that they’re AGI” - OK, well what if those people are wrong but LLMs are still useful for many things? Not all LLM users are AGI believers, many aren’t.

- “But people are forcing me to use them.” - They shouldn’t do that, that’s bad. It doesn’t mean LLMs are bad.

- “They’re just pattern-matchers, stochastic parrots, they can’t generalize outside their training data.” - All the academic arguments I’ve seen about this become irrelevant when I ask an LLM to write me code in a really esoteric programming language and it succeeds. I personally don’t think this is true, but if in fact they are categorically no more than pattern-matchers, then Pattern Matching Is All You Need to do many many jobs.

- “I have an argument why they are categorically useless for all tasks” - the existence of smart people using these things of their own accord, observing the results and continuing to use them should put a serious dent in this theory.

- “They can’t do my whole job” - OK, what if they can help you with part of your job?

- “I’m a programmer. If I use an AI Assistant, but still have to review its code, I haven’t saved any time.” - This can’t be categorically disproven, but also isn’t totally true, and in the gaps in this argument lie amazing things if you’re willing to keep an open mind.

- “They can’t do arithmetic, how can they be expected to do everyday tasks.” - I’ll admit that it’s weird that LLMs are useful despite failing at arithmetic, but they are. Rain Man had trouble with everyday tasks, how could he be expected to do arithmetic? The world is counterintuitive sometimes.

- “They can’t help me with any of my job, I do surgery all day” - Thank you and my condolences. Please be aware though that many jobs out there aren’t surgery.

- “The people who promote them are annoying. I call them ‘influencers’ to signal that they are not hackers like us.” - Many good things have annoying fans, if you follow this logic to its conclusion you will miss out on many good things.

- “I’ve tried them, I’ve tried them in a variety of ways, they’re just really not for me.” - That’s fine. I’d still recommend checking in on the field later on, but I can totally admit that these things can take some finagling to get right, and not everyone has time. They will get easier to use in the future.

- “No they won’t, we’ve hit a plateau! Attention isn’t all you need!” - If all LLM development were to stop today, all AI cloud services shut down and only the open weights LLMs were left, I predict we’d still be finding novel usage patterns for them for the next 3-5 years.

bgwalter

> What other tasks could be automated today with the current LLMs performance?

CEO speeches and pro-LLM blogs come to mind.

Again, there is a vague focus on "updating dependencies" where allegedly some time was saved. Take that to the extreme and we don't need any new software. Freeze Linux and Windows, do only security updates and fire everyone. Because the ultimate goal of LLM shills or self-hating programmers appears to be to eliminate all redundant work.

Be careful what you wish for. They won't reward you for shilling or automating, they'll just fire you.

msgodel

The primary use seems to be satisfying administrative demands that were never productive anyway.

Eddy_Viscosity2

This. They've been pushing these at my workplace and the only thing I can think to use it for is have the LLMs generate empty long-winded corporate-speak emails that I can send to managers when they ask for things that seem best answered by an empty long-winded corporate-speak email. Like "How are using using all these AI tools we are forcing on you without asking if you needed or wanted them?"

stpedgwdgfhgdd

The recent developments are impressive. I’m now using my IDE as a diff viewer. Everything goes through the terminal. If there is an error, CC can analyse and fix it.

Still needs a lot of handholding. I do not (yet) think big upfront plans will suddenly start working in the enterprise world. Let it write a failing test first.

arpowers

Has anyone actually gotten productivity improvements from Claude Code?

What’s the use case?

(I tried some things, and it blew up. Thus far my experience w agents in general)

ryandvm

I have used it on a fairly simple Kotlin Android application and was blown away. I have previously been using paid ChatGPT, Github Copilot, and Gemini. In my opinion, it's the complete access to your repo that really makes it powerful, whereas with the other plugins you kind of have to manually feed it the files in your workspace and keep them in sync.

I asked it to add Google Play subscription support to my application and it did, it required minimal tweaking.

I asked it to add a screen for requesting location permissions from the user and it did it perfectly. No adjustment.

I also asked it add a query parameter to my API (GoLang) which should result in a subtle change several layers deep and it had no problems with that.

None of this is rocket science and I think the key is that it's all been done and documented a million times on the Internet. At this point, Claude Code is at least as effective as junior developer.

Yes, I understand that this is a Faustian bargain.

lokimedes

Well, Intelligence is arguably represented in a “prior” that skews the result to an optimum faster, with fewer iterations. What the article is describing as intelligence is exactly the opposite, it’s just brute force.

ajkjk

normal English would be "Why does Claude Code feel like magic?"

edit: or "Why Claude Code feels like magic" without the ?.

ed_mercer

> What if Claude Code operated autonomously with massive parallel compute?

Afaik this is not possible as LLMs have linear conversations.

weiliddat

I guess if we interpreted it charitably, maybe every time there's a decision to be made, it just forked itself and ran with possible inputs it expects?

I would say that's how some devs operate too. Instead of waiting for the product/customer to come back, let's predict how they might think and make a couple of possible solutions and iterate over them. Some might be dead ends, we can effectively prune them, some might lead to more forks, some might lead down linear paths. But we can essentially get more coverage before really needing some input.

We might argue that it already does that in its chain-of-thought, or agent mode, but having a dedicated "forked" checkpoint lets us humans then check and rewind time in that sense.

revskill

LLM reflects YOUR intelligence, it's the secret truth.

rvnx

Many of the complainers don't know how to use them and how to write prompts, and then blame the LLMs.

Or simply use LLMs that struggle at writing good code (GPT, Gemini Pro, etc).

You need to be in the shoes of a product owner, and be able to express your requirements clearly and drive the LLM in your direction, and this requires to learn new skills (like kids learn how to use search engines).

timr

> Or simply use LLMs that struggle at writing good code (GPT, Gemini Pro, etc).

I love how one side of this debate seems to have embraced "No True Scotsman" as the preferred argument strategy. Anyone who points out that these things have practical limitations gets a litany of "oh you aren't using it right" or "oh, you just aren't using the cool model" in response. It reminds me of the hipsters in SF who always felt your music was a little too last week.

As someone who is currently using these every day, Gemini Pro is right up there with the very best models for writing code -- and "GPT" is not a single thing -- so I have no idea what you're talking about. These things have practical limitations.

rvnx

cainxinth

Just like all the people who think their LLM is sentient or an alien or a god are really just talking to themselves.

goodpoint

If anything it reflects the intelligence of the people whose work is being stolen.

rvnx

Ycombinator is an accomplice of this, and you know, all they will get is billions of tainted money as punishment. But I guess they can live with that.

talles

Technology feels like magic when you don't understand it.

amelius

Even more so when even the creators of the technology don't understand it.

mmh0000

The creators understand it well. The math is pretty a lot, but, you can literally do it with pen and paper. There are plenty of blog[1] posts showing the process.

Anyone claiming AI is a black box no one understands is a marketing-level drone trying to sell something that THEY don't understand.

[1] https://explainextended.com/2023/12/31/happy-new-year-15/

amelius

No, they only understand it on a superficial level. The behavior of these systems emerges from simpler stuff, yes, but the end result is difficult to reason about. Just have a look at Claude's prompt [1] that leaked some time ago, and which is an almost desperate attempt of the creators to nudge the system into a certain direction and make it not say the wrong things.

We probably need a New Kind of Soft Science™ to fill this gap.

[1] https://simonwillison.net/2025/May/25/claude-4-system-prompt...

revskill

Where did u master humor from ?

kypro

Kinda reminds me of something my old AI professor used to say, "every problem is a search problem".

Intelligence is really just a measure of ones's ability to accurately filter and iterate over the search space.

Evolution is one extreme where the heuristic is poor so it must do a huge amount of iteration over many bad solution to find reasonably good solutions. Then on the other hand you have expert systems which are great at refining the search space to always deliver quality answers, but filter too much and are therefore too narrow so lack the creativity and nuance of real intelligence.

LLMs provide good heuristics and agents with verifiable goals allow for iteration. This combination results in a system which demonstrates significantly more intelligent than either of its parts.

conartist6

Search is eventually an existential and philosophical problem. How do you know if you have found what you are searching for? How do you know how long you can afford to keep searching if you don't find it? An LLM lacks even the intelligence of a cat or a mouse if you stop treating the intelligence of the human using it as its intelligence.

To that I add this:

Every single LLM user is a hyperintelligent ultraproductive centaur if I understand correctly, so how is it possible that I, as a made-of-meat individual, am kicking the ass of several whole world-class teams of these LLM-using centaur-y juggernauts? It shouldn't be possible, right?

But I'm human, so it is

sylware

Is claude able to write rv64 assembly code?

For instance can you asked for a vector-based quicksort? Well, with a "vector size unit" of a "standard" cache line, namely 512bits/64bytes (rv22+ profile).

Veen

You could just ask it:

https://claude.ai/public/artifacts/5f4cb680-9a99-4781-8803-9...

(No idea how good that is. I just gave it your comment)

tomashubelbauer

And if you use Claude Code you can also tell it to compile it and test it and it will keep fixing problems until it gets it right or gives up or spirals to a dead end.

sylware

You can give a set of test cases?

sylware

I use noscript/basic (x)html browsers: I get only a 'enable javascript' thingy. Is there a clean web portal for this? Or could you pastebin that stuff on a decently implemented online service like https://paste.c-net.org ? Thx!

mavhc

https://github.com/simonw/llm