AI Is Like a Crappy Consultant
76 comments
·May 13, 2025phillipcarter
Some of the phenomenon described in this post are felt a lot when using AI.
My own anecdote with a codebase I'm familiar with is indeed, as the article mentions, it's a terrible architect. The problem I was solving ultimately called for a different data structure, but it never had that realization, instead trying to fit the problem shape into an existing, suboptimal way to represent the data.
When I mentioned that this part of the code was memory-sensitive, it indeed wrote good code! ...for the bad data structure. It even included some nice tests that I decided to keep, including memory benchmarks. But the code was ultimately really bad for the problem.
This is related to the sycophancy problem. AI coding assistants bias towards assuming the code they're working with is correct, and that the person using them is also correct. But often neither is ideal! And you can absolutely have a model second-guess your own code and assumptions, but it takes a lot of persistent work because these damn things just want to be "helpful" all the time.
I say all of this as a believer in this paradigm and one who uses these tools every day.
marcosdumay
> This is related to the sycophancy problem.
No, this is way more fundamental than the sycophancy. It's related to the difficulty of older AI to understand "no".
Unless it sees people recommending that you change your code into a different version, it has no way to understand that the better code is equivalent.
sdoering
Would it help prompting (and adapting the system prompts for the coding assistants) accordingly? Like:
> Do not assume the person writing the code knows what they are doing. Also do not assume the code base follows best practices or sensible defaults. Always check for better solutions/optimizations where it makes sense and check the validity of data structures.
Just a quick draft. Would probably need waaaaaay more refinement. But might this help at least mitigating a bit of the felt issue?
I always think of AI as an overeager junior dev. So I tend to treat it that way when giving instructions, but even then...
... well, let's say the results are sometimes interesting.
phillipcarter
Yeah, that's what I do now -- and some coworkers have noted that it can often help with biasing towards design system components if you prompt it to do that -- but similarly, but the challenge here is that the level of pushback I want from the AI depends on several factors that aren't encodable into rules. Sometimes the area of the code is exactly the way it should be, and sometimes I know exactly what to do! Or it's somewhere in between. And it's a lot of work to have a set of rules that plays well here.
SAI_Peregrinus
A central issue is that specifying what you require is difficult. It's hard for non-programmers to specify what they want to programmers, it's hard for people (programmers or not) to specify what they want to AIs, it's hard to specify exactly what requirements your system has to a model checker, etc. Specifying requirements isn't always the hardest part of making software, but it often is and it's not something with purely technical solutions.
layer8
Yes. The biggest issue with LLMs is their tunnel vision and general lack of awareness. They lack the ability to go meta, or to "take a step back" on their own, which given their construction isn't surprising. Adjusting the prompts is only a hack and doesn't solve the fundamental issue.
franktankbank
They are designed for executives. Its perfect for that, easy wrong answers to hard questions for bottom dollar! Get that bonus and bounce, how could it fail? /s
realbenpope
In six months AI has gone from an idiot savant intern to a crappy consultant. I'd call that progress.
unyttigfjelltol
It's never been completely safe to just do things you found on the Internet. Attach another Rube Goldberg machine to the front, this doesn't fundamentally change.
AI accelerates complex search 10x or maybe 100x, but still will occasionally respond to recipe requests by telling you to just substitute some anti-matter for extra calories.
bayindirh
> but still will occasionally respond to recipe requests by telling you to just substitute some anti-matter for extra calories.
or emit (or spew) pages of training data or output when you "please change all headers to green", which I experienced recently.
echelon
AI is a million times better than Google search. I don't see how it doesn't replace Google search in a few years.
AI code completion is god mode. While I seldom prompt for new code, AI code autocompletion during refactoring is 1000x faster than plumbing fields manually. I can do extremely complicated and big refactors with ease, and that's coming from someone who made big use of static typing, IDEs, and AST-based refactoring. It's legitimately faster than thought.
And finally, it's really nice to ask about new APIs or pose questions you would normally pour over docs or Google and find answers on Stack Overflow. It's so much better and faster.
We're watching the world change in the biggest way since smartphones and the internet.
AI isn't a crappy consultant. It's an expansion of the mind.
bayindirh
AI is just a weighted graph which stands on shoulders of a million giants. However, it can't cite, can't fact check, doesn't know when it's hallucinating, and its creators doesn't respect any of the work which they need to feed to that graph to make it to fake its all knowledgeable accent.
Tech is useful, how it's built is very unethical, and how it's worshiped is sad.
danielbln
Modern LLM offerings can use tools, including search, and that (like most good RAG) enables citation and fact checking. If you use LLMs like it's late 2022 and you just opened ChatGPT, then that's not indicative on how you should be using LLMs today.
jaoane
I think you are a bit outdated, since state of the art AIs can cite and fact check just fine.
ninetyninenine
You’re still in denial and possibly behind. AI cites stuff all the time and has become agentic.
On the opposite end of the spectrum of worshippers there are naysayers and deniers. It’s easy to see why there are delusional people at both ends of the spectrum.
The reason is that the promise of AI both heralds an amazing future of machines and a horrible future where machines surpass humanity.
skydhash
> AI code autocompletion during refactoring is 1000x faster than plumbing fields manually. I can do extremely complicated and big refactors with ease, and that's coming from someone who made big use of static typing, IDEs, and AST-based refactoring. It's legitimately faster than thought.
Unless you know Vim!
bayindirh
> Unless you know Vim!
or the IDE (or text editor for that matter) well. People don't want to spend time understanding, appreciating and learning the tool they use, and call them useless...
codechicago277
This is true. Just like a crappy consultant, AI lets you offload the repetitive, monotonous work so that you can focus your time on the big architectural problems. Of course you can write a better function if you spend a lot of time on it, but there’s magic in just letting the AI write the off the shelf version and move on.
skydhash
Where are these repetitive, monotonous work so I can se send a job application there.
Even on a greenfield project, I rarely spend more than a day setting up the scaffolding and that’s for something I’ve not touched before. And for refactoring and tests, this is where Vim/Emacs comes in.
ninetyninenine
What humanity has achieved here is incredible. We couldn’t even build an idiot for decades.
What you’re referring to is popular opinion. AI has become so pervasive in our lives that we are used to it and the magnitude of achievement has been lost on us. The fact that it went from stochastic parrot to idiot savant to crappy consultant is from people in denial about reality and then slowly coming to terms with it.
In the beginning literally everyone on HN called it a stochastic parrot with the authority of an expert. Clearly they were all wrong.
null
SketchySeaBeast
Oh, it's still a stochastic parrot. What changed is that people realized it didn't have the authority of an expert. What's a stochastic parrot with dubious authority? It's a crappy consultant.
bee_rider
Were they wrong to call it a stochastic parrot, or was there some wrong implication about the usefulness of such a parrot?
biophysboy
I use LLMs regularly, but like a crappy consultant, their solutions are often not incisive enough. The answer I get is frequently 10x longer than I actually want. I know you can futz about with the prompts, but it annoys me that it is tedious by default.
amarcheschi
With gemini, even if I implore it to not make additional safety checks I usually get a shitton of superfluous code that performs those checks i didn't want to. More often than not, using it for entire chunks everything makes the whole thing much more verbose than necessary - given that sometimes these checks make sense, but often they're really superfluous and add nothing of value
Zambyte
Interesting! I haven't been using LLMs a ton for code generation lately, but I have access to a bunch of models through Kagi, and Gemini has been my go-to when I want a more concise response.
amarcheschi
I don't know why though, it's quite annoying but not so annoying that I feel I need to switch. Given that I'm just following a uni course which requires code to not be read again - if not by colleagues in my group - I leave the safety slop and put the burden of skipping 70% of the code on the shoulders of my colleagues which will read my code.
Then they put my code into chatgpt or whatever they use and ask it to adapt to their code
After a while we (almost) all realized that was just doing a huge clusterfuck
BTW, I think it would have been much better to start from scratch with their own implementation given we're analyzing different datasets. And it might not make sense to try to convert the code for a dataset structure to another. A colleague didn't manage to draw a heatmap with my code and a simple csv for God know what reasons. And I think just asking a plot from scratch from a csv would be quite easy for a llm
bulatb
Many companies are perfectly ok with crappy results from a human consultant. Getting those results for fractions of a cent per token? They're on that like flies on a...crappy consultant.
dgb23
There are tricks one can use to mitigate some of the pitfalls when using either a conversational LLM or a code assistant.
They emerge from the simple assumptions that:
- LLMs fundamentally pattern match bytes. It's stored bytes + user query = generated bytes.
- We have common biases and instinctively use heuristics. And we are aware of some of them. Like confirmation bias or anthropomorphism.
Some tricks:
1. Ask for alternate solutions or let them reword their answers. Make them generate lists of options.
2. When getting an answer that seems right, query for a counterexample or ask it to make the opposite case. This can sometimes help one to remember that we're really just dealing with clever text generation. In other cases it can create tension (I need to research this more deeply or ask an actual expert). Sometimes it will solidify one of the two, answers.
3. Write in a consistent and simple style when using code assistants. They are the most productive and reliable when used as super-auto-complete. They only see the bytes, they can't reason about what you're trying to achieve and they certainly can't read your mind.
4. Let them summarize previous conversations or a code module from time to time. Correct them and add direction whenever they are "off", either with prompts or by adding comments. They simply needed more bytes to look at to produce the right ones at the end.
5. Try to get wrong solutions. Make them fail from time to time, or ask too much of them. This develops a intuition for when these tools work well and when they don't.
6. This is the most important and reflected in the article: Never ask them to make decisions, for the simple fact that they can't do it. They are fundamentally about _generating information_. Prompt them to provide information in the form of text and code so you can make the decisions. Always use them with this mindset.
_fat_santa
I think there's something poetic about that fact that you can go on some AI prompt subreddits and have folks there make posts about turning ChatGPT into an "super business consultant" and then go over hear to read about how it's actually pretty bad at that.
But back on point, I found AI works best when given a full set of guardrails around what it should do. The other day I put it to work generating copy for my website. Typically it will go off the deep end if you try to make it generate entire paragraphs but for small pieces of text (id say up to 3 sentences) it does surprisingly well and because it's outputting such small amounts of text you can quickly make edits to remove places where it made a bad word choice or didn't describe something quite right.
But I would say I only got ChatGPT to do this after uploading 3-4 large documents that outline my product in excruciating detail.
As for coding tasks again it works great when given max guardrails. I had several pages that had strings from an object and I wanted those strings to be put back in the code and taken out of the object. This object has ~500 lines in it so it would have taken all day but I ended up doing it in about an hour by having AI do most of the work and just going in after the fact and verifying. This worked really well but I would caution folks that this was a very very specific use case. I've tried vibe coding once for shits and giggles and I got annoyed and stopped after about 10 minutes, IMHO if you're a developer at the "Senior" level, dealing with AI output is more crumbsome than just writing the damn code yourself.
pizzafeelsright
As a once crappy consultant I would say no.
Instant answers, correct or not.
Cheaper per answer by magnitudes.
Solutions provided with extensive documentation.
malfist
> Solutions provided with extensive documentation.
Solutions provided with extensive _made up_ documentation.
bingemaker
At the moment, I use Windsurf to explain me how a feature is written and how to do 3rd party integrations. I ask for the approach and I write the code myself. Letting AI write the code has become very unproductive over the period of time.
I'm still learning though
ramesh31
>I ask for the approach and I write the code myself. Letting AI write the code has become very unproductive over the period of time.
Ask it to write out the approach in a series of extensive markdown files that you will use to guide the build-out. Tell it to use checklists. Once you're happy with the full proposal, use @file mentions to keep the files in context as you prompt it through the steps. Works wonders.
esafak
I find that if you talk about architecture it can give excellent advice. It can also refactor in accordance with your existing architecture. If you do not bring up architecture I suppose it could use a bad one though I have not had that issue since I always mention the architecture when I ask it to implement a new feature, which is not "vibe coding". But then why should I vibe code?
Another conclusion is that we could benefit from benchmarks for architectural quality.
skydhash
Architecture is best done on paper, or a whiteboard if you have contributors. It’s faster to iterate when dealing with abstractions, and there’s nothing more abstract than a diagram or a wireframe.
Once you’ve got a general gist of a solution, you can try coding it. Coding with no plan is generally a recipe for disaster (aka can you answer “what am I trying to do?” clearly)
benoau
AI is like a crappy consultant who doesn't care how many times you reject their code and will get it right if you feed them enough information.
The amount of time I save just by not having to write tests or jsdocs anymore is amazing. Refactoring is amazing.
And that's just the code - I also use AI for video production, 3d model production, generating art and more.
yannyu
Similar musings by speculative/science fiction author Ted Chiang: Will A.I. Become the New McKinsey? – https://www.newyorker.com/science/annals-of-artificial-intel...
Honestly I've already had to work with crappier consultants.
Also, there's a lot of value already in a crappy but fast and cheap consultant.