Skip to content(if available)orjump to list(if available)

I use Cursor daily - here's how I avoid the garbage parts

walthamstow

Eng leadership at my place are pushing Cursor pretty hard. It's great for banging out small tickets and improving the product incrementally kaizen-style, but it falls down with anything heavy.

I think it's weakening junior engineers' reasoning and coding abilities as they become reliant on it without having lived for long, or at all, in the before times. I think may be doing the same to me too.

Personally, and quietly, I have a major concern about the conflict of interest of Cursor deciding which files to add to context then charging you for the size of the context.

As with so many products, it's cheap to start with, you become dependent on it, then one day it's not cheap and you're fucked.

rco8786

I’ve been a paying cursor user for 4-5 months now and feeling the same. A lot more mistakes leaking into my PRs. I feel a lot faster but there’s been a noticeable decrease in the quality of my work.

Obviously I could just better review my own code, but that’s proving easier said than done to the point where I’m considering going back to vanilla Code.

DanHulton

There's this concept in aviation of "ahead of or behind the plane". When you're ahead of the plane, you understand completely what it's doing and why, and you're literally thinking in front of it, like "in 30 minutes we have to switch to this channel, confirm new heading with ATC" and so forth. When you're behind the plane, it has done something expected and you are literally thinking behind it, like "why did it make that noise back there, and what does that mean for us?"

I think about coding assistants like this as well. When I'm "ahead of the code," I know what I intend to write, why I'm writing it that way, etc. I have an intimate knowledge of both the problem space and the solution space I'm working in. But when I use a coding assistant, I feel like I'm "behind the code" - the same feeling I get when I'm reviewing a PR. I may understand the problem space pretty well, but I have to basically pick up the pieced of the solution presented to me, turn them over a bunch, try to identify why the solution is shaped this way, if it actually solves the problem, if it has any issues large or small, etc.

It's an entirely different way of thinking, and one where I'm a lot less confident of the actual output. It's definitely less engaging, and so I feel like I'm way less "in tune" with the solution, and so less certain that the problem is solved, completely, and without issues. And because it's less engaging, it takes more effort to work like this, and I get tired quicker, and get tempted to just give up and accept the suggestions without proper review.

I feel like these tools were built without any sort of analysis if they _were_ actually an improvement on the software development process as a whole. It was just assumed they must be, since they seemed to make the coding part much quicker.

ygra

That's a great analogy. For me it is a very similar feeling that I get ripped out of "problem solving mode" into "code review mode" which is often a lot more taxing for me.

It also doesn't help reviewing such code that sometimes surprisingly complex problems are solved correctly, while there's surprisingly easy parts that can be subtly (or very) wrong.

Breza

This is such a great analogy! Exactly how I feel when using AI tools. I have had some incredibly productive conversations about high-level design where I explain my goals and the approaches I'm considering. But then the actual code will have subtle bugs that are hard to find.

speedbird

Also very much in the spirit of "children of the magenta line" https://www.computer.org/csdl/magazine/sp/2015/05/msp2015050...

matwood

Unlike an airplane you can stop using the assistant at any time and catch up. Those who learn to leverage AI will have an advantage.

ljm

Same result - I tried it for a while out of curiosity but the improvements were a false economy: time saved in one PR is time lost to unplanned work afterwards. And it is hard to spot the mistakes because they can be quite subtle, especially if you've got it generating boilerplate or mocks in your tests.

Makes you look more efficient but it doesn't make you more effective. At best you're just taking extra time to verify the LLM didn't make shit up, often by... well, looking at the docs or the source.. which is what you'd do writing hand-crafted code lol.

I'm switching back to emacs and looking at other ways I can integrate AI capabilities without losing my mental acuity.

_rwo

> And it is hard to spot the mistakes because they can be quite subtle

aw yeah; recently I spent half a day pulling my hair debugging some cursor-generated frontend code just to find out the issue was buried in some... obscure experimental CSS properties which broke a default button behavior across all major browsers (not even making this up).

Velocity goes up because you produce _so much code so quickly_, most of which seems to be working; managers are happy, developers are happy, people picking up the slack - not so much.

I obviously use LLMs to some extent during daily work, but going full-on blind mode on autopilot gotta crash the ship at some point.

geoduck14

Can you elaborate on the mistakes you see? What languages are you working with?

Aeolun

I feel like this is also related to cursor getting worse, not better, over the past few months.

Torkel

I saw this post from a professor this morning:

https://x.com/lxeagle17/status/1899979401460371758

Students are not asking questions anymore.

Small assignments work well!

But then the big test comes and scores are at an all time low.

HenryBemis

Yes it is exactly what we think is causing it. Students use LLMs to get 10/10 in assignments so they don't learn 'the thing' and they tank on the big-closed-books-non-LLM-exams.

BUT (apologies for the caps), back in the day we didn't have calculators and now we do. And perhaps the next phase in academia is "solve this problem with the LLM of your choice - you can only use the free versions of LLAMA vX.Y, ChatGTP vA.B, etc. - no paid subscriptions allowed" (in the same spirit that for some exams you can use the simple calculator and not the scientific one). Because if they don't do it, they (academia/universities) will lose/bleed out even more credibility/students/participation.

The world is changing. Some 'parts' are lagging. Academia is 5 years behind (companies paying them for projects help though), Politicians are 10-15 years behind (because the 'donors' (aka bribers) prefer a wild-wild-west for a few years before rights are protected. (Case and point writers/actors applying a lot of pressure when they realized that politicians won't do anything until cornered)

meheleventyone

Calculators replaced books of look up tables and slide rules. As tools they're not really replacing thinking. They help calculate the result but not make good decisions about what to calculate.

LLMs are replacing thinking and for students the need to even know the basics. From the perspective of an academic program if they're stopping the students learning the materials they're actively harmful.

If you're saying that LLMs obviate the need to understand the basics I think that's dangerously wrong. You still need a human in the loop capable of understanding whether the output is good or bad.

vasachi

When you need to calculate a square root of 12.345 during a physics exam, professor doesn't care that you use a calculator, because the exam doesn't test your calculating ability. But it does test your knowledge of physics. What is the point of allowing LLM use during such a test?

dr_kiszonka

> I think it's weakening junior engineers' reasoning and coding abilities as they become reliant on it without having lived for long, or at all, in the before times.

I have a feeling there will be a serious shortage of strong "old-school" seniors in a few years. If students and juniors are reliant on AI, and we recruit seniors from our juniors, who will companies turn to when AI gets stuck?

mbil

I wrote about this recently. I agree, though I think it might take a little longer. There will be a deskilling where the easy problems are solved by AI, so new grad workers won’t have gradual experience to grow into seasoned experts.

mwgalloway

I’m looking forward to when the job market recovers, but I’m not looking forward to the prospect of a significant amount of future demand being in the realm of having to scale and maintain the AI slop code that’s being generated now.

ryandrake

It sounds like a pretty lucrative retirement plan, no matter how boring and frustrating the actual work would be.

Sadly, I don't think companies are going to hire graybeards to maintain AI slop code. They're just going to release low-quality garbage, and the bar for what counts as good software will get lower and lower.

HenryBemis

I feel that (I can feel it in my own skin/life) that those 'oldies' with "ok" SME skills and "great" business acumen will be the ones harnessing the hordes of 'prompt engineers'.

It is like they will be the 'new nerds' and we will have the 'street-smarts'.

arkh

> I think it's weakening junior engineers' reasoning and coding abilities as they become reliant on it without having lived for long, or at all, in the before times.

Same as StackOverflow, same as Google, same as Wikipedia for students.

The problem is not using the tools, it's what you do with the result. There will always be lazy people who will just use this result and not think anymore about it. And then there will always be people who will use those results as a springboard to what to look for in the documentation of whatever tool / language they just discovered thanks to Cursor.

You want to hire from the second category of people.

m_fayer

Quantity has a quality all it’s own.

That is to say, as we strive for better and better tools along a single axis, at some point the social dynamics shift markedly, even though it’s just “more of the same”. Digital distribution was just better distribution, but it changed the very nature of journalism, music, and others. Writing on computers changed writing. And the list goes on.

“This is just the next thing in a long line of things” is how we technologists escape the disquieting notion that we are running more and more wild social experiments.

addicted

> Same as StackOverflow, same as Google, same as Wikipedia for students.

Is it though?

If I used stack overflow, for example, I still needed to understand the code well enough to translate it to my specific codebase, changing variable names at the very least.

breakfastduck

This is not the same. Because SO, google etc require actual research, introspection, prompting AI does not.

tomrod

You're describing superficial behavior -- akin to taking the first match on Google or SO and saying that people are only doing that regarding AI prompts.

Good prompting _does_ require engagement and, for most cases, some research, just like SO or Google.

Sure, you can throw out idle or lazy queries. The results will be worse generally.

KronisLV

> Personally, and quietly, I have a major concern about the conflict of interest of Cursor deciding which files to add to context then charging you for the size of the context.

> As with so many products, it's cheap to start with, you become dependent on it, then one day it's not cheap and you're fucked.

If it gets too expensive, then I guess the alternative becomes using something like Continue.dev or Cline with one of the providers like Scaleway that you can rent GPUs from or that have managed inference… either that, or having a pair of L4 cards in a closet somewhere (or a fancy Mac, or anything else with a decent amount of memory).

Whereas if there are no well priced options anywhere (e.g. the upfront investment for a company to buy their own GPUs to run with Ollama or something else), then that just means that running LLM based systems nowadays is economically infeasible for many.

null

[deleted]

lolinder

> Personally, and quietly, I have a major concern about the conflict of interest of Cursor deciding which files to add to context then charging you for the size of the context.

Can you elaborate on what you're referring to? I don't used Cursor extensively, but I do pay for it, and it was a flat fee annual subscription for unlimited requests, with the "fast" queue being capped at a set number of requests per month with no reference to their size.

Claude Code does work the way you say, since you provide it your Anthropic API key. But I have not seen Cursor charging for context or completion tokens, which is actually the reason why I'm paying for it instead of using something like Aider.

https://www.cursor.com/pricing

evo_9

Has anyone tried limiting Cursor or Cline, etc, to a higher level role such as analysis and outlining proposed changes, and then coding those changes yourself with minimal LLM interaction? Aka, ask to define / outline only a high level set of changes, but do no actual changes to any file; then proceed through the outlined work, limiting Cursor to roughing out the work and hand writing the actual critical bits? That’s the approach I’ve been taking, a sort of best of both worlds that greatly speeds me up without taking the hands 100% off the wheel.

vineyardmike

This seems like the worst of both worlds. The human still has to do the "boring" work of writing out all the boiler plate stuff, but now there is a machine telling the human what to do. Oh, and the machine is famously not great at big question type stuff while being being much more performant at churning out boilerplate.

tgdude

I find the opposite. I tend to think through the problem myself, give cursor/claude my understanding, guide it through a few mistakes it makes, have it leave files at 80% good enough as it codes and gets stuck, and then spend the next 20 min or so cleaning up the changes and fixing the few wire up spots it got wrong.

Often I will decompose the problem into smaller subproblems and feed those to cursor one by one slowly building up the solution. That works for big tickets.

For me the time saving and force multiplier isn't necessarily in the problem solving, I can do that faster and better in most cases, but the raw act of writing code? It does that way faster than me.

mock-possum

Yeah that’s been my approach as well - and honestly I’m not even sure that it’s necessarily faster, it’s just different. Sometimes I feel like getting my hands dirty and writing the code myself - LLMs can be good for getting yourself unstuck when you’re facing an obstacle too. But other times, I’d rather just sit back and dictate requirements and approaches, and let the robot dream up to implementation itself.

real0mar

Yeah. Reasoning models like r1 tend to be good for architecting changes and less optimal for actually writing code. So this allows the best of both worlds.

j45

Heavier architecture / systems design stuff is sometimes easier to do with a differently configured cursor, or claude.

The key I find is experience doing that kind of stuff to begin with, vs domain experience as well, vs little to none and wanting to learn the ropes.

laborcontract

Cursor's current business model produces a fundamental conflict between the well-being of the user and the financial well-being of the company. We're starting to see these cracks form as LLM providers are relying on scaling through inference-time compute.

Cursor has been trying to do things to reduce the costs of inference, especially through context pruning. For instance, if you "attach" files to a conversation, Cursor no longer stuffs the code from those files into the prompt. Instead, it'll run function calls to open those files and read bits and pieces of the code until the model feels it has enough information. This seems like a perfectly reasonable strategy until you realize you cannot do the same thing with reasoning models, if you're limiting the reasoning to just the initial prompt.

If you prune out context from the initial prompt, instead of reasoning on richer context, the llm reasons only on the prompt itself (w/ no access to the attached files). After the thinking process, Cursor runs function calls to retrieve more context, which entirely defeats the point of "thinking" and induces the model to create incoherent plans and speculative edits in its thinking process, thus explaining Claude's bizarre over-editing behavior. I suspect this is why so many Cursor users are complaining about Claude 3.7.

On top of this, Cursor has every incentive to keep the thinking effort for both o3-mini and Claude 3.7 to the very minimum so as to reduce server load.

Cursor is being hailed as one of the greatest SAAS growth stories but their $20/mo all-you-can-eat business model puts them in such a bad place.

rafaelmn

>Cursor has been trying to do things to reduce the costs of inference, especially through context pruning. For instance, if you "attach" files to a conversation, Cursor no longer stuffs the code from those files into the prompt. Instead, it'll run function calls to open those files and read bits and pieces of the code until the model feels it has enough information. While that seems like a perfectly reasonable strategy, it starts to fall apart when integrating reasoning models.

In general I feel like this was always the reason automatic context detection could not be good in fixed fee subscription models - providers need to constrain the context to stay profitable. I also saw that things like Claude Code happily chew through your codebase, and bank account, since they are charging by token - so they have the opposite incentive.

NitpickLawyer

> This seems like a perfectly reasonable strategy until you realize you cannot do the same thing with reasoning models, if you're limiting the reasoning to just the initial prompt.

Keep in mind that what we call "reasoning" models today are the first iteration. There's no fundamental reason why you can't do what you stated. It's not done now, but it can be done.

There's nothing stoping you from running "tinking" in "chunks" of 1-2 paragraphs, doing some search, and adding more context (maybe from pre-reasoned cache) and continuing the reasoning from there.

There's also work being done on think - summarise - think - summarise - etc. And on various "RAG"-like thinking.

Roritharr

This is only surface-level deep. Cursor already has Quotas for their paid plans and Usage-based Pricing for their larger models, which I run into and fall over to their usage based model every month.

Imo most of their incentive on context-pruning comes not just from reducing the token amount, but from the perception that you only have to find "the right way"tm to build that context window automatically, to get to coding panacea. They just aren't there yet.

laborcontract

If you’re going to pay on the margin, why not use those incremental dollars running the same requests on cline? I’m assuming cost is the deciding factor here because, quality-wise, plugging directly into provider apis with cline always does a much better job for me.

Roritharr

Good callout, will try! I haven't considered switching tools, it's mostly convenience of just continuing, instead of stopping mid-way through and switch out the tools. But also I only code intermittently, a couple of days a week at most these days, because it's only part of what I do, so I can get to experiment with new tooling much less than i'd like.

IanCal

> Instead, it'll run function calls to open those files and read bits and pieces of the code until the model feels it has enough information. This seems like a perfectly reasonable strategy until you realize you cannot do the same thing with reasoning models, if you're limiting the reasoning to just the initial prompt.

There's nothing about this that conflicts with reasoning models, I'm not sure what you mean here.

laborcontract

what i mean is that their implementation (thinking only on the first response) renders zero benefit because it doesn’t see the code itself. They run multiple function calls to analyze your codebase in increments. If they ran the thinking model on the output of those function calls, then performance would be great but, so far, this is not what they are doing (yet). It also dramatically increases the cost of running the same operation.

IanCal

But the way those models work is to run everything once the function calls come in. Are you saying cursor is not using the model you selected on function calls responses?

throwaway314155

This sounds like a Cursor issue, not something that effects reasoning models in general.

edit: Ah, I see what you mean now.

MrBuddyCasino

> Cursor has been trying to do things to reduce the costs of inference, especially through context pruning.

You can also use cline with gemini-2.0-flash, which supports a huge context window. Cline will send it the full context and not prune via RAG, which helps.

laborcontract

I've just tried gemini-2.0-flash, this is an incredible model that's great for making edits. I haven't tried any heavy lifting with it yet but It's replaced Claude for a lot of my edits. It's also great at agentic stuff too!

laborcontract

I love cline but i’ve never tried the gemini models with it. I’ll give it a shot tonight, thanks for the tip!

greyman

Or you can also use Gemini Code Assist extension for VS Code, which is basically free, but so far, the code it wrote almost never worked for me. So far I use only Claude 3.7 or Grok in chat mode. Almost no model, as of today, is good at coding.

MrBuddyCasino

Did Grok 3 finally get an API?

sandbach

I think you're right, but what company's business model doesn't produce a conflict between the user's well-being and the company's finances?

namaria

Reflecting on your comment I realized that using a huge amount of GPUs is akin to an Turing machine approaching infinite speed. So I think the promise of LLMs writing code is basically saying: if we add a huge number of reading/writing heads with unbounded number of rules, we can solve decideability. Because what is the ability to generate arbitrarily complex code if not solving the halting problem? Maybe there's a more elegant or logical way to postulate this, or maybe I'm just confused or plain wrong, but it seems to me that it is impossible to generate a program that is guaranteed to terminate unless you can solve decideability. And throwing GPUs at a huge tape is just saying that the tape approaches infinite size and the Turing machine approaches infinite speed...

Or put another way, isn't the promise of software that is capable to generate any software given a natural language description in finite time basically assuming P=NP? Because unless the time can be guaranteed to be finite, throwing GPU farms and memory at this most general problem (isn't the promise of using software to generating arbitrary software the same as the promise that any possible problem can be solved in polynomial time?) is not guaranteed to solve it in finite time.

cyprx

I had been using Cursor for a month until a day when my house got no internet, then i realized that i started forgetting how to write code properly

risyachka

I had the exact same experience, pretty sure this happens in most cases, people just don’t realize it

ant6n

Just get a Mac Studio with 512GB RAM and run a local model when the internet is down.

jjude

Which local model would you recommend that comes close to cursor in response quality? I have tried deepseek, mistral, and few others. None comes close to quality of cursor. I keep coming back to it.

ant6n

Possibly useful comment on local models, perhaps also fitting on machines with less ram:

https://news.ycombinator.com/item?id=43340989

_puk

Back up your $20 a month subscription with a $2000 Mac Studio for those days your internet is down.

Peak HN.

automatic6131

Lol he suggested a $10k Mac Studio

But you can at least resell that $10k Mac Studio, theoretically.

yohannesk

Even more absurd is that Mac Studio with 512GB RAM costs around $9.5K

timothygold

Maybe this "backup" solution.. developed into commodity hardware as an affordable open source solution that keeps the model and code locally and private at all times is the actual solution we need.

Lets say a cluster of raspberry pi's / low powered devices producing results as good as claude 3.7-sonnet. Would it be completely infeasible to create a custom model that is trained on your own code base and might not be a fully fledged LLM but provides similar features to cursor?

Have we all gone bonkers sending our code to third parties? The code is the thing you want to keep secret unless your working on an open source project.

rullopat

2000$? You wish!

pknerd

Can one run cursor with local LLMs only?

SCdF

... to make *completely* sure that they forgot how to program?

eadmund

But then I’d be using a Mac, and that would slow my development down and be generally miserable.

vrnvu

Me too. I completely forgot the standard library and basic syntax of my daily language. wow. I went back to VSCode and use Cursor for the AI model to ask questions.

jillesvangurp

The UX of tools like these is largely constrained by how good they are with constructing a complete context of what you are trying to do. Micromanaging context can be frustrating.

I played with aider a few days ago. Pretty frustrating experience. It kept telling me to "add files" that are in the damn directory that I opened it in. "Add them yourself" was my response. Didn't work; it couldn't do it somehow. Probably once you dial that in, it starts working better. But I had a rough time with it creating commits with broken code, not picking up manual file changes, etc. It all felt a bit flaky and brittle. Half the problem seems to be simple cache coherence issues and me having to tell it things that it should be figuring out by itself.

The model quality seems less important than the plumbing to get the full context to the AI. And since large context windows are expensive, a lot of these tools are cutting corners all the time.

I think that's a short term problem. Not cutting those corners is valuable enough that a logical end state is tools that don't do that that cost a bit more. Just load the whole project. Yes it will make every question cost 2-3$ or something like that. That's expensive now but if it drops by 20x we won't care.

Basically large models that support huge context windows of millions/tens of millions of tokens cost something like the price of a small car and use a lot of energy. That's OK. Lots of people own small cars. Because they are kind of useful. AIs that have a complete, detailed context of all your code, requirements, intentions, etc. will be able to do a much better job that one that has to guess all of that from a few lines of text. That would be useful. And valuable to a lot of people.

Nvidia is rich because they have insane margins on their GPUs. They cost a fraction of what they sell them for. That means that price will crash over time. So, I'm optimistic that a lot of this stuff will improve rapidly.

mbeex

> aider [...] It kept telling me to "add files" that are in the damn directory that I opened it in.

That's intentional, and I like it. It limits the context dynamically to what is necessary (of course it makes mistakes). You can also add files with placeholders and in a number of other ways. but most of the time I let Aider decide. It has a repomap (https://aider.chat/docs/repomap.html), gradually building up knowledge and makes proposals based on this and other information it gathered also with token costs and out-of-context-window in mind.

As for manual changes: aider is opinionated regarding the role of Git in your workflow. At first glance, this repels some people and some stick to this opinion. For others, it is exactly one of the advantages, especially in combination with the shell-like nature of the tool. But the standard Git handling can still be overridden. For me personally, the default behavior becomes more and more smooth and second nature. And the whole thing is scriptable, I only begin to use the possibilities.

In general: Tools have to be learned, impatient one-shot attempts are simply not enough anymore.

jampekka

> Nvidia is rich because they have insane margins on their GPUs. They cost a fraction of what they sell them for. That means that price will crash over time. So, I'm optimistic that a lot of this stuff will improve rapidly.

OTOH currently the LLM companies are probably taking a financial loss with each token. Wouldn't be surprised if the price doesn't even cover the electricity used in some cases.

Also e.g. Gemini already runs on Google's custom hardware, skipping the Nvidia tax.

_heimdall

> Nvidia is rich because they have insane margins on their GPUs. They cost a fraction of what they sell them for. That means that price will crash over time. So, I'm optimistic that a lot of this stuff will improve rapidly.

That still leaves us with an ungodly amount of resources used both to build the GPUs and to run them for a few years before having to replace them with even more GPUs.

Its pretty amazing to me how quickly the big tech companies pivoted from making promises to "go green" to buying as many GPUs as possible to burn through entire powerplants worth of electricity.

myflash13

Try Claude Code. It figures out context by itself. I’m having a lot of success with it for a few days now, whereas I never caught on with Cursor due to the context problem.

_--__--__

I have not tried Claude Code, but besides the model lock-in the number one complaint I have heard is that it consistently over provides context leading to high token usage.

myflash13

I don't care about model lock-in as long it actually gets the job done. Claude Code is the only AI solution I've tried that can actually make deep, meaningful changes across frontend and backend on a mature enterprise codebase.

2sk21

I read this point in the article with bafflement:

"Learn when a problem is best solved manually."

Sure, but how? This is like the vacuous advice for investors: buy low and sell high

dkersten

By trying things and seeing what it’s good and bad at. For example, I no longer let it make data modelling decisions (both for client local data and database schemas), because it had a habit of coding itself into holes it had trouble getting back out of, eg duplicating data that it then has difficulty keeping in sync, where a better model from the start might have been a more normalised structure.

But I came to this conclusion by first letting it try to do everything and observing where it fell down.

Amekedl

Compounding the opinions of other commentors, I feel that using Cursor is a bad idea. It's a closed source SaaS, and with these components involved, service quality can do wild swings on a daily basis, not something I'm particularly keen of.

Aurornis

AI tools aren’t all or nothing. You use them when it makes sense and you go back to regular coding when they don’t (or when they’re unavailable).

You’re also not limited to a single tool. You can switch to different tools and even have multiple editors open at the same time.

turnsout

There's always Aider with local models!

rco8786

This is true of every single service provider outside of fully OSS solutions, which are a teeny tiny fraction of the world's service providers.

blainm

I've found tools like Cursor useful for prototyping and MVP development. However, as the codebase grows, they struggle. It's likely due to larger files or an increased number of them filling up the context window, leading to coherence issues. What once gave you a speed boost now starts to work against you. In such cases, manually selecting relevant files or snippets from them yields better results, but at that point it's not much different from using the web interface to something like Claude.

Semaphor

I had that same experience with Claude Code. I tried to do a 95% "Idle Development RPG" approach to developing a music release organization software. At the beginning, I was really impressed, but with more and more complexity, it becomes increasingly incoherent, forgetting about approaches and patterns used elsewhere and reinventing the wheel, often badly.

turnsout

Agreed. One useful tip is to have Cursor break up large files into smaller files. For some reason, the model doesn't do this naturally. I've had several Cursor experiments grow into 3000+ line files because it just keeps adding.

Once the codebase is reasonably structured, it's much better at picking which files it needs to read in.

blitzar

Or the context not being large enough for all the obscure functions and files to go into the context. I am too basic to have dug deep enough, but a simple (automatic) documentation context for the entire project would certainly improve things for me.

kevingadd

> Like mine will keep forgetting about nullish coallescing (??) in JS, and even after I fix it up it will revert my change in its future changes. So of course I put that rule in and it won't happen again.

I'm surprised that this sort of pattern - you fix a bug and the AI undoes your fix - is common enough for the author to call it out. I would have assumed the model wouldn't be aggressively editing existing working code like that.

worldsayshi

Yeah I have seen this a bunch of times as well. Especially with deprecated function calls. It generates a bunch of code. I get deprecation warnings. I fix them. Copilot fixes them back. I have to explicitly point out that I made the change for it to not reintroduce the deprecations.

I guess that while code that compiles is easier to train for but code with warnings less so?

I remember there are other examples of changes that I have to tell the AI I made to not have it change it back again, but can't remember any specific examples.

Aeolun

It’s due to a problem with Cursor not updating the state of the files that have been manually edited since the last time they were used in the chat, so it’ll thing the fix is not there and blindly output code that doesn’t have it. The ‘apply’ model is dumb, so it just overwrites the corrected version with the wrong one.

I think the changelog said they fixed it in 0.46, but that’s clearly not the case.

oefrha

Yep I asked about this exact problem the other day: https://news.ycombinator.com/item?id=43308153 Having something like “always read the current version of the file before suggesting edits” in Cursor rules doesn’t help, the current file is only read by the agent sometimes. Guess no one has a reliable solution yet.

siquick

Cursor in agent mode + Sonnet 3.7 love nothing better than rewriting half your codebase to fix one small bug in a component.

I've stopped using agent unless its for a POC where I just want to test an assumption. Applying each step takes a bit more time but means less rogue behaviour and better long term results IME.

krets

Sounds like a human colleague of mine

worldsayshi

> love nothing better than rewriting half your codebase to fix one small bug in a component

Relatable though.

MarcelOlsz

Reminds me of my old co-worker who rewrote our code to be 10x faster but 100x more unreadable. AI agent code is often the worst of both of those worlds. I'm going to give [0] this guy's strategy a shot.

[0] https://www.youtube.com/watch?v=jEhvwYkI-og

WA

If you stopped using agent mode, why use Cursor at all and not a simple plugin for VSCode? Or is there something else that Cursor can do, but a VSCode plugin can't?

DeathArrow

Apart from the fact that it chews fast requests like there's no tomorrow I dislike how it does changes I didn't ask to. And if I ask to undo what it did without being asked, it goes on and beaks more code.

In my test application I had a service which checked the cache, then asked the repository if no data is in cache, then uses external APIs to fetch some data, combine it and update the DB and the cache.

I asked Cursor to change using DateTime type to using Unix timestamp. It did the changes but it also removed cache checks and calling external APIs, so my web app relied just on the data in DB. When asked to add back what it removed, it broke functionality in other parts of the application.

And that is with a small simple app.

torginus

I have been a religious Cursor + Sonnet user for like past half a year, and maybe I'm an idiot, but I don't like this agentic workflow at all.

What worked for me is having it generate functions, classes, ranging from tens of lines of code to low hundreds. That way I could quickly interate on its output and check if its actually what I wanted.

It created a prompt-check-prompt iterative workflow where I could make progress quite fast and be reasonably certain of getting what I wanted. Sometimes it required fiddling with manually including files in the context, but that was a sacrifice I was willing to make and if I messed up, I could quickly try again.

With these agentic workflows, and thinking models I'm at a loss.

To take advantage of them, you need very long and detailed prompts, they take a long time to generate and drop huge chunks of code on your head. What it generates is usually wrong due to the combination of sloppy or ambiguous requirements by me, model weaknesses, and agent issues. So I need to take a good chunk of time to actually understand what it made, and fix it.

The iteration time is longer, I have less control over what it's doing, which means I spend many minutes of crafting elaborate prompts, reading the convoluted and large output, figuring out what's wrong with it, either fixing it by hand, or modifying my prompt, rinse and repeat.

TLDR: Agents and reasoning models generate 10x as much code, that you have to spend 10x time reviewing and 10x as much time crafting a good prompt.

In theory it would come out as a wash, in practice, it's worse since the super-productive tight AI iteration cycle is gone.

Overall I haven't found these thinking models to be that good for coding, other than the initial project setup and scaffolding.

timrichard

I think you’re absolutely right and I’ve come to the same conclusion and workflow.

I work on one file at a time in Ask mode, not Composer/Agent. Review every change, and insist on revisions for anything that seems off. Stay in control of the process, and write manually whenever it would be quicker. I won’t accept code I don’t understand, so when exploring new domains I’ll go back with as many questions as necessary to get into the details.

I think Cursor started off this way as a productivity tool for developers, but a lot of Composer/Agent features were added along the way as it became very popular with Vibe Coders. There are inherent risks with non-coders copypasting a load of code they don’t understand, so I see this use case as okay for disposable software, or perhaps UI concept prototypes. But for things that matter and need to be maintained, I think your approach is spot on.

_heimdall

Have you found that this still saves you time overall? Or do you spent a similar amount of time acting as a code reviewer rather than coding it yourself?

timrichard

Yes, I think so. Often it doesn’t take much more than a glance for simpler edits.

theshrike79

Do you have any Cursor rules defined? Those tend to control its habit of trying to go off the rails and solve 42 problems at once instead of just the one.

yard2010

How can I stop Cursor from sending .env files with secrets as plain text? Nothing I tried from the docs works.

M4v3R

This is a huge issue that was already raised on their forums and it's very surprising they didn't address it yet.

[0] https://forum.cursor.com/t/environment-secrets-and-code-secu...

timrichard

I have been adding .env files to .cursorignore so far.

I can see from that thread that the approach hasn’t been perfect, but it seems that the last two releases have tried to address that :

“0.46.x : .cursorignore now blocks files from being added in chat or sent up for tab completions, in addition to ignoring them from indexing.”

mexicocitinluez

lol "move fast and break stuff....like really, really break stuff. i mean, break it so bad you'll probably cause people to lose their jobs and livelihoods"

null

[deleted]

rhodescolossus

I've tried Cursor a couple of times but my complain is always the same: why forking VS Code when all this functionality could just be an extension, same as Copilot does?

Some VSCode extensions don't work, you need to redo all your configuration, add all your workspaces... and the gain vs Copilot is not that high

andrewl-hn

> why forking VS Code when all this functionality could just be an extension, same as Copilot does?

Have you programmed extensions for VSCode before? While it seems like a fairy extensible system overall, the editor component in particular is very restrictive. You can add text (that's what extensions like ErrorLens and GitLens are doing), inlay hints, and on-hover popup overlays (those can only trigger on words, and not on punctuation). What Cursor does: the automatic diff-like views of AI suggestions with graphic outlines, floating buttons, and whatnot right on top of the text editing view - is not possible in vanilla VSCode.

This was originally driven by necessity of tighter control over editor performance. In its early days VSCode was competing with Atom - another extensible JS-powered editor from GitHub, and while Atom had an early lead due to larger extensions catalog VSCode ultimately won the race because they manged to maintain lower latency of their text editor component. Nowadays they still don't want to introduce extra extension points to it, because newer faster editors pop out all the time, too.

satvikpendem

Now the Atom (and Electron, hence the name) creators are working Zed, which is even faster than VSCode, albeit not as extensible.

frereubu

> and the gain vs Copilot is not that high

I think that's (at least part of) your answer. More friction to move back from an entirely separate app rather than disabling an extension.