Prompt engineering playbook for programmers

165 comments

·June 4, 2025

Visit

DebtDeflation

In my experience there's really only three true prompt engineering techniques:

- In Context Learning (providing examples, AKA one shot or few shot vs zero shot)

- Chain of Thought (telling it to think step by step)

- Structured output (telling it to produce output in a specified format like JSON)

Maybe you could add what this article calls Role Prompting to that. And RAG is its own thing where you're basically just having the model summarize the context you provide. But really everything else just boils down to tell it what you want to do in clear plain language.

dachris

Context is king.

Start out with Typescript and have it answer data science questions - won't know its way around.

Start out with Python and ask the same question - great answers.

LLMs can't (yet) really transfer knowledge between domains, you have to prime them in the right way.

christophilus

Dunno. I was working on a side project in TypeScript, and couldn’t think of the term “linear regression”. I told the agent, “implement that thing where you have a trend line through a dot cloud”, or something similarly obtuse, and it gave me a linear regression in one shot.

I’ve also found it’s very good at wrangling simple SQL, then analyzing the results in Bun.

I’m not doing heavy data processing, but so far, it’s remarkably good.

whoknowsidont

Linear regression is a non-niche, well understood topic that's used in many other domains other than data science.

However, asking it to implement "that thing that groups data points into similar groups" needs a bit more context (I just tried it) as K-means is very much specific to machine learning.

nxobject

I see that as applying to niche platforms/languages without large public training datasets - if Rust was introduced today, the productivity differential would be so stacked against it that I’m not sure it would hypothetically survive.

0points

That's your made up magical explanation right there dude.

Every day tech broism gets closer to a UFO sect.

LPisGood

I think it’s not really a magical explanation; it’s pretty grounded in how LLMs work.

Obviously how exactly they work still isn’t fully explained, but calling basic principles magical is too far in my opinion.

lexandstuff

Even role prompting is totally useless imo. Maybe it was a thing with GPT3, but most of the LLMs already know they're "expert programmers". I think a lot of people are just deluding themselves with "prompt engineering".

Be clear with your requirements. Add examples, if necessary. Check the outputs (or reasoning trace if using a reasoning model). If they aren't what you want, adjust and iterate. If you still haven't got what you want after a few attempts, abandon AI and use the reasoning model in your head.

dimitri-vs

It's become more subtle but still there. You can bias the model towards more "expert" responses with the right terminology. For example, a doctor asking a question will get a vastly different response than a normal person. A query with emojis will get more emojis back. Etc.

didgeoridoo

This is definitely something I’ve noticed — it’s not about naïve role-priming at all, but rather about language usage.

“You are an expert doctor, help me with this rash I have all over” will result in a fairly useless answer, but using medical shorthand — “pt presents w bilateral erythema, need diff dx” — gets you exactly what you’re looking for.

easyThrowaway

I get the best results with Claude by treating the prompt like a pseudo-SQL language, treating words like "consider" or "think deeply" like keywords in a programming language. Also making use of their XML tags[1] to structure my requests.

I wouldn't be surprised if in a few years from now some sort of actual formalized programming language for "gencoding" AI is gonna emerge.

[1]https://docs.anthropic.com/en/docs/build-with-claude/prompt-...

petesergeant

One thing I've had a lot of success with recently is a slight variation on role-prompting: telling the LLM that someone else wrote something, and I need their help assessing the quality of it.

When the LLM thinks _you_ wrote something, it's nice about it, and deferential. When it thinks someone else wrote it, you're trying to decide how much to pay that person, and you need to know what edits to ask for, it becomes much more cut-throat and direct.

dwringer

I notice this to affect its tendency to just make things up in other contexts, too. I asked it to take a look at "my" github, gave it a link, then asked it some questions; it started talking about completely different repos and projects I never heard of. When I simply said take a look at `this` github and gave it a link, its answers had a lot more fidelity to what was actually there (within limits of course - it's still far from perfect) [This was with Gemini Flash 2.5 on the web]. I have had simlar experiences asking it to do style transfer from an example of "my" style versus "this" style, etc. Presumably this has something to do with the idea that in training, every text that speaks in first person is in some sense seen as being from the same person.

coolKid721

The main thing I think is people just trying to do everything in "one prompt" or one giant thing throwing all the context at it. What you said is correct but also, instead of making one massive request breaking it down into parts and having multiple prompts with smaller context that say all have structured output you feed into each other.

Make prompts focused with explicit output with examples, and don't overload the context. Then the 3 you said basically.

denhaus

Regarding point 3, my colleagues and i studied this for a use case in science: https://doi.org/10.1038/s41467-024-45563-x

caterama

Can you provide a "so what?" summary?

melagonster

>We test three representative tasks in materials chemistry: linking dopants and host materials, cataloging metal-organic frameworks, and general composition/phase/morphology/application information extraction. Records are extracted from single sentences or entire paragraphs, and the output can be returned as simple English sentences or a more structured format such as a list of JSON objects. This approach represents a simple, accessible, and highly flexible route to obtaining large databases of structured specialized scientific knowledge extracted from research papers.

denhaus

As a clarification, we used fine tuning more than prompt engineering because low or few-shot prompt engineering did not work for our use case.

faustocarva

Did you find it hard to create structured output while also trying to make it reason in the same prompt?

demosthanos

You use a two-phase prompt for this. Have it reason through the answer and respond with a clearly-labeled 'final answer' section that contains the English description of the answer. Then run its response through again in JSON mode with a prompt to package up what the previous model said into structured form.

The second phase can be with a cheap model if you need it to be.

faustocarva

Great, will try this! But, in a chain-based prompt or full conversational flow?

haolez

Sometimes I get the feeling that making super long and intricate prompts reduces the cognitive performance of the model. It might give you a feel of control and proper engineering, but I'm not sure it's a net win.

My usage has converged to making very simple and minimalistic prompts and doing minor adjustments after a few iterations.

taosx

That's exactly how I started using them as well. 1. Give it just enough context, the assumptions that hold and the goal. 2. Review answer and iterate on the initial prompt. It is also the economical way to use them. I've been burned one too many times by using agents (they just spin and spin, burn 30 dollars for one prompt and either mess the code base or converge on the previous code written ).

I also feel the need to caution others that by letting the AI write lots of code in your project it makes it harder to advance it, evolve it and just move on with confidence (code you didn't think about and write it doesn't stick as well into your memory).

apwell23

> they just spin and spin, burn 30 dollars for one prompt and either mess the code base or converge on the previous code written ).

My experience as well. I fear admitting this for fear of being labled a luddite.

scarface_74

How is that different than code I wrote a year ago or when I have to modify someone else’s code?

conception

I’d have to hunt, but there is evidence that using the vocabulary of an expert versus a layman will produce better results. Which makes sense since places where people talk “normally” in spaces are more likely to be incorrect. Whereas in places where people speak in the in the professional vernacular they are more likely to be correct. And the training will associate them together in their spaces.

ijk

At their heart, these are still just document-completion machines. Very clever ones, but still inherently trying to find a continuation that matches the part that came before.

heisenzombie

This seems right to me. I often ask questions in two phases to take advantage of this (1) How would a professional in the field ask this question? Then (2) paste that question into a new chat.

tgv

For another kind of task, a colleague had written a very verbose prompt. Since I had to integrate it, I added some CRUD ops for prompts. For a test, I made a very short one, something like "analyze this as a <profession>". The output was pretty much comparable, except that the output on the longer prompt contained (quite a few) references to literal parts of that prompt. It wasn't incoherent, but it was as if that model (gemini 2.5, btw) has a basic response for the task it extracts from the prompt, and merges the superfluous bits in. It would seem that, at least for this particular task, the model cannot (easily) be made to "think" differently.

nico

That’s also been my experience

At the same time, I’ve seen the system prompts for a few agents (https://github.com/x1xhlol/system-prompts-and-models-of-ai-t...), and they are huge

How does that work?

sagarpatil

That has been my conclusion too but how do you explain the long ass prompt by AI labs: https://docs.anthropic.com/en/release-notes/system-prompts#m...

haolez

Well, your prompt adds up to the baseline. The logic still applies.

pjm331

Yeah I had this experience today where I had been running code review with a big detailed prompt in CLAUDE.md but then I ran it in a branch that did not have that file yet and got better results.

dwringer

I would simplify this as "irrelevant context is worse than no context", but it doesn't mean a long prompt of relevant context is bad.

matt3210

At what point does it become programming in legalese?

efitz

It already did. Programming languages already are very strict about syntax; professional jargon is the same way, and for the same reason- it eliminates ambiguity.

bsoles

There is no such thing as "prompt engineering". Since when the ability to write proper and meaningful sentences became engineering?

This is even worse than "software engineering". The unfortunate thing is that there will probably be job postings for such things and people will call themselves prompt engineers for their extraordinary abilities for writing sentences.

NitpickLawyer

> Since when the ability to write proper and meaningful sentences became engineering?

Since what's proper and meaningful depends on a lot of variables. Testing these, keeping track of them, logging and versioning take it from "vibe prompting" to "prompt engineering" IMO.

There are plenty of papers detailing this work. Some things work better than others (do this and this works better than don't do this - pink elephants thing). Structuring is important. Style is important. Order of information is important. Re-stating the problem is important.

Then there's quirks with family models. If you're running an API-served model you need internal checks to make sure the new version still behaves well on your prompts. These checks and tests are "prompt engineering".

I feel a lot of people take the knee-jerk reaction to the hype and miss critical aspects because they want to dunk on the hype.

gwervc

It's still very very far from engineering. Like, how long and how much one has to study to get an engineering degree? 5 years over many disciplines.

On the other hand, prompt tweaking can be learned in a few days just by experimenting.

NitpickLawyer

>the branch of science and technology concerned with the design, building, and use of engines, machines, and structures.

Not this one.

> (alt) a field of study or activity concerned with modification or development in a particular area. "software engineering"

This one ^^^

Too many people seem really triggered by this. I don't know why, but it's weird. It's just a term. It's well understood by now. The first 5 pages on google all state the same thing. Why bicker about something so trivial?

apwell23

> Some things work better than others

That could be said about ordering coffee at local coffee shop. Is there a "barista order engineering" we are all supposed to read?

> Re-stating the problem is important.

maybe you can show us some examples ?

liampulles

If my local barista were to start calling themselves a coffee engineer, I would treat that as a more credible title.

hansmayer

Yeah, if this catches on, we may definitely see the title "engineer" go the way of "manager" and "VP" in the last decades...So, yeah, we may start seeing coffee engineers now :D

gwervc

Mixologist has already replaced bartender.

SchemaLoad

AI sloperators are desperate to make it look like they are actually doing something.

zelias

Since modern algorithmic driven brainrot has degraded the ability of the average consumer to read a complete sentence, let alone write one

sach1

I agree with yowlingcat's point but I see where you are coming from and also agree with you.

The way I see it, it's a bit like putting up a job posting for 'somebody who knows SSH'. While that is a useful skill, it's really not something you can specialize in since it's just a subset within linux/unix/network administration, if that makes sense.

mkfs

> The unfortunate thing is that there will probably be job postings for such things

I don't think you have to worry about that.

mseepgood

You don't even have to write proper sentences. "me get error X how fix here code:" usually works.

bicepjai

I would argue code is a meaningful sentence. So software writers is more appropriate :) ?

heisenburgzero

In my own experience, if the problem is not solvable by a LLM. No amount of prompt "engineering" will really help. Only way to solve it would be by partially solving it (breaking down to sub-tasks / examples) and let it run its miles.

I'll love to be wrong though. Please share if anyone has a different experience.

TheCowboy

I think part of the skill in using LLMs is getting a sense for how to effectively break problems down, and also getting a sense of when and when not to do it. The article also mentions this.

I think we'll also see ways of restructuring, organizing, and commenting code to improve interaction with LLMs. And also expect LLMs to get better at doing this, and maybe suggesting ways for programmers to break problems down that it is struggling with.

stets

I think the intent of prompt engineering is to get better solutions quicker, in formats you want. But yeah, ideally the model just "knows" and you don't have to engineer your question

ColinEberhardt

There are so many prompting guides at the moment. Personally I think they are quite unnecessary. If you take the time to use these tools, build familiarity with them and the way they work, the prompt you should use becomes quite obvious.

Disposal8433

It reminds me that we had the same hype and FOMO when Google became popular. Books were being written on the subject and you had to buy those or you would become a caveman in a near future. What happened is that anyone could learn the whole thing in a day and that was it, no need to debate about whether you would miss anything if you didn't knew all those tools.

verbify

I certainly have better Google fu than some relatives who are always asking me to find something online.

Timwi

I love the term “Google fu”. We should call it prompt fu or LLM fu instead of “prompt engineering”.

wiseowise

You’re only proving the opposite: there’s definitely a difference between “experienced Google user” and someone who just puts random words and expects to find what they need.

marliechiller

Is there? I feel like google has optimised heavily for the caveman input rather than the enlightened search warrior nowadays

sokoloff

I think there are people for whom reading a prompt guide (or watching an experienced user) will be very valuable.

Many people just won't put any conscious thought into trying to get better on their own, though some of them will read or watch one thing on the topic. I will readily admit to picking up several useful tips from watching other people use these tools and from discussing them with peers. That's improvement that I don't think I achieve by solely using the tools on my own.

awb

Many years ago there were guides on how to write user stories: “As a [role], I want to be able to do [task] so I can achieve [objective]”, because it was useful to teach high-level thinkers how to communicate requirements with less ambiguity.

It may seem simple, but in my experience even brilliant developers can miss or misinterpret unstructured requirements, through no fault of their own.

TheCowboy

It's at least useful for seeing how other people are being productive with these tools. I also sometimes find a clever idea that improves that I'm already doing.

And documenting the current state of this space as well. It's easy to have tried doing something a year ago and think they're still bad.

I also usually prefer researching some area before reinventing the wheel by trial/failure myself. I appreciate when people share what they've discovered with their own their time, as I don't always have all the time in the world to explore it as I would if I were still a teen.

baby

There are definitely tricks that are not obvious. For example it seems like you should delete all politeness (e.g. "please")

orochimaaru

A long time back for my MS CS I took a science of programming course. The way to verify has helped me craft prompts when I do data engineering work. Basically:

Given input (…) and preconditions (…) write me spark code that gives me post conditions (…). If you can formally specify the input, preconditions and post conditions you usually get good working code.

1. Science of programming, David Gries 2. Verification of concurrent and sequential systems

yuvadam

Seems like so much over (prompt) engineering.

I get by just fine with pasting raw code or errors and asking plain questions, the models are smart enough to figure it out themselves.

leshow

using the term "engineering" for writing a prompt feels very unserious

vunderba

I came across a pretty amusing analogy back when prompt "engineering" was all the rage a few years ago.

> Calling someone a prompt engineer is like calling the guy who works at Subway an artist because his shirt says ‘Sandwich Artist.’

All jokes aside I wouldn't get to hung up on the title, the term engineer has long since been diluted to the point of meaninglessness.

https://jobs.mysubwaycareer.eu/careers/sandwich-artist.htm

theanonymousone

Why would I have a problem calling that guy a sandwich engineer?

https://en.wikipedia.org/wiki/Audio_engineer

guappa

It's cute that you think that being a sound engineer is something you can pick up in a few minutes, while it requires knowledge of acoustics, electronics, music theory and human perception.

wiseowise

Because you’ll hurt ops huge ego. God forbid you put the godly title of ENGINEER near something so trivial as sandwich.

guappa

Well in USA they have "sales engineers", which in my experience are people who have no clue how the thing they're supposed to sell works.

ndriscoll

I went into software instead, but IIRC sales and QA engineers were common jobs I heard about for people in my actual accredited (optical) engineering program. A quick search suggests it is common for sales engineers to have engineering degrees? Is this specifically about software (where "software engineers" frequently don't have engineering degrees either)?

guappa

In my (software) organisation, sales engineers were not aware of the fact that after entering a command on a linux terminal you must press enter for it to work.

They were also unaware of the fact that if you create a filename with spaces you must then escape/quote it for it to work.

They requested this important information to be included in the user manual (the users being sysadmins at very large companies).

dwringer

Isn't this basically the same argument that comes up all the time about software engineering in general?

leshow

I have a degree in software engineering and I'm still critical if its inclusion as an engineering discipline, just given the level of rigour that's applied to typical software development.

When it comes to "prompt engineering", the argument is even less compelling. Its like saying typing in a search query is engineering.

klntsky

googling pre-LLMs was a required skill. Prompting is not just for search if you build LLM pipelines. Cost commonly can be easily optimized 2x if you know what you are doing.

ozim

Because your imagination stopped at chat interface asking for funny cat pictures.

There are prompts to be used with API an inside automated workflows and more to it.

kovac

IT is where words and their meanings come to die. I wonder if words ever needed to mean something :p

theanonymousone

I understand your point, but don't we already have e.g. AWS engineers? Or I believe SAP/Tableau/.. engineers?

liampulles

Absolutely. It's not appropriate to describe developers in general either. That fight has been lost I think and that's all the more reason to push against this nonsense now.

morkalork

For real. Editing prompts bares no resemblance to engineering at all, there is no accuracy or precision. Say you have a benchmark to test against and you're trying to make an improvement. Will your change to the prompt make the benchmark go up? Down? Why? Can you predict? No, it is not a science at all. It's just throwing shit and examples at the wall in hopes and prayers.

yawnxyz

updating benchmarks and evals is something closer to test engineering / qa engineering work though

echelon

> Will your change to the prompt make the benchmark go up? Down? Why? Can you predict? No, it is not a science at all.

Many prompt engineers do measure and quantitatively compare.

morkalork

Me too but it's after the fact. I make a change then measure, if it doesn't I roll back. But it's as good as witch craft or alchemy. Will I get I get gold with this adjustment? Nope, still lead. Tries variation #243 next

Kiyo-Lynn

At first I kept thinking the model just wasn't good enough, it just couldn’t give me what I wanted. But over time, I realized the real problem was that I hadn’t figured out what I wanted in the first place. I had to make my own thinking clear first, then let the AI help organize it. The more specific and patient I am, the better it responds.

jwr

I find the name "prompt engineering" so annoying. There is no engineering in throwing something at the wall and seeing if it sticks. There are no laws or rules that one can learn. It's not science, and it is certainly not engineering.

dimitri-vs

It's really just technical writing. Majority of tricks from the GTP-4 era are obsolete with reasoning models.

jorge_cab

I don't think it should be the role of developers to write "good prompts" I think ideally an intermediate layer should optimize the information passed to an LLM.

Like what Agentic IDEs are starting to do. I don't copy paste code in the correct way to optimize my prompt, I select the code I want, with MCPs picking up you might not even have to paste input/output the Agent can run it and parse it into the LLM in an optimal way.

Of course, the quality of your instructions matter but I think that falls outside of "prompt engineering"

air7

A few days ago Sergey Brin said "We don't circulate this too much in the AI community – not just our models but all models – tend to do better if you threaten them … with physical violence"

-- https://www.theregister.com/2025/05/28/google_brin_suggests_...

layman51

This reminds me of that funny detail in a YouTube video by “Programmers are also human” on professional vibe coders where he keeps ending his orders to the LLM with “.. or you go to jail.”

xigency

So that's why they dropped "Don't Be Evil."

HN

Prompt engineering playbook for programmers

Prompt engineering playbook for programmers