LLMs for Engineering: Teaching Models to Design High Powered Rockets
17 comments
·April 30, 2025Workaccount2
heisenzombie
My experience is that SOTA LLMs still struggle to read even the metadata from a mechanical drawing. They're getting better -- they now are mostly ok at reading things like a BOM or revision table -- but moderately complicated title blocks often trip them up.
As for the drawings themselves, I have found them pretty unreliable at reading even quite simple things (i.e. what's the ID of the thru hole?), even when they're specifically dimensioned. As soon as spatial reasoning is required (i.e. there's a dimension from A to B and from A to C and one asks for the dimension B to C), they basically never get it right.
This is a place where there's a LOT of room for improvement.
Terr_
I'm scared of something like the Xerox number-corruption bug [0], where some models will subtly fuck everything up in a way that is too expensive to recover from by the time it's discovered.
[0] https://www.dkriesel.com/en/blog/2013/0802_xerox-workcentres...
discordance
Mechanical drawings and schematics are visualizations for humans.
If you look at the data structure of a gerber or DWG, it’s vectors and metadata. These happen to be great for LLMs.
My hypothesis is that we haven’t done the work on that yet because the market is more interested in things like Ghibli imagery.
flipflipper
Try having it output the circuit in SPICE. It actually works surprisingly well and does a good job picking out components values for parts and can describe the connectivity well. It falls apart when it writes the SPICE (professionally, there isn’t really one well accepted syntax really)and making the wires to connect your components, like you say missing the minds eye. But I can imagine adding a ton spice schematics with detailed descriptions with maybe an LLM optimized SPICE syntax to the training data set… it’ll be designing and simulating circuits in no time.
tintor
Problem #1 with text-to-image models is that focus is on producing visually attractive photo-realistic artistic images, which is completely orthogonal from what is needed for engineering: accurate, complete, self-consistent, and error-free diagrams.
Problem #2 is low control over outputs of text-to-image models. Models don't follow prompts well.
slicktux
Electrical schematics can be represented with linear algebra and Boolean logic… Maybe their being able to “understand” such schematics is just a matter of them becoming better at mathematical logic…which is pretty objective.
neodypsis
Try one of the models with good vision capabilities and ask it to output code using build123d.
yieldcrv
Tell it how to read schematics in the prompt
simianwords
More evidence that we need fine tuned domain specific models. Some one should come up with a medical LLM fine tuned on a 640b model. What better RL dataset can you have than a patient with symptoms and the correct diagnosis?
akomtu
Imagine a fake engineer who read books about engineering as scifi, and thanks to his superhuman memory, he's mastered the engineer-speak so well that he sounds more engineery than top engineers in the world. Except that he has no clue about engineering and to him it's the same as literature or prose. Now he's tasked with designing a bridge. He pauses for a second and starts speaking, in his usual polished style: "sure, let me design a bridge for you." And while he's talking, he's starring at you with his perfect blank face expression, for his mind is blank as well.
Think of the absurdity of trying to understand the Pi number by looking at its first billion digits and trying to predict the next digit. And think of what it takes to advance from memorizing digits of such numbers and predicting continuation with astrology-style logic to understanding the math behind the digits of Pi.
DaiPlusPlus
> Think of the absurdity of trying to understand the Pi number by looking at its first billion digits and trying to predict the next digit. And think of what it takes to advance from memorizing digits of such numbers and predicting continuation with astrology-style logic to understanding the math behind the digits of Pi.
I'm prepared to believe that a sufficiently advanced LLM around today will have some "neural" representation of a generalization of a Taylor Series, thus allowing it to "natively predict" digits of Pi.
weq
You have decribed enron musk perfectly without probably even meaning to. I concur that we have "software engineers" in every role at our tech company now that the general populous has learnt how to use chatgtp. This leads to some interesting conversations as above.
revskill
How about halting problem ? I see llm often got infinite recursive problem.
aaron695
I think what might work is people coming together around this LLM like a God.
Similar to Rod of Iron Ministries (The Church of the AR-15) Taking what is says, fine tuning it, testing it, feeding back in and mostly waiting as LLMs improve.
LLMs will never be smarter than humans, but they can be a meeting place where people congregate to work on goals and worship.
Like QAnon, that's where the collective IQ and power comes from, something to believe in. At the micro level this is also mostly how LLMs are used in practical ways.
If you look to the Middle East there is a lot of work on rockets but a limited community working together.
My hypothesis is until they can really nail down image to text and text to image, such that training on diagrams and drawings can produce fruitful multi modal output, classic engineering is going to be a tough nut to crack.
Software engineering lends itself greatly to LLMs because it just fits so nicely into tokenization. Whereas mechanical drawings or electronic schematics are sort of more like a visual language. Image art but with very exacting and important pixel placement, with precise underlying logical structure.
In my experience so far, only O3 can kind of understand an electronic schematic, but really only at a "Hello World!" level difficulty. I don't know how easy it will be to get to the point where it can render a proper schematic or edit one it is given to meet some specified electronic characteristics.
There are programming languages that are used to define drawings, but the training data would be orders of magnitude less than what is written for humans to learn from.