Skip to content(if available)orjump to list(if available)

Honda: 2 years of ml vs 1 month of prompting - heres what we learned

pjc50

Crucially, this is:

    - text classification, not text generation
    - operating on existing unstructured input
    - existing solution was extremely limited (string matching)
    - comparing LLM to similar but older methods of using neural networks to match
    - seemingly no negative consequences to warranty customers themselves of mis-classification (the data is used to improve process, not to make decisions)

pards

> Over multiple years, we built a supervised pipeline that worked. In 6 rounds of prompting, we matched it. That’s the headline, but it’s not the point. The real shift is that classification is no longer gated by data availability, annotation cycles, or pipeline engineering.

stego-tech

And this is where the strengths of LLMs really lie: making performant ML available to a wider audience, without requiring PHDs in Computer Science or Mathematics to build. It’s consistently where I spend my time tinkering with these, albeit in a local-only environment.

If all the bullshit hype and marketing would evaporate already (“LLMs will replace all jobs!”), stuff like this would float to the top more and companies with large data sets would almost certainly be clamoring for drop-in analysis solutions based on prompt construction. They’d likely be far happier with the results, too, instead of fielding complaints from workers about it (AI) being rammed down their throats at every turn.

Veliladon

^ This. I'm waiting for an LLM where I can just point it to a repo, slurp it up, and let me ask questions about it.

nmfisher

$ git clone repo && cd repo $ claude

Ask away. Best method I’ve found so far for this.

cpursley

github copilot somewhat does this.

etothet

This is exactly what Devin (https://devin.ai) is designed to do. Their deepwiki feature is free. I’ve personally had decent success with it, but YMMV.

stogot

This was fun to read

“ Fun fact: Translating French and Spanish claims into German first improved technical accuracy—an unexpected perk of Germany’s automotive dominance.”

happimess

I wonder how they came up with that. Was it a human idea, or did the AI stumble upon it?

Given that it was inside a 9-step text preprocessing pipeline, it would be surprising if the AI had that much autonomy.

yahoozoo

I wonder if text embeddings and semantic similarity would be effective here?

davidsainez

> We tried multiple vectorization and classification approaches. Our data was heavily imbalanced and skewed towards negative cases. We found that TF-IDF with 1-gram features paired with XGBoost consistently emerged as the winner.