Curl: We still have not seen a valid security report done with AI help

257 comments

·May 6, 2025

danielvf

I handle reports for a one million dollar bug bounty program.

AI spam is bad. We've also never had a valid report from an by an LLM (that we could tell).

People using them will take any being told why a bug report is not valid, questions, or asks for clarification and run them back through the same confused LLM. The second pass through generates even deeper nonsense.

It's making even responding with anything but "closed as spam" not worth the time.

I believe that one day there will be great code examining security tools. But people believe in their hearts that that day is today, and that they are riding the backs of fire breathing hack dragons. It's the people that concern me. They cannot tell the difference between truth and garbage.

phs318u

>It's the people that concern me. They cannot tell the difference between truth and garbage.

Suffice to say, this statement is an accurate assessment of the current state of many more domains than merely software security.

immibis

This has been going for years before AI - they say we live in a "post-truth society". The generation and non-immediate-rejection of AI slop reports could be another manifestation of post-truth rather than a cause of it.

Seb-C

> I believe that one day there will be great code examining security tools.

As for programming, I think that we will simply continue to have incrementally better tools based on sane and appropriate technologies, as we have had forever.

What I'm sure about is that no such tool can come out of anything based on natural language, because it's simply the worst possible interface to interact with a computer.

cratermoon

people have been trying various iterations of "natural language programming" since programming languages were a thing. Even COBOL was supposed to be more natural than other languages of the era.

https://www.cs.utexas.edu/~EWD/transcriptions/EWD06xx/EWD667...

VladVladikoff

This sounds more like an influx of scammers than security researchers leaning too hard on AI tools. The main problem is the bounty structure. And I don’t think these influx of low quality reports will go away, or even get any less aggressive as long as there is money to attract the scammers. Perhaps these bug bounty programs need to develop an automatic pass/fail tester of all submitted bug code, to ensure the reporter really found a bug, before the report is submitted to the vendor.

rwmj

It's unfortunately widespread. We don't offer bug bounties, but we still get obviously LLM-generated "security reports" which are just nonsense and waste our time. I think the motivation may be trying to get credit for contributing to open source projects.

holuponemoment

Simply charge a fee to submit a report. At 1% of the payment for low bounties it's perfectly valid. Maybe progressively scale that down a bit as the bounty goes up. But still for a $50k bounty you know is correct it's only $500.

Jean-Papoulos

No need to make it a percentage ; charge $1 and the spammers will stop extremely quickly, since none of their reports are valid.

But I do think established individual and institutes should have free access ; leave a choice between going through an identification process and paying the fee. If it's such a big problem that you REALLY need to do something ; otherwise just keep marking as spam.

cedws

If you charge a fee the motivation for good samaritan reports goes to zero.

ponector

You are adding more incentive to go directly to black market to sell vulnerability.

Also I've heard many times cases when company refused to pay bounty for any reason.

And taxes, how you'll tax it internationally? Sales tax? VAT?

imtringued

Why charge a fee? All you need is a reputation system where low reputation bounty hunters need a reputable person to vouch for them. If it turns out to be false, both take a hit. If true, the voucher gets to be a co-author and a share in the bounty.

lucyjojo

gentle reminder that the median salary of a programmer in japan is 60k USD a year. 500 usd is a lot of money (i would not be able to afford it personally).

i suspect 1usd would do the job perfectly fine without cutting out normal non-american people.

justsid

Could also be made refundable when the bug report is found to be valid. Although of course the problem then becomes some kid somewhere who is into computers and hacking find something but can’t easily report it because the barrier to entry is too high now. I don’t think there is a good solution unfortunately.

datatrashfire

> I believe that one day there will be great code examining security tools.

Based on current state, what makes you think this is given?

ASalazarMX

The improvement history of tools beside LLMs, I suspect. First we had syntax highlighting, and we were wondered. Now we have fuzzers and sandbox malware analysis, who knows what the future will bring?

michaelcampbell

> They cannot tell the difference between truth and garbage.

I honestly think that in this context, they don't care - they put in essentially zero effort on the minuscule chance that you'll pay out something.

It's the same reason we have spam. The return rates are near zero, but so is the effort.

unsnap_biceps

For those of you who don't want to click into linked in, https://hackerone.com/reports/3125832 is the latest example of a invalid curl report

harrisi

This is interesting because they've apparently made a couple thousand dollars reporting things to other companies. Is it just a case of a broken clock being right twice a day? Seems like a terrible use of everyone's time and money. I find it hard to believe a random person on the internet using ChatGPT is worth $1000.

billyoneal

There are places that will pay bounties on even very flimsy reports to avoid the press / perception that they aren't responding to researchers. But that's only going to remain as long as a very small number of people are doing this.

It's easy for reputational damage to exceed $1'000, but if 1000 people do this...

cratermoon

One might even call it reputational blackmail. "Give me $1000 for this invalid/useless bug report or I'll go to the most click-baity incompetent tech press outlets with how your product is the worst thing since ILUVYOU."

bluGill

$1000 is cheap... The real question is when will companies become wise to this scam?

Most companies make you fill in expense reports for every trivial purchase. It would be cheaper to just let employees take the cash - and most employees are honest enough. However the dishonest employee isn't why they do expense reports (there are other ways to catch dishonest employees). There used to be a scam where someone would just send a bill for "services" and those got paid often enough until companies realized the costs and started making everyone do the expense reports so they could track the little expenses.

jasinjames

Can someone explain the ip address in the hackerone profile[0]? I can't tell if 139.224.130.174 is a reference to something real or just hallucinated by the LLM to look "cool". Wikipedia says that this /8 is controlled by "MIX"[1] but my google-fu is failing me atm.

[0] https://hackerone.com/evilginx?type=user [1] https://en.wikipedia.org/wiki/List_of_assigned_/8_IPv4_addre...

dpifke

Per WHOIS, it's assigned to Alibaba Cloud (could be a VM there):

  inetnum:        139.224.0.0 - 139.224.255.255
  netname:        ALISOFT
  descr:          Aliyun Computing Co., LTD
  descr:          5F, Builing D, the West Lake International Plaza of S&T
  descr:          No.391 Wen'er Road, Hangzhou, Zhejiang, China, 310099
  country:        CN
  admin-c:        ZM1015-AP
  tech-c:         ZM877-AP
  tech-c:         ZM876-AP
  tech-c:         ZM875-AP
  abuse-c:        AC1601-AP
  status:         ALLOCATED PORTABLE
  mnt-by:         MAINT-CNNIC-AP
  mnt-irt:        IRT-ALISOFT-CN
  last-modified:  2023-11-28T00:57:06Z
  source:         APNIC

darkoob12

You can tell it's ChatGPT from the stupid icon. In one of the iterations they started using thses emojis which are disturbing for me. The answer to the first question has obvious ChatGPT writing style.

Aachen

Daniel posting about the LinkedIn post: https://mastodon.social/@bagder/114455578286549482

Recent toots on account has the news as well

nneonneo

Good god did they hallucinate the segmentation fault and the resulting GDB trace too? Given that the diffs don’t even apply and the functions don’t even exist, I guess the answer is yes - in which case, this is truly a new low for AI slop bug reports.

terom

The git commit hashes in the diff are interesting: 1a2b3c4..d4e5f6a

I think my wetware pattern-matching brain spots a pattern there.

terom

Going a bit further, it seems like there's a grain of truth here, HTTP/2 has a stream priority dependency mechanism [1] and this report [2] from Imperva describes an actual Dependency Cycle DoS in the nghttp implementation.

Unfortunately that's where it seems to end... I'm not that familiar with QUIC and HTTP/2, but I think the closest it gets is that the GitHub repo exists and has a `class QuicConnection` [3]. Beyond that, the QUIC protocol layer doesn't have any concept of exchanging stream priorities [4] and HTTP/2 priorities are something the client sends, not the server? The PoC also mentions HTTP/3 and PRIORITY_UPDATE frames, but those are from the newer RFC 9218 [5] and lack the stream dependencies used in HTTP/2 PRIORITY frames.

I should learn more about HTTP/3!

[1] https://blog.cloudflare.com/adopting-a-new-approach-to-http-...

[2] https://www.imperva.com/docs/imperva_hii_http2.pdf

[3] https://github.com/aiortc/aioquic/blob/218f940467cf25d364890...

[4] https://datatracker.ietf.org/doc/html/rfc9000#name-stream-pr...

[5] https://www.rfc-editor.org/rfc/rfc9218.html#name-the-priorit...

mitchellpkt

Excellent catch! I had to go back and take a second look, because I completely missed that the first time.

3abiton

This is a whole new problem open source project will be facing. AI slop PR and Vulnerability reports, which will be only solved using AI tools to filter through the unholy amount.

bluGill

An real report would have a GDB trace that looks like that, so it isn't hard to create such a trace. Many of us could create a real looking GDB trace just as well by hand - it would be tedious, boring, and pointless but we could.

nneonneo

Oh, I'm fully aware an LLM can hallucinate a GDB trace just fine.

My complaint is: if you're trying to use an AI to help you find bugs, you'd sincerely hope that they would have *some* attempt to actually run the exploit. Having the LLM invent fake evidence that you have done so, when you haven't, is just evil, and should be resulting in these people being kicked straight off H1 completely.

xk_id

Not sure what timeline this is anymore where a tech website loads up a completely blank page on my mobile device.

seanp2k2

Welcome to the web in 2025, where it takes 5MB of JS and everything else to load a blog post containing 640B of text.

bogwog

If I wanted to slip a vulnerability into a major open source project with a lot of eyes on it, using AI to DDOS their vulnerability reports so they're less likely to find a real report from someone who caught me seems like an obvious (and easy) step.

Looking at one of the bogus reports, it doesn't even seem like a real person. Why do this if you're not trying to gain recognition?

jsheard

> Why do this if you're not trying to gain recognition?

They're doing it for money, a handful of their reports did result in payouts. Those reports aren't public though, so there's no way to know if they actually found real bugs or the reviewer rubber-stamped them without doing their due diligence.

0x500x79

It should be called "Denial of Attention" attack!

vessenes

Reading the straw that broke the camel's back commit illustrates the problem really well: https://hackerone.com/reports/3125832 . This shit must be infuriating to dig through.

I wonder if reputation systems might work here - you could give anyone who id's with an AML/KYC provider some reputation, enough for two or three reports, let people earn reputation digging through zero rep submissions and give someone like 10,000 reputation for each accurate vulnerability found, and 100s for any accurate promoted vulnerabilities. This would let people interact anonymously if they want to edit, quickly if they found something important and are willing to AML/KYC, and privilege quality people.

Either way, AI is definitely changing economics of this stuff, in this case enshittifying first.

bflesch

there is a reputation system already. according to hackerone reputation system, it is a credible reporter. it's really bad

hedora

The vast majority of developers are 10-100x more likely to find a security hole in a random tool than spend time improving their reputation on a bug bounty site that pays < 10% their salary.

That makes it extremely hard to build a reputation system for a site like that. Almost all the accounts are going to be spam, and the highest quality accounts are going to freshly created and take ~ 1 action on the platform.

Aachen

Or a deposit system: pay 2€ for a human to read this message, you'll get it back if it's not spam

What if the human marks it as spam but you're actually legit? Deposit another 2€ to have the platform (like Hackerone or whichever you're reporting via) give a second opinion, you'll get the 4€ back if you weren't spamming. What to do with the proceeds from spammers? The first X euros of spam reports go to upkeep of the platform, the rest to a good cause defined by the projects to whom the reports were submitted because they were the ones who had to deal with reading the slop so they get at least this much out of it

Raise deposit cost so long as slop volume remains unmanageable

This doesn't discriminate against people who aren't already established, but it may be a problem if you live in a low-income country and can't easily afford 20€ (assuming it ever gets to that deposit level). Perhaps it wouldn't work, but it can first be trialed at a normal cost level. Another concern is anonymity and payment. We hackers are often a paranoid lot. One can always support cash in the mail though, the sender can choose whether their privacy is worth a postage stamp

emushack

Reputation systems for this kind of thing sounds like rubbing some anti-itch cream on bullet wound. I feel like the problem seems to me to be behavior, not a technology issue.

Personally I can't imagine how miserable it would be for my hard-earned expertise to be relegated to sifting through SLOP where maybe 1 in hundreds or even thousands of inquiries is worth any time at all. But it also doesn't seem prudent to just ignore them.

I don't think better ML/AI technology or better information systems will make a significant difference on this issue. It's fundamentally about trust in people.

delusional

I consider myself a left leaning soyboy, but this could be the outcome of too "nice" of a discourse. I won't advocate for toxicity, but I am considering if we bolster the self-image of idiots when we refuse to call them idiots. Because you're right, this is fundamentally a people problem, specifically we need people to filter this themselves.

I don't know where the limit would go.

orthecreedence

Shame is a useful social tool. It can be overused or underused, but it's still a tool and people like this should be made to publicly answer for their obnoxious and destructive behavior.

bigiain

I'm now imagining old-Linus responding to an AI slop bug report on lkml...

Analemma_

> I feel like the problem seems to me to be behavior, not a technology issue.

To be honest, this has been a grimly satisfying outcome of the AI slop debacle. For decades, the general stance of tech has been, “there is no such thing as a behavioral/social problem, we can always fix it with smarter technology”, and AI is taking that opinion and drowning it in a bathtub. You can’t fix AI slop with technology because anything you do to detect it will be incorporated into better models until they evade your tests.

We now have no choice but to acknowledge the social element of these problems, although considering what a shitshow all of Silicon Valley’s efforts at social technology have been up to now, I’m not optimistic this acknowledgement will actually lead anywhere good.

senordevnyc

You can’t fix AI slop with technology because anything you do to detect it will be incorporated into better models until they evade your tests.

How is that a bad thing? At a certain point, it’s no longer AI slop!

https://xkcd.com/810/

squigz

I guess I'm confused by your position here.

> I feel like the problem seems to me to be behavior, not a technology issue.

Yes, it's a behavior issue, but that doesn't mean it can't be solved or at least minimized by technology, particularly as a technology is what's exacerbating the issue?

> It's fundamentally about trust in people.

Who is lacking trust in who here?

me_again

Vulnerability reports are interesting from a trust point of view, because each party has a different financial incentive. You can't 100% trust the vendor to accurately assess the severity of an issue - they have a lot riding on downplaying an issue in some cases. The person reporting the bug is also likely looking for bounty and reputational benefit, both of which are enhanced if the issue is considered high severity. So a user of the supposedly-vulnerable program can't blindly trust either party.

Seb-C

IMO, this AI crap is just the next step of the "let's block criminal behavior with engineering" path we followed for decades. That might very well be the last straw, as it is very unlikely we can block this one efficiently and reliably.

It's due time we ramp-up our justice systems to make people truly responsible and punished for their bad behavior online, including all kind of spams, scams, fishing and disinformation.

That might involve the end of anonymity on internet, and lately I feel that the downsides of that are getting smaller and smaller compared to it's upsides.

parliament32

Didn't even have to click through to the report in question to know it would be all hallucinations -- both the original patchfile and the segfault ("ngtcp2_http3_handle_priority_frame".. "There is no function named like this in current ngtcp2 or nghttp3.") I guess these guys don't bother to verify, they just blast out AI slop and hope one of them hits?

indigodaddy

Reminds me of when some LLM (might have been Deepseek) told me I could add wasm_mode=True in my FastHTML python code which would allow me to compile it to WebAssembly, when of course there is no such feature in FastHTML. This was even when I had provided it full llms-ctx.txt

alabastervlog

I had Google's in-search "AI" invent a command line switch that would have been very helpful... if it existed. Complete with usage caveats and warnings!

This was like two weeks ago. These things suck.

j_w

My favorite is when their in search "AI answer" hallucinates on the Golang standard lib. Always makes me happy to see.

sidewndr46

Isn't there a website that builds git man pages this way? By just stringing together random concepts into sentences that seem vaguely like something Git would implement. I thought it was silly and potentially harmful the first time I saw it. Apparently, it may have just been ahead of the curve.

bigiain

<conspiracy theory> Google's internal version of that tool _does_ implement that command line switch...

pixl97

>"ngtcp2_http3_handle_priority_frame"

I wonder if you could use AI to classify the probability factor that something is AI bullshit and deprioritize it?

pacifika

AI red tape.

spiffyk

> I guess these guys don't bother to verify, they just blast out AI slop and hope one of them hits?

Yes. Unfortunately, some companies seem to pay out the bug bounty without even verifying that the report is actually valid. This can be seen on the "reporter"'s profile: https://hackerone.com/evilginx

soraminazuki

Considering that even the reporter responded to requests for clarification with yet another AI slop, they likely lack the technical background.

bigiain

"they likely lack the ethical background."

FTFY

kazinator

A prominent project in which people have a stake in seeing bugs fixed can afford to charge a refundable deposit against reporters.

Say, $100.

If your report is true, or even if it is incorrect but honestly mistaken, you get your $100 back.

If it is time-wasting slop with hallucinated gdb crash traces, then you don't get your money back (and so you don't pay the deposit in the first place, and don't send such a report, unless you're completely stupid, or too rich to care about $100).

If AI slopsters have to pay to play, with bad odds and no upside, they will go elsewhere.

rdtsc

> evilginx updated the severity from none to high

Well the reporter in the report that stated it that they are open for employment https://hackerone.com/reports/3125832 Anyone want to hire them? They can play with ChatGPT all day and spam random projects with the AI slop.

gorbachev

Growth hack: hire this person to find vulnerabilities in competitors' products.

bigiain

Effective altruism: hire this guy to manipulate software company's stock prices with highly publicized "vulnerabilities" in their products...

null

[deleted]

uludag

I can imagine that most LLMs, if you ask it to find a security vulnerability in a given piece of code, will make something up completely out of the air. I've (mistakenly) sent valid code with an unrelated error and to this day I get nonsense "fixes" for these errors.

This alignment problem between responding with what the user wants (e.g. a security report, flattering responses) and going against the user seems a major problem limiting the effectiveness of such systems.

ianbutler

Counterpoint we have a CVE attributable to ours and I suspect the difference is my co-founder was an offensive kernel researcher so our system is tuned for this in a way your average...ambulance chaser is unable to do.

https://blog.bismuth.sh/blog/bismuth-found-the-atop-bug

https://www.cve.org/CVERecord?id=CVE-2025-31160

The amount of bad reports curl in particular has gotten is staggering and it's all from people who have no background just latching onto a tool that won't elevate them.

Edit: Also shoutout to one of our old professors Brendan Dolan-Gavitt who now works on offensive security agents who has a highly ranked vulnerability agent XBOW.

https://hackerone.com/xbow?type=user

So these tools are there and doing real work its just there are so many people looking for a quick buck that you really have to tease the noise from the bs.

pizzalife

I would try to find a better example than CVE-2025-31160. If you ask me, this kind of 'vulnerability' is CVE spam.

ianbutler

Except if you read the blog post we helped a very confused maintainer when they had this dropped on them with no explanation on hacker news except "oooh potential scary heap vuln"

jacksnipe

Something that really frustrates me about interacting with (some) people who use AI a lot is that they will often tell me things that start “I asked ChatGPT and it said…” stop it!!! If the chatbot taught you something and you understood it, explain it to me. If you didn’t understand or didn’t trust it, then keep it to yourself!

cogman10

I recently had this happen from a senior engineer. What's really frustrating is I TOLD them the issues and how to fix it. Instead of listening to what I told them, they plugged it into GPT and responded with "Oh, interesting this is what GPT says" (Which, spoiler, was similar but lacking from what I'd said).

Meaning, instead of listening to a real-life expert in the company telling them how to handle the problem they ignored my advice and instead dumped the garbage from GPT.

I really fear that a number of engineers are going to us GPT to avoid thinking. They view it as a shortcut to problem solve and it isn't.

colechristensen

>They view it as a shortcut to problem solve and it isn't

Oh but it is, used wisely.

One: it's a replacement for googling a problem and much faster. Instead of spending half an hour or half a day digging through bug reports, forum posts, and stack overflow for the solution to a problem. LLMs are a lot faster, occasionally correct, and very often at least rather close.

Two: it's a replacement for learning how to do something I don't want to learn how to do. Case Study: I have to create a decent-enough looking static error page for a website. I could do an awful job with my existing knowledge, I could spend half a day relearning and tweaking CSS, elements, etc. etc. or I could ask an LLM to do it and then tweak the results. Five minutes for "good enough" and it really is.

LLMs are not a replacement for real understanding, for digging into a codebase to really get to the core of a problem, or for becoming an expert in something, but in many cases I do not want to, and moreover it is a poor use of my time. Plenty of things are not my core competence or anywhere near the goals I'm trying to achieve. I just need a quick solution for a topic I'm not interested in.

ijidak

This exactly!

There are so many things that a human worker or coder has to do in a day and a lot of those things are non-core.

If someone is trying to be an expert on every minor task that comes across their desk, they were never doing it right.

An error page is a great example.

There is functionality that sets a company apart and then there are things that look the same across all products.

Error pages are not core IP.

At almost any company, I don't want my $200,000-300,000 a year developer mastering the HTML and CSS of an error page.

vuserfcase

>Oh but it is, used wisely.

Sufficiently advanced orange juice extractor is the solution to any problem. Doesen't necessarily mean you should build the sufficient part.

>One: it's a replacement for googling a problem and much faster

This is more to do with the problem that google results have gone downhill very rapidly. It used to be you could find what you were looking for very fast and solve a problem.

>I could ask an LLM to do it and then tweak the results. Five minutes for "good enough" and it really is.

When the cost of failures is low, a hackjob can be economical, like a generated picture for entertainment or a static error page. Miscreating a support for a bridge it is not very economical

jsight

I wonder if this is an indication that they didn't really understand what you said to begin with.

colechristensen

If I had a dollar for every time I told someone how to fix something and they did something else...

Let's just say not listening to someone and then complaining that doing something else didn't work isn't exactly new.

silversmith

I often do this - ask a LLM for an answer when I already have it from an expert. I do it to evaluate the ability of the LLM. Usually not in the presence of said expert tho.

namaria

Just using LLMs on the (few) things I have specialist knowledge of it's clear they are extremely limited. I get absurdly basic mistakes and I am very wary of even reading LLM output about topics I don't command. It's easy to get stuck on dead ends reasoning wise even by getting noisy input.

tharant

Is it possible that what happened was an impedance mismatch between you and the engineer such that they couldn’t grok what you told them but ChatGPT was able to describe it in a manner they could understand? Real-life experts (myself included, though I don’t claim to be an expert in much) sometimes have difficulty explaining domain-specific concepts to other folks; it’s not a flaw in anyone, folks just have different ways of assembling mental models.

kevmo314

Whenever someone has done that to me, it's clear they didn't read the ChatGPT output either and were sending it to me as some sort of "look someone else thinks you're wrong".

cogman10

Definitely a possibility.

However, I have a very strong suspicion they also didn't understand the GPT output.

To flush out the situation a bit further, this was a performance tuning problem with highly concurrent code. This engineer was initially tasked with the problem and they hadn't bothered to even run a profiler on the code. I did, shared my results with them, and the first action they took with my shared data was dumping a thread dump into GPT and asking it where the performance issues were.

Instead, they've simply been littering the code with timing logs in hopes that one of them will tell them what to do.

delusional

Those people weren't engineers to start with.

layer8

Software engineers rarely are.

I’m saying this tongue in cheek, but there’s some truth to it.

throwanem

You should ask yourself why this organization wants engineering advice from a chatbot more than from you.

I doubt the reason has to do with your qualities as an engineer, which must be basically sound. Otherwise why bother to launder the product of your judgment, as you described here someone doing?

tharant

> I really fear that a number of engineers are going to us GPT to avoid thinking. They view it as a shortcut to problem solve and it isn't.

How is this sentiment not different from my grandfather’s sentiment that calculators and computers (and probably his grandfather’s view of industrialization) are a shortcut to avoid work? From my perspective most tools are used as a shortcut to avoid work; that’s kinda the while point—to give us room to think about/work on other stuff.

parliament32

Because calculators aren't confidently wrong the majority of the time.

stevage

Did you grandpa think that calculators made engineers worse at their jobs?

evandrofisico

It is supremely annoying when i ask in a group if someone has experience with a tool or system and some idiot copies my question into some LLM and paste the answer. I can use the LLM just like anyone, if i'm asking for EXPERIENCE it is because I want the opinion of a human who actually had to deal with stuff like corner cases.

ModernMech

It's the 2025 version of lmgtfy.

layer8

Nah, that’s different. Lmgtfy has nothing to do with experience, other than experience in googling. Lmgtfy applies to stuff that can expediently be googled.

soulofmischief

The whole point of paying a domain expert is so that you don't have to google shit all day.

jacksnipe

That’s exactly how I feel

jsheard

If it's not worth writing, it's not worth reading.

floren

Reminds me of something I wrote back in 2023: "If you wrote it with an LLM, it wasn't worth writing" https://jfloren.net/b/2023/11/1/0

ToValueFunfetti

There's a lot of documentation out there that I've found was left unwritten but that I would have loved to read

pixl97

I mean, there is a lot of hand written crap to, so even that isn't a good rule.

Frost1x

I work in a corporate environment as I’m sure many others do. Many executives have it in their head that LLMs are this brand new efficiency gain they can pad profit margins with, so you should be using it for efficiency. There’s a lot of push for that, everywhere where I work.

I see email blasts suggesting I should be using it, I get peers saying I should be using it, I get management suggesting I should use it to cut costs… and there is some truth there but as usual, it depends.

I, like many others, can’t be asked to take on inefficiency in the name of efficiency ontop of currently most efficient ways to do my work. So I too say “ChatGPT said: …” because I dump lots of things into it now. Some things I can’t quickly verify, some things are off, and in general it can produce far more information than I have time to check. Saying “ChatGPT said…” is the current CYA caveat statement around the world of: use this thing but also take liability for it. No, if you practically mandate I use something, the liability falls on you or that thing. If it’s a quick verify I’ll integrate it into knowledge. A lot of things aren’t.

parliament32

> I see email blasts suggesting I should be using it, I get peers saying I should be using it, I get management suggesting I should use it to cut costs

The ideal scenario: you write a few bulletpoints and ask Copilot to turn it into a long-form email to send out. Your receiving coworker then asks Copliot to distill it back into a few bullet points they can skim.

You saved 5 minutes but one of your points was ignored entirely and 20% of your output is nonsensical.

Your coworker saved 2 minutes but one of their bulletpoints was hallucinated and important context is missing from the others.

Microsoft collects a fee from both of you and is the only winner here.

rippleanxiously

It just feels to me like a boss walking into a car mechanic's shop holding some random tool, walking up to a mechanic, and:

"Hey, whatcha doin?"

"Oh hi, yea, this car has a slight misfire on cyl 4, so I was just pulling one of the coilpacks to-"

"Yea alright, that's great. So hey! You _really_ need to use this tool. Trust me, it's gonna make your life so much easier"

"umm... that's a 3d printer. I don't really think-"

"Trust me! It's gonna 10x your work!"

...

I love the tech. It's the evangelists that don't seem to bother researching the tech beyond making an account and asking it to write a couple scripts that bug me. And then they proclaim it can replace a bunch of other stuff they don't/haven't ever bothered to research or understand.

yoyohello13

Seriously. Being able to look up stuff using AI is not unique. I can do that too.

This is kind of the same with any AI gen art. Like I can go generate a bunch of cool images with AI too, why should I give a shit about your random Midjourney output.

kristopolous

Comfyui workflows, fine-tuning models, keeping up with the latest arxiv papers, patching academic code to work with generative stacks, this stuff is grueling.

Here's an example https://files.meiobit.com/wp-content/uploads/2024/11/22l0nqm...

Being dismissive of AI art is like those people who dismiss electronic music because there's a drum machine.

Doing things well still requires an immense amount of skill and exhaustive amount of effort. It's wildly complicated

codr7

Makes even less sense when you put it like that, why not invest that effort into your own skills instead?

alwa

I mean… I have a fancy phone camera in my pocket too, but there are photographers who, with the same model of fancy phone camera, do things that awe and move me.

It took a solid hundred years to legitimate photography as an artistic medium, right? To the extent that the controversy still isn’t entirely dead?

Any cool images I ask AI for are going to involve a lot less patience and refinement than some of these things the kids are using AI to turn out…

For that matter, I’ve watched friends try to ask for factual information from LLMs and found myself screaming inwardly at how vague and counterproductive their style of questioning was. They can’t figure out why I get results I find useful while they get back a wall of hedging and waffling.

namaria

> It took a solid hundred years to legitimate photography as an artistic medium, right?

Not really.

"In 1853 the Photographic Society, parent of the present Royal Photographic Society, was formed in London, and in the following year the Société Française de Photographie was founded in Paris."

https://www.britannica.com/technology/photography/Photograph...

h4ck_th3_pl4n3t

How can you be so harsh on all the new kids with Senior Prompt Engineer in their job titles?

They have to prove to someone that they're worth their money. /s

esafak

I had to deal with someone who tried to check in hallucinated code with the defense "I checked it with chatGPT!"

If you're just parroting what you read, what is it that you do here?!

qmr

I hope you dealt with them by firing them.

esafak

Yes, unfortunately. This was the last straw, not the first.

giantg2

Manage people?

tough

then what the fuck are they doing commiting code? leave that to the coders

hashmush

As much as I'm also annoyed by that phrase, is it really any different from:

- I had to Google it...

- According to a StackOverflow answer...

- Person X told me about this nice trick...

- etc.

Stating your sources should surely not be a bad thing, no?

mentalpiracy

It is not about stating a source, the bad thing is treating chatGPT as an authoritative source like it is a subject matter expert.

silversmith

But is "I asked chatgpt" assigning any authority to it? I use precisely that sentence as a shorthand for "I didn't know, looked it up in the most convenient way, and it sounded plausible enough to pass on".

stonemetal12

In general those point to the person's understanding being shallow. So far when someone says "GPT said..." it is a new low in understanding, and there is no more to the article they googled or second stackOverflow answer with a different take on it, it is the end of the conversation.

spiffyk

Well, it is not, but the three "sources" you mention are not worth much either, much like ChatGPT.

bloppe

SO at least has reputation scores and people vote on answers. An answer with 5000 upvotes, written by someone with high karma, is probably legit.

gruez

>but the three "sources" you mention are not worth much either, much like ChatGPT.

I don't think I've ever seen anyone lambasted for citing stackoverflow as a source. At best, they chastised for not reading the comments, but nowhere as much pushback as for LLMs.

dpoloncsak

...isn't that exactly why someone states that?

"Hey, I didn't study this, I found it on Google. Take it with a grain of caution, as it came from the internet" has been shortened to "I googled it and...", which is now evolving to "Hey, I asked chatGPT, and...."

rhizome

All three of those should be followed by "...and I checked it to see if it was a sufficient solution to X..." or words to that effect.

billyoneal

The complaint isn't about stating the source. The complaint is about asking for advice, then ignoring that advice. If one asks how to do something, get a reply, then reply to that reply 'but Google says', that's just as rude.

kimixa

It's a "source" that cannot be reproduced or actually referenced in any way.

And all the other examples will have a chain of "upstream" references, data and discussion.

I suppose you can use those same phrases to reference things without that, random "summaries" without references or research, "expert opinion" from someone without any experience in that sector, opinion pieces from similarly reputation-less people etc. but I'd say they're equally worthless as references as "According to GPT...", and should be treated similarly.

hx8

It depends on if they are just repeating things without understanding, or if they have understanding. My issue is that people that say "I asked gpt" is that they often do not have any understanding themselves.

Copy and pasting from ChatGPT has the same consequences as copying and pasting from StackOverflow, which is to say you're now on the hook supporting code in production that you don't understand.

tough

We cannot blame the tools for how they are used by those yielding them.

I can use ChatGPT to teach me and understand a topic or i can use it to give me an answer and not double check and just copy paste.

Just shows off how much you care about the topic at hand, no?

nraynaud

the first 2 bullet points give you an array of answers/comments helping you cross check (also I'm a freak, and even on SO, I generally click on the posted documentation links).

JohnFen

I agree wholeheartedly.

"I asked X and it said..." is an appeal to authority and suspect on its face whether or not X is an LLM. But when it's an LLM, then it's even worse. Presumably, the reason for the appeal is because the person using it considers the LLM to be an authoritative or meaningful source. That makes me question the competence of the person saying it.

godelski

  > Something that really frustrates me about interacting with

Something that frustrates me with LLMs is that they are optimized such that errors are as silent as possible.

It is just bad design. You want errors to be as loud as possible. So they can be traced and resolved. On the other hand, LLMs optimize human preference (or some proxy of this). While humans prefer accuracy, it would be naive to ignore all the other things that optimize this objective. Specifically, humans prefer answers that they don't know are wrong over those that they do know are wrong.

This doesn't make LLMs useless but certainly it should strongly inform how we use them. Frankly, you cannot trust outputs, so you have to verify. I think this is where there's a big divergence between LLM users (and non-users). Those that blindly trust and those that don't (extreme case is non-users). If you need to constantly verify AND recognize that verification is extra hard (because it is optimized to be invisible to you), it can create extra work, not less.

It really is two camps and I think it says a lot:

  - "Blindly" trust
  - "Trust" but verify

Wide range of opinions in these two camps, but I think it comes down to some threshold of default trust or default suspicion.

meindnoch

The solution is simple. Before submitting a security report, the reporter must escrow $10 which is awarded to the reviewer if the submission turns out to be AI slop.

HN

Curl: We still have not seen a valid security report done with AI help

Curl: We still have not seen a valid security report done with AI help