AI Slop vs. OSS Security

goalieca

> This is the fundamental problem: AI can generate the form of security research without the substance.

I think this is the fundamental problem of LLMs in general. Some of the time looks just enough right to seem legitimate. Luckily the rest of the time it doesn’t.

jsheard

The other fundamental problem is that to a grifter, it's not a fundamental problem for the output to be plausible but often wrong. Plausible is all they need.

Jean-Papoulos

The solution isn't to block aggressively or to allow everything, but to prioritize. Put accounts older than the AI boom at the top, and allow them to give "referrals", ie stake a part of their own credibility to boost another account on the priority ladder.

Referral systems are very efficient at filtering noise.

skydhash

It only works when it’s a true stake. Like loosing privileges when a referral is a dunce. The downside is tribalism.

beeburrt

I didn't realize how bleak the future looks, wrt CVE infastructure both MITRE and at National Vuln. Database.

What do other countries do for their stuff like this?

Maro

Manufacturing vulnerability submissions that look like real vulnerability submissions, but the vulnerability isn't there and the submitter doesn't understand what it's saying.

It's a cargo cult. Maybe the airplanes will land and bring the goodies!

https://en.wikipedia.org/wiki/Cargo_cult

dvt

> Requiring technical evidence such as screencasts showing reproducibility, integration or unit tests demonstrating the fault, or complete reproduction steps with logs and source code makes it much harder to submit slop.

If this isn't already a requirement, I'm not sure I understand what even non-AI-generated reports look like. Isn't the bare-minimum of CVE reporting a minimally reproducible example? Like, even if you find some function, that for example doesn't do bounds-checking on some array, you can trivially write some unit testing code that's able to break it.

bawolff

As someone who worked on the recieving end of security reports, often not. They can be surprisingly poorly written.

You sort of want to reject them all, but ocassionally a gem gets submitted which makes you reluctant.

For example, years ago i was responsible for triaging bug bounty reports at a SaaS company i worked at at the time. One of the most interesting reports was that someone found a way to bypass our oauth thing by using a bug in safari that allowed them to bypass most oauth forms. The report was barely understandable written in broken english. The impression i got was they tried to send it to apple but apple ignored them. We ended up rewriting the report and submitting it to apple on there behalf (we made sure the reporter got all credit).

If we ignored poorly written reports we would have missed that. Is it worth it though? I dont know.

hshdhdhehd

In the AI age I'd prefer poorly written reports in broken English. Just as long as that doesnt become a known bypass and so the AI is instructed to sound broken.

noirscape

The problem that is that a lot of CVEs often don't represent "real" vulnerabilities, but merely theoretical ones that could hypothetically be combined to make a real exploit.

Regex exploitation is the forever example to bring up here, as it's generally the main reason that "autofail the CI system the moment an auditing command fails" doesn't work on certain codebases. The reason this happens is because it's trivial to make a string that can waste significant resources to try and do a regex match against, and the moment you have a function that accepts a user-supplied regex pattern, that's suddenly an exploit... which gets a CVE. A lot of projects then have CVEs filed against them because internal functions rely on Regex calls as arguments, even if they're in code the user is flat-out never going to be able interact with (ie. Several dozen layers deep in framework soup there's a regex call somewhere, in a way the user won't be able to access unless a developer several layers up starts breaking the framework they're using in really weird ways on purpose).

The CVE system is just completely broken and barely serves as an indicator of much of anything really. The approval system from what I can tell favors acceptance over rejection, since the people reviewing the initial CVE filing aren't the same people that actively investigate if the CVE is bogus or not and the incentive for the CVE system is literally to encourage companies to give a shit about software security (at the same time, this fact is also often exploited to create beg bounties). CVEs have been filed against software for what amounts to "a computer allows a user to do things on it" even before AI slop made everything worse; the system was questionable in quality 7 years ago at the very least, and is even worse these days.

The only indicator it really gives is that a real security exploit can feel more legitimate if it gets a CVE assigned to it.

wwfn

Wealth generated on top of underpaid labor is a reoccurring theme -- and in this case maybe surprisingly exacerbated by LLMs.

Would this be different if the underlying code had a viral license? If google's infrastructure was built on a GPL'ed libcurl [0], would they have investment in the code/a team with resources to evaluate security reports (slop or otherwise)? Ditto for libxml.

Does GPL help the linux kernel get investment from it's corporate users?

[0] Perhaps an impossible hypothetical. Would google have skipped over the imaginary GPL'ed libcurl or libxml for a more permissively licensed library? And even if they didn't, would a big company's involvement in an openly developed ecosystem create asymmetric funding/goals, a la XMPP or Nix?

big-and-small

Copyleft licenses are made to support freedom for everyone and particularly end-users. They only limit freedom of developers / maintainers to exploit the code and users.

> Does GPL help the linux kernel get investment from it's corporate users?

GPL has helped "linux kernel the project" greatly, but companies invest in it out of their self-interest. They want to benefit from upstream improvements and playing nicely by upstreaming changes is just much cheaper than maintaining own kernel fork.

On other side you have companies like Sony that used BSD OS code for their game consoles for decades and contributed shit.

So... Two unrelated things.

pksebben

> The model has no concept of truth—only of plausibility.

This is such an important problem to solve, and it feels soluble. Perhaps a layer with heavily biased weights, trained on carefully curated definitional data. If we could train in a sense of truth - even a small one - many of the hallucinatory patterns disappear.

Hats off to the curl maintainers. You are the xkcd jenga block at the base.

jcattle

I am assuming that millions of dollars have already been spent trying to get LLMs to hallucinate less.

Even if Problems feel soluble, they often aren't. You might have to invent an entirely new paradigm of text generation to solve the hallucination problem. Or it could be the Collatz Conjecture of LLMs, that it "feels" so possible, but you never really get there.

big-and-small

Nuclear fusion was always 30 years away (c)

pjc50

The "fact database" is the old AI solution, e.g. Cycorp; it doesn't quite work either. Knowing what is true is a really hard, unsolved problem in philosophy, see e.g. https://en.wikipedia.org/wiki/Gettier_problem . The secret to modern AI is just to skip that and replace unsolvable epistemology with "LGTM", then sell it to investors.

HN

AI Slop vs. OSS Security

AI Slop vs. OSS Security