AI-Generated Voice Evidence Poses Dangers in Court
88 comments
·March 11, 2025fhd2
mitthrowaway2
I agree. The article didn't touch on this aspect, but we're now at the point where even authentic recordings could be plausibly denied and claimed to be fake. So the entire usage of recordings as evidence will suffer a hit. We may essentially be knocked back to an 18th century level of reliance on eyewitness testimony. One wonders what the consequences for justice will be.
Ukv
I wouldn't say we'd be quite back to pre-photo evidence days. I feel a lot if not most of the value in a video/audio recording is not just that the medium has traditionally been difficult to edit, but that it's attesting to a lot of details with high specificity. There's a lot to potentially get caught out on with not a lot of wiggle room when inconsistencies are spotted (compared to recalling from memory). Document scans and static images are still useful despite having long been trivial to edit, for instance.
thereddaikon
There's already a process for this, its called chain of custody. If you cant prove the evidence has a solid chain of custody then it was potentially tampered with and isn't reliable.
AnthonyMouse
The usual chain of custody goes something like: The store has a video surveillance system which the police collect the footage from, so the chain of custody goes through the store and the police which implies that nobody other than those two have tampered with it.
But then you have an inside job where the perpetrators work for the store and have doctored the footage before the police come to pick it up, or a corrupt cop who wants to convict someone without proving their case or is accepting bribes to convict the wrong person and now has easy access to forgeries. Chain of custody can't help you in either of these cases, and both of those things definitely happen in real life, so how do you determine when they happen or don't?
mrandish
Yep, "chain of custody." Came here hoping to see that concept discussed since it's how the system already deals with cases of potential evidence tampering. If the evidence is of material importance and there's no sufficiently credible chain of custody, then its validity can be questioned. The concept started around purely physical evidence but applies to image, audio and video. The good thing about the ubiquity of deepfake memes on social media is that it familiarizes judges and juries with how easy it now is to create plausible fake media.
LPisGood
This is unironically a usecase for blockchain.
_DeadFred_
This is such BS. The government is ALWAYS deferred to when the chain of custody is broken because 'good faith' is applied. As long as 'good faith' is rountely dispensed 'chain of custody' is nothing but propaganda for the justice system not an actual tool used for justice.
As long as chain of custody ca be discarded because 'good faith' whenever it becomes inconvenient it is not a real thing.
nine_k
I can easily imagine a future where video evidence is only acceptable in the form of chemically developed analog film, at resolutions that are prohibitively expensive to model, and audio recordings of any kind are not admissible as evidence at all. Signatures on paper, faxes, etc are, of course, inadmissible, too.
AyyEye
> we're now at the point where even authentic recordings could be plausibly denied and claimed to be fake.
We've been there for at least two years.
https://arstechnica.com/tech-policy/2023/04/judge-slams-tesl...
datadrivenangel
Digital forensics will continue to be an in-demand skill!
null
treetalker
I question the wisdom of setting the judge up as a superjury / gatekeeper for this kind of situation. This seems like a reliability / weight of the evidence scenario, not a reliability / qualification of the witness scenario (as with an expert witness).
Why would the judge be better qualified to determine whether the voice was authentic, as opposed to the witness? And why should the judge effectively determine the witness's credibility or ability to discern, when that's what juries are for?
All that said, emulated voices do pose big problems for litigation.
bluGill
I don't know what the latest is, but often judges are supposed to not allow "expert testimony" without checking that the person is an expert. However this is a really complex area. Judges don't want to be the one deciding a case, but not allowing some "expert" is in a way deciding the case.
zusammen
There’s a built-in design paradox. How does a judge assess an expert in a field where he is not one? There’s probably some improvement that comes from experience but it’s not perfect.
stogot
Qualifications=academics usually & career experience. No way to know if they actually learned something and aren’t a fraud. Even corporations that do thousands of interviews get duped
hiatus
Isn't it up to the opposing party, and not the judge, to impeach an expert?
ARandumGuy
If it was solely up to the opposing party, then they'd strike down every single expert. The judge still is the ultimate decider on what evidence (including expert testimony) is admissible.
null
_DeadFred_
You mean like expert witness polygraphers that were treated as fact by the courts for years and still used to re-incarcerate people on parole/probation?
Or gun matching (ballistics) that are no longer considered conclusive but subjective? https://www.mdcourts.gov/data/opinions/coa/2023/10a22.pdf
Hair and Fiber expert analysis that was wrong? https://innocenceproject.org/fbi-agents-gave-erroneous-testi...
Or do you mean bite mark analysis that was again wrong?
Many of these forensic methods were used for decades and presented/treated as conclusive evidence before being challenged leading to wrongful convictions. But yes, let's have 'certified' voice experts whose living is based on being hired by the government and giving testimony to convict people. Surely this time it will be altruistic and scientific.
klipklop
What always worried me is that nobody ever challenged these methods. We just accepted it as fact. Endless crime tv shows reenforce that these methods are infallible. People that watch these shows then become jurors. Scary.
whywhywhywhy
Gell-Mann Amnesia effect comes to mind.
NoMoreNicksLeft
Excluding fake evidence is very much a responsibility of the judge. In the age of Fox News, letting the jury decide for themselves whether or not made-up bullshit is actual evidence seems like a recipe for disaster, and not necessarily one that errs on the side of caution.
treetalker
To the contrary, in US courts the jury determines whether evidence (documentary, testimonial) is credible or not, and what weight (if any) to assign it. (Experts, à la Daubert etc., are a different matter because they give expert opinions, not factual evidence based on personal witnessing of the events in the case, so the judge does perform a gatekeeping function, essentially to ensure the underlying field/science is reliable.)
While certainly Fox News headlines would not reach the jury in most instances, that is on account of hearsay, lack of qualification, materiality, relevance, and similar rules. It is not a prior credibility or weight determination by the judge, as I understand TFA to be advocating. So: did the witness hear a voice that he believed to be the one in question? If so, jury gets to decide (unless unfairly prejudicial or some other overriding rule comes into play).
pharrington
IANAL, and I am assuming you are a lawyer - I interpreted @NoMoreNicksLeft as saying that it's a judge's responsibility to determine whether or not evidence is admissible - before it gets to a jury. And that's also what the article is talking about - "The examples should illustrate circumstances that may satisfy the authentication requirement while still leaving judges discretion to exclude an item of evidence if there is other proof that it is a fake. "
NoMoreNicksLeft
>While certainly Fox News headlines would not reach the jury in most instances, that is on account of hearsay,
Nor should obviously fake evidence reach the jury. They can judge for themselves whether testimony is credible, but this is far different than admitting faked evidence. And if you can't see the difference, I'm not sure I'm qualified to explain it to you.
PolieBotics
I've developed a novel approach to creating tamper-evident video via cryptographic feedback loops between projectors and cameras. The process works as follows:
1. A projector displays a challenge pattern (Perlin noise derived from of a hash) 2. A camera captures this projection 3. The system hashes the captured image concatenated with the previous hash and uses it to derive the next projection 4. This chain demonstrates true temporal sequentiality that's difficult to forge
By incorporating random noise derived from Byzantine Fault Tolerant networks and using these networks as timestamping servers, the proofs inherit the network's decentralization properties. ML then confirms that the feature distributions in projection-photograph pairs match expected patterns from the training dataset.
Demo video and GitHub repo available here: https://www.reddit.com/r/PoliePals/comments/1j8qm2j/truth_be...
o11c
There are actually (at least) 3 different places where cryptography is needed here:
* Proof that this starts after a given time. Traditionally this has used methods like "this is the headline of a major newspaper today", which is limited to 1-day granularity and has problems if you can just generate a large number of expected headlines and use them in parallel. But with crypto, we can just query any random-number-timestamp-signing server, and a network of such servers can mutually sign each other's previous packets so it's very reliable both against downtime and against attacks.
* Proof of sequencing. This is trivial with a chain of hashes, though it does prevent recompression.
* Proof that this ends before a given time. This requires actively submitting your signature data to a timestamp server for additional signing, which is a much more complicated task than the initial half. It is still possible to eliminate the single source of vulnerability, but much more work.
"Camera looks at monitor" is going to be a much cheaper way to make this air-gapped than adding a projector. And this doesn't strictly need to be continuous; most things are tolerant of one-day granularity and almost everything of 15-minute granularity.
PolieBotics
Largely true, although submitting timestamp hashes to the blockchain is probably the easiest bit.
"Camera looking at a monitor": While that might be simpler in some setups, it doesn’t really solve my main issue. I want the signal to permeate the entire scene, not just appear in the corner of a display or overlaid on the video. By projecting the challenge onto all visible surfaces, we create a physical environment that’s difficult to fake (since you’d have to convincingly generate or remove those patterns in real time). Air-gapping isn’t really the goal right now.
Finally, we're need much finer granularity than 15 minutes! The point is to lower the generation time below what is achievable via generative model.
Thank you for the comment, and I hope these clarifications are useful. It's a new concept, so please forgive the clumsiness with which I may be communicating it.
o11c
"The blockchain" is just a wasteful way of doing things compared to the plain crypto.
"Overlay the entire scene" doesn't actually appear to add any information-theoretic value compared to simply bounding the timestamp at which the video was made. Nothing either of us is talking about will actually prevent fakes (before the camera signs it), only constrain the time at which the fakes are made.
Slower-than-real-time generation of fakes is still significantly inhibited by the fact that the hash sequence can be checked for continuity across long lengths of time whose bounds are verified using the other steps.
asddubs
pretty clever, kind of like cipher block chaining
PolieBotics
Thank you! It is indeed a little like a signature based on proof-of-projection.
As they say, once you have a signature, you have a most of a cryptosystem. I've been experimenting with those and other applications of non-linear functions in projector-camera systems.
Lammy
In the future this stuff will get so good that the public will beg to be surveilled at all times because it will be the only way to prove what you didn't do. You will learn to love Total Information Awareness. Consent status: manufactured :)
nostrademons
It could easily go the other way, where the public doesn't care what people think they did or didn't do and just does whatever they want, because they don't respect the state and believe the social contract has been broken. "Fuck justice, talk to my AR-15."
There's ample evidence that this is already happening, eg. recent headlines about kids being radicalized at increasingly younger ages, groups like No Lives Matter that embrace violent nihilism, increased domestic terrorism, record high gun ownership across both sides of the political spectrum, authority figures that just do whatever they want and ignore any form of law or accountability, etc.
bilbo0s
I don't know man?
We're already at a place where most people don't care what other people think of them.
Issue is, as long as the government has the big guns, what the government thinks of you will still matter in a major way.
In such an environment, most people are going to choose to have some kind of way to prove to the government what they did and what they didn't do. Not because they care what other people think as you're implying, but rather because they very much care that the government not get the wrong idea about them. Because the government getting the wrong idea about you can be fatal.
nostrademons
Government is based on the threat of the use of force, not the actual use of force. If you're a government that is regularly using force against a significant proportion of your citizens, you have problems, and probably will not remain the government for long.
I suspect that we're saying the same thing but with reversed causality. Both of us agree that non-deterministic enforcement breaks down the incentives needed for pro-social behavior. You're saying that this will cause people to demand ways to improve the governments enforcement abilities. I'm saying that this will cause people to adapt their behavior to the new, lessened enforcement abilities. In defense of my point, I'd point out that changes to government are a coordination problem while adaptation of behavior is an individual-only response, and it is much easier to effect changes to your own behavior than it is to convince 300 million people to agree on a solution and implement it, particularly when the root problem is a lack of enforcement ability.
dghlsakjg
How is AI voice faking any different than any other type of faking? How is it different than a manipulated recording, or a recording where someone is imitating another?
It is just as easy to fake many paper documents, and we have accepted documents as evidence for centuries.
Photos can be faked, video can be edited or faked, witnesses lie or misremember.
Is this just about telling lawyers that unvetted audio recordings can be unreliable? Because that shouldn't be news.
Edit: this is a good faith question. I'm legitimately just curious. Splicing and editing have been around since recording was invented, I was legitimately curious why voice recordings would have been given extra evidential weight when manipulating recordings is a known possibility.
rout39574
Presuming good faith here, faking recordings has been harder to do, easier to detect, and less equivocal in the past than it is now.
If it takes an FX house to generate a plausible recording of me saying something I didn't say, that's a risky enterprise with a lot of witnesses.
If my enemy can do it in their basement with an hour of research, the exposure risk goes way down, and consequently the expectation you'll see it in real life goes way up.
null
ceejayoz
Nuclear weapons require nation-state levels of resources. Now imagine you can build one in your basement for $10k. Does that change things at all?
dghlsakjg
I understand what you are saying, but my point is that I could make plausible sounding recordings that did not reflect reality by, for example, cutting recordings up with freeware like Audacity, or even using a consumer level double tapedeck before that. It wasn't Manhattan project levels of effort before this.
This seems more like people losing their minds over 3d printed guns, when hobbyists with a drill press have been making guns in the garage for decades.
Yeah, its easier now to fake voice, but its not as if what this article warns against wasn't possible before the latest AI hype cycle. And it is also worth noting that voice cloning/changing technology is not particularly new either (I've been able to sound like Morgan Freeman using a phone app for at least half a decade).
I agree that courts should be cautious around accepting voice recording evidence, I just don't think that the ability to do this is new.
Majromax
> I could make plausible sounding recordings that did not reflect reality by, for example, cutting recordings up with freeware like Audacity,
Cutting up what recordings, exactly? If you're mixing-and-matching, you need a pretty broad corpus of source material with fairly consistent recording quality (background noise, etc), and even still you're limited to reproducing words that are already present. You can't cut-and-splice together a recording of me saying 'Yes, dghlsakjg, I embezzled $1 million and blew it on the ponies' because there will be no recording of me saying your username, 'embezzled', or 'ponies.'
The problem comes with the voice-cloning technology that can construct entirely new sentences based on relatively short voice profile samples.
> I've been able to sound like Morgan Freeman using a phone app for at least half a decade
You've been able to sound like Morgan Freeman because of specific, hard tuning work put into the voice changer. Now, you can sound like your boss, or your neighbour, or your ex.
ceejayoz
> I could make plausible sounding recordings that did not reflect reality by, for example, cutting recordings up with freeware like Audacity, or even using a consumer level double tapedeck before that.
Sure, and https://en.wikipedia.org/wiki/David_Hahn managed to get quite a bit of nuclear material.
Being able to do it at scale, convincingly, in real-time, for any arbitrary text, with just 30s or so of someone's voice as a sample, changes the calculus a lot.
gdulli
"We've had projectile murder for centuries with arrows, how are these machine guns and missiles any different?"
tgv
I'll say it again, even though it is rather unpopular here: there has never been a need to develop these tools, nor one to make them easy to deploy, nor one to make them easy to use. Yet all this has happened, and now it may occur that someone is acquitted because AI generated media is so good, the evidence might be artificial. If that happens, and the suspect commits another crime, it's on the conscience of the people that contributed to this. You cannot create something and pretend its use has nothing to do with you.
The tools aren't perfect yet, so it's not too late to stop. Stop the ridiculous image and audio generation tools before it's too late. Nothing of value is lost when these models are made private again, and research is simply halted.
bdangubic
You cannot create something and pretend its use has nothing to do with you.
has always worked with guns so it'll always work with anything else :) and guns will kill more people that AI-generated shit for foreseeable future
cootsnuck
It actually is too late for that. For anyone unaware, the open source models are already more than sufficient enough for imitations and deepfakes. For better or worse there's no going back.
Personally, I'd rather we all know this tech is out there and develop defense mechanisms rather than thinking hiding it away will prevent harm.
The cyber security industry exists because of all the privacy and security issues posed by all the tech we already have had for the past several decades.
I'm confident the same will happen for AI simply because it is a business opportunity and other businesses and institutions are already talking about these issues.
qwertox
I do want a computer which talks to me like a person, with that agility. I am used to listening to voice.
I am tired of reading so much stuff which could well be spoken to me. Like this comment here, which you now need to read.
It's only very recently that living beings on this world have learned to read and write, it's not normal. It's normal to communicate through sound.
triceratops
Reading and writing isn't "normal" but using a computer is? Humans have been doing the former for way longer.
krainboltgreene
What an incredible line to end on. “It’s not normal to read/write”.
qwertox
I'm not aware of any other species which is capable of reading and writing. But many can certainly hear and make sounds.
null
gowld
You don't need writing, if you want to live in a cave or wander a prairie, I guess?
educasean
Fatalities due to automobile accidents are a major drag. We should've stuck with horses
a2128
There is some use to bringing deepfakes to mass adoption. The thing is that since the tech exists, powerful actors with lots of resources will develop these tools for their own use either way. The question is whether they'll be able to fool the masses who are unaware that such realistic deepfakes can exist, or whether they will have no effect as everyone and their mom have already seen similar AI slop on their Facebook feed
gortok
What shocks (and irritates!) me is that Charles Schwab keeps wanting me to set up voice ID. Why would I want to set up a voice ID for something that is now trivially spoofed?
alamortsubite
When was the last time you encountered this? I remember getting nags up until around the end of last year, but not lately. I like to think they dropped the program because I expressed concerns about it to so many reps, but more likely I've just been dialing in on a different number.
MichaelDickens
Schwab used to have an 8 character maximum on passwords (although at least they changed that). They have never been a paragon of good security practices.
recursive
Here's my plan.
1. Cryptographically hash each piece of media when it's recorded.
2. Submit the hash to a "trusted" authority.
3. It will add a timestamp and sign the result.
4. Now, as long as you keep the original, without re-compressing, and you trust the authority, you have some evidence that the media existed at a timestamp. On or before.
This doesn't prove authenticity, but in many cases, establishing a timestamp would be enough. Forgeries probably wouldn't be created until later, after the shit hit the fan.
Or maybe this doesn't work at all.
AnthonyMouse
Thieves are planning an inside job. They forge the surveillance video ahead of time, do the theft and submit the forgery to be timestamped while it's happening.
Also, a lot of surveillance systems are purposely kept offline to prevent them from being compromised, but your system doesn't allow that because they would need external connectivity to get signatures.
lolc
Sure when you're controlling the source, you can fake it. But requiring the fakes be prepared ahead of time locks the faker into one story that may be contradicted by other evidence.
AnthonyMouse
That's pretty much what happens anyway. The police are going to come collect the footage as soon as the crime is reported and you can't change it after they do.
carra
Wait some more time and photos or even video recordings will be deemed just as dangerous. And then what? Even if there is real evidence, it will have to be discarded unless it can't be sufficiently validated. It will get very hard to prove anything.
bluGill
Or perhaps camera manufactures will start putting traces in. Anyone who makes surveillance systems (like stores use) should put some end to end things in so they can say "this camera took that recording and we can see that it wasn't tampered with by...". (if anyone works for a surveillance company please run with this!) With encryption we can verify a lot of things, but it sets the bar higher than someone took a picture.
Of course in a darkroom someone skilled could always make a fake photo - but the bar is a lot lower with AI.
AnthonyMouse
Cryptography doesn't really fix it. There are a zillion camera makers and all it takes is one of them to have poor security and leak the keys. Then the forger uses any of the cameras with extractable keys to sign their forgery.
Or they just point a camera at a high resolution playback of a forged video.
This also assumes you can trust all the camera makers, because by doing this you're implicitly giving them each authorization to produce forged videos. Recall that many of these companies are based in China.
gs17
> Or they just point a camera at a high resolution playback of a forged video.
Exactly, it's just like DRM or cheating in video games. Even if everything in software is blocked, there's always the analogue hole.
bluGill
The point is the camera maker certifies in court under perjury penalty that their cameras are not compromised and that is their image. "Other camera systems are compromised, but ours is not...".
Lanolderen
The smart encryption people will fix it. At some point it had to be fixed anyway since LLMs have only made it less effort.
paulnpace
If it goes that direction, one silver lining could be that they also realize video conference tools used for court proceedings should also go away.
nottorp
So... those talking head "influencers" who leave multiple hours of voice and video samples on social networking for anyone to download and clone are the most at risk for an attack like this?
TriangleEdge
AI threatens all digital perceptions, not just voice. Images, videos, recordings, ... I think soon enough proving things in court beyond a reasonable doubt when the evidence is digital media will be difficult/impossible.
exe34
On a related note, why oh why does Lloyds Bank insist on grabbing my voice for login every time I call them? I have to keep saying "no, fcuk off!" a dozen times until it gives up.
Sounds like reasonable changes.
Generally speaking, I think evidence tampering is not a new problem, and even though it's easy in some cases, I don't think it's _that_ widespread. Just like it's possible to lie on the stand, but people usually think twice before they do it, because _if_ they are found to have lied, they're in trouble.
My main concern is rather that legit evidence can now easily be called into question. That seems to me like a much higher risk than fake evidence, considering the overall dynamics.
But ultimately: Humanity has coped without photo, audio or video evidence for most of its existence. I suppose it will cope again.