Copilot broke audit logs, but Microsoft won't tell customers
90 comments
·August 20, 2025TheRoque
AdieuToLogic
> In my opinion, using AI tools for programming at the moment, unless in a sandboxed environment and on a toy project, is just ludicrous.
Well put.
The fundamental flaw is in trying to employ nondeterministic content generation based on statistical relevance defined by an unknown training data set, which is what commercial LLM offerings are, in an effort to repeatably produce content satisfying a strict mathematical model (program source code).
mlyle
Nearly as bad: trying to use systems made out of meat, evolved from a unrelated background and trained on an undocumented and chaotic corpus of data, to try and produce content satisfying a strict mathematical model.
cookiengineer
The difference: Meatbags created something like an education system, where the track record in it functions as a ledger for hiring companies.
There is no such thing for AI. No ledger, no track record, no reproducibility.
AdieuToLogic
Except that the "systems made out of meat" are the entities which both define the problem needing to be solved and are the sole determiners if said problem has been solved.
Of note too is that the same "systems made out of meat" have been producing content satisfying the strict mathematical model for decades and continue to do so beyond the capabilities of the aforementioned algorithms.
jdiff
Meaty feet can be held to a fire. To quote IBM, "A computer can never be held accountable."
jp0d
Countless bodies consisting of the said meat have been responsible for the advancement of technology so far. If these meat brains don't contribute to any new advancements then the corpus of data will stay stagnant!
matt3210
Use AI to audit what’s produced by AI. Problem solved! /sarcasm
ThrowawayTestr
Companies won't use open source software because of licencing concerns but if you launder it through an LLM it's hunky-dory.
cut3
[flagged]
neuroelectron
The icing on the shit cake is a text editor programmed in typeScript with an impossible to secure plugin architecture.
Sparkyte
AI is a pump and dump scheme promoted by large companies who can't innovate in order to drive up sales. It isn't even AI, it is just weighted variables.
lokar
Wait, copilot operates as some privileged user (that can bypass audit?), not as you (or better, you with some restrictions)
That can’t be right, can it?
catmanjan
As someone else mentioned the file isnt actually accessed by copilot, rather copilot is reading the pre-indexed contents of the file in a search engine...
Really Microsoft should be auditing the search that copilot executes, its actually a bit misleading to be auditing the file as accessed when copilot has only read the indexed content of the file, I don't say I've visited a website when I've found a result of it in Google
internetter
> I don't say I've visited a website when I've found a result of it in Google
I mean, it depends on how large the index window is, because if google returned the entire webpage content without leaving (amp moment), you did visit the website. fine line.
null
tomrod
Sure sounds like, for Microsoft, an audit log is optional when it comes to cramming garbage AI integrations in places they don't belong.
ceejayoz
> That can’t be right, can it?
dhosek
That was a laugh-out-loud moment in that film.
lokar
lol. I’ve avoided MS my entire (30+ year) career. Every now and then I’m reminded I made the right choice.
tomrod
Brilliant.
jjkaczor
So... basically like when Delve was first introduced and was improperly security trimming things it was suggesting and search results.
... Or ... a very long-time ago, when SharePoint search would display results and synopsis's for search terms where a user couldn't open the document, but could see that it existed and could get a matching paragraph or two... Best example I would tell people of the problem was users searching for things like: "Fall 2025 layoffs"... if the document existed, then things were being planned...
Ah Microsoft, security-last is still the thing, eh?
ocdtrekkie
I would say "insecure by default".
I talked to some Microsoft folks around the Windows Server 2025 launch, where they claimed they would be breaking more compatibility in the name of their Secure Future Initiative.
But Server 2025 will load malicious ads on the Edge start screen[1] if you need to access a web interface of an internal thing from your domain controller, and they gleefully announced including winget, a wondeful malware delivery tool with zero vetting or accountability in Server 2025.
Their response to both points was I could disable those if I wanted to. Which I can, but was definitely not the point. You can make a secure environment based on Microsoft technologies, but it will fight you every step of the way.
[1] As a fun fact, this actually makes Internet Explorer a drastically safer browser than Edge on servers! By default, IE's ESC mode on servers basically refused to load any outside websites.
beart
I've always felt that Microsoft's biggest problem is the way it manages all of the different teams, departments, features, etc. They are completely disconnected and have competing KPIs. I imagine the edge advertising team has a goal to make so much revenue, and the security team has a goal to reduce CVEs, but never the twain shall meet.
Also you probably have to go up 10 levels of management before you reach a common person.
ValveFan6969
I can only assume that Microsoft/OpenAI have some sort of backdoor privileges that allows them to view our messages, or at least analyze and process them.
I wouldn't be surprised.
faangguyindia
I've disabled copilot i don't even find it useful. I think most people who use copilot have not see "better".
Spooky23
No, it accesses data with the users privilege.
gpm
Are you telling me I, a normal unprivileged user, have a way to read files on windows that bypasses audit logs?
Spooky23
If there is a product defect? Sure.
The dude found the bug, reported the bug, they fixed the bug.
This isn’t uncommon, there bugs like this frequently in complex software.
lokar
I'm guessing they are making an implicit distinction between access as the user, vs with the privs of the user.
In the second case, the process has permission to do whatever it wants, it elects to restrain itself. Which is obviously subject to many more bugs then the first approach.
jeanlucas
A better title would be: Microsoft Copilot isn't HIPAA compliant
A title like this will get it fixed faster.
fulafel
> CVEs are given to fixes deployed in security releases when customers need to take action to stay protected. In this case, the mitigation will be automatically pushed to Copilot, where users do not need to manually update the product and a CVE will not be assigned.
Is this a feature of CVE or of Microsoft's way of using CVE? It would seem this vulnerability would still benefit from having a common ID to be refrenced in various contexts (eg vulnerability research). Maybe there needs to be another numbering system that will enumerate these kinds of cases and doesn't depend on the vendor.
degamad
One thing that's not clear in the write-up here: *which* audit log is he talking about? Sharepoint file accesses? Copilot actions? Purview? Something else?
RachelF
Lots of things aren't clear.
Copilot is accessing the indexed contents of the file, not the file itself, when you tell it not to access the file.
The blog writer/marketer needs to look at the index access logs.
internetter
> The blog writer/marketer needs to look at the index access logs.
How can you say this if microsoft is issuing a fix?
nzeid
Hard to count the number of things that can go wrong by relying directly on an LLM to manage audit/activity/etc. logs.
What was their bug fix? Shadow prompts?
jsnell
> Hard to count the number of things that can go wrong by relying directly on an LLM to manage audit/activity/etc. logs.
Nothing in this post suggests that they're relying on the LLM itself to append to the audit logs. That would be a preposterous design. It seems far more likely the audit logs are being written by the scaffolding, not by the LLM, but they instrumented the wrong places. (I.e. emitting on a link or maybe a link preview being output, rather than e.g. on the document being fed to the LLM as a result of RAG or a tool call.)
(Writing the audit logs in the scaffolding is probably also the wrong design, but at least it's just a bad design rather than a totally absurd one.)
nzeid
Heard, but since the content or its metadata must be surfaced by the LLM, what's the fix?
nzeid
Thinking about this a bit - you'd have to isolate any interaction the LLM has with any content to some sort of middle end that can audit the LLM itself. I'm a bit out of my depth here, though. I don't know what Microsoft does or doesn't do with Copilot.
verandaguy
I'm very sceptical of using shadow prompts (or prompts of any kind) as an actual security/compliance control or enforcement mechanism. These things should be done using a deterministic system.
ath3nd
I bet you are a fan of OpenAI's groundbreaking study mode feature.
gpm
I'd hope that if a tool the LLM uses reveals any part of the file to the LLM it counts as a read by every user who sees any part of the output that occurred after that revelation was added to the context.
downrightmike
Shadow copies
zavec
Just to make sure I'm understanding footnote one correctly: it shows up (sometimes before and hopefully every time now) as a copilot event in the log, and there's no corresponding sharepoint event?
From a brief glance at the O365 docs it seems like the 'AISystemPluginData` field indicates that the event in the screenshot showing the missing access is a copilot event (or maybe they all get collapsed into one event, I'm not super familiar with O365 audit logs), and I'm inferring from the footnote that there's not another sharepoint event somewhere in either the old or new version. But if there is one that could at least be a mitigation if you needed to do such a search on the activity before the fix.
null
jayofdoom
Generally speaking, anyone can file a CVE. Go file one yourself and force their response. This blogpost puts forth reasonably compelling evidence.
thombles
Is there value in requesting a CVE for a service that only Microsoft runs? What's a user supposed to do with that?
aspenmayer
It’s true. The form is right here. When they support PGP, I suspect they know what they’re doing and why, and have probably been continuously doing so for longer than I have been alive. Just look at their sponsors and partners.
Please only use this for legitimate submissions.
db48x
Fun, but it doesn’t deserve a CVE. CVEs are for vulnerabilities that are common across multiple products from multiple sources. Think of a vulnerability in a shared library that is used in most Linux distributions, or is statically linked into multiple programs. Copilot doesn’t meet that criteria.
Honestly, the worst thing about this story is that apparently the Copilot LLM is given the instructions to create audit log entries. That’s the worst design I could imagine! When they use an API to access a file or a url then the API should create the audit log. This is just engineering 101.
gpm
Huh, there are CVEs for windows components all the time, random example: https://msrc.microsoft.com/update-guide/vulnerability/CVE-20...
Including for end user applications, not libraries, another random example: https://msrc.microsoft.com/update-guide/vulnerability/CVE-20...
ecb_penguin
> CVEs are for vulnerabilities that are common across multiple products from multiple sources.
This is absolutely not true. I have no idea where you came up with this.
> Honestly, the worst thing about this story is that apparently the Copilot LLM is given the instructions to create audit log entries.
That's not at all what the article says.
> That’s the worst design I could imagine!
Ok, well, that's not how they designed it.
> This is just engineering 101.
Where is the class for reading 101?
HelloImSteven
CVEs aren’t just for common dependencies. The “Common” part of the name is about having standardized reporting that over time helps reveal common issues occurring across multiple CVEs. Individually they’re just a way to catalog known vulnerabilities and indicate their severity to anyone impacted, whether that’s a hundred people or billions. There are high severity CVEs for individual niche IoT thermostats and light strips with obscure weaknesses.
Technically, CVEs are meant to only affect one codebase, so a vulnerability in a shared library often means a separate CVE for each affected product. It’s only when there’s no way to use the library without being vulnerable that they’d generally make just one CVE covering all affected products. [1]
Even ignoring all that, people are incorporating Copilot into their development process, which makes it a common dependency.
immibis
More accurately, CVEs are for vulnerabilities that may be present on many systems. Then, the CVE number is a reference point that helps you when discussing the vulnerability, like asking whether it's present on a particular system, or what percentage of systems are patched. This vulnerability was only present on one system, so it doesn't need a CVE number. It could have a Microsoft-assigned bug number, but it doesn't need a CVE.
heywire
I am so tired of Microsoft cramming Copilot into everything. Search at $dayjob is completely borked right now. It shows a page of results, but the immediately pops up some warning dialog you cannot dismiss that Copilot can’t access some file “” or something. Every VSCode update I feel like I have to turn off Copilot in some new way. And now apparently it’ll be added to Excel as well. Thankfully I don’t have to use anything from Microsoft after work hours.
troad
> Every VSCode update I feel like I have to turn off Copilot in some new way.
This has genuinely made me work on switching to neovim. I previously demurred because I don't trust supply chains that are random public git repos full of emojis and Discords, but we've reached the point now where they're no less trustworthy than Microsoft. (And realistically, if you use any extensions on VS Code you're already trusting random repos, so you might as well cut out the middle man with an AI + spyware addiction and difficulties understanding consent.)
TheRoque
Same. Actually made me switch to neovim more and more. It's a great time to do so, with the new native package manager (now working in nightly 0.12)
candiddevmike
RE: VSCode copilot, you're not crazy, I'm seeing it too. And across multiple machines, even with settings sync enabled, I have to periodically go on each one and uninstall the copilot extension _again_. I'll notice the Add to chat... in the right click context menu and immediately know it got reinstalled somehow.
I'd switch to VSCodium but I use the WSL and SSH extensions :(
userbinator
Thankfully I don’t have to use anything from Microsoft after work hours.
There are employers where you don't have to use anything from Microsoft during work hours either.
keyle
Everything except the best thing they could have brought back: Clippy! </3
fragmede
So Louis Rossmann put out a YouTube video encouraging internet users to change their profile pictures to an image of Clippy, as a form of silent protest against unethical conduct by technology companies, so it's making a comeback!
sgentle
The coercion will continue until metrics improve.
troad
Microsoft's ham-fisted strategy for trying to build a moat around its AI offering, by shoving everyone's documents in it without any real informed consent, genuinely beggars belief.
It will not successfully create a moat - turns out files are portable - but it will successfully peeve a huge number of users and institutions off, and inevitably cause years of litigation and regulatory attention.
Are there no adults left at Microsoft? Or is it now just Copilot all the way up?
p_ing
Copilot pulls from the substrate, like many other apps. No files are store in Copilot. They’re usually on ODSP but could be in Dataverse or a non-Microsoft product like Confluence (there goes your moat!).
overgard
I don’t know much about audit logs, but the more concerning thing to me is it sounds like it’s up to the program reading the file to register an access? Shouldn’t that be something at the file system level? I’m a bit baffled why this is a copilot bug instead of a file system bug unless copilot has special privileges? (Also to that: ick!)
IcyWindows
I suspect this might be typical RAG where there is a vector index or chucked data it looks at.
In my opinion, using AI tools for programming at the moment, unless in a sandboxed environment and on a toy project, is just ludicrous. The amount of shady things going on in this domain (AI trained on stolen content, no proper attribution, not proper way to audit what's going out to third party servers etc.) should be a huge red flag for any professional developer.