Skip to content(if available)orjump to list(if available)

Watching AI drive Microsoft employees insane

diggan

Interesting that every comment has "Help improve Copilot by leaving feedback using the or buttons" suffix, yet none of the comments received any feedback, either positive or negative.

> This seems like it's fixing the symptom rather than the underlying issue?

This is also my experience when you haven't setup a proper system prompt to address this for everything an LLM does. Funniest PRs are the ones that "resolves" test failures by removing/commenting out the test cases, or change the assertions. Googles and Microsofts models seems more likely to do this than OpenAIs and Anthropics models, I wonder if there is some difference in their internal processes that are leaking through here?

The same PR as the quote above continues with 3 more messages before the human seemingly gives up:

> please take a look

> Your new tests aren't being run because the new file wasn't added to the csproj

> Your added tests are failing.

I can't imagine how the people who have to deal with this are feeling. It's like you have a junior developer except they don't even read what you're telling them, and have 0 agency to understand what they're actually doing.

Another PR: https://github.com/dotnet/runtime/pull/115732/files

How are people reviewing that? 90% of the page height is taken up by "Check failure", can hardly see the code/diff at all. And as a cherry on top, the unit test has a comment that say "Test expressions mentioned in the issue". This whole thing would be fucking hilarious if I didn't feel so bad for the humans who are on the other side of this.

surgical_fire

> I can't imagine how the people who have to deal with this are feeling. It's like you have a junior developer except they don't even read what you're telling them, and have 0 agency to understand what they're actually doing.

That comparison is awful. I work with quite a few Junior developers and they can be competent. Certainly don't make the silly mistakes that LLMs do, don't need nearly as much handholding, and tend to learn pretty quickly so I don't have to keep repeating myself.

LLMs are decent code assistants when used with care, and can do a lot of heavy lifting, they certainly speed me up when I have a clear picture of what I want to do, and they are good to bounce off ideas when I am planning for something. That said, I really don't see how it could meaningfully replace an intern however, much less an actual developer.

safety1st

These GH interactions remind me of one of those offshore software outsourcing firms on Upwork or Freelancer.com that bid $3/hr on every project that gets posted. There's a PM who takes your task and gives it to a "developer" who potentially has never actually written a line of code, but maybe they've built a WordPress site by pointing and clicking in Elementor or something. After dozens of hours billed you will, in fact, get code where the new file wasn't added to the csproj or something like that, and when you point it out, they will bill another 20 hours, and send you a new copy of the project, where the test always fails. It's exactly like this.

Nice to see that Microsoft has automated that, failure will be cheaper now.

dkdbejwi383

This gives me flashbacks to when my big corporate former employer outsourced a bunch of work offshore.

An outsourced contractor was tasked with a very simple job as their first task - update a single dependency, which required just a bump of the version and no code changes - after three days of them seemingly struggling to even understand what they were asked to do, inability to clone the repo, failure to install the necessary tooling on their machine, they ended up getting fired from the project. Complete waste of money, and the time of those of us having to delegate and review this work.

AbstractH24

> These GH interactions remind me of one of those offshore software outsourcing firms on Upwork or Freelancer.com that bid $3/hr on every project that gets posted

Those have long been the folks I’ve seen at the biggest risk of being replaced by AI. Tasks that didn’t rely on human interaction or much training, just brute force which can be done from anywhere.

And for them, that $3/hr was really good money.

voxic11

Actually the AI might still be more expensive at this point. But give it a few years I'm sure they will get the costs down.

kamaal

>>These GH interactions remind me of one of those offshore software outsourcing firms on Upwork or Freelancer.com that bid $3/hr on every project that gets posted.

This level of smugness is why outsourcing still continues to exist. The kind of things you talk about were rare. And were mostly exaggerated to create anti-outsourcing narrative. None of that led to outsourcing actually going away simply because people are actually getting good work done.

Bad quality things are cheap != All cheap things are bad.

Same will work with AI too, while people continue to crap on AI, things will only improve, people will be more productive with AI, get more and bigger things done for cheaper and better. This is just inevitable given how things are going now.

>>There's a PM who takes your task and gives it to a "developer" who potentially has never actually written a line of code, but maybe they've built a WordPress site by pointing and clicking in Elementor or something.

In the peak of outsourcing wave. Both the call center people and IT services people had internal training and graduation standards that were quite brutal and mad attrition rates.

Exams often went along the lines of having to write whole ass projects without internet help in hours. Theory exams that had like -2 marks on getting things wrong. Dozens of exams, projects, coding exams, on-floor internships, project interviews.

>>After dozens of hours billed you will, in fact, get code where the new file wasn't added to the csproj or something like that, and when you point it out, they will bill another 20 hours, and send you a new copy of the project, where the test always fails. It's exactly like this.

Most IT services billing had pivoted away from hourly billing, to fixed time and material in the 2000s itself.

>>It's exactly like this.

Very much like outsourcing. AI is here to stay man. Deal with it. Its not going anywhere. For like $20 a month, companies will have same capability as a full time junior dev.

This is NOT going away. Its here to stay. And will only get better with time.

sbarre

I think that was the point of the comparison..

It's not like a regular junior developer, it's much worse.

spacemadness

And yet it got the job and lots of would be juniors didn’t, and it seems to be costing the company more in compute and senior dev handholding. Nice work silicon valley.

preisschild

> That said, I really don't see how it could meaningfully replace an intern however

And even if it could, how do you get senior devs without junior devs? ^^

surgical_fire

What is making it difficult for Junior devs to be hired is not AI. That is a diversion.

The raise in interest rates a couple of years ago triggered many layoffs in the industry. When that happens salaries are squeezed. Experienced people work for less, and juniors have trouble finding job because they are now competing against people with plenty of experience.

lazide

Sounds like a next quarter problem (I wish it was /s).

PKop

Did you miss the "except" in his sentence? He was making the point this is worse than junior devs for all reasons listed.

surgical_fire

I was agreeing with him, by saying that the comparison is awful.

Not sure how it can be read otherwise.

yubblegum

This field (SE - when I started out back in late 80s) was enjoyable. Now it has become toxic, from the interview process, to imitating "big tech" songs and dances by small fry companies, and now this. Is there any joy left in being a professional software developer?

bluefirebrand

Making quite a bit of money brings me a lot of joy compared to other industries

But the actual software part? I'm not sure anymore

diggan

> This field (SE - when I started out back in late 80s) was enjoyable. Now it has become toxic

I feel the same way today, but I got started around 2012 professionally. I wonder how much of this is just our fading optimism after seeing how shit really works behind the scenes, and how much the industry itself is responsible for it. I know we're not the only two people feeling this way either, but it seems all of us have different timescales from when it turned from "enjoyable" to "get me out of here".

null

[deleted]

salawat

My issue stems from the attitudes of the people we're doing it for. I started out doing it for humanity. To bring the bicycle for the mind to everyone.

Then one day I woke up and realized the ones paying me were also the ones using it to run over or do circles around everyone else not equipped with a bicycle yet; and were colluding to make crippled bicycles that'd never liberate the masses as much as they themselves had been previously liberated; bicycles designed to monitor, or to undermine their owner, or more disgustingly, their "licensee".

So I'm not doing it anymore. I'm not going to continue making deliberately crippled, overly complex, legally encumbered bicycles for the mind, purely intended as subjects for ARR extraction.

bwfan123

It happens in waves. For a period, there was an oversupply of cs engineers, and now, the supply will shrink. On top of this, the BS put out by AI code will require experienced engineers to fix.

So, for experienced engineers, I see a great future fixing the shit show that is AI-code.

sweman

>> Is there any joy left in being a professional software developer?

Yes, when your 100k quarterly RSU drop lands

camdenreslink

A very very small percentage of professional software developers get that.

iamleppert

No, there is absolutely no joy left.

coldpie

I've been looking at getting a CDL and becoming a city bus driver, or maybe a USPS driver or deliveryman or clerk or something.

yubblegum

I hear you. Same boat just can't figure out the life jacket yet. (You do fine wood work, why not that? I am considering finding entry level work in architecture myself - kicking myself for giving that up for software now. Did not see this shit show coming.)

mrweasel

At least we can tell the junior developers to not submit a pull-request before they have the tests running locally.

At what point does the human developers just give up and close the PRs as "AI garbage". Keep the ones that works, then just junk the rest. I feel that at some point entertaining the machine becomes unbearable and people just stops doing it or rage close the PRs.

pydry

When their performance reviews stop depending upon them not doing that.

Microsoft's stock price is dependent on them proving that this is a success.

Qem

> Microsoft's stock price is dependent on them proving that this is a success.

Perhaps this explains the recent firings that affected faster CPython and other projects. While they throw money at AI but sucess still doesn't materialize, they need to make the books look good for yet another quarter through the old-school reliable method of laying off people left and right.

null

[deleted]

mrweasel

What happens when they can't prove that and development efficiency starts falling, because developers spend 50% of their time battling copilot?

throwup238

> Interesting that every comment has "Help improve Copilot by leaving feedback using the or buttons" suffix, yet none of the comments received any feedback, either positive or negative.

The feedback buttons open a feedback form modal, they don’t reflect the number of feedback given like the emoji button. If you leave feedback, it will reflect your thumbs up/down (hiding the other button), it doesn’t say anything about whether anyone else has left feedback (I’ve tried it on my own repos).

vasco

> improve Copilot by leaving feedback using the or buttons" suffix, yet none of the comments received any feedback, either positive or negative

Why do they even need it? Success is code getting merged 1st shot, failure gets worse the more requests for changes the agent gets. Asking for manual feedback seems like a waste of time. Measure cycle time and rate of approvals and change failure rate like you would for any developer.

dfxm12

It's like you have a junior developer except they don't even read what you're telling them, and have 0 agency to understand what they're actually doing.

Anyone who has dealt with Microsoft support knows this feeling well. Even talking to the higher level customer success folks feels like talking to a brick wall. After dozens of support cases, I can count on zero hands the number of issues that were closed satisfactorily.

I appreciate Microsoft eating their dogfood here, but please don't make me eat it too! If anyone from MS is reading this, please release finished products that you are prepared to support!

xnorswap

> How are people reviewing that? 90% of the page height is taken up by "Check failure",

Typically, you wouldn't bother manually reviewing something until the automated checks have passed.

diggan

I dunno, when I review code, I don't review what's automatically checked anyways, but thinking about the change/diff in a broader context, and whatever isn't automatically checked. And the earlier you can steer people in the right direction, the better. But maybe this isn't the typical workflow.

xnorswap

The reality is more nuanced, there are situations where you'd want to glance over it anyway, such as looking for an opportunity to coach a junior dev.

I'd rather hop in and get them on the right path rather than letting them struggle alone, particularly if they're struggling.

If it's another senior developer though I'd happily leave them to it to get the unit tests all passing before I take a proper look at their work.

But as a general principle, please at least get a PR through formatting checks before assigning it to a person.

Cthulhu_

It's a waste of time tbh; fixing the checks may require the author to rethink or rewrite their entire solution, which means your review no longer applies.

Let them finish a pull request before spending time reviewing it. That said, a merge request needs to have an issue written before it's picked up, so that the author does not spend time on a solution before the problem is understood. That's idealism though.

phkahler

>> And the earlier you can steer people in the right direction, the better.

The earliest feedback you can get comes from the compiler. If it won't build successfully don't submit the PR.

prossercj

This comment on that PR is pure gold. The bots are talking to each other:

https://github.com/dotnet/runtime/pull/115732#issuecomment-2...

ncr100

Q: Does Microsoft report its findings or learnings BACK to the open source community?

The @stephentoub MS user suggests this is an experiment (https://github.com/dotnet/runtime/pull/115762#issuecomment-2...).

If this is using open source developers to learn how to build a better AI coding agent, will MS share their conclusions ASAP?

EDIT: And not just MS "marketing" how useful AI tools can be.

bramhaag

Seeing Microsoft employees argue with an LLM for hours instead of actually just fixing the problem must be a very encouraging sight for businesses that have built their products on top of .NET.

mikrl

I remember before mass LLM adoption, reading an issue on GitHub where an increasingly frustrated user was failing to properly describe a blocking issue, and the increasingly frustrated maintainer was failing to get them to stick to the issue template.

Now you don’t even need the frustrated end user!

shultays

one day both sides will be AI so we can all relax and enjoy our mojitos

marcosdumay

Well, people have been putting M-x doctor to talk with M-x eliza for decades.

some_random

when that day arrives we'll won't be relaxing, we will be put through a wood chipper

nashashmi

I sometimes feel like that is the right outcome for bad management and bad instructions. Only this time they can’t blame the junior engineer and are left to only blame themselves.

qoez

They'll probably blame openai/the AI instead.

null

[deleted]

nashashmi

AI has reproducible outcomes. If someone else can make it work, then they should too.

gwervc

Especially painful when one of said employee is Stephen Toub, who is famous for his .net performance blog posts.

svaha1728

I was thinking that too. He's a great programmer, and at this point I can't imagine he's having fun 'prompting' an LLM to write correct code.

daveguy

I hope he writes a personal essay about the experience after he leaves Microsoft. Not that he will leave anytime soon, but the first hand accounts of how they are talking about these systems internally are going to be even more entertaining than the wtf PRs.

mock-possum

You don’t think he’s having fun getting laid a ton for playing with computers?

null

[deleted]

empath75

The point of this exercise for Microsoft isn't to produce usable code right now, but to use and improve copilot.

svick

You don't want them to experiment with new tools? The main difference now is that the experiment is public.

stickfigure

It's pretty obviously a failed experiment. Why keep repeating it? Try again in another 3 months.

The answer is probably that the Copilot team is using the rest of the engineering organization as testers. Great for the Copilot team, frustrating for everyone else.

gmm1990

I wouldn't necessarily call that just an experiment if the same requests aren't being fixed without copilot and the ai changes could get merged.

I would say the copilot system isn't really there yet for these kinds of changes, you don't have to run experiments on a language framework to figure that out.

flmontpetit

By all means. Just not on one of the most popular software development frameworks in the world. Maybe that can wait until after the concept is proven.

mystified5016

Yeah, seems to me that breaking .NET with this garbage will be, uh, extremely bad

PKop

Nah I'd prefer they focus on writing code themselves to improve .NET not babysitting a spam-machine

lloydatkinson

That is essentially what I tried to say in my comment there but don't think they wanted to hear it.

ozim

That is why they just fired 7k people so they don’t argue with LLM but let it do the work /s

kruuuder

A comment on the first pull request provides some context:

> The stream of PRs is coming from requests from the maintainers of the repo. We're experimenting to understand the limits of what the tools can do today and preparing for what they'll be able to do tomorrow. Anything that gets merged is the responsibility of the maintainers, as is the case for any PR submitted by anyone to this open source and welcoming repo. Nothing gets merged without it meeting all the same quality bars and with us signing up for all the same maintenance requirements.

abxyz

The author of that comment, an employee of Microsoft, goes on to say:

> It is my opinion that anyone not at least thinking about benefiting from such tools will be left behind.

The read here is: Microsoft is so abuzz with excitement/panic about AI taking all software engineering jobs that Microsoft employees are jumping on board with Microsoft's AI push out of a fear of "being left behind". That's not the confidence inspiring the statement they intended it to be, it's the opposite, it underscores that this isn't the .net team "experimenting to understand the limits of what the tools" but rather the .net team trying to keep their jobs.

Verdex

The "left behind" mantra that I've been hearing for a while now is the strange one to me.

Like, I need to start smashing my face into a keyboard for 10000 hours or else I won't be able to use LLM tools effectively.

If LLM is this tool that is more intuitive than normal programming and adds all this productivity, then surely I can just wait for a bunch of others to wear themselves out smashing the faces on a keyboard for 10000 hours and then skim the cream off of the top, no worse for wear.

On the other hand, if using LLMs is a neverending nightmare of chaos and misery that's 10x harder than programming (but with the benefit that I don't actually have to learn something that might accidentally be useful), then yeah I guess I can see why I would need to get in my hours to use it. But maybe I could just not use it.

"Left behind" really only makes sense to me if my KPIs have been linked with LLM flavor aid style participation.

Ultimately, though, physics doesn't care about social conformity and last I checked the machine is running on physics.

spiffytech

There's a third way things might go: on the way to "superpower for everyone", we go through an extended phase where AI is only a superpower in skilled hands. The job market bifurcates around this. People who make strong use of it get first pick of the good jobs. People not making effective use of AI get whatever's left.

Kinda like how word processing used to be an important career skill people put on their resumes. Assuming AI becomes as that commonplace and accessible, will it happen fast enough that devs who want good jobs can afford to just wait that out?

Vicinity9635

If you're not using it where it's useful to you, then I still wouldn't say you're getting left behind, but you're making your job harder than it has to be. Anecdotally I've found it useful mostly for writing unit tests and sometimes debugging (can be as effective as a rubber duck).

It's like the 2025 version not not using an IDE.

It's a powerful tool. You still need to know when to and when not to use it.

marcosdumay

> It's like the 2025 version not not using an IDE.

That's right on the mark. It will save you a little bit of work on tasks that aren't the bottleneck on your productivity, and disrupt some random tasks that may or may not be important.

It's makes so little difference that plenty of people in 2025 don't use an IDE, and looking at their performance from the outside one just can't tell.

Except that LLMs have less potential to improve your tasks and more potential to be disruptive.

static_void

Tests are one of the areas where it performs least well. I can ask an LLM to summarize the functionality of code and be happy with the answer, but the tests it writes are the most facile unit tests, just the null hypothesis tests and the like. "Here's a test that the constructor works." Cool.

null

[deleted]

the-lazy-guy

This is Stephen Toub, who is the lead of many important .NET projects. I don't think he is worried about losing job anytime soon.

I think, we should not read too much into it. He is honestly exploring how much this tool can help him to resolve trivial issues. Maybe he was asked to do so by some of his bosses, but unlikely to fear the tool replacing him in the near future.

n8cpdx

They don’t have any problem firing experienced devs for no reason. Including on the .NET team (most of the .NET Android dev team was laid off recently).

https://www.theregister.com/2025/05/16/microsofts_axe_softwa...

Perhaps they were fired for failing to show enthusiasm for AI?

low_tech_love

I love the fact that they seem to be asking it to do simple things because ”AI can do the simple boring things for us so we can focus on the important problems” and then it floods them with so many meaningless mumbo jumbo that they could have probably done the simple thing in a fraction of the time they take to keep correcting it continuously.

sensanaty

Didn't M$ just fire like 7000 people, many of which were involved in big important M$ projects? The CPython guys, for example.

spacemadness

Anyone not showing open AI enthusiasm at that level will absolutely be fired. Anyone speaking for MS will have to be openly enthusiastic or silent on the topic by now.

hnthrow90348765

TBF they are dogfooding this (good) but it's just not going well

dmix

> Microsoft employees are jumping on board with Microsoft's AI push out of a fear of "being left behind"

If they weren't experimenting with AI and coding and took a more conservative approach, while other companies like Anthropic was running similar experiments, I'm sure HN would also be critiquing them for not keeping up as a stodgy big corporation.

As long as they are willing to take risks by trying and failing on their own repos, it's fine in my books. Even though I'd never let that stuff touch a professional github repo personally.

username135

i dont think hey are mutually exclusive. jumping on board seems like the smart move if you're worried about losing your career. you also get to confirm your suspicions.

lcnPylGDnU4H9OF

This is important context given that it would be absurd for the managers to have already drawn a definitive conclusion about the models’ capabilities. An explicit understanding that the purpose of the exercise is to get a better idea of the current strengths and weaknesses of the models in a “real world” context makes this actually very reasonable.

mrguyorama

So why in public, and why in the most ham-fisted way, and why on important infrastructure, and why in such a terrible integration that it can't even verify that things compile before opening a PR!

In my org, we would have had to bypass precommit hooks to do this!

rsynnott

Beyond every other absurdity here, well, maybe Microsoft is different, but I would never assign a PR that was _failing CI_ to somebody. That that's happening feels like an admission that the thing doesn't _really_ work at all; if it worked even slightly, it would at least only assign passing PRs, but presumably it's bad enough that if they put in that requirement there would be no PRs.

sbarre

I feel like everyone is applying a worse-case narrative to what's going on here..

I see this as a work in progress.. I am almost certain the humans in the loop on these PRs are well aware of what's going on and have their expectations in check, and this isn't just "business as usual" like any other PR or work assignment.

This is a test. You can't improve a system without testing it on real world conditions.

How do we know they're not tweaking the Copilot system prompts and settings behind the scenes while they're doing this work?

Can no one see the possibility that what is happening in those PRs is exactly what all the people involved expected to have happen, and they're just going through the process of seeing what happens when you try to refine and coach the system to either success or failure?

When we adopted AI coding assist tools internally over a year ago we did almost exactly this (not directly in GitHub though).

We asked a bunch of senior engineers to see how far they could get by coaching the AI to write code rather than writing it themselves. We wanted to calibrate our expectations and better understand the limits, strengths and weaknesses of these new tools we wanted to adopt.

In most of those early cases we ended up with worse code than if it had been written by humans, but we learned a ton. We can also clearly see how much better things have gotten over time, since we have that benchmark to look back on.

rco8786

I think people would be more likely to adopt this view if the overall narrative about AI is that it’s a work in progress and we expect it to get magnitudes better. But the narrative is that AI is already replacing human software engineers.

codyvoda

[flagged]

phkahler

>> I see this as a work in progress.. I am almost certain the humans in the loop on these PRs are well aware of what's going on and have their expectations in check, and this isn't just "business as usual" like any other PR or work assignment.

>> This is a test. You can't improve a system without testing it on real world conditions.

Software developers know to fix build problems before asking for a review. The AIs are submitting PRs in bad faith because they don't know any better. Compilers and other build tools produce errors when they fail, and the AI is ignoring this first line of feedback.

It is not a maintainers job to review code for syntax errors, or use of APIs that don't actually exist, or other silly mistakes. That's the compilers job and it does it well. The AI needs to take that feedback and fix the issues before escalating to humans.

sbarre

Like I said, I think you may be missing the point of the whole exercise.

mieubrisse

I was looking for exactly this comment. Everybody's gloating, "Wow look how dumb AI is! Haha, schadenfreude!" but this seems like just a natural part of the evolution process to me.

It's going to look stupid... until the point it doesn't. And my money's on, "This will eventually be a solved problem."

roxolotl

The question though is what is the time horizon of “eventually”. Very different decisions should be made if it’s 1 year, 2 years, 4 years, 8 years etc. To me it seems as if everyone is making decisions which are only reasonable if the time horizon is 1 year. Maybe they are correct and we’re on the cusp. Maybe they aren’t.

Good decision making would weigh the odds of 1 vs 8 vs 16 years. This isn’t good decision making.

Qem

> It's going to look stupid... until the point it doesn't. And my money's on, "This will eventually be a solved problem."

AI can remain stupid longer than you can remain solvent.

grewsome

Sometimes the last 10% takes 90% of the time. It'll be interesting to see how this pans out, and whether it will eventually get to something that could be considered a solved problem.

I'm not so sure they'll get there. If the solved problem is defined as a sub-standard but low cost, then I wouldn't bet against that. A solution better than that though, I don't think I'd put my money on that.

Workaccount2

To some people, it will always look stupid.

I have met people who believe that automobile engineering peaked in the 1960's, and they will argue that until you are blue in the face.

null

[deleted]

solids

You are not addressing the point in the comment, why are failing CI changes assigned?

sbarre

I believe I did address that when I said "this is not business as usual work"..

So the typical expectations or norms of how code reviews and PRs work between humans don't really apply here.

That's my guess at least. I have no more insider information than you.

beefnugs

This is the exact reason AI sucks : there is no proper feedback loop.

EVERY single prompt should have the opportunity to get copied off into a permanent log where the end user triggers it : log all input, all output, human writes a summary of what he wanted to happen but did not, what he thinks might have went wrong, what he thinks should have happened (domain specific experts giving feedback about how things are fucking up) And then its still only useful with long term tracking like how someone actually made a training change to fix this exact failure scenario.

None of that exists, so just like "full self driving" was a pie in the sky bullshit dream that proved machine learning has an 80/20 never gonna fully work problem, same thing here

Dlanv

They said in the comments that currently the firewall is blocking it from checking tests for passing, and they need to fix that.

Otherwise it would check the tests are passing.

robotcapital

Replace the AI agent with any other new technology and this is an example of a company:

1. Working out in the open

2. Dogfooding their own product

3. Pushing the state of the art

Given that the negative impact here falls mostly (completely?) on the Microsoft team which opted into this, is there any reason why we shouldn't be supporting progress here?

JB_Dev

100% agree. i’m not sure why everyone is clowning on them here. This process is a win. Do people want this all being hidden instead in a forked private repo?

It’s showing the actual capabilities in practice. That’s much better and way more illuminating than what normally happens with sales and marketing hype.

rco8786

Satya says: "I’d say maybe 20%, 30% of the code that is inside of our repos today and some of our projects are probably all written by software".

Zuckerberg says: "Our bet is sort of that in the next year probably … maybe half the development is going to be done by AI, as opposed to people, and then that will just kind of increase from there".

It's hard to square those statements up with what we're seeing happen on these PRs.

SketchySeaBeast

These are AI companies selling AI to executives, there's no need to square the circle, the people that they are talking to have no interest in what's happening in a repo, it's about convincing people to buy in early so they can start making money off their massive investments.

polishdude20

The fact that Zuck is saying "sort of" and "probably" is a big giveaway it's not going to happen.

daveguy

> Satya says: "I’d say maybe 20%, 30% of the code that is inside of our repos today and some of our projects are probably all written by software".

Well, that makes sense to me. Microsoft's software has gotten noticably worse in the last few years. So much that I have abandoned it for my daily driver for the first time since the early 2000s.

throwaway844498

"Pushing the state of the art" and experimenting on a critical software development framework is probably not the best idea.

Dlanv

Why not, when it goes through code review by experienced software engineers who are experts on the subject in a codebase that is covered by extensive unit tests?

Draiken

I don't know about you, but it's much more likely for me to let a bug slip when I'm reviewing someone else's code than when I'm writing it myself.

This is what's happening right now: they are having to review every single line produced by this machine and trying to understand why it wrote what it wrote.

Even with experienced developers reviewing and lots of tests, the likelihood of bugs in this code compared to a real engineer working on it is much higher.

Why not do this on less mission critical software at the very least?

Right now I'm very happy I don't write anything on .NET if this is what they'll use as a guinea pig for the snake oil.

constantcrying

Who is "we" and how and why would "we" "support" or not "support" anything.

Personally I just think it is funny that MS is soft launching a product into total failure.

mrguyorama

>supporting progress

This presupposes AI IS progress.

Nevermind that what this actually shows is an executive or engineering team that so buys their own hype that they didn't even try to run this locally and internally before blasting to the world that their system can't even ensure tests are passing before submitting a PR. They are having a problem with firewall rules blocking the system from seeing CI outcomes and that's part of why it's doing so badly, so why wasn't that verified BEFORE doing this on stage?

"Working out in the open" here is a bad thing. These are issues that SHOULD have been caught by an internal POC FIRST. You don't publicly do bullshit.

"Dogfooding" doesn't require throwing this at important infrastructure code. Does VS code not have small bugs that need fixing? Infrastructure should expect high standards.

"Pushing the state of the art" is comedy. This is the state of the art? This is pushing the state of the art? How much money has been thrown into the fire for this result? How much did each of those PRs cost anyway?

lawn

Because they're using it on an extremely popular repository that many people depend on?

And given the absolute garbage the AI is putting out the quality of the repo will drop. Either slop code will get committed or the bots will suck away time from people who could've done something productive instead.

globalise83

Malicious compliance should be the order of the day. Just approve the requests without reviewing them and wait until management blinks when Microsoft's entire tech stack is on fire. Then quit your job and become a troubleshooter on x3 the pay.

sbarre

I know this is meant to sound witty or clever, but who actually wants to behave this way at their job?

I'll never understand the antagonistic "us vs. them" mentality people have with their employer's leadership, or people who think that you should be actively sabotaging things or be "maliciously compliant" when things aren't perfect or you don't agree with some decision that was made.

To each their own I guess, but I wouldn't be able to sleep well at night.

HelloMcFly

It’s worth recognizing that the tension between labor and capital historical reality, not just a modern-day bad attitude. Workers and leadership don’t automatically share goals, especially when senior management incentives often prioritize reducing labor costs which they always do now (and no, this wasn't always universally so).

Most employees want to do good work, but pretending there’s no structural divergence in interests flattens decades of labor history and ignores the power dynamics baked into modern orgs. It’s not about being antagonistic, it’s about being clear-eyed where there are differences between the motivations of your org. leadership and your personal best interests. After a few levels remove from your position, you're just headcount with loaded cost.

sbarre

Great comment.. It's of course more complex than I made it out to be, I was mostly reacting to the idea of "malicious compliance" at your place of employment and how at odds that is with my own personal morals and approach.

But 100% agreed that everyone should maintain a realistic expectation and understanding of their relationship with their employer, and that job security and employment guarantees are possibly at an all-time low in our industry.

Frost1x

I suppose that depends on your relationship with your employer. If your goals are highly aligned (e.g. lots of equity based compensation, some degree of stability and security, interest in your role, healthy management practices that value their workforce, etc.) then I agree, it’s in your own self interest to push back because it can effect you directly.

Meanwhile a lot of folks have very unhealthy to non-existent relationships with their employers. There may be some mixture where they may be temporary hired/viewed as highly disposable or transient in nature having very little to gain from the success of the business, they may be compensated regardless of success/failure, they may have toxic management who treat them terribly (condescendingly, constantly critical, rarely positive, etc.). Bad and non-existent relationships lead to this sort of behavior. In general we’re moving towards “non-existent” relationships with employers broadly speaking for the labor force.

The counter argument is often floated here “well why work there” and the fact is money is necessary to survive, the number of positions available hiring at any given point is finite, and many almost by definition won’t ever be the top performers in their field to the point they truly choose their employers and career paths with full autonomy. So lots of people end up in lots of places that are toxic or highly misaligned with their interests as a survival mechanism. As such, watching the toxic places shoot themselves in the foot can be some level of justice people find where generally unpleasant people finally get to see consequences of their actions and take some responsibility.

People will prop others up from their own consequences so long as there’s something in it for them. As you peel that away, at some point there’s a level of poetic justice to watch the situation burn. This is why I’m not convinced having completely transactional relationships with employers is a good thing. Even having self interest and stability in mind, certain levels of toxicity in business management can fester. At some point no amount of money is worth dealing with that and some form of correction is needed there. The only mechanism is to typically assure poor decision making and action is actually held accountable.

sbarre

Another great comment, thanks! Like I said elsewhere I agree things are more complicated than I made them out to be in my short and narrow response.

I agree with all your points here, the broader context of one's working conditions really matter.

I do think there's a difference between sitting back and watching things go bad (vs struggling to compensate for other people's bad decisions) and actively contributing to the problems (the "malicious compliance" part)..

Letting things fail is sometimes the right choice to make, if you feel like you can't effect change otherwise.

Being the active reason that things fail, I don't think is ever the right choice.

nope1000

On the other hand: why should you accept that your employer is trying to fire you but first wants you to train the machine that will replace you? For me this is the most "them vs us" it can be.

early_exit

To be fair, "them" are actively working to replace "us" with AI.

Xori71

I agree. It doesn’t help that once things start breaking down, the employer will ask the employees to fix the issue themselves, and thus they’ll have to deal with so much broken code that they’ll be miserable. It’ll become a spiral.

anonymousab

When the issues arise because of the tool being trained explicitly to respect/fire you, then that sounds like an apt and appropriate resulting level of job security.

null

[deleted]

whywhywhywhy

> but who actually wants to behave this way at their job?

Almost no one does but people get ground down and then do it to cope.

mhuffman

>I'll never understand the antagonistic "us vs. them" mentality people have with their employer's leadership

Interesting because "them" very much have an antagonistic mentality vs "us". "Them" would fire you in a fucking heartbeat to save a relatively small amount (10%). "Them" also want to aggressively pay you the least amount for which they can get you to do work for them, not what they "value" you at. "Us" depends on "them" for our livelihoods and the lives of people that depend on us, but "them" doesn't doesn't have any dependency on you that can't be swapped out rather quickly.

I am a capitalist, don't get me wrong, but it is a very one-sided relationship not even-footed or rooted in two-way respect. You describe "them" as "leadership" while "Them" describe you as a "human resource" roughly equivalent to the way toilet paper and plastics for widgets are described.

If you have found a place to work where people respect you as a person, you should really cherish that job, because most are not that way.

sbarre

Yep maybe I've been lucky but in my 30-year career, I've worked at over a dozen companies (big and small), and I've always been well-treated and respected, and I've never felt the kind of dynamic you describe. But that isn't to say that I don't think it exists or happens. I'm sure it does.

It's everyone's personal choice to put their own lens on how they believe other people think - like your take on how "leadership" thinks of their employees.

I guess I choose to be more positive about it - having been in leadership positions myself, including having to oversee layoffs as part of an eventual company wind-down - but I readily acknowledge that my own biases come into this based on my personal career experiences.

null

[deleted]

tantalor

> when Microsoft's entire tech stack is on fire

Too late?

MonkeyClub

Just in time for marshmallows!

weird-eye-issue

That's cute, but the maintainers themselves submitted the requests with Copilot.

xyst

At some point code pilot will just delete the whole codebase. Can’t fail integration tests if there is no code :)

otabdeveloper4

That would be logical, but alas LLMs can't into logic.

Bloating the codebase with dead code is much more likely.

hello_computer

Might as well when they’re going to lay you off no matter what you do (like the guy who made an awesome TypeScript compiler in Go).

balazstorok

At least opening PRs is a safe option, you can just dump the whole thing if it doesn't turn out to be useful.

Also, trying something new out will most likely have hiccups. Ultimately it may fail. But that doesn't mean it's not worth the effort.

The thing may rapidly evolve if it's being hard-tested on actual code and actual issues. For example it will be probably changed so that it will iterate until tests are actually running (and maybe some static checking can help it, like not deleting tests).

Waiting to see what happens. I expect it will find its niche in development and become actually useful, taking off menial tasks from developers.

Frost1x

It might be a safer option in a forked version of the project that the public can’t see. I have to wonder about the optics here from a sales perspective. You’d think they’d test this out more internally before putting it in public access.

Now when your small or medium size business management reads about CoPilot in some Executive Quarterly magazine and floats that brilliant idea internally, someone can quite literally point to these as examples of real world examples and let people analyze and pass it up the management chain. Maybe that wasn’t thought through all the way.

Usually businesses tend to hide this sort of performance of their applications to the best of their abilities, only showcasing nearly flawless functionality.

6uhrmittag

> At least opening PRs is a safe option, you can just dump the whole thing if it doesn't turn out to be useful.

However, every PR adds load and complexity to community projects.

As another commenter suggested, doing these kind of experiments on separate forks sound a bit less intrusive. Could be a take away from this experiment and set a good example.

There are many cool projects on GitHub that are just accumulating PRs for years, until the maintainer ultimately gives up and someone forks it and cherry-picks the working PRs. I've than that myself.

I'm super worried that we'll end up with more and more of these projects and abandoned forks :/

cesarb

> At least opening PRs is a safe option, you can just dump the whole thing if it doesn't turn out to be useful.

There's however a border zone which is "worse than failure": when it looks good enough that the PRs can be accepted, but contain subtle issues which will bite you later.

UncleMeat

Yep. I've been on teams that have good code review culture and carefully review things so they'd be able to catch subtle issues. But I've also been on teams where reviews are basically "tests pass, approved" with no other examination. Those teams are 100% going to let garbage changes in.

camdenreslink

Even when you review human-written code carefully, subtle bugs can sneak through. Software development is hard.

ecb_penguin

Funny enough, this happens literally every day with millions of developers. There will be thousands upon thousands of incidents in the next hour because a PR looked good, but contained a subtle issue.

xnickb

> I expect it will find its niche in development and become actually useful, taking off menial tasks from developers.

Reading AI generated code is arguably far more annoying than any menial task. Especially if the said code happens to have subtle errors.

Speaking from experience.

ecb_penguin

This is true for all code and has nothing to do with AI. Reading code has always been harder than writing code.

The joke is that PERL was a write-once, read-none language.

> Speaking from experience.

My experience is all code can have subtle errors, and I wouldn't treat any PR differently.

cyanydeez

Unfortunately,if you believe LLMs really can learn to code with bugs, then the nezt step would be to curate a sufficiently bug free data set. Theres no evidence this has occured, rather, they just scraped whayecer

petetnt

GitHub has spent billions of dollars building an AI that struggles with things like whitespace related linting errors on one of the most mature repositories available. This would be probably okay for a hobbyist experiment, but they are selling this as a groundbreaking product that costs real money.

marcosdumay

> This would be probably okay for a hobbyist experiment

It's perfectly ok for a professional research experiment.

What's not ok is their insistence on selling the partial research results.

sexy_seedbox

Nat Friedman must be rolling in his grave...

oh wait

ocdtrekkie

He's rolling in money for sure.

Philpax

Stephen Toub, a Partner Software Engineer at MS, explaining that the maintainers are intentionally requesting these PRs to test Copilot: https://github.com/dotnet/runtime/pull/115762#issuecomment-2...

Crosseye_Jack

I do love one bot asking another bot to sign a CLA! - https://github.com/dotnet/runtime/pull/115732#issuecomment-2...

pm215

That's funny, but also interesting that it didn't "sign" it. I would naively have expected that being handed a clear instruction like "reply with the following information" would strongly bias the LLM to reply as requested. I wonder if they've special cased that kind of thing in the prompt; or perhaps my intuition is just wrong here?

Bedon292

A comment on one of the threads, when a random person tried to have copilot change something, said that copilot will not respond to anyone without write access to the repo. I would assume that bot doesn't have write access, so copilot just ignores them.

Quarrel

AI can't, as I understand it, have copyright over anything they do.

Nor can it be an entity to sign anything.

I assume the "not-copyrightable" issue, doesn't in anyway interfere with the rights trying to be protected by the CLA, but IANAL ..

I assume they've explicitly told it not to sign things (perhaps, because they don't want a sniff of their bot agreeing to things on behalf of MSFT).

candiddevmike

Are LLM contributions effectively under public domain?

90s_dev

Well?? Did it sign it???

jsheard

Not sure if a chatbot can legally sign a contract, we'd better ask ChatGPT for a second opinion.

gortok

At least currently, to qualify for copyright, there must be a human author. https://www.reuters.com/world/us/us-appeals-court-rejects-co...

I have no idea how this will ultimately shake out legally, but it would be absolutely wild for Microsoft to not have thought about this potential legal issue.

TuringNYC

There is some unfortunate history here, though not a perfect analog: https://en.wikipedia.org/wiki/2010_United_States_foreclosure...

b0ner_t0ner

Just need the chatbot to connect to an MCP to call my robotic arm to sign it.

tessierashpool9

offer it more money, then it will sign

Hamuko

I would imagine it can't sign it, especially with the options given.

>I have sole ownership of intellectual property rights to my Submissions

I would assume that the AI cannot have IP ownership considering that an AI cannot have copyright in the US.

>I am making Submissions in the course of work for my employer (or my employer has intellectual property rights in my Submissions by contract or applicable law). I have permission from my employer to make Submissions and enter into this Agreement on behalf of my employer.

Surely an AI would not be classified as an employee and therefore would not have an employer. Has Microsoft drafted an employment contract with Copilot? And if we consider an AI agent to be an employee, is it protected by the Fair Labor Standards Act? Is it getting paid at least minimum wage?

marcosdumay

It didn't. It completely ignored the request.

(Turns out the AI was programmed to ignore bots. Go figure.)

nikolayasdf123

that's the future, AI talking to other AI, everywhere, all the time

thallium205

Is this the first instance of an AI cyber bullying another AI?

Quarrelsome

rah, we might be in trouble here. The primary issue at play is that we don't have a reliable means of measuring developer performance, outside of subjective judgement like end of year reviews.

This means its probably quite hard to measure the gain or the drag of using these agents. On one side, its a lot cheaper than a junior, but on the other side it pulls time from seniors and doesn't necessarily follow instruction well (i.e. "errr your new tests are failing").

This combined with the "cult of the CEO" sets the stage for organisational dissonance where developer complaints can be dismissed as "not wanting to be replaced" and the benefits can be overstated. There will be ways of measuring this, to project it as huge net benefit (which the cult of the CEO will leap upon) and there will be ways of measuring this to project it as a net loss (rabble rousing developers). All because there is no industry standard measure accepted by both parts of the org that can be pointed at which yields the actual truth (whatever that may be).

If I might add absurd conjecture: We might see interesting knock-on effects like orgs demanding a lowering of review standards in order to get more AI PRs into the source.

rco8786

> its a lot cheaper than a junior

I’m not even sure if this is true when considering training costs of the model. It takes a lot of junior engineer salaries to amortize the billions spent building this thing in the first place.

Quarrelsome

sure, but for an org just buying tokens its cheaper and more disposable than an employee. At least it looks better on paper for the bean counters.