really really really depends on what sort of code is being 'written'. 20 years ago, IDEs would automatically create boilerplate getters and setters. In large projects that's a non-trivial amount of code. IDEs can autocomplete stuff already. For most of the folks I know doing non-trivial projects, AI tools are... useful autocompletes, but not much more. So... 25% of your code was done by AI but is it the hard nitty gritty stuff? The value prop of your whole company? Or is it just lots of boilerplate that is necessary because of all the abstractions we have at our disposal today (or... all the abstractions that are required to do anything 'modern' to use a negative light on it)?
How many lines in a diff are actually relevant code? Anyone who does reviews knows the answer.
That is one of the reasons why lean, terse languages are often better to review.
We can guess by those companies preferred coding styles and technologies whether their codebases are lean and terse or full of straw. And that should give us an estimate.
Of course, I could be wrong. They could be doing this measurement after removing irrelevant changes.
The better choice would be not to publish those sorts of claims if there is not a clear methodology that explains how the number was achieved.
If any line of any codebase was written by AI, then up to 30% can be true, in the way that my ISP will claim ~900 Mbps still qualifies as up to 2 Gbps.
It doesn’t have to be contiguous AI-written chunks. This could also mean X% accepted AI suggestions on Y% codebase where X*Y < 30%.
AI suggestions can also be as simple as autocomplete and still be counted for the sake of engagement metrics.
Oh and in enterprise settings and especially MS shops Github Copilot is being pushed everywhere, (forceful) adoption rates are much higher than the market average.
Generated by software -> generated by AI seems huge logical leap. Then again maybe it can be for given meaning of "AI".
Not that 30% of code being automatically generated from templates or in some algorithmic way seem unbelievable. There is likely lot of code that could be generated by other code and it might even be reasonable choice.
> Satya Nadella said that 20%-30% of code inside the company’s repositories was “written by software”
and
> The Microsoft CEO said the company was seeing mixed results in AI-generated code across different languages, with more progress in Python and less in C++.
So the CEO of Microsoft is saying that 20 - 30% of their code is being produced by computer systems that write poor code?
Horrible why would you not let people eat on earth and in the name of saving you take away jobs. Let the AI because helpful as a tool to help people not to just take away jobs.
> Of course, it’s unclear how exactly Microsoft and Google are measuring what’s AI-generated versus not, so these figures are best taken with a grain of salt.
This does seem to me to be the key question, is anyone transparent about this? If not, why not?
The perception of having fallen behind in AI adversely impacts your stock price, the amount of capital you can marshal to actually compete in AI. What I think is actually happening industry-wide is that any sort of "intelligence" in software is slowly being rebranded as AI.
Only if it's effective at writing code, which it of course is not.
Any time you see companies refusing to even vaguely define what metrics like this mean (or, for that matter, using non-standard metrics, like disclosing weekly active users but not monthly), it's generally a very strong signal that they're not interested in being transparent because the truth is, ah, not what they would like it to be.
> "I'd say maybe 20%, 30% of the code that is inside of our repos today and some of our projects are probably all written by software," Nadella said during a conversation before a live audience with Meta CEO Mark Zuckerberg.
which is clearly untrue, one assumes he meant "20%, 30% written since 2023 was partially generated by an LLM operated by a developer", but that doesn't sell stock.
> Microsoft CEO Satya Nadella said that 20%-30% of code inside the company’s repositories was “written by software” — meaning AI
Does it mean AI though? Lots of lines in repositories are generated by software that isn’t AI. Dependency lock files, proto files, etc
IMO the wording is intentionally misleading.
really really really depends on what sort of code is being 'written'. 20 years ago, IDEs would automatically create boilerplate getters and setters. In large projects that's a non-trivial amount of code. IDEs can autocomplete stuff already. For most of the folks I know doing non-trivial projects, AI tools are... useful autocompletes, but not much more. So... 25% of your code was done by AI but is it the hard nitty gritty stuff? The value prop of your whole company? Or is it just lots of boilerplate that is necessary because of all the abstractions we have at our disposal today (or... all the abstractions that are required to do anything 'modern' to use a negative light on it)?
Percentages are misleading.
How many lines in a diff are actually relevant code? Anyone who does reviews knows the answer.
That is one of the reasons why lean, terse languages are often better to review.
We can guess by those companies preferred coding styles and technologies whether their codebases are lean and terse or full of straw. And that should give us an estimate.
Of course, I could be wrong. They could be doing this measurement after removing irrelevant changes.
The better choice would be not to publish those sorts of claims if there is not a clear methodology that explains how the number was achieved.
I swear it's as if the entire last 5 years of AI hype and sales were fueled by drugs.
The last few years of AI have been the longest decade.
this was reported as actually just IDE tab completion back when google claimed this stat last year
google's internal and extremely sophisticated LLM completion thing is driven by IDE tab completion
it also drives my gmail sentence autocomplete. it does not mean “30% of my email is written by AI”
I don’t believe that datum for a second considering how big their existing code bases are.
If any line of any codebase was written by AI, then up to 30% can be true, in the way that my ISP will claim ~900 Mbps still qualifies as up to 2 Gbps.
I don't believe it either but they probably meant 30% of the code committed not of their total codebase.
It doesn’t have to be contiguous AI-written chunks. This could also mean X% accepted AI suggestions on Y% codebase where X*Y < 30%.
AI suggestions can also be as simple as autocomplete and still be counted for the sake of engagement metrics.
Oh and in enterprise settings and especially MS shops Github Copilot is being pushed everywhere, (forceful) adoption rates are much higher than the market average.
Right, they're not going to want to risk breaking legacy application behavior.
Generated by software -> generated by AI seems huge logical leap. Then again maybe it can be for given meaning of "AI".
Not that 30% of code being automatically generated from templates or in some algorithmic way seem unbelievable. There is likely lot of code that could be generated by other code and it might even be reasonable choice.
Technically 0.5% qualifies as up to 30% to the marketing crowd.
> Satya Nadella said that 20%-30% of code inside the company’s repositories was “written by software”
and
> The Microsoft CEO said the company was seeing mixed results in AI-generated code across different languages, with more progress in Python and less in C++.
So the CEO of Microsoft is saying that 20 - 30% of their code is being produced by computer systems that write poor code?
Honestly, it is disappointing. Even joke jobs are taken away by LLMs. For shame.
My current VBA coding is all generated by Chat-GPT/Claude/Deepseek.
There is no use of writing VBA these days :@
So was it AI or code gen tools (interfaces generators, scaffolding, etc ?)
Horrible why would you not let people eat on earth and in the name of saving you take away jobs. Let the AI because helpful as a tool to help people not to just take away jobs.
Explains a lot.
Based on how overzealous these models[0] are to over engineer a solution it’s not surprising. I would imagine the real number is significantly lower.
[0] Claude 3.7 in my recent experience
Doesn't mean 30% more productivity
> Of course, it’s unclear how exactly Microsoft and Google are measuring what’s AI-generated versus not, so these figures are best taken with a grain of salt.
This does seem to me to be the key question, is anyone transparent about this? If not, why not?
The perception of having fallen behind in AI adversely impacts your stock price, the amount of capital you can marshal to actually compete in AI. What I think is actually happening industry-wide is that any sort of "intelligence" in software is slowly being rebranded as AI.
Why not? They want companies to buy their AI enabled slop.
Wouldn’t being transparent with how effective it was at writing code be a good sales tool?
Only if it's effective at writing code, which it of course is not.
Any time you see companies refusing to even vaguely define what metrics like this mean (or, for that matter, using non-standard metrics, like disclosing weekly active users but not monthly), it's generally a very strong signal that they're not interested in being transparent because the truth is, ah, not what they would like it to be.
Yes, and that's exactly why they're not transparent about it.
And 75% of my code was “written” by copy/pasting…
If this isn’t jumping the shark it’s darn close.
...as anyone who's used MS Teams can attest.
Good news then. If the last update to Outlook I received is any indicator they're coming for that, next.
the actual quote (https://www.nbclosangeles.com/news/business/money-report/sat...):
> "I'd say maybe 20%, 30% of the code that is inside of our repos today and some of our projects are probably all written by software," Nadella said during a conversation before a live audience with Meta CEO Mark Zuckerberg.
which is clearly untrue, one assumes he meant "20%, 30% written since 2023 was partially generated by an LLM operated by a developer", but that doesn't sell stock.