Model Comparison
Model Editorial Structural Class Conf SETL Theme
claude-haiku-4-5-20251001 +0.19 +0.30 Mild positive 0.39 0.63 Labor Rights & Honest Discourse
@cf/meta/llama-4-scout-17b-16e-instruct lite +0.24 ND Mild positive 0.80 0.00 Tech Industry
@cf/meta/llama-3.3-70b-instruct-fp8-fast lite 0.00 ND Neutral 0.80 0.00 AI productivity
Section claude-haiku-4-5-20251001 @cf/meta/llama-4-scout-17b-16e-instruct lite @cf/meta/llama-3.3-70b-instruct-fp8-fast lite
Preamble 0.15 ND ND
Article 1 0.20 ND ND
Article 2 -0.20 ND ND
Article 3 -0.15 ND ND
Article 4 ND ND ND
Article 5 ND ND ND
Article 6 ND ND ND
Article 7 0.20 ND ND
Article 8 ND ND ND
Article 9 ND ND ND
Article 10 0.10 ND ND
Article 11 ND ND ND
Article 12 ND ND ND
Article 13 ND ND ND
Article 14 ND ND ND
Article 15 ND ND ND
Article 16 ND ND ND
Article 17 ND ND ND
Article 18 0.50 ND ND
Article 19 0.60 ND ND
Article 20 ND ND ND
Article 21 0.20 ND ND
Article 22 0.40 ND ND
Article 23 0.60 ND ND
Article 24 -0.20 ND ND
Article 25 0.30 ND ND
Article 26 -0.15 ND ND
Article 27 0.20 ND ND
Article 28 0.20 ND ND
Article 29 0.15 ND ND
Article 30 ND ND ND
+0.19 Where's the shovelware? Why AI coding claims don't add up (mikelovesrobots.substack.com S:+0.30 )
770 points by dbalatero 179 days ago | 482 comments on HN | Mild positive Editorial · v3.7 · 2026-02-28 13:20:26 0
Summary Labor Rights & Honest Discourse Advocates
This Substack article by developer Mike Judge advocates for honest assessment of AI coding tool productivity claims, arguing that widespread industry claims of 10x productivity gains are unsupported by evidence and justify unfair employment practices. The piece centrally engages Articles 19 (Free Expression), 23 (Labor Rights), and 22 (Economic Justice), using data analysis to challenge narratives that have justified layoffs and salary suppression, while defending developers' intellectual freedom and right to informed decision-making.
Article Heatmap
Preamble: +0.15 — Preamble P Article 1: +0.20 — Freedom, Equality, Brotherhood 1 Article 2: -0.20 — Non-Discrimination 2 Article 3: -0.15 — Life, Liberty, Security 3 Article 4: ND — No Slavery Article 4: No Data — No Slavery 4 Article 5: ND — No Torture Article 5: No Data — No Torture 5 Article 6: ND — Legal Personhood Article 6: No Data — Legal Personhood 6 Article 7: +0.20 — Equality Before Law 7 Article 8: ND — Right to Remedy Article 8: No Data — Right to Remedy 8 Article 9: ND — No Arbitrary Detention Article 9: No Data — No Arbitrary Detention 9 Article 10: +0.10 — Fair Hearing 10 Article 11: ND — Presumption of Innocence Article 11: No Data — Presumption of Innocence 11 Article 12: ND — Privacy Article 12: No Data — Privacy 12 Article 13: ND — Freedom of Movement Article 13: No Data — Freedom of Movement 13 Article 14: ND — Asylum Article 14: No Data — Asylum 14 Article 15: ND — Nationality Article 15: No Data — Nationality 15 Article 16: ND — Marriage & Family Article 16: No Data — Marriage & Family 16 Article 17: ND — Property Article 17: No Data — Property 17 Article 18: +0.50 — Freedom of Thought 18 Article 19: +0.60 — Freedom of Expression 19 Article 20: ND — Assembly & Association Article 20: No Data — Assembly & Association 20 Article 21: +0.20 — Political Participation 21 Article 22: +0.40 — Social Security 22 Article 23: +0.60 — Work & Equal Pay 23 Article 24: -0.20 — Rest & Leisure 24 Article 25: +0.30 — Standard of Living 25 Article 26: -0.15 — Education 26 Article 27: +0.20 — Cultural Participation 27 Article 28: +0.20 — Social & International Order 28 Article 29: +0.15 — Duties to Community 29 Article 30: ND — No Destruction of Rights Article 30: No Data — No Destruction of Rights 30
Negative Neutral Positive No Data
Aggregates
Editorial Mean +0.19 Structural Mean +0.30
Weighted Mean +0.22 Unweighted Mean +0.18
Max +0.60 Article 19 Min -0.20 Article 2
Signal 17 No Data 14
Volatility 0.25 (Medium)
Negative 4 Channels E: 0.6 S: 0.4
SETL +0.63 Editorial-dominant
FW Ratio 56% 45 facts · 36 inferences
Evidence 39% coverage
5H 12M 14 ND
Theme Radar
Foundation Security Legal Privacy & Movement Personal Expression Economic & Social Cultural Order & Duties Foundation: 0.05 (3 articles) Security: -0.15 (1 articles) Legal: 0.15 (2 articles) Privacy & Movement: 0.00 (0 articles) Personal: 0.50 (1 articles) Expression: 0.40 (2 articles) Economic & Social: 0.28 (4 articles) Cultural: 0.03 (2 articles) Order & Duties: 0.17 (2 articles)
HN Discussion 20 top-level · 30 replies
wrs 2025-09-03 21:57 UTC link
This makes some sense. We have CEOs saying they're not hiring developers because AI makes their existing ones 10X more productive. If that productivity enhancement was real, wouldn't they be trying to hire all the developers? If you're getting 10X the productivity for the same investment, wouldn't you pour cash into that engine like crazy?

Perhaps these graphs show that management is indeed so finely tuned that they've managed to apply the AI revolution to keep productivity exactly flat while reducing expenses.

bjackman 2025-09-03 22:03 UTC link
There is actually a lot of AI shovelware on Steam. Sort by newest releases and you'll see stuff like a developer releasing 10 puzzle games in one day.

I have the same experience as OP, I use AI every day including coding agents, I like it, it's useful. But it's not transformative to my core work.

I think this comes down to the type of work you're doing. I think the issue is that most software engineering isn't in fields amenable to shovelware.

Most of us either work in areas where the coding is intensely brownfield. AI is great but not doubling anyone's productivity. Or, in areas where the productivity bottlenecks are nowhere near the code.

com2kid 2025-09-03 22:04 UTC link
Multiple things can be true at the same time:

1. LLMs do not increase general developer productivity by 10x across the board for general purpose tasks selected at random.

2. LLMs dramatically increases productivity for a limited subset of tasks

3. LLMs can be automated to do busy work and although they may take longer in terms of clock time than a human, the work is effectively done in the background.

LLMs can get me up to speed on new APIs and libraries far faster than I can myself, a gigantic speedup. If I need to write a small bit of glue code in a language I do not know, LLMs not only save me time, but they make it so I don't have to learn something that I'll likely never use again.

Fixing up existing large code bases? Productivity is at best a wash.

Setting up a scaffolding for a new website? LLMs are amazing at it.

Writing mocks for classes? LLMs know the details of using mock libraries really well and can get it done far faster than I can, especially since writing complex mocks is something I do a couple times a year and completely forget how to do in-between the rare times I am doing it.

Navigating a new code base? LLMs are ~70% great at this. If you've ever opened up an over-engineered WTF project, just finding where HTTP routes are defined at can be a problem. "Yo, Claude, where are the route endpoints in this project defined at? Where do the dependency injected functions for auth live?"

Right tool, right job. Stop using a hammer on nails.

kenjackson 2025-09-03 22:06 UTC link
Shovelware may not be a good way to track additional productivity.

That said, I’m skeptical that AI is as helpful for commercial software. It’s been great for in automating my workflow because I suck at shell scripting and AI is great at it. But most of the code I write I honestly halfway don’t know what I’m going to write until I write it. The prompt itself is where my thinking goes - so the time savings would be fairly small, but I also think I’m fairly skilled (except at scripting).

captainkrtek 2025-09-03 22:06 UTC link
This tracks with my own experience as well. I’ve found it useful in some trivial ways (eg: small refactors, type definition from a schema, etc.) but so far tasks more than that it misses things and requires rework, etc. The future may make me eat my words though.

On the other hand, I’ve lately seen it misused by less experienced engineers trying to implement bigger features who eagerly accept all it churns out as “good” without realizing the code it produced:

- doesn’t follow our existing style guide and patterns.

- implements some logic from scratch where there certainly is more than one suitable library, making this code we now own.

- is some behemoth of a PR trying to do all the things.

some-guy 2025-09-03 22:07 UTC link
These claims wouldn't matter if the topic weren't so deadly serious. Tech leaders everywhere are buying into the FOMO, convinced their competitors are getting massive gains they're missing out on. This drives them to rebrand as AI-First companies, justify layoffs with newfound productivity narratives, and lowball developer salaries under the assumption that AI has fundamentally changed the value equation.

This is my biggest problem right now. The types of problems I'm trying to solve at work require careful planning and execution, and AI has not been helpful for it in the slightest. My manager told me that the time to deliver my latest project was cut to 20% of the original estimate because we are "an AI-first company". The mass hysteria among SVPs and PMs is absolutely insane right now, I've never seen anything like it.

larve 2025-09-03 22:08 UTC link
In case the author is reading this, I have the receipts on how there's a real step function in how much software I build, especially lately. I am not going to put any number on it because that makes no sense, but I certainly push a lot of code that reasonably seems to work.

The reason it doesn't show up online is that I mostly write software for myself and for work, with the primary goal of making things better, not faster. More tooling, better infra, better logging, more prototyping, more experimentation, more exploration.

Here's my opensource work: https://github.com/orgs/go-go-golems/repositories . These are not just one-offs (although there's plenty of those in the vibes/ and go-go-labs/ repositories), but long-lived codebases / frameworks that are building upon each other and have gone through many many iterations.

jryio 2025-09-03 22:10 UTC link
I completely agree with the thesis here. I also have not seen a massive productivity boost with the use of AI.

I think that there will be neurological fatigue occurring whereby if software engineers are not actively practicing problem-solving, discernment, and translation into computer code - those skills will atrophy...

Yee, AI is not the 2x or 10x technology of the future ™ is was promised to be. It may the case that any productivity boost is happening within existing private code bases. Even still, there should be a modest uptick in noticeably improved offer deployment in the market, which does not appear to be there.

In my consulting practice I am seeing this phenomenon regularly, wereby new founders or stir crazy CTOs push the use of AI and ultimately find that they're spending more time wrangling a spastic code base than they are building shared understanding and working together.

I have recently taken on advisory roles and retainers just to reinstill engineering best practices..

rglover 2025-09-03 22:11 UTC link
Most of it doesn't exist beyond videos of code spraying onto a screen alongside a claim that "juniors are dead."

I think the "why" for this is that the stakes are high. The economy is trembling. Tech jobs are evaporating. There's a high anxiety around AI being a savior, and so, a demi-religion is forming among the crowd that needs AI to be able to replace developers/competency.

That said: I personally have gotten impressive results with AI, but you still need to know what you're doing. Most people don't (beyond the beginner -> intermediate range), and so, it's no surprise that they're flooding social media with exaggerated claims.

If you didn't have a superpower before AI (writing code), then having that superpower as a perceived equalizer is something that you will deploy all resources (material, psychological, etc) to ensuring that everyone else maintain the position that 1) superpower good, 2) superpower cannot go away 3) the superpower being fallible should be ignored.

Like any other hype cycle, these people will flush out, the midpoint will be discovered, and we'll patiently await the next excuse to incinerate billions of dollars.

throwaway13337 2025-09-03 22:12 UTC link
Great angle to look at the releases of new software. I, too, thought we'd see a huge increase by now.

An alternative theory is that writing code was never the bottleneck of releasing software. The exploration of what it is you're building and getting it on a platform takes time and effort.

On the other hand, yeah, it's really easy to 'hold it wrong' with AI tools. Sometimes I have a great day and think I've figured it out. And then the next day, I realize that I'm still holding it wrong in some other way.

It is philosophically interesting that it is so hard to understand what makes building software products hard. And how to make it more productive. I can build software for 20 years and still feel like I don't really know.

benjiro 2025-09-03 22:17 UTC link
I need to agree with the author, with a caveat. He is a well developed developer. For somebody like him, churning out good quality code is probably easy.

Where i expect to see a lot of those metrics of feeling fast come from, is from people who may have less coding experience, and with AI are coding way above their level.

My brother in law asks for a nice product website, i just feed his business plan into a LLM, do some fine tuning on the results, and have a good looking website in a hour time. If i did it myself manually, just take me behind a barn as those jobs are so boring and take for ages. But i know that website design is a weakness of mine.

That is the power of LLMs. Turn out quick code, maybe offer some suggestion you did not think about, but ... it also eats time! Making your prompts so that the LLM understands, waiting for the result, ... waiting ... ok, now check the result, can you use it? O no, it did X, Y, Z wrong. Prompt again ... and again. And this is where your productivity goes to die.

So when you compare a pool of developer feedback, your going to get a broad "it helps a lot", "some", "is worse then my code", ... mix in with the prompting, result delays etc...

It gets even worse with Agent / Vibe coding, as you just tend to be waiting, 5, 10min for changed to be done. You need to review them, test them, ... o no, the LLM screwed something up again. O no, it removed 50% of my code. Hey, where did my comments go. And we are back to a loss of time.

LLMs are a tool... But after a lot of working with them, my opinion is to use them when needed but do not depend on them for everything. I sometimes look with cow eyes when people say they are coding so much with LLMs and spending 200, or more bucks per month.

They can be powerful tools, but i feel that some folks become so over dependent on them. And worst is my feeling that our juniors are going to be in a world of hurt, if their skills are more LLM monkey coding (or vibe coding), then actually understanding how to code (and the knowledge behind the actual programming languages and systems).

searls 2025-09-03 23:05 UTC link
The answer is that we're making it right now. AI didn't speed me up at all until agents got good enough, which was April/May of this year.

Just today I built a shovelware CLI that exports iMessage archives into a standalone website export. Would have taken me weeks. I'll probably have it out as a homebrew formula in a day or two.

I'm working on an iOS app as well that's MUCH further along than it would be if I hand-rolled it, but I'm intentionally taking my time with it.

Anyway, the post's data mostly ends in March/April which is when generative AI started being useful for coding at all (and I've had Copilot enabled since Nov 2022)

NathanKP 2025-09-03 23:07 UTC link
I think the explanation is simple: there is a direct correlation between being too lazy and demotivated to write your own code, and being too lazy and demotivated to actually finish a project and publish your work online.

The same people who are willing to go through all the steps to release an application online are also willing to go through the extra effort of writing their own code. The code is actually the easy part compared to the rest of it... always has been.

stillsut 2025-09-03 23:39 UTC link
Got your shovelware right here...with receipts.

Background: I'm building a python package side project which allows you to encode/decode messages into LLM output.

Receipts: the tool I'm using creates a markdown that displays every prompt typed, and every solution generated, along with summaries of the code diffs. You can check it out here: https://github.com/sutt/innocuous/blob/master/docs/dev-summa...

Specific example: Actually used a leet-code style algorithms implementation of memo-ization for branching. This would have taken a couple of days to implement by hand, but it took about 20 minutes to write the spec and 20 minutes to review solutions and merge the solution generated. If you're curious you can see this diff generated here: https://github.com/sutt/innocuous/commit/cdabc98

m-hodges 2025-09-04 00:08 UTC link
This article reminds me of two recent observations by Paul Krugman about the internet:

"So, here’s labor productivity growth over the 25 years following each date on the horizontal axis [...] See the great productivity boom that followed the rise of the internet? Neither do I. [...] Maybe the key point is that nobody is arguing that the internet has been useless; surely, it has contributed to economic growth. The argument instead is that its benefits weren’t exceptionally large compared with those of earlier, less glamorous technologies."¹

"On the second, history suggests that large economic effects from A.I. will take longer to materialize than many people currently seem to expect [...] And even while it lasted, productivity growth during the I.T. boom was no higher than it was during the generation-long boom after World War II, which was notable in the fact that it didn’t seem to be driven by any radically new technology [...] That’s not to say that artificial intelligence won’t have huge economic impacts. But history suggests that they won’t come quickly. ChatGPT and whatever follows are probably an economic story for the 2030s, not for the next few years."²

¹ https://www.nytimes.com/2023/04/04/opinion/internet-economy....

² https://www.nytimes.com/2023/03/31/opinion/ai-chatgpt-jobs-e...

InCom-0 2025-09-04 00:39 UTC link
On one hand I don't understand what all the fuss is about. LLMs are great at all kinds of things around and about: searching for (good) information, summarizing existing text, conceptual discussions where it points you in the right directions very quickly, etc. ..... they are just not great (some might say harmful) at straight up non-trivial code generation or design of complex systems with the added peculiarity that on the surface the models seem almost capable to do it but never quite ... which is sort their central feature: producing text so that it is seems correct from statistical perspective, but without actual reasoning.

On the other hand, I do understand that the things the LLMs are really great at is not actually all that spectacular to monetize ... and so as a result we have all these snake oil salesmen on every corner boasting about nonsensical vibecoding achievements, because that's where the real money would be ... if it were really true ... but it is not.

raylad 2025-09-04 02:53 UTC link
I used to be a full-time developer back in the day. Then I was a manager. Then I was a CTO. I stopped doing the day-to-day development and even stopped micro-managing the detailed design.

When I tried to code again, I found I didn't really have the patience for it -- having to learn new frameworks, APIs, languages, tricky little details, I used to find it engrossing: it had become annoying.

But with tools like Claude Code and my knowledge about how software should be designed and how things should work, I am able to develop big systems again.

I'm not 20% more productive than I was. I'm not 10x more productive than I was either. I'm infinity times more productive because I wouldn't be doing it at all otherwise, realistically: I'd either hire someone to do it, or not do it, if it wasn't important enough to go through the trouble to hire someone.

Sure, if you are a great developer and spend all day coding and love it, these tools may just be a hindrance. But if you otherwise wouldn't do it at all they are the opposite of that.

solatic 2025-09-04 06:00 UTC link
I'm not sure what to make of these takes because so many people are using such an enormous variety of LLM tooling in such a variety of ways, people are going to get a variety of results.

Let's take the following scenario for the sake of argument: a codebase with well-defined AGENTS.md, referencing good architecture, roadmap, and product documentation, and with good test coverage, much of which was written by an LLM and lightly reviewed and edited by a human. Let's say for the sake of argument that the human is not enjoying 10x productivity despite all this scaffolding.

Is it still worthwhile to use LLM tooling? You know what, I think a lot of companies would say yes. There are way too many companies whose codebases lack testing and documentation, that are too difficult to on-board new engineers and have too high risk if the original engineers are lost. The simple fact that LLMs, to be effective, force the adaptation of proper testing and documentation is a huge win for corporate software.

weweersdfsd 2025-09-04 09:16 UTC link
The problem with current GenAI is the same as in outsourcing to lowest bidder in India or whatever. For any non-trivial project you'll get something that may appear to work out of it, but for anything production-ready you'll most likely you'll spend lots of time testing, verifying, cleaning up the code and making changes to things AI didn't catch. Then there's requirement gathering, discussing with stakeholders, gathering more feedback and so on, debugging when things fail in production...

I believe it's a productivity boost, but only to a small part of my job. The boost would be larger if only had to build proof-of-concepts or hobby projects that don't need to be reliable in prod, and don't require feedback and requirements from many other people.

iainctduncan 2025-09-04 16:58 UTC link
This reminds me of something... I'm a jazz musician when not being a coder, and have studied and taught from/to a lot of players. One thing advanced improvisors notice is that the student is very frequently not a good judge – in the moment – of what is making them better. Doing long term analytics tests (as the author did) works, but knowing how well something is working while you're doing it? not so much. Very, very frequently that which feels productive isn't, and that which feels painful and slow is.

Just spit balling here, but it sure feels similar.

trenchpilgrim 2025-09-03 22:14 UTC link
Same. On many days 90% of my code output by lines is Claude generated and things that took me a day now take well under an hour.

Also, a good chunk of my personal OSS projects are AI assisted. You probably can't tell from looking at them, because I have strict style guides that suppress the "AI style", and I don't really talk about how I use AI in the READMEs. Do you also expect I mention that I used Intellisense and syntax highlighting too?

nicce 2025-09-03 22:16 UTC link
> implements some logic from scratch where there certainly is more than one suitable library, making this code we now own - is some behemoth of a PR trying to do all the things

Depending on the amount of code, I see this only as positive? Too often people pull huge libraries for 50 lines of code.

rglover 2025-09-03 22:19 UTC link
> My manager told me that the time to deliver my latest project was cut to 20% of the original estimate because we are "an AI-first company".

Lord, forgive them, they know not what they do.

fennecbutt 2025-09-03 22:20 UTC link
I mean the truth should be fairly obvious to people given a lot of the talk around AI stuff rings very much like the ifls/mainstream media style "science" articles which always make some outrageous "right around the corner" claim based off some small tidbit out of a paper they only skimmed the abstract of.
fennecbutt 2025-09-03 22:24 UTC link
Granted, _discovery_ of such things is something I'm still trying to solve at my own job and potentially llms can at least be leveraged to analyse and search code(bases) rather than just write it.

It's difficult because you need team members to be able to work quite independently but knowledge of internal libraries can get so siloed.

balder1991 2025-09-03 22:28 UTC link
Also when vou create a product you can’t speed up the iterative process of seeing how users want it, fixing edge cases that you only realized later etc. these are the things that make a product good and why there’s that article about software taking 10 years to mature: https://www.joelonsoftware.com/2001/07/21/good-software-take...
quantumcotton 2025-09-03 22:36 UTC link
Today you will learn what diminishing returns are :)

You can only utilize so many people or so much action within a business or idea.

Essentially it's throwing more stupid at a problem.

The reason there are so many layoffs is because of AI creating efficiency. The thing that people don't realize is it's not that one AI robot or GPU is going to replace one human at a one to one ratio. It's going to replace the amount of workload one person can do. Which in turn gets rid of one human employee. It's not that you job isn't taken by AI. It's started. But how much human is needed is where the new supply demand lies and how long the job lasts. There will always be more need for more creative minds. The issue is we are lacking them.

It's incredible how many software engineers I see walking around without jobs. Looking for a job making $100,000 to $200,000 a year. Meanwhile, they have no idea how much money they could save a business. Their creativity was killed by school.

They are relying on somebody to tell them what to do and when nobody's around to tell anybody what to do. They all get stuck. What you are seeing isn't a lack of capability. It's a lack of ability to control direction or create an idea worth following.

moduspol 2025-09-03 22:49 UTC link
A lot of these C-suite people also expect the remaining ones to be replaced by AI. They subscribe to the hockey-stick "AGI is around the corner" narrative.

I don't, but at least it is somewhat logical. If you truly believe that, you wouldn't necessarily want to hire more developers.

Nextgrid 2025-09-03 22:53 UTC link
This is the answer. Programming was never the bottleneck in delivering software, whether free-range, organic, grass-fed human-generated code or AI-assisted.

AI is just a convenient excuse to lay off many rounds of over-hiring while also keeping the door open for potential investors to throw more money into the incinerator since the company is now “AI-first”.

lumost 2025-09-03 22:58 UTC link
The experience in green field development is very different. In the early days of a project, the LLMs opinion is about as good as the individuals starting the project. The coding standards and other items have not yet been established. The buggy/half nonsense code means that the project is still demo able. Being able to explore 5 projects to demo status instead of 1 is a major boost.
heavyset_go 2025-09-03 22:59 UTC link
> LLMs can get me up to speed on new APIs and libraries far faster than I can myself, a gigantic speedup. If I need to write a small bit of glue code in a language I do not know, LLMs not only save me time, but they make it so I don't have to learn something that I'll likely never use again.

I wax and wane on this one.

I've had the same feelings, but too often I've peaked behind the curtain, read the docs and got familiar with external dependencies and then realize whatever the LLM responds with paradoxically either wasn't following convention or tried to shoehorn your problem to fit code examples found online, used features inappropriately, took a long roundabout path to do something that can be done simply, etc.

It can feel like magic until you look too closely at it, and I worry that it'll make me complacent with the feeling of understanding without actually taking away an understanding.

SchemaLoad 2025-09-03 23:08 UTC link
At least in my experience, it excels in blank canvas projects. Where you've got nothing and want something pretty basic. The tools can probably set up a fresh React project faster than me. But at least every time I've tried them on an actual work repo they get reduced to almost useless.

Which is why they generate so much hype. They are perfect for tech demos, then management wonders why they aren't seeing results in the real world.

heavyset_go 2025-09-03 23:13 UTC link
> I think that there will be neurological fatigue occurring whereby if software engineers are not actively practicing problem-solving, discernment, and translation into computer code - those skills will atrophy...

I've found this to be the case with most (if not all) skills, even riding a bike. Sure, you don't forget how to ride it, but your ability to expertly articulate with the bike in a synergistic and tool-like way atrophies.

If that's the case with engineering, and I believe it to be, it should serve as a real warning.

Seattle3503 2025-09-03 23:16 UTC link
> My manager told me that the time to deliver my latest project was cut to 20% of the original estimate because we are "an AI-first company".

If we can delegate incident response to automated LLMs too, sure, why not. Let the CEO have his way and pay the reputational price. When it doesn't work, we can revert our git repos to the day LLMs didn't write all the code.

I'm only being 90% facetious.

nerevarthelame 2025-09-03 23:18 UTC link
How are you sure it's increasing your productivity if it "makes no sense" to even quantify that? What are the receipts you have?
heavyset_go 2025-09-03 23:20 UTC link
As the rate of profit drops, value needs to be squeezed out of somewhere and that will come from the hiring/firing and compensation of labor, hence a strong bias towards that outcome.

99% of the draw of AI is cutting labor costs, and hiring goes against that.

That said, I don't believe AI productivity claims, just pointing out a factor that could theoretically contribute to your hypothetical.

anp 2025-09-03 23:30 UTC link
FWIW this closely matches my experience. I’m pretty late to the AI hype train but my opinion changed specifically because of using combinations of models & tools that released right before the cut off date for the data here. My impression from friends is that it’s taken even longer for many companies to decide they’re OK with these tools being used at all, so I would expect a lot of hysteresis on outputs from that kind of adoption.

That said I’ve had similar misgivings about the METR study and I’m eager for there to be more aggregate study of the productivity outcomes.

ksenzee 2025-09-03 23:32 UTC link
> Stop using a hammer on nails.

sorry, what am I supposed to use on nails?

mvdtnz 2025-09-03 23:33 UTC link
> LLMs can be automated to do busy work and although they may take longer in terms of clock time than a human, the work is effectively done in the background.

What is this supposed busy work that can be done in the background unsupervised?

I think it's about time for the AI pushers to be absolutely clear about the actual specific tasks they are having success with. We're all getting a bit tired of the vagueness and hand waving.

mvdtnz 2025-09-03 23:36 UTC link
> AI didn't speed me up at all until agents got good enough, which was April/May of this year.

That was 5 months ago, which is 6 years in 10x time.

sumeno 2025-09-03 23:38 UTC link
It's amazing how whenever criticisms pop up the responses for the last 3 years have been "well you aren't using <insert latest>, it's finally good!"
Noumenon72 2025-09-03 23:45 UTC link
You should have used the word "steganography" in this description like you did in your readme, makes it 100% more clear what it does.
noidesto 2025-09-04 00:22 UTC link
Agree. In the hands of a seasoned dev not only does productivity improve but the quality of outputs.

If I’m working against a deadline I feel more comfortable spending time on research and design knowing I can spend less time on implementation. In the end, it took the same amount of time, though hopefully with an increase of reliability, observability, and extendibility. None of these things show up in the author’s faulty dataset and experiment.

abathologist 2025-09-04 01:40 UTC link
My theory is that the digital revolution has mostly cancelled out potential productivity gains with its introduction of productivity sinks: the technology has tended to encourage less rigorous thinking, more distraction, more complexity; and even if you can do task T X times faster, most people as spending X * Y more time being distracted, overwhelmed, or just reflective button pushers.

The ways AI is being used now will make this a lot worse on every front.

bwfan123 2025-09-04 01:57 UTC link
> writing code was never the bottleneck

This is an insightful observation.

When working on anything, I am asked: what is the smallest "hard" problem that this is solving ? ie, in software, value is added by solving "hard" problems - not by solving easy problems. Another way to put it is: hard problems are those that are not "templated" ie, solved elsewhere and only need to be copied.

LLMs are allowing the easy problems to be solved faster. But the real bottleneck is in solving the hard problems - and hard problems could be "hard" due to technical reasons, or business reasons or customer-adoption reasons. Hard problems are where value lies particularly when everyone has access to this tool, and everyone can equally well create or copy something using it.

In my experience, LLMs have not yet made a dent in solving the hard problems because, they dont really have a theory of how something really works. On the other hand, they have really helped boost productivity for tasks that are templated .

rootusrootus 2025-09-04 01:59 UTC link
> My manager told me that the time to deliver my latest project was cut to 20% of the original estimate

That's insane. Who the hell pulls a number out of their ass and declares it the new reality? When it doesn't happen, he'll pin the blame on you, but everyone else above will pin the blame on him. He's the one who will get fired.

Laying off unnecessary developers is the answer if LLMs turn out to make us all so much more productive (assuming we don't just increase the amount of software written instead). But that happens after successful implementation of LLMs into the development process, not in advance.

Starting to think I should do the inadvisable and move my investments far far away from the S&P 500 and into something that will survive the hype crash that can't be too far off now.

dmonitor 2025-09-04 02:31 UTC link
By "knowing what you're doing" do you mean "have enough experience to it by hand", "have experience with a specific AI tool and its limitations" or a combination?
ferrous69 2025-09-04 02:59 UTC link
my grand theory on AI coding tools is that they don't really save on time, but they massively save on annoyance. I can save my frustration budget for useful things instead of fiddling with syntax or compiler messages or repetitive tasks, and oftentimes this means I'll take on a task I would find too frustrating in an already frustrating world, or stay at my desk longer before needing to take a walk or ditch the office for the bar.
kobe_bryant 2025-09-04 03:23 UTC link
wow, not just one but multiple big systems? well, share the details with us
mildweed 2025-09-04 03:54 UTC link
Interested in this Homebrew. Share when ready?
Editorial Channel
What the content says
+0.80
Article 19 Freedom of Expression
High Advocacy Framing Coverage
Editorial
+0.80
SETL
+0.63

Article is a centerpiece exercise in free expression. Author publishes contrarian views challenging dominant industry narrative, demands evidence-based speech, quotes corporate claims to interrogate them, and calls for honest evaluation of AI productivity. Extensively engages freedom to express dissenting views.

+0.60
Article 23 Work & Equal Pay
High Advocacy Practice
Editorial
+0.60
SETL
ND

Article extensively advocates for labor rights and protection from unfair employment practices. Documents termination based on tool adoption, job insecurity, and pressure to adopt unproven tools. Advocates for workers' right to work without unreasonable pressure, fair compensation, and job security based on competence rather than tool adoption.

+0.50
Article 18 Freedom of Thought
High Advocacy Framing
Editorial
+0.50
SETL
ND

Article strongly advocates for intellectual freedom and autonomy of judgment. Explicitly tells developers to 'trust your gut' and resist conforming to industry pressure to adopt tools. Defends right to form own conclusions independent of hype.

+0.40
Article 22 Social Security
High Advocacy
Editorial
+0.40
SETL
ND

Article discusses social and economic rights extensively. Criticizes salary suppression based on false AI productivity assumptions, job insecurity, and economic pressure on workers. Advocates for economic justice and fair treatment of developers.

+0.30
Article 25 Standard of Living
High Advocacy
Editorial
+0.30
SETL
ND

Article advocates for adequate standard of living protection through job security and fair wages. Criticizes industry practices that threaten developers' economic security and ability to maintain adequate living standards. Advocates against unfounded salary reductions.

+0.20
Article 1 Freedom, Equality, Brotherhood
Medium Advocacy
Editorial
+0.20
SETL
ND

Article advocates that all developers deserve equal access to truthful information and should not be penalized for not adopting tools based on false claims. Opposes creation of false hierarchies (10xers vs others) based on misleading narratives.

+0.20
Article 7 Equality Before Law
Medium Advocacy
Editorial
+0.20
SETL
ND

Article advocates for equal access to truthful information and equal standing in the debate about AI tool productivity. Opposes creation of false hierarchies based on misled narratives; calls for evidence to be presented equally to all.

+0.20
Article 21 Political Participation
Medium Advocacy
Editorial
+0.20
SETL
ND

Article advocates for transparency and data-driven decision-making in tech leadership and industry governance. Calls for tech leaders to base decisions affecting workers on evidence rather than hype. Advocates for developer voice in decisions affecting their careers.

+0.20
Article 27 Cultural Participation
Medium Advocacy
Editorial
+0.20
SETL
ND

Article defends intellectual integrity and participation in informed discourse about technology development. Advocates for honest evaluation of technological claims and scientific rigor in assessing tools.

+0.20
Article 28 Social & International Order
Medium Advocacy Framing
Editorial
+0.20
SETL
ND

Article advocates for more honest and just order in technology industry. Calls for practices based on evidence rather than hype, transparency rather than manipulation, and protection of workers from unfair practices.

+0.15
Preamble Preamble
Medium Advocacy
Editorial
+0.15
SETL
ND

Article invokes universal human dignity implicitly by defending developers against industry pressure and false narratives. Advocates for truthful information as essential to maintaining equal standing.

+0.15
Article 29 Duties to Community
Medium Advocacy Practice
Editorial
+0.15
SETL
ND

Article fulfills duty to inform community. Publishes research and data to protect developer community from manipulation. Exercises responsibility to expose false claims and support fellow developers.

+0.10
Article 10 Fair Hearing
Medium Advocacy Framing
Editorial
+0.10
SETL
ND

Article advocates for evidence-based claims and demands 'receipts' as basis for accepting assertions. Frames truthful claims as essential to fair judgment of industry practices affecting workers.

-0.15
Article 3 Life, Liberty, Security
Medium Practice
Editorial
-0.15
SETL
ND

Article discusses job insecurity and loss of employment, which threaten both liberty and security. Documents how false productivity claims create precarious conditions affecting developers' ability to maintain livelihood.

-0.15
Article 26 Education
Medium Practice
Editorial
-0.15
SETL
ND

Article documents failure of industry to provide adequate education and training. Notes that industry advice is 'figure it out yourself' with no formal training, contradicting best practices for helping workers master new tools.

-0.20
Article 2 Non-Discrimination
Medium Practice
Editorial
-0.20
SETL
ND

Article documents discriminatory employment practice: developers fired for not adopting specific tools quickly enough. This represents discrimination in hiring and employment based on adoption of particular tools rather than competence.

-0.20
Article 24 Rest & Leisure
Medium Practice
Editorial
-0.20
SETL
ND

Article documents pressure and anxiety affecting developers' rest and leisure. Notes people spending excessive time trying to master prompting, feeling bad about failing, and experiencing work-related stress due to industry pressure.

ND
Article 4 No Slavery

ND
Article 5 No Torture

ND
Article 6 Legal Personhood

ND
Article 8 Right to Remedy

ND
Article 9 No Arbitrary Detention

ND
Article 11 Presumption of Innocence

ND
Article 12 Privacy

ND
Article 13 Freedom of Movement

ND
Article 14 Asylum

ND
Article 15 Nationality

ND
Article 16 Marriage & Family

ND
Article 17 Property

ND
Article 20 Assembly & Association

ND
Article 30 No Destruction of Rights

Structural Channel
What the site does
+0.30
Article 19 Freedom of Expression
High Advocacy Framing Coverage
Structural
+0.30
Context Modifier
ND
SETL
+0.63

Substack platform enables publication without corporate gatekeeping. Article demonstrates how independent publishing infrastructure allows contrarian speech to reach audience without editorial filtering.

ND
Preamble Preamble
Medium Advocacy

Article invokes universal human dignity implicitly by defending developers against industry pressure and false narratives. Advocates for truthful information as essential to maintaining equal standing.

ND
Article 1 Freedom, Equality, Brotherhood
Medium Advocacy

Article advocates that all developers deserve equal access to truthful information and should not be penalized for not adopting tools based on false claims. Opposes creation of false hierarchies (10xers vs others) based on misleading narratives.

ND
Article 2 Non-Discrimination
Medium Practice

Article documents discriminatory employment practice: developers fired for not adopting specific tools quickly enough. This represents discrimination in hiring and employment based on adoption of particular tools rather than competence.

ND
Article 3 Life, Liberty, Security
Medium Practice

Article discusses job insecurity and loss of employment, which threaten both liberty and security. Documents how false productivity claims create precarious conditions affecting developers' ability to maintain livelihood.

ND
Article 4 No Slavery

ND
Article 5 No Torture

ND
Article 6 Legal Personhood

ND
Article 7 Equality Before Law
Medium Advocacy

Article advocates for equal access to truthful information and equal standing in the debate about AI tool productivity. Opposes creation of false hierarchies based on misled narratives; calls for evidence to be presented equally to all.

ND
Article 8 Right to Remedy

ND
Article 9 No Arbitrary Detention

ND
Article 10 Fair Hearing
Medium Advocacy Framing

Article advocates for evidence-based claims and demands 'receipts' as basis for accepting assertions. Frames truthful claims as essential to fair judgment of industry practices affecting workers.

ND
Article 11 Presumption of Innocence

ND
Article 12 Privacy

ND
Article 13 Freedom of Movement

ND
Article 14 Asylum

ND
Article 15 Nationality

ND
Article 16 Marriage & Family

ND
Article 17 Property

ND
Article 18 Freedom of Thought
High Advocacy Framing

Article strongly advocates for intellectual freedom and autonomy of judgment. Explicitly tells developers to 'trust your gut' and resist conforming to industry pressure to adopt tools. Defends right to form own conclusions independent of hype.

ND
Article 20 Assembly & Association

ND
Article 21 Political Participation
Medium Advocacy

Article advocates for transparency and data-driven decision-making in tech leadership and industry governance. Calls for tech leaders to base decisions affecting workers on evidence rather than hype. Advocates for developer voice in decisions affecting their careers.

ND
Article 22 Social Security
High Advocacy

Article discusses social and economic rights extensively. Criticizes salary suppression based on false AI productivity assumptions, job insecurity, and economic pressure on workers. Advocates for economic justice and fair treatment of developers.

ND
Article 23 Work & Equal Pay
High Advocacy Practice

Article extensively advocates for labor rights and protection from unfair employment practices. Documents termination based on tool adoption, job insecurity, and pressure to adopt unproven tools. Advocates for workers' right to work without unreasonable pressure, fair compensation, and job security based on competence rather than tool adoption.

ND
Article 24 Rest & Leisure
Medium Practice

Article documents pressure and anxiety affecting developers' rest and leisure. Notes people spending excessive time trying to master prompting, feeling bad about failing, and experiencing work-related stress due to industry pressure.

ND
Article 25 Standard of Living
High Advocacy

Article advocates for adequate standard of living protection through job security and fair wages. Criticizes industry practices that threaten developers' economic security and ability to maintain adequate living standards. Advocates against unfounded salary reductions.

ND
Article 26 Education
Medium Practice

Article documents failure of industry to provide adequate education and training. Notes that industry advice is 'figure it out yourself' with no formal training, contradicting best practices for helping workers master new tools.

ND
Article 27 Cultural Participation
Medium Advocacy

Article defends intellectual integrity and participation in informed discourse about technology development. Advocates for honest evaluation of technological claims and scientific rigor in assessing tools.

ND
Article 28 Social & International Order
Medium Advocacy Framing

Article advocates for more honest and just order in technology industry. Calls for practices based on evidence rather than hype, transparency rather than manipulation, and protection of workers from unfair practices.

ND
Article 29 Duties to Community
Medium Advocacy Practice

Article fulfills duty to inform community. Publishes research and data to protect developer community from manipulation. Exercises responsibility to expose false claims and support fellow developers.

ND
Article 30 No Destruction of Rights

Supplementary Signals
How this content communicates, beyond directional lean. Learn more
Epistemic Quality
How well-sourced and evidence-based is this content?
0.74 medium claims
Sources
0.8
Evidence
0.8
Uncertainty
0.7
Purpose
0.8
Propaganda Flags
1 manipulative rhetoric technique found
1 techniques detected
loaded language
Author uses strong language: 'This whole thing is bullshit,' 'I'm furious,' 'shut the fuck up.' Emotionally charged words that, while grounded in data analysis, are rhetorically provocative.
Emotional Tone
Emotional character: positive/negative, intensity, authority
confrontational
Valence
-0.3
Arousal
0.8
Dominance
0.7
Transparency
Does the content identify its author and disclose interests?
0.50
✓ Author ✗ Conflicts
More signals: context, framing & audience
Solution Orientation
Does this content offer solutions or only describe problems?
0.59 mixed
Reader Agency
0.7
Stakeholder Voice
Whose perspectives are represented in this content?
0.45 4 perspectives
Speaks: individualsworkers
About: corporationinstitution
Temporal Framing
Is this content looking backward, at the present, or forward?
mixed medium term
Geographic Scope
What geographic area does this content cover?
global
United States
Complexity
How accessible is this content to a general audience?
moderate medium jargon domain specific
Longitudinal · 5 evals
+1 0 −1 HN
Audit Trail 25 entries
2026-02-28 13:20 eval Evaluated by claude-haiku-4-5-20251001: +0.22 (Mild positive) -0.11
2026-02-28 12:42 model_divergence Cross-model spread 0.33 exceeds threshold (3 models) - -
2026-02-28 12:42 eval Evaluated by claude-haiku-4-5-20251001: +0.33 (Moderate positive)
2026-02-28 08:42 eval_success Light evaluated: Mild positive (0.24) - -
2026-02-28 08:42 eval Evaluated by llama-4-scout-wai: +0.24 (Mild positive)
reasoning
ED, neutral with slight positive lean on tech critique
2026-02-28 08:42 rater_validation_warn Light validation warnings for model llama-4-scout-wai: 0W 1R - -
2026-02-28 08:38 rater_validation_warn Light validation warnings for model llama-3.3-70b-wai: 0W 1R - -
2026-02-28 08:38 eval_success Light evaluated: Neutral (0.00) - -
2026-02-28 08:38 eval Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00
reasoning
Tech editorial neutral
2026-02-28 08:33 eval_success Light evaluated: Neutral (0.00) - -
2026-02-28 08:33 eval Evaluated by llama-3.3-70b-wai: 0.00 (Neutral)
reasoning
Tech editorial neutral
2026-02-28 08:33 rater_validation_warn Light validation warnings for model llama-3.3-70b-wai: 0W 1R - -
2026-02-26 18:44 rater_validation_fail Parse failure for model deepseek-v3.2: Error: Failed to parse OpenRouter JSON: SyntaxError: Expected double-quoted property name in JSON at position 6099 (line 203 column 6). Extracted text starts with: { "schema_version": "3.7", "eval - -
2026-02-26 12:20 dlq Dead-lettered after 1 attempts: Where's the shovelware? Why AI coding claims don't add up - -
2026-02-26 12:18 rate_limit OpenRouter rate limited (429) model=llama-3.3-70b - -
2026-02-26 12:17 rate_limit OpenRouter rate limited (429) model=llama-3.3-70b - -
2026-02-26 12:16 rate_limit OpenRouter rate limited (429) model=llama-3.3-70b - -
2026-02-26 10:13 dlq Dead-lettered after 1 attempts: Where's the shovelware? Why AI coding claims don't add up - -
2026-02-26 10:11 dlq Dead-lettered after 1 attempts: Where's the shovelware? Why AI coding claims don't add up - -
2026-02-26 10:11 dlq Dead-lettered after 1 attempts: Where's the shovelware? Why AI coding claims don't add up - -
2026-02-26 10:10 dlq Dead-lettered after 1 attempts: Where's the shovelware? Why AI coding claims don't add up - -
2026-02-26 10:08 dlq Dead-lettered after 1 attempts: Where's the shovelware? Why AI coding claims don't add up - -
2026-02-26 10:06 dlq Dead-lettered after 1 attempts: Where's the shovelware? Why AI coding claims don't add up - -
2026-02-26 10:03 dlq Dead-lettered after 1 attempts: Where's the shovelware? Why AI coding claims don't add up - -
2026-02-26 10:03 dlq Dead-lettered after 1 attempts: Where's the shovelware? Why AI coding claims don't add up - -