+0.31 An AI agent coding skeptic tries AI agent coding, in excessive detail

Name: HRCB Evaluation: An AI agent coding skeptic tries AI agent coding, in excessive detail
Item: An AI agent coding skeptic tries AI agent coding, in excessive detail
Rating: 0.316
Author: HN HRCB

Model: deepseek/deepseek-v3.2-20251201 0.00 @cf/meta/llama-4-scout-17b-16e-instruct lite 0.00 @cf/meta/llama-3.3-70b-instruct-fp8-fast lite 0.00 claude-haiku-4-5-20251001 +0.31 Compare

+0.31	An AI agent coding skeptic tries AI agent coding, in excessive detail (minimaxir.com S:+0.21 )
	56 points by minimaxir 2 days ago \| 9 comments on HN \| Moderate positive Contested Editorial · v3.7 · 2026-02-28 11:57:06 0

Summary Free Expression & Knowledge Access Advocates

A technical blog post documenting one developer's methodical evaluation of LLM agent capabilities, emphasizing skepticism toward automation hype and human critical evaluation. The content advocates for transparency in AI development, open knowledge sharing via open-source code, and preservation of human agency in technological augmentation—embodying values of free expression, scientific advancement access, and worker dignity.

Article Heatmap

Negative Neutral Positive No Data

Aggregates

Editorial Mean	+0.31	Structural Mean	+0.21
Weighted Mean	+0.32	Unweighted Mean	+0.27
Max	+0.50 Article 27	Min	+0.11 Article 23
Signal	5	No Data	26
Volatility	0.17 (Medium)
Negative	0	Channels	E: 0.6 S: 0.4
SETL ℹ	+0.14	Editorial-dominant
FW Ratio ℹ	55%	12 facts · 10 inferences

Evidence 11% coverage ℹ

 2H  2M  1L  26 ND 

Theme Radar

HN Discussion 6 top-level · 1 replies

simonw 2026-02-27 20:51 UTC link

This is my favorite yet of the genre of "OK, coding agents got good in November" posts. It starts with relatively simple examples (YouTube metadata scraping) and by the end Max is rewriting Python's skikit-learn framework in Rust and making it way faster.

7777777phil 2026-02-27 21:18 UTC link

The sharp results all came from pairing domain expertise with detailed AGENTS.md files. The impressive Rust output happened because someone who knows Rust was steering it. Vague prompts got mediocre output. A model on its own converges to the mean of its training data, which is why the "vibe code everything" thesis keeps not holding up: https://philippdubach.com/posts/the-impossible-backhand/

busssard 2026-02-27 21:56 UTC link

this post reflects my experience with the model...

ej88 2026-02-27 21:58 UTC link

Thanks Max! This was a really interesting article and closely matches my own experience with how the agents have been progressing

one of the takeaways I get when reading skilled engineers' experiences with these tools is that they essentially offer leverage, and the more skill someone already has the higher their ceiling will be

verdverm 2026-02-27 22:12 UTC link

I second that spending effort on your AGENTS.md is game changing. Don't auto generate these, work with them and learn how to make them good (sparknotes and table of contents, keep minimal, distribute over dirs)

ivraatiems 2026-02-27 23:46 UTC link

This investigation aligns with my experience in a lot of ways. I'm con-the influence and behavior big AI companies, but lukewarm-to-pro the actual use of the technology itself.

I use Claude and other models frequently (mostly via Cursor, with a smattering of other tools) in my work now. It is not at the "I never write code myself" point, but the AI tools are absolutely capable of generating highly effective and usable code, usually nearly as good or as good as what I'd do myself, with guidance.

It hasn't eliminated the need for my existence as an engineer, but it has changed it drastically. It is much more like "tell the computer what I want and mostly get it" than it was a year ago.

And yet, I have friends and colleagues who reject it out of hand as useless, and are so skeptical of it that they suggest it must only be good because my skills are poor, or our codebase is bad, or I'm getting lucky.

I just can't totally credit any of those explanations anymore.

rudiksz 2026-02-28 09:19 UTC link

Claude isn't generating code that is "highly effective and usable code". I'm not your friend but I also reject your claims out of hand because I've also seen what Claude can and can't do.

Editorial Channel

What the content says

+0.50

Article 19 Freedom of Expression

High A: free expression and thought P: open knowledge dissemination

Editorial

+0.50

SETL

+0.22

Content is direct exercise of freedom of expression and thought. Author freely publishes opinion, analysis, and critique of dominant narratives without self-censorship. Shares detailed technical knowledge and code, facilitating others' access to information and ability to participate in scientific advancement.

+0.50

Article 27 Cultural Participation

High A: scientific advancement and knowledge access P: open authorship and intellectual property respect

Editorial

+0.50

SETL

0.00

Content directly promotes participation in scientific and technical advancement. Author shares detailed methodology, code, and results openly. Respects authorship by clearly attributing work and citing sources. Facilitates others' participation in technological development through open-source distribution.

+0.20

Preamble Preamble

Medium A: human dignity in human-AI collaboration F: skepticism toward technological determinism

Editorial

+0.20

SETL

+0.14

Content affirms human dignity and agency by critiquing hype narratives that position agents as replacing human judgment. Emphasizes human oversight ('manually reviewed'), critical evaluation, and sovereignty of human decision-making ('impugn the sovereignty of the human soul').

+0.20

Article 1 Freedom, Equality, Brotherhood

Medium A: human reason and critical thinking F: defense against technological determinism

Editorial

+0.20

SETL

+0.20

Content defends human reason and critical evaluation against passive acceptance of technological narratives. Author positions skepticism as intellectual integrity ('I actually don't use Generative LLMs often'), resisting social pressure to adopt tools uncritically.

+0.15

Article 23 Work & Equal Pay

Low F: worker dignity in human-agent collaboration

Editorial

+0.15

SETL

+0.12

Implicit engagement: Content discusses 'productivity gain' and optimization of human work through agents, framed within context of human oversight and skill maintenance. Author's concern that agents might lead to 'atrophy of programming skills' reflects attention to workers' right to develop abilities and maintain meaningful work.

Article 2 Non-Discrimination

No observable content regarding freedom from discrimination.

Article 3 Life, Liberty, Security

No observable content regarding right to life, liberty, personal security.

Article 4 No Slavery

No observable content regarding slavery.

Article 5 No Torture

No observable content regarding torture or cruel treatment.

Article 6 Legal Personhood

No observable content regarding right to recognition as person before law.

Article 7 Equality Before Law

No observable content regarding equality before law.

Article 8 Right to Remedy

No observable content regarding effective remedy.

Article 9 No Arbitrary Detention

No observable content regarding arrest and detention.

Article 10 Fair Hearing

No observable content regarding right to fair trial.

Article 11 Presumption of Innocence

No observable content regarding presumption of innocence.

Article 12 Privacy

No observable content regarding privacy.

Article 13 Freedom of Movement

No observable content regarding freedom of movement.

Article 14 Asylum

No observable content regarding asylum.

Article 15 Nationality

No observable content regarding nationality.

Article 16 Marriage & Family

No observable content regarding family and marriage.

Article 17 Property

No observable content regarding property.

Article 18 Freedom of Thought

No observable content regarding freedom of thought, conscience, religion.

Article 20 Assembly & Association

No observable content regarding assembly or association.

Article 21 Political Participation

No observable content regarding political participation.

Article 22 Social Security

No observable content regarding social security.

Article 24 Rest & Leisure

No observable content regarding rest and leisure.

Article 25 Standard of Living

No observable content regarding standard of living or health.

Article 26 Education

No observable content regarding education.

Article 28 Social & International Order

No observable content regarding social/international order.

Article 29 Duties to Community

No observable content regarding duties to community.

Article 30 No Destruction of Rights

No observable content regarding limitation of rights.

Structural Channel

What the site does

+0.50

Article 27 Cultural Participation

High A: scientific advancement and knowledge access P: open authorship and intellectual property respect

Structural

+0.50

Context Modifier

SETL

0.00

GitHub repositories provide open-source code under permissive licenses. Attribution clear throughout (author name, publication date, citations). Multiple links enable direct access to underlying work and allow reuse, modification, and redistribution by others.

+0.40

Article 19 Freedom of Expression

High A: free expression and thought P: open knowledge dissemination

Structural

+0.40

Context Modifier

SETL

+0.22

Blog platform provides unmediated publishing capability. GitHub links enable free access to open-source code. No paywalls, gating, or censorship signals. Content is discoverable and shareable.

+0.10

Preamble Preamble

Medium A: human dignity in human-AI collaboration F: skepticism toward technological determinism

Structural

+0.10

Context Modifier

SETL

+0.14

Open blog platform facilitates unmediated expression. No barriers to author's communication.

+0.05

Article 23 Work & Equal Pay

Low F: worker dignity in human-agent collaboration

Structural

+0.05

Context Modifier

SETL

+0.12

No direct structural signals regarding working conditions.

0.00

Article 1 Freedom, Equality, Brotherhood

Medium A: human reason and critical thinking F: defense against technological determinism

Structural

0.00

Context Modifier

SETL

+0.20

No structural signals specific to this article.

Article 2 Non-Discrimination

No structural signals.

Article 3 Life, Liberty, Security

No structural signals.

Article 4 No Slavery

No structural signals.

Article 5 No Torture

No structural signals.

Article 6 Legal Personhood

No structural signals.

Article 7 Equality Before Law

No structural signals.

Article 8 Right to Remedy

No structural signals.

Article 9 No Arbitrary Detention

No structural signals.

Article 10 Fair Hearing

No structural signals.

Article 11 Presumption of Innocence

No structural signals.

Article 12 Privacy

No structural signals.

Article 13 Freedom of Movement

No structural signals.

Article 14 Asylum

No structural signals.

Article 15 Nationality

No structural signals.

Article 16 Marriage & Family

No structural signals.

Article 17 Property

No structural signals.

Article 18 Freedom of Thought

No structural signals.

Article 20 Assembly & Association

No structural signals.

Article 21 Political Participation

No structural signals.

Article 22 Social Security

No structural signals.

Article 24 Rest & Leisure

No structural signals.

Article 25 Standard of Living

No structural signals.

Article 26 Education

No structural signals.

Article 28 Social & International Order

No structural signals.

Article 29 Duties to Community

No structural signals.

Article 30 No Destruction of Rights

No structural signals.

Supplementary Signals

How this content communicates, beyond directional lean. Learn more

Epistemic Quality ℹ

How well-sourced and evidence-based is this content?

0.77 medium claims

Sources		0.8
Evidence		0.8
Uncertainty		0.7
Purpose		0.9

Propaganda Flags ℹ

No manipulative rhetoric detected

0 techniques detected

Emotional Tone ℹ

Emotional character: positive/negative, intensity, authority

measured

Valence		+0.5
Arousal		0.3
Dominance		0.6

Transparency ℹ

Does the content identify its author and disclose interests?

0.50

✓ Author

More signals: context, framing & audience

Solution Orientation ℹ

Does this content offer solutions or only describe problems?

0.85 solution oriented

Reader Agency

0.8

Stakeholder Voice ℹ

Whose perspectives are represented in this content?

0.42 3 perspectives

Speaks: individualscorporation

About: workersinstitution

Temporal Framing ℹ

Is this content looking backward, at the present, or forward?

mixed long term

Geographic Scope ℹ

What geographic area does this content cover?

global

San Francisco, United States

Complexity ℹ

How accessible is this content to a general audience?

technical high jargon domain specific

Longitudinal 75 HN snapshots · 54 evals

Audit Trail 74 entries

2026-03-02 08:17	eval_success	Evaluated: Neutral (0.00)	- -
2026-03-02 08:17	model_divergence	Cross-model spread 0.32 exceeds threshold (3 models)	- -
2026-03-02 08:17	eval	Evaluated by deepseek-v3.2: +0.00 (Neutral) 17,891 tokens -0.30
2026-03-01 18:48	eval_success	Evaluated: Moderate positive (0.30)	- -
2026-03-01 18:48	model_divergence	Cross-model spread 0.32 exceeds threshold (3 models)	- -
2026-03-01 18:48	eval	Evaluated by deepseek-v3.2: +0.30 (Moderate positive) 12,717 tokens +0.20
2026-03-01 18:48	rater_validation_warn	Validation warnings for model deepseek-v3.2: 0W 2R	- -
2026-03-01 17:37	eval_success	Evaluated: Neutral (0.10)	- -
2026-03-01 17:37	model_divergence	Cross-model spread 0.32 exceeds threshold (3 models)	- -
2026-03-01 17:37	eval	Evaluated by deepseek-v3.2: +0.10 (Neutral) 13,498 tokens -0.16
2026-02-28 17:11	eval_success	Lite evaluated: Neutral (0.00)	- -
2026-02-28 17:11	model_divergence	Cross-model spread 0.32 exceeds threshold (3 models)	- -
2026-02-28 17:11	eval	Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
	reasoning ED, neutral AI tech exploration
2026-02-28 15:38	model_divergence	Cross-model spread 0.32 exceeds threshold (3 models)	- -
2026-02-28 15:38	eval_success	Lite evaluated: Neutral (0.00)	- -
2026-02-28 15:38	eval	Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
	reasoning ED, neutral AI tech exploration
2026-02-28 15:26	model_divergence	Cross-model spread 0.32 exceeds threshold (3 models)	- -
2026-02-28 15:26	eval_success	Lite evaluated: Neutral (0.00)	- -
2026-02-28 15:26	eval	Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00
	reasoning Technical blog post
2026-02-28 13:57	eval_success	Lite evaluated: Neutral (0.00)	- -
2026-02-28 13:57	model_divergence	Cross-model spread 0.32 exceeds threshold (3 models)	- -
2026-02-28 13:57	eval	Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00
	reasoning Technical blog post
2026-02-28 13:26	model_divergence	Cross-model spread 0.32 exceeds threshold (3 models)	- -
2026-02-28 13:26	eval_success	Lite evaluated: Neutral (0.00)	- -
2026-02-28 13:26	eval	Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
	reasoning ED, neutral AI tech exploration
2026-02-28 13:07	eval_success	Lite evaluated: Neutral (0.00)	- -
2026-02-28 13:07	rater_validation_warn	Lite validation warnings for model llama-3.3-70b-wai: 0W 1R	- -
2026-02-28 13:07	model_divergence	Cross-model spread 0.32 exceeds threshold (3 models)	- -
2026-02-28 13:07	eval	Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00
	reasoning Technical blog post
2026-02-28 13:02	eval	Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
	reasoning ED, neutral AI tech exploration
2026-02-28 11:57	eval	Evaluated by claude-haiku-4-5-20251001: +0.32 (Moderate positive)
2026-02-28 11:27	eval	Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00
	reasoning Technical blog post
2026-02-28 10:56	eval	Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00
	reasoning Technical blog post
2026-02-28 10:38	eval	Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
	reasoning ED, neutral AI tech exploration
2026-02-28 09:14	eval	Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00
	reasoning Technical blog post
2026-02-28 08:51	eval	Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
	reasoning ED, neutral AI tech exploration
2026-02-28 08:46	eval	Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
	reasoning ED, neutral AI tech exploration
2026-02-28 08:19	eval	Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00
	reasoning Technical blog post
2026-02-28 08:17	eval	Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00
	reasoning Technical blog post
2026-02-28 08:14	eval	Evaluated by deepseek-v3.2: +0.26 (Mild positive) 13,551 tokens +0.21
2026-02-28 07:30	eval	Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00
	reasoning Technical blog post
2026-02-28 07:02	eval	Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00
	reasoning Technical blog post
2026-02-28 07:00	eval	Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
	reasoning ED, neutral AI tech exploration
2026-02-28 06:43	eval	Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00
	reasoning Technical blog post
2026-02-28 06:42	eval	Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
	reasoning ED, neutral AI tech exploration
2026-02-28 06:29	eval	Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
	reasoning ED, neutral AI tech exploration
2026-02-28 06:11	eval	Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00
	reasoning Technical blog post
2026-02-28 05:48	eval	Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00
	reasoning Technical blog post
2026-02-28 05:41	eval	Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00
	reasoning Technical blog post
2026-02-28 05:26	eval	Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00
	reasoning Technical blog post
2026-02-28 05:24	eval	Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00
	reasoning Technical blog post
2026-02-28 04:30	eval	Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
	reasoning ED, neutral AI tech exploration
2026-02-28 04:18	eval	Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
	reasoning ED, neutral AI tech exploration
2026-02-28 04:17	eval	Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
	reasoning ED, neutral AI tech exploration
2026-02-28 03:48	eval	Evaluated by deepseek-v3.2: +0.05 (Neutral) 13,708 tokens
2026-02-28 03:41	eval	Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
	reasoning ED, neutral AI tech exploration
2026-02-28 03:36	eval	Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
	reasoning ED, neutral AI tech exploration
2026-02-28 03:14	eval	Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
	reasoning ED, neutral AI tech exploration
2026-02-28 03:12	eval	Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00
	reasoning Technical blog post
2026-02-28 03:02	eval	Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00
	reasoning Technical blog post
2026-02-28 02:40	eval	Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00
	reasoning Technical blog post
2026-02-28 02:38	eval	Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
	reasoning ED, neutral AI tech exploration
2026-02-28 02:36	eval	Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
	reasoning ED, neutral AI tech exploration
2026-02-28 02:36	eval	Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00
	reasoning Technical blog post
2026-02-28 02:13	eval	Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
	reasoning ED, neutral AI tech exploration
2026-02-28 01:58	eval	Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00
	reasoning Technical blog post
2026-02-28 01:54	eval	Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
	reasoning ED, neutral AI tech exploration
2026-02-28 01:51	eval	Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
	reasoning ED, neutral AI tech exploration
2026-02-28 01:38	eval	Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
	reasoning ED, neutral AI tech exploration
2026-02-28 01:30	eval	Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
	reasoning ED, neutral AI tech exploration
2026-02-28 01:30	eval	Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
	reasoning ED, neutral AI tech exploration
2026-02-28 01:29	eval	Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00
	reasoning Technical blog post
2026-02-28 01:19	eval	Evaluated by llama-3.3-70b-wai: 0.00 (Neutral)
	reasoning Technical blog post
2026-02-28 00:44	eval	Evaluated by llama-4-scout-wai: 0.00 (Neutral)
	reasoning ED, neutral AI tech exploration

build 1ad9551+j7zs · deployed 2026-03-02 09:09 UTC · evaluated 2026-03-02 11:31:12 UTC