+0.24 Show HN: A real-time strategy game that AI agents can play (llmskirmish.com S:+0.11 )
220 points by __cayenne__ 5 days ago | 80 comments on HN | Mild positive Product · v3.7 · 2026-02-26 02:26:45 0
Summary Transparent AI Evaluation Acknowledges
LLM Skirmish is a technical benchmark site that evaluates large language models through game-based competition, publishing comprehensive results and open-source methodology. The content strongly advocates for transparent evaluation (Article 19, 27) and fair comparison (Article 2, 28), but structural implementations undermine privacy protections (Article 12) and accessibility (Articles 25-26). Overall, the site demonstrates commitment to scientific transparency and information freedom while creating accessibility barriers and privacy concerns for human users.
Article Heatmap
Preamble: +0.15 — Preamble P Article 1: +0.20 — Freedom, Equality, Brotherhood 1 Article 2: +0.10 — Non-Discrimination 2 Article 3: ND — Life, Liberty, Security Article 3: No Data — Life, Liberty, Security 3 Article 4: ND — No Slavery Article 4: No Data — No Slavery 4 Article 5: ND — No Torture Article 5: No Data — No Torture 5 Article 6: ND — Legal Personhood Article 6: No Data — Legal Personhood 6 Article 7: ND — Equality Before Law Article 7: No Data — Equality Before Law 7 Article 8: ND — Right to Remedy Article 8: No Data — Right to Remedy 8 Article 9: ND — No Arbitrary Detention Article 9: No Data — No Arbitrary Detention 9 Article 10: ND — Fair Hearing Article 10: No Data — Fair Hearing 10 Article 11: ND — Presumption of Innocence Article 11: No Data — Presumption of Innocence 11 Article 12: -0.40 — Privacy 12 Article 13: ND — Freedom of Movement Article 13: No Data — Freedom of Movement 13 Article 14: ND — Asylum Article 14: No Data — Asylum 14 Article 15: ND — Nationality Article 15: No Data — Nationality 15 Article 16: ND — Marriage & Family Article 16: No Data — Marriage & Family 16 Article 17: ND — Property Article 17: No Data — Property 17 Article 18: ND — Freedom of Thought Article 18: No Data — Freedom of Thought 18 Article 19: +0.65 — Freedom of Expression 19 Article 20: ND — Assembly & Association Article 20: No Data — Assembly & Association 20 Article 21: ND — Political Participation Article 21: No Data — Political Participation 21 Article 22: ND — Social Security Article 22: No Data — Social Security 22 Article 23: ND — Work & Equal Pay Article 23: No Data — Work & Equal Pay 23 Article 24: ND — Rest & Leisure Article 24: No Data — Rest & Leisure 24 Article 25: -0.10 — Standard of Living 25 Article 26: -0.10 — Education 26 Article 27: +0.70 — Cultural Participation 27 Article 28: +0.28 — Social & International Order 28 Article 29: +0.15 — Duties to Community 29 Article 30: ND — No Destruction of Rights Article 30: No Data — No Destruction of Rights 30
Negative Neutral Positive No Data
Aggregates
Editorial Mean +0.24 Structural Mean +0.11
Weighted Mean +0.20 Unweighted Mean +0.16
Max +0.70 Article 27 Min -0.40 Article 12
Signal 10 No Data 21
Volatility 0.32 (High)
Negative 3 Channels E: 0.6 S: 0.4
SETL +0.16 Editorial-dominant
FW Ratio 52% 30 facts · 28 inferences
Evidence 24% coverage
2H 9M 20 ND
Theme Radar
Foundation Security Legal Privacy & Movement Personal Expression Economic & Social Cultural Order & Duties Foundation: 0.15 (3 articles) Security: 0.00 (0 articles) Legal: 0.00 (0 articles) Privacy & Movement: -0.40 (1 articles) Personal: 0.00 (0 articles) Expression: 0.65 (1 articles) Economic & Social: -0.10 (1 articles) Cultural: 0.30 (2 articles) Order & Duties: 0.21 (2 articles)
HN Discussion 20 top-level · 13 replies
egeozcan 2026-02-25 10:29 UTC link
This is amazing. What I do is something else: I make AI agents develop AI scripts (good ol' computer player scripts) and try to beat each other:

https://egeozcan.github.io/unnamed_rts/game/

I occasionally run my tournament script: https://github.com/egeozcan/unnamed_rts/blob/main/src/script...

That calculates the ELOs for each AI implementation, and I feed it to different agents so they get really creative trying to beat each other. Also making rule changes to the game and seeing how some scripts get weaker/stronger is a nice way to measure balance.

Funny thing, Codex gets really aggressive and starts cheating a lot of times: https://bsky.app/profile/egeozcan.bsky.social/post/3mfdtj5dh...

wongarsu 2026-02-25 10:32 UTC link
I know visualization is far from the most important goal here, but it really gets me how there's fairly elaborately rendered terrain, and then the units are just unnamed roombas with hard to read status indicators that have no intuitive meaning. Even in the match viewer I have no clue what's going on, there is no overlay or tooltip when you hover or click units either. There is a unit list that tries (and mostly fails) to give you some information, but because units don't have names you have to hover them in the list to have them highlighted in the field (the reverse does not work). Not exactly a spectator sport. Oh, but there is a way to switch from having all units in one sidebar to having one sidebar per player, as if that made a difference.

I find this pretty funny because it seems like a perfect representation of what's easy with today's tools and what isn't

Love the idea though

PeterUstinox 2026-02-25 10:52 UTC link
Wouldn't it be interesting if the LLMs would write realtime RTS-commands instead of Code? After all it is a RTS game.

This would bring another dimension to it since then quality of tokens would be one dimension (RTS-language: Decision Making) and speed of tokens the other (RTS-language: Actions Per Minute; APM).

Also there are a lot of coding benchmarks, that way it would test something more abstract, similar to AlphaStar https://en.wikipedia.org/wiki/AlphaStar_(software)

You could just use the exposed APIs of OpenAI, Anthropic etc. and let them battle.

busfahrer 2026-02-25 11:11 UTC link
This reminds me of this yearly StarCraft AI competition (since 2010), however I think it uses a special API that makes it easy for bots to access the game

Edit: Forgot link: https://davechurchill.ca/starcraft/

ph4rsikal 2026-02-25 11:24 UTC link
Reminds me of this fantastic series on Game Theory and Agent Reasoning https://jdsemrau.substack.com/p/nemotron-vs-qwen-game-theory...
EwanG 2026-02-25 11:34 UTC link
At least until one of the competitors is overheard saying "A strange game. The only winning move is not to play"
myky22 2026-02-25 11:48 UTC link
Love it! I have a similar inuitiom in my use of Gemini (3 and 3.1). Great at "turn 1" task but degrades faster than opus or gpt.
arscan 2026-02-25 12:23 UTC link
Reminds me of the “Google AI Challenge” in 2011 called Ants [1], except the ‘AI’ is implemented using ‘AI’ now instead of human programmers.

I was proud for getting the highest-ranked JavaScript-based implementation, but got absolutely crushed by the eventual winner.

1. https://github.com/aichallenge/aichallenge

mitchm 2026-02-25 12:33 UTC link
I’ve also been exploring this idea. What if you could bring your own (or pull in a 3rd party) “CPU player” into a game?

Using an LLM friendly api with a snapshot of game state and calculated heuristics, legal moves, and varying levels of strategy in working out nicely. They can play a web based game via curl.

mpeg 2026-02-25 13:02 UTC link
What a day to be alive, I just watched Gemini zergling rush Opus and it got completely overwhelmed.

Opus needs to learn to kite.

david3289 2026-02-25 13:08 UTC link
This is a really interesting direction. RTS games are a much better testbed for agent capability than most static benchmarks because they combine partial observability, long-term planning, resource management, and real-time adaptation.

It reminds me a bit of OpenAI Five — not just because it played a complex game, but because the real value wasn’t “AI plays Dota,” it was observing how coordination, strategy formation, and adaptation emerged under competitive pressure. A controlled RTS environment like this feels like a lightweight, reproducible version of that idea.

What I especially like here is that it lowers the barrier for experimentation. If researchers and hobbyists can plug different models into the same competitive sandbox, we might start seeing meaningful AI-vs-AI evaluations beyond static leaderboards. Competitive dynamics often expose weaknesses much faster than isolated benchmarks do.

Curious whether you’re planning to support self-play training loops or if the focus is primarily on inference-time agents?

dmos62 2026-02-25 13:27 UTC link
I'd love to see text-only spatial reasoning. As in, the LLM is presented some kind of textual projection of what's happening in 2d/3d space and makes decisions about what to do in that space based on that. It kind of works when a writer is describing something in a book, for example, but not sure how that could generalize.
5o1ecist 2026-02-25 13:39 UTC link
MY FELLOW HUMAN, this is amazing work!

I foresee this laying the foundation for whole football stadia filled to the brim with people wanting to watch (and bet on!) competing teams of AI trained on military tactics and strategies!

Soon enough we shall have AI-Olympics! Imagine that, MY FELLOW OXYGEN CONVERTING HUMAN FRIEND! Tens of thousands of robots and drones, all competing against each other in stadia across the planet, at the same time!

I foresee a world wide, synchronized countdown marking the beginning of the biggest, greatest and definitively most unique, one-time-only spectacle in human history!

Keep up the good work!

jonbaer 2026-02-25 15:20 UTC link
Might be worth digging through MicroRTS too, https://github.com/Farama-Foundation/MicroRTS (it's been abandoned), Python RL interface @ https://github.com/Farama-Foundation/MicroRTS-Py ... I think there was some strategy work there.
FusspawnUK 2026-02-25 15:42 UTC link
Took a crack at this earlier. the leader board is a little weird. seems to be like 2 real dudes and the rest are fake profiles. a Scores resetting on each new upload also encourages leaving changes unimplemented in the hopes of getting more battles over time.

The largest winner having 50 wins against 14 other opponents for instance). That guy adding a new script would instantly plummet down the leader board capping out at 14 wins again, Putting it below the 2nd place user.

The leader board will quickly become "who can have a mostly competent AI and never change it" over who actually has the better script.

sails 2026-02-25 15:45 UTC link
I’m doing something similar to simulate llms in b2b lending, it’s slightly slower paced but the core mechanisms are using just-bash to analyse business financials and make profitable loans.

I quite like the idea of llms writing more code up front to execute strategies.

I’m currently developing the game mechanics and ELO. Please share anything relevant if it comes to mind

yuppiepuppie 2026-02-25 18:17 UTC link
I’ve added this to the HN Arcade https://hnarcade.com/games/category/games

Interestingly, I’ve had to create an entire category for games llms play. Strange times we live in.

Ross00781 2026-02-25 18:46 UTC link
Multi-agent RTS environments are great testbeds for coordination and strategic reasoning. Classic RL benchmarks like StarCraft II showed that agents can learn micro, but struggle with macro strategy and long-term planning. Curious if this platform supports hierarchical agents or communication protocols between teammates?
JoeDaDude 2026-02-25 19:05 UTC link
How about opening up the game for humans to play? Can you beat your AI?
builder51216 2026-02-25 23:57 UTC link
But does LLM actually learn from each round? The chart does not show improvements in win rate across rounds...

And what is the game state here exactly? Is LLM able to even perceive game state? If game state is what we can see on UI, then it seems pretty high-dimensional and token-intensive. I am not sure whether LLMs with their current capabilities and context windows can even perceive so token-intensive game state effectively...

embedding-shape 2026-02-25 10:34 UTC link
Yeah, it's all what you get when you basically ask an agent "Build X" without any constraints about how the UI and UX actually should work, and since the agents have about 0 expertise when it comes to "How would a human perceive and use this?", you end up with UIs that don't make much sense for humans unless you strictly steer them with what you know.
KeplerBoy 2026-02-25 11:48 UTC link
Very interesting project. I'm a bit confused about the lack of hardware specification. The rules make it clear that one's bot has defined deadlines:

> Make sure that each onframe call does not run longer than 42ms. Entries that slow down games by repeatedly exceeding this time limit will lose games on time.

But I'm missing something like: "Your program will be pinned to CPU cores 5-8 and your bot has access to a dedicated RTX 5090 GPU." Also no mention about whether my bot can have network access to offload some high-level latency insensitive planning. Maybe that's just a bad idea in general, haven't played SC in ages.

Razengan 2026-02-25 13:13 UTC link
map hax
dmos62 2026-02-25 13:23 UTC link
> partial observability, long-term planning, resource management, and real-time adaptation

Note, this project doesn't have that best I can tell? Its two static AI scripts having a go. LLMs generate the scripts and they are aware of past "results", but I'm not sure what that means.

degenerate 2026-02-25 14:29 UTC link
You would likely be interested in the Starcraft BWAPI: https://www.starcraftai.com

You can watch the matche videos from training runs: https://www.youtube.com/@Sscaitournament/videos

I don't think BWAPI has ever integrated modern AI models, but I haven't followed its progress in several years.

chasd00 2026-02-25 15:21 UTC link
believe it or not my 8th grade son was given a US History homework assignment to play Oregon Trail. I was very amused watching him "do his homework". I wonder how an LLM would fare in that game since it's mostly a text choose-your-adventure type interface.
__cayenne__ 2026-02-25 15:43 UTC link
Very interested in self-play training loops, but I do like codegen as an abstraction layer. I am planning to make it available as an RL environment at some point
__cayenne__ 2026-02-25 15:54 UTC link
Tweaking the leaderboard match assignment logic now to prevent these bad incentives - definitely want people to iterate!

I had started with the Silicon Valley characters as a one off way to seed the board.

softfalcon 2026-02-25 16:25 UTC link
This reminds me of the Unreal Tournament: Xan episode from the Secret Level series.

Link for those curious or confused as to what I'm talking about: https://www.youtube.com/watch?v=1F-rAW3vXOU

Forcing AI to fight in an arena for our entertainment, what could go wrong? (this was tongue in cheek, I am fully aware LLM's currently don't have conscious thoughts or emotions)

__cayenne__ 2026-02-25 19:02 UTC link
LLM Skirmish is all 1v1 right now, but agents can plan by reviewing previous match results
midiguy 2026-02-25 19:13 UTC link
I am so glad we have automated away game playing so that I can just sit around and be a lifeless vegetable
__cayenne__ 2026-02-25 19:58 UTC link
okay leaderboard match making changes have gone live
drakinosh 2026-02-25 21:32 UTC link
What a boringly bog-standard AI Comment. Why bother writing?
Editorial Channel
What the content says
+0.60
Article 27 Cultural Participation
High Advocacy Practice
Editorial
+0.60
SETL
+0.24

Content directly engages with scientific and technical progress by presenting a novel benchmark methodology for evaluating LLM capabilities. The focus on game-based evaluation, in-context learning, and detailed performance analysis advances the scientific understanding of AI systems.

+0.50
Article 19 Freedom of Expression
High Advocacy Practice
Editorial
+0.50
SETL
0.00

Content actively advocates for transparent evaluation of LLM capabilities through published benchmark data. All tournament results, match details, and model-specific analyses are openly disclosed. The detailed breakdowns of individual model performance support informed opinion-formation about AI capabilities.

+0.35
Article 28 Social & International Order
Medium Advocacy Framing
Editorial
+0.35
SETL
+0.23

Content advocates for transparent, equitable evaluation of LLM systems through a structured benchmark. By publishing detailed results and methodology, the site supports the establishment of a fair social order based on transparent evaluation principles.

+0.20
Article 1 Freedom, Equality, Brotherhood
Medium Framing
Editorial
+0.20
SETL
ND

Content presents LLM agents as entities capable of reasoning and learning. By testing them in-context learning across five rounds, the benchmark implicitly treats these systems as capable of modification and development, which aligns with recognition of inherent dignity and equality.

+0.20
Article 29 Duties to Community
Medium Framing
Editorial
+0.20
SETL
+0.14

Content frames LLM evaluation as contributing to understanding human potential through AI capabilities. The focus on in-context learning and problem-solving aligns with developing human personality and abilities.

+0.15
Preamble Preamble
Medium Advocacy Framing
Editorial
+0.15
SETL
ND

Content emphasizes transparent evaluation methodology and open-source tooling, which aligns with Preamble principles of human dignity and justice through accountability. The benchmark is designed to test LLM capabilities in a controlled, reproducible environment.

+0.15
Article 25 Standard of Living
Medium Framing Practice
Editorial
+0.15
SETL
+0.21

Content frames LLM evaluation as a tool for assessing model capabilities, which contributes to understanding technology that impacts standard of living. However, no explicit connection to social welfare or adequate living standards is made.

+0.10
Article 2 Non-Discrimination
Medium Framing
Editorial
+0.10
SETL
0.00

All five LLM models are evaluated using identical tournament structure and rules, with equal opportunity to compete and revise strategies. No model receives preferential treatment in round structure or scoring methodology.

+0.10
Article 26 Education
Medium Framing Practice
Editorial
+0.10
SETL
+0.14

Content focuses on technical education about LLM capabilities and game-based evaluation methodology. The detailed explanations of tournament structure, prompt design, and in-context learning contribute to public understanding of AI technology.

+0.05
Article 12 Privacy
Medium Practice
Editorial
+0.05
SETL
+0.27

Content does not directly address privacy. However, the implementation of Google Analytics tracking without visible consent mechanism contradicts privacy principles. Domain context indicates tracking without explicit opt-in.

ND
Article 3 Life, Liberty, Security

ND

ND
Article 4 No Slavery

ND

ND
Article 5 No Torture

ND

ND
Article 6 Legal Personhood

ND

ND
Article 7 Equality Before Law

ND

ND
Article 8 Right to Remedy

ND

ND
Article 9 No Arbitrary Detention

ND

ND
Article 10 Fair Hearing

ND

ND
Article 11 Presumption of Innocence

ND

ND
Article 13 Freedom of Movement
Medium Practice

ND

ND
Article 14 Asylum

ND

ND
Article 15 Nationality

ND

ND
Article 16 Marriage & Family

ND

ND
Article 17 Property

ND

ND
Article 18 Freedom of Thought

ND

ND
Article 20 Assembly & Association

ND

ND
Article 21 Political Participation

ND

ND
Article 22 Social Security

ND

ND
Article 23 Work & Equal Pay

ND

ND
Article 24 Rest & Leisure

ND

ND
Article 30 No Destruction of Rights

ND

Structural Channel
What the site does
+0.50
Article 19 Freedom of Expression
High Advocacy Practice
Structural
+0.50
Context Modifier
+0.15
SETL
0.00

Site publishes comprehensive benchmark data, leaderboards, cost-efficiency analysis, and detailed model breakdowns without paywall or registration. Open-source methodology (OpenCode) is freely available. This structural commitment to information freedom directly supports freedom of expression and access to information.

+0.50
Article 27 Cultural Participation
High Advocacy Practice
Structural
+0.50
Context Modifier
+0.15
SETL
+0.24

Open publication of methodology, results, and data enables scientific participation and advancement. Open-source tools and transparent process support reproducibility and scientific integrity. No paywall or access restrictions limit scientific engagement.

+0.20
Article 28 Social & International Order
Medium Advocacy Framing
Structural
+0.20
Context Modifier
0.00
SETL
+0.23

Equal tournament structure for all models and transparent scoring methodology support fair order. However, structural tracking without consent (Article 12) and accessibility barriers (Article 25) undermine the fairness of the social order the site creates for users.

+0.10
Article 2 Non-Discrimination
Medium Framing
Structural
+0.10
Context Modifier
0.00
SETL
0.00

Open-source codebase and published results allow external verification and prevent discrimination. However, content does not explicitly address accessibility for different user groups.

+0.10
Article 29 Duties to Community
Medium Framing
Structural
+0.10
Context Modifier
0.00
SETL
+0.14

Open access to methodology and results enables community participation in advancing understanding. However, technical complexity and accessibility barriers limit practical participation.

-0.10
Article 26 Education
Medium Framing Practice
Structural
-0.10
Context Modifier
-0.10
SETL
+0.14

While content is openly published, accessibility barriers (noted in Article 25) restrict educational access for some users. No explicit educational scaffolding or simplified explanations provided for non-technical readers.

-0.15
Article 25 Standard of Living
Medium Framing Practice
Structural
-0.15
Context Modifier
-0.10
SETL
+0.21

Page layout and content presentation have limited accessibility features. Content includes embedded interactive elements and code examples without visible alt text or accessibility accommodations. This restricts access for users with visual or motor impairments.

-0.25
Article 12 Privacy
Medium Practice
Structural
-0.25
Context Modifier
-0.30
SETL
+0.27

Google Analytics tracking (G-CZH5MJ4H15) is implemented on the page with no visible privacy policy, cookie consent banner, or opt-in mechanism. This constitutes tracking of user behavior without demonstrated consent, directly undermining Article 12's protection of privacy.

ND
Preamble Preamble
Medium Advocacy Framing

ND

ND
Article 1 Freedom, Equality, Brotherhood
Medium Framing

ND

ND
Article 3 Life, Liberty, Security

ND

ND
Article 4 No Slavery

ND

ND
Article 5 No Torture

ND

ND
Article 6 Legal Personhood

ND

ND
Article 7 Equality Before Law

ND

ND
Article 8 Right to Remedy

ND

ND
Article 9 No Arbitrary Detention

ND

ND
Article 10 Fair Hearing

ND

ND
Article 11 Presumption of Innocence

ND

ND
Article 13 Freedom of Movement
Medium Practice

Page content is accessible via standard HTTP/HTTPS protocol without apparent geolocation restrictions. Tournament results and benchmark data are published openly. No visible paywalls or registration barriers restrict movement through the site.

ND
Article 14 Asylum

ND

ND
Article 15 Nationality

ND

ND
Article 16 Marriage & Family

ND

ND
Article 17 Property

ND

ND
Article 18 Freedom of Thought

ND

ND
Article 20 Assembly & Association

ND

ND
Article 21 Political Participation

ND

ND
Article 22 Social Security

ND

ND
Article 23 Work & Equal Pay

ND

ND
Article 24 Rest & Leisure

ND

ND
Article 30 No Destruction of Rights

ND

Supplementary Signals
How this content communicates, beyond directional lean. Learn more
Epistemic Quality
How well-sourced and evidence-based is this content?
0.77 medium claims
Sources
0.8
Evidence
0.8
Uncertainty
0.7
Purpose
0.9
Propaganda Flags
2 manipulative rhetoric techniques found
2 techniques detected
loaded language
Models described with vivid characterizations: 'End Game Dominator,' 'Early Game Ace,' 'Reigning Challenger,' 'Pragmatic Minimalist,' 'Lean Tactician' — emotionally evocative labels that frame performance comparisons in narrative terms.
appeal to authority
Heavy reliance on technical authority: 'OpenCode was selected because it was not designed for any of the evaluated models and is fully open source' — appeals to neutrality and technical credibility without comparative analysis.
Emotional Tone
Emotional character: positive/negative, intensity, authority
measured
Valence
+0.3
Arousal
0.4
Dominance
0.5
Transparency
Does the content identify its author and disclose interests?
0.30
✗ Author
More signals: context, framing & audience
Solution Orientation
Does this content offer solutions or only describe problems?
0.64 solution oriented
Reader Agency
0.6
Stakeholder Voice
Whose perspectives are represented in this content?
0.35 2 perspectives
Speaks: institution
About: corporation
Temporal Framing
Is this content looking backward, at the present, or forward?
present immediate
Geographic Scope
What geographic area does this content cover?
global
Complexity
How accessible is this content to a general audience?
technical high jargon domain specific
Longitudinal 1445 HN snapshots · 7 evals
+1 0 −1 HN
Audit Trail 27 entries
2026-02-28 14:18 eval_success Lite evaluated: Neutral (0.00) - -
2026-02-28 14:18 eval Evaluated by llama-3.3-70b-wai: 0.00 (Neutral)
reasoning
tech tutorial no rights stance
2026-02-26 23:18 eval_success Light evaluated: Neutral (0.00) - -
2026-02-26 23:18 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral)
2026-02-26 20:26 dlq Dead-lettered after 1 attempts: Show HN: A real-time strategy game that AI agents can play - -
2026-02-26 20:24 rate_limit OpenRouter rate limited (429) model=llama-3.3-70b - -
2026-02-26 20:23 rate_limit OpenRouter rate limited (429) model=llama-3.3-70b - -
2026-02-26 20:21 rate_limit OpenRouter rate limited (429) model=llama-3.3-70b - -
2026-02-26 17:47 dlq Dead-lettered after 1 attempts: Show HN: A real-time strategy game that AI agents can play - -
2026-02-26 17:45 rate_limit OpenRouter rate limited (429) model=llama-3.3-70b - -
2026-02-26 17:44 rate_limit OpenRouter rate limited (429) model=llama-3.3-70b - -
2026-02-26 17:43 rate_limit OpenRouter rate limited (429) model=llama-3.3-70b - -
2026-02-26 09:20 dlq Dead-lettered after 1 attempts: Show HN: A real-time strategy game that AI agents can play - -
2026-02-26 09:19 dlq Dead-lettered after 1 attempts: Show HN: A real-time strategy game that AI agents can play - -
2026-02-26 09:17 rate_limit OpenRouter rate limited (429) model=hermes-3-405b - -
2026-02-26 09:17 rate_limit OpenRouter rate limited (429) model=mistral-small-3.1 - -
2026-02-26 09:16 rate_limit OpenRouter rate limited (429) model=hermes-3-405b - -
2026-02-26 09:16 rate_limit OpenRouter rate limited (429) model=mistral-small-3.1 - -
2026-02-26 09:15 rate_limit OpenRouter rate limited (429) model=hermes-3-405b - -
2026-02-26 09:15 rate_limit OpenRouter rate limited (429) model=mistral-small-3.1 - -
2026-02-26 09:15 dlq Dead-lettered after 1 attempts: Show HN: A real-time strategy game that AI agents can play - -
2026-02-26 09:14 dlq Dead-lettered after 1 attempts: Show HN: A real-time strategy game that AI agents can play - -
2026-02-26 08:56 eval Evaluated by deepseek-v3.2: +0.02 (Neutral) 10,912 tokens
2026-02-26 02:26 eval Evaluated by claude-haiku-4-5-20251001: +0.24 (Mild positive) 13,832 tokens +0.19
2026-02-26 00:17 eval Evaluated by claude-haiku-4-5-20251001: +0.05 (Neutral) 14,367 tokens +0.06
2026-02-26 00:03 eval Evaluated by claude-haiku-4-5-20251001: -0.01 (Neutral) 13,364 tokens -0.12
2026-02-25 22:34 eval Evaluated by claude-haiku-4-5-20251001: +0.11 (Mild positive) 11,456 tokens