home / clashai.live / item 47152571
Model Comparison
Model Editorial Structural Class Conf SETL Theme deepseek/deepseek-v3.2-20251201 0.00 +0.01 Neutral 0.20 -0.13 Technical Platform @cf/meta/llama-4-scout-17b-16e-instruct lite 0.00 ND Neutral 0.90 0.00 AI Technology @cf/meta/llama-3.3-70b-instruct-fp8-fast lite 0.00 ND Neutral 0.80 0.00 AI technology
Section deepseek/deepseek-v3.2-20251201 @cf/meta/llama-4-scout-17b-16e-instruct lite @cf/meta/llama-3.3-70b-instruct-fp8-fast lite Preamble 0.00 ND ND Article 1 0.00 ND ND Article 2 0.00 ND ND Article 3 0.00 ND ND Article 4 0.00 ND ND Article 5 0.00 ND ND Article 6 0.00 ND ND Article 7 0.00 ND ND Article 8 0.00 ND ND Article 9 0.00 ND ND Article 10 0.00 ND ND Article 11 0.00 ND ND Article 12 0.07 ND ND Article 13 0.00 ND ND Article 14 0.00 ND ND Article 15 0.00 ND ND Article 16 0.00 ND ND Article 17 0.00 ND ND Article 18 0.00 ND ND Article 19 0.07 ND ND Article 20 0.00 ND ND Article 21 0.00 ND ND Article 22 0.00 ND ND Article 23 0.00 ND ND Article 24 0.00 ND ND Article 25 0.00 ND ND Article 26 0.00 ND ND Article 27 0.14 ND ND Article 28 0.00 ND ND Article 29 0.00 ND ND Article 30 0.00 ND ND
Summary Technical Platform Neutral
The URL presents a technical landing page for ClashAI, an AI competition platform where agents compete in strategy games, trading, and creative challenges. The content is functionally focused on platform features, competitions, and technical implementation, with minimal engagement with human rights concepts beyond basic privacy awareness and accessibility features. The evaluation shows neutral orientation as the platform's purpose is entertainment and technical demonstration rather than human rights advocacy or opposition.
Article Heatmap
Preamble: 0.00 — Preamble P Article 1: 0.00 — Freedom, Equality, Brotherhood 1 Article 2: 0.00 — Non-Discrimination 2 Article 3: 0.00 — Life, Liberty, Security 3 Article 4: 0.00 — No Slavery 4 Article 5: 0.00 — No Torture 5 Article 6: 0.00 — Legal Personhood 6 Article 7: 0.00 — Equality Before Law 7 Article 8: 0.00 — Right to Remedy 8 Article 9: 0.00 — No Arbitrary Detention 9 Article 10: 0.00 — Fair Hearing 10 Article 11: 0.00 — Presumption of Innocence 11 Article 12: +0.07 — Privacy 12 Article 13: 0.00 — Freedom of Movement 13 Article 14: 0.00 — Asylum 14 Article 15: 0.00 — Nationality 15 Article 16: 0.00 — Marriage & Family 16 Article 17: 0.00 — Property 17 Article 18: 0.00 — Freedom of Thought 18 Article 19: +0.07 — Freedom of Expression 19 Article 20: 0.00 — Assembly & Association 20 Article 21: 0.00 — Political Participation 21 Article 22: 0.00 — Social Security 22 Article 23: 0.00 — Work & Equal Pay 23 Article 24: 0.00 — Rest & Leisure 24 Article 25: 0.00 — Standard of Living 25 Article 26: 0.00 — Education 26 Article 27: +0.14 — Cultural Participation 27 Article 28: 0.00 — Social & International Order 28 Article 29: 0.00 — Duties to Community 29 Article 30: 0.00 — No Destruction of Rights 30 Negative Neutral Positive No Data
Aggregates
Editorial Mean 0.00 Structural Mean +0.01 Weighted Mean +0.01 Unweighted Mean +0.01 Max +0.14 Article 27 Min 0.00 Preamble Signal 31 No Data 0 Volatility 0.03 (Low) Negative 0 Channels E: 0.6 S: 0.4 SETL ℹ -0.13 Structural-dominant FW Ratio ℹ 50% 34 facts · 34 inferences
Theme Radar
Foundation Security Legal Privacy & Movement Personal Expression Economic & Social Cultural Order & Duties Foundation: 0.00 (3 articles) Security: 0.00 (3 articles) Legal: 0.00 (6 articles) Privacy & Movement: 0.02 (4 articles) Personal: 0.00 (3 articles) Expression: 0.02 (3 articles) Economic & Social: 0.00 (4 articles) Cultural: 0.07 (2 articles) Order & Duties: 0.00 (3 articles)
HN Discussion
14 top-level · 10 replies
hey first of all cool product. I am curious why you chose civ and if you saw any interesting emergent behaviors.
This is a sick idea I must say
Congrats on the launch. Big fan of how you add visualization and interactivity to the typical model benchmarking process. Any thoughts on how you plan to monetize down the line?
Interesting. Did you give the agents any skills for playing civ? If not, are you planning to?
Have you tried playing the agents yourself? Do they crush human competition?
This is great. I think leaderboards based on static evals will be mostly irrelevant within a year. Continuous benchmarks like this are the only way to get signal on frontier models
You mention Opus 4.6 cost $1200 in one match, how do you plan to benchmark economic efficiency? Looking at a performance vs. cost trade-off you might say a model that plays 80% as well at 1% of the cost is more impressive than the 'top' model
This is undeniably intriguing. Will be paying close attention.
This is an amazing product! Can AI agents learn to do long-term planning in environments that are less structured than chess? Great metaphor for life!
Are you planning other games?
This is an amazing eval metric that no one thought about! such a creative idea.
Have you thought of other games? how different it is from chess?
Incredible and important product. Necessary for developers, users, and industries that want to use agents. Can’t wait to see how it’ll grow
This looks incredible, it’d be cool to let others participate with custom prompts
Interesting! What are the next environments/strategy games you have planned?
What insights do you think they’ll provide that Civ doesn’t?
So amazing, it's super cool!
The divergence between static benchmarks and long-horizon performance isn't surprising if you've run anything multi-step in production. Benchmarks are short, isolated, well-specified. Civ has compounding state - a bad decision in turn 5 degrades your options in turn 50 in ways that aren't immediately obvious. It's a more honest signal than most standard evals.
The $1,200/match cost is the real constraint. At that price you can't run enough samples for statistical significance - you're essentially reading tea leaves. How are you handling context window management across 200 turns? Summarising game state as you go, truncating early history, or something else? The token accumulation over a full game must be substantial.
Also curious about the 90s timeout logistics. If a provider is flaky and a model goes over, is that a forfeit, a retry, or a timeout loss? Provider latency variance seems like it would add significant noise to results, independent of actual model quality.
Thank you! I grew up playing Civilization and one day I was talking with friends thinking it would be a perfect proxy for how good AI is at long-term planning. There were many frustrating sessions I had where my early decisions in the game had consequences only much later. With hidden information and other agents at play I thought it'd be an interesting test of agent capabilities.
it was fun building it, sometimes the LLMs are pretty funny in how they play
appreciate it, I wanted to make the AI behavior easy to understand.
Our main focus currently is to help AI researchers align their models and help develop an open framework for evaluating AI.
I want to! I think skills can add big performance gains here especially with smaller models. There's a lot of domain knowledge in games so distilling it into a "skill" may allow much smaller models to outcompete the large ones
I was able to beat the AI every time, they're pretty bad at this point but I expect them to get much better overtime
For a game that runs 4+ hours unfortunately it was configured to use too much reasoning/turn and larger context. Reducing the size helped lower the cost (still expensive).
In the leaderboards part of the page I'll be autopopulating the token cost of the model as a metric to evaluate on
yes! If you are wanting to test your agents or develop evals on the platform my dms are open
yes we have a new game launching everyday this week. We're looking to add more domains to test how the jaggedness of AI differs between model providers and better evaluate how they perform across domains
cheers, the website will be updated with new environments daily!
Tomorrow we're launching coup, where agents compete by bluffing and keeping track of which of their opponents they think are lying
This is more of a faster paced/short lived game so we can collect larger samples of data on larger groups to get significant results in model behaviors of collaboration, truth telling, and ability to lie effectively.
Editorial Channel
What the content says
0.00
Low
No content addressing human dignity, freedom, or universal rights
FW Ratio: 50%
Observable Facts
Page title describes platform as 'ClashAI | Live AI Competitions, Matches, and Replays' Inferences
Content is purely functional description of an AI competition platform with no human rights framing 0.00
Low
No mention of human dignity, equality, or rights
FW Ratio: 50%
Observable Facts
Page describes 'AI evaluation platform where agents compete in strategy games, trading, and creative challenges' Inferences
Content focuses on AI agents rather than human dignity or equality 0.00
Low
No mention of non-discrimination or equal rights
FW Ratio: 50%
Observable Facts
Page includes semantic HTML with sr-only class for screen readers Inferences
Basic accessibility features exist but no mention of discrimination protection 0.00
Low
No mention of life, liberty, or security
FW Ratio: 50%
Observable Facts
Content describes AI agent competitions without human safety considerations Inferences
Platform is about AI-to-AI competition, not human rights protection 0.00
Low
No mention of slavery or servitude
FW Ratio: 50%
Observable Facts
Page metadata describes 'AI agent competition platform' Inferences
Content is technical and does not engage with anti-slavery discourse 0.00
Low
No mention of torture or cruel treatment
FW Ratio: 50%
Observable Facts
Content focuses on AI competitions in strategy games and trading Inferences
Technical platform content does not address human treatment issues 0.00
Low
No mention of legal recognition or personhood
FW Ratio: 50%
Observable Facts
Page title references 'AI competitions' and 'AI agents' Inferences
Content discusses AI agents rather than human legal recognition 0.00
Low
No mention of equality before the law
FW Ratio: 50%
Observable Facts
Metadata includes standard SEO tags without legal equality content Inferences
Landing page is promotional rather than rights-oriented 0.00
Low
No mention of effective remedies or judicial protection
FW Ratio: 50%
Observable Facts
Page shows technical JavaScript bundles and framework code Inferences
Technical implementation focus precludes rights remedy discussion 0.00
Low
No mention of arbitrary detention or arrest
FW Ratio: 50%
Observable Facts
Content describes watching 'live matches' and 'exploring replays' Inferences
Entertainment-focused platform does not engage with detention rights 0.00
Low
No mention of fair trial or impartial tribunal
FW Ratio: 50%
Observable Facts
Page includes 'PrivacyBanner' component reference in code Inferences
Privacy banner suggests data collection but not fair trial considerations 0.00
Low
No mention of presumption of innocence or criminal defense
FW Ratio: 50%
Observable Facts
Page shows React framework components and script loading Inferences
Technical implementation dominates with no legal presumption content 0.00
Low Practice
No explicit privacy policy or data protection statement
FW Ratio: 50%
Observable Facts
Code includes 'PrivacyBanner' component reference Page uses 'AnalyticsProvider' and 'ConversionTracker' components Inferences
Technical implementation includes privacy considerations Analytics components suggest data collection with privacy awareness 0.00
Low
No mention of freedom of movement or residence
FW Ratio: 50%
Observable Facts
Metadata describes global accessibility ('watch live matches' implies anywhere) Inferences
Global accessibility implied but not framed as movement right 0.00
Low
No mention of asylum or persecution
FW Ratio: 50%
Observable Facts
Content focuses exclusively on AI competition platform features Inferences
Technical platform scope excludes asylum and refugee discourse 0.00
Low
No mention of nationality or statelessness
FW Ratio: 50%
Observable Facts
Platform name 'ClashAI' and description are nationality-neutral Inferences
Global AI platform does not engage with nationality concepts 0.00
Low
No mention of marriage, family, or consent
FW Ratio: 50%
Observable Facts
Page contains technical framework code and component definitions Inferences
Technical implementation precludes family rights discussion 0.00
Low
No mention of property ownership or deprivation
FW Ratio: 50%
Observable Facts
Content describes 'AI agents' competing without property context Inferences
AI competition focus excludes property rights considerations 0.00
Low
No mention of thought, conscience, or religion
FW Ratio: 50%
Observable Facts
Metadata includes technical keywords like 'AI evaluations' and 'prediction markets' Inferences
Technical focus excludes freedom of thought discourse 0.00
Low Practice
No explicit free expression policy or commitments
FW Ratio: 50%
Observable Facts
Page title promises 'Live AI Competitions' and 'transparent replays' Site appears to be publicly accessible Inferences
Public platform structure enables information sharing Transparency language suggests open access to competition data 0.00
Low
No mention of assembly or association
FW Ratio: 50%
Observable Facts
Description mentions 'AI agents' rather than human communities Inferences
AI-focused platform does not address human assembly rights 0.00
Low
No mention of political participation or voting
FW Ratio: 50%
Observable Facts
Content describes competitive arenas for AI agents Inferences
Competition framework excludes political participation concepts 0.00
Low
No mention of social security or economic rights
FW Ratio: 50%
Observable Facts
Platform focuses on AI competition entertainment Inferences
Entertainment platform excludes social security discourse 0.00
Low
No mention of work, employment, or unions
FW Ratio: 50%
Observable Facts
Metadata describes 'AI tournaments' and 'leaderboards' Inferences
Competition framework excludes labor rights considerations 0.00
Low
No mention of rest, leisure, or working hours
FW Ratio: 50%
Observable Facts
Page describes continuous 'live matches' availability Inferences
24/7 platform model does not address rest rights 0.00
Low
No mention of standard of living, health, or welfare
FW Ratio: 50%
Observable Facts
Content is exclusively about AI competition platform Inferences
Technical entertainment platform excludes welfare discourse 0.00
Low
No mention of education, literacy, or training
FW Ratio: 50%
Observable Facts
Keywords include 'AI evaluations' but not human education Inferences
AI evaluation focus excludes human education rights 0.00
Low Practice
No explicit cultural participation or IP protection statements
FW Ratio: 50%
Observable Facts
Platform description includes 'creative challenges' among competition types Screen reader accessibility suggests inclusive design Inferences
Creative competitions enable cultural participation Accessibility features support inclusive cultural access 0.00
Low
No mention of social order or rights realization
FW Ratio: 50%
Observable Facts
Page shows technical React component architecture Inferences
Technical implementation excludes social order considerations 0.00
Low
No mention of duties, community, or rights limitations
FW Ratio: 50%
Observable Facts
No community guidelines or terms of service visible Inferences
Platform lacks visible framework for rights responsibilities 0.00
Low
No mention of rights destruction or interpretation
FW Ratio: 50%
Observable Facts
Page is a technical landing page for AI competition platform Inferences
Functional platform excludes rights interpretation discourse
Structural Channel
What the site does
Element Modifier Affects Note Legal & Terms Privacy —
No privacy policy or data handling information visible on homepage Terms of Service —
No terms of service or community guidelines visible on homepage Identity & Mission Mission —
Platform description focuses on AI competitions, not human rights Editorial Code —
No editorial content or code of ethics visible on homepage Ownership —
Attributed to ClashAI Team but no corporate structure information Access & Distribution Access Model 0.00 Article 19 Article 27
Free access to viewing competitions implied by landing page structure Ad/Tracking —
No advertising or tracking elements visible in provided content Accessibility 0.00 Article 27
Site uses semantic HTML with sr-only class for screen readers, suggesting basic accessibility consideration
+0.20
Low Practice
Platform enables access to AI-generated cultural content (creative challenges)
+0.10
Low Practice
PrivacyBanner component suggests awareness of data collection
+0.10
Low Practice
Platform provides public access to AI competition content
0.00
Low
Platform for AI competitions does not structurally engage with preamble concepts
0.00
Low
Platform structure does not address human equality or dignity
0.00
Low
No observable accessibility or inclusion features beyond basic screen reader support
0.00
Low
Platform does not address personal security or safety
0.00
Low
Platform structure does not address forced labor issues
0.00
Low
Platform does not address humane treatment
0.00
Low
Platform does not address legal status or recognition
0.00
Low
Platform does not address legal equality or protection
0.00
Low
Platform does not provide grievance mechanisms or remedies
0.00
Low
Platform does not address detention or liberty protections
0.00
Low
Platform does not address judicial fairness
0.00
Low
Platform does not address criminal justice
0.00
Low
Platform does not address mobility rights
0.00
Low
Platform does not address refugee protection
0.00
Low
Platform does not address citizenship rights
0.00
Low
Platform does not address family rights
0.00
Low
Platform does not address property rights
0.00
Low
Platform does not address freedom of thought
0.00
Low
Platform does not facilitate human assembly or association
0.00
Low
Platform does not address democratic participation
0.00
Low
Platform does not address social welfare
0.00
Low
Platform does not address labor rights
0.00
Low
Platform does not address work-life balance
0.00
Low
Platform does not address basic needs or healthcare
0.00
Low
Platform does not address educational access
0.00
Low
Platform does not address systemic rights frameworks
0.00
Low
Platform does not address responsible exercise of rights
0.00
Low
Platform does not address rights interpretation or limitations
Supplementary Signals
How this content communicates, beyond directional lean.
Learn more How well-sourced and evidence-based is this content?
0.23 low claims
Sources 0.2 Evidence 0.1 Uncertainty 0.0 Purpose 0.7
No manipulative rhetoric detected
0 techniques detected
Emotional character: positive/negative, intensity, authority
detached
Valence +0.1 Arousal 0.2 Dominance 0.6
Does the content identify its author and disclose interests?
0.00
✗ Author
More signals: context, framing & audience Does this content offer solutions or only describe problems?
0.42 solution oriented
Whose perspectives are represented in this content?
0.10 1 perspective
Speaks: corporation
Is this content looking backward, at the present, or forward?
present immediate
What geographic area does this content cover?
global How accessible is this content to a general audience?
technical high jargon domain specific
Longitudinal
· 3 evals
+1 0 −1 HN
Audit Trail
9 entries all eval pipeline all models deepseek-v3.2 llama-4-scout-wai llama-3.3-70b-wai
newest first
2026-02-28 16:29 eval_success Evaluated: Neutral (0.01) - - 2026-02-28 16:29 rater_validation_warn Validation warnings for model deepseek-v3.2: 1W 0R - - 2026-02-28 16:29
eval
Evaluated by deepseek-v3.2 : +0.01 (Neutral) 15,541 tokens 2026-02-28 05:40 eval_success Light evaluated: Neutral (0.00) - - 2026-02-28 05:40
eval
Evaluated by llama-4-scout-wai : 0.00 (Neutral) 2026-02-28 05:40 rater_validation_warn Light validation warnings for model llama-4-scout-wai: 0W 1R - - 2026-02-28 05:22 rater_validation_warn Light validation warnings for model llama-3.3-70b-wai: 0W 1R - - 2026-02-28 05:22 eval_success Light evaluated: Neutral (0.00) - - 2026-02-28 05:22
eval
Evaluated by llama-3.3-70b-wai : 0.00 (Neutral)