+0.09 I beat Grok 4 on ARC-AGI-2 using a CPU-only symbolic engine (18.1% score) (github.com S:+0.07 )
11 points by kofdai 5 days ago | 6 comments on HN | Mild positive Product · v3.7 · 2026-03-01 03:10:28 0
Summary Scientific Advancement Advocates
The page is a GitHub repository for 'verantyx-v6', an LLM-Free Symbolic Reasoning Engine. Its editorial content advocates for unbiased, structurally-verified scientific tools for a hypothetical 'Humanity's Last Exam'. The evaluation finds mild positive advocacy themes related to scientific advancement, freedom of expression, and participation in cultural life, framed within the platform's structural support for open access and collaboration.
Article Heatmap
Preamble: +0.20 — Preamble P Article 1: ND — Freedom, Equality, Brotherhood Article 1: No Data — Freedom, Equality, Brotherhood 1 Article 2: ND — Non-Discrimination Article 2: No Data — Non-Discrimination 2 Article 3: ND — Life, Liberty, Security Article 3: No Data — Life, Liberty, Security 3 Article 4: ND — No Slavery Article 4: No Data — No Slavery 4 Article 5: ND — No Torture Article 5: No Data — No Torture 5 Article 6: ND — Legal Personhood Article 6: No Data — Legal Personhood 6 Article 7: ND — Equality Before Law Article 7: No Data — Equality Before Law 7 Article 8: ND — Right to Remedy Article 8: No Data — Right to Remedy 8 Article 9: ND — No Arbitrary Detention Article 9: No Data — No Arbitrary Detention 9 Article 10: ND — Fair Hearing Article 10: No Data — Fair Hearing 10 Article 11: ND — Presumption of Innocence Article 11: No Data — Presumption of Innocence 11 Article 12: ND — Privacy Article 12: No Data — Privacy 12 Article 13: ND — Freedom of Movement Article 13: No Data — Freedom of Movement 13 Article 14: ND — Asylum Article 14: No Data — Asylum 14 Article 15: ND — Nationality Article 15: No Data — Nationality 15 Article 16: ND — Marriage & Family Article 16: No Data — Marriage & Family 16 Article 17: -0.05 — Property 17 Article 18: ND — Freedom of Thought Article 18: No Data — Freedom of Thought 18 Article 19: +0.30 — Freedom of Expression 19 Article 20: ND — Assembly & Association Article 20: No Data — Assembly & Association 20 Article 21: ND — Political Participation Article 21: No Data — Political Participation 21 Article 22: ND — Social Security Article 22: No Data — Social Security 22 Article 23: ND — Work & Equal Pay Article 23: No Data — Work & Equal Pay 23 Article 24: ND — Rest & Leisure Article 24: No Data — Rest & Leisure 24 Article 25: ND — Standard of Living Article 25: No Data — Standard of Living 25 Article 26: +0.15 — Education 26 Article 27: +0.35 — Cultural Participation 27 Article 28: +0.10 — Social & International Order 28 Article 29: 0.00 — Duties to Community 29 Article 30: ND — No Destruction of Rights Article 30: No Data — No Destruction of Rights 30
Negative Neutral Positive No Data
Aggregates
Editorial Mean +0.09 Structural Mean +0.07
Weighted Mean +0.15 Unweighted Mean +0.15
Max +0.35 Article 27 Min -0.05 Article 17
Signal 7 No Data 24
Volatility 0.14 (Medium)
Negative 1 Channels E: 0.6 S: 0.4
SETL +0.07 Editorial-dominant
FW Ratio 50% 15 facts · 15 inferences
Evidence 20% coverage
12M 24 ND
Theme Radar
Foundation Security Legal Privacy & Movement Personal Expression Economic & Social Cultural Order & Duties Foundation: 0.20 (1 articles) Security: 0.00 (0 articles) Legal: 0.00 (0 articles) Privacy & Movement: 0.00 (0 articles) Personal: -0.05 (1 articles) Expression: 0.30 (1 articles) Economic & Social: 0.00 (0 articles) Cultural: 0.25 (2 articles) Order & Duties: 0.05 (2 articles)
HN Discussion 2 top-level · 1 replies
jaen 2026-02-25 12:42 UTC link
You used LLMs to generate code to beat ARC-AGI "without using LLMs"... Uhh, okay then.

LLMs generating code to solve ARC-AGI is literally what they do these days, so as far as I see, basically this entire exercise is equivalent to just running "Deep Think" test-time compute type models and committing their output to Github?

What exactly was the novel, un-LLMable human input here?

kofdai 2026-02-27 08:56 UTC link
Title: [Show HN] Verantyx Update: 22.7% on ARC-AGI-2 using Human-Logic + OpenClaw Loop

Body: Following up on my previous post (where I was at 18.1%), I’ve just reached 22.7% (227/1000) on the ARC-AGI-2 public evaluation set.

I want to address the skepticism regarding my development speed. As an undergraduate student in Japan, I have limited manual coding time. To overcome this, I’ve established a "Human-Architect / AI-Builder" research loop.

How the 24/7 loop works:

Human (Me): I analyze failed tasks to identify underlying geometric patterns and design new DSL primitives (e.g., the new gravity_solver and cross3d_geometry in v62).

AI Agent (OpenClaw/Claude Code): Based on my architectural design, the agent scaffolds the implementation, performs rigorous regression testing across all 1,000 tasks, and refines the code for performance.

This synergy allows for a high-frequency commit cycle that a single developer could never achieve alone, while ensuring the inference engine remains 100% symbolic and deterministic. At test-time, there are zero LLM calls; it's pure structural reasoning.

V62 Key Updates:

Gravity Solver: 4 distinct strategies for object sliding/gravity-based transformations.

Cross3D Geometry Engine: Improved handling of 3D-projected cross structures.

Score: 22.7% (monotonically increasing from 20.1% and 22.4% earlier this week).

I believe this hybrid development model—where human intuition drives logic and AI agents drive implementation—is the fastest path to 80%+ on the "Humanity's Last Exam".

I'm eager to hear your thoughts on this "System 2" approach and the role of AI agents in building symbolic AI.

GitHub: https://github.com/Ag3497120/verantyx-v6 Project Site: https://verantyx.ai

kofdai 2026-02-26 05:19 UTC link
I understand the skepticism—the line between "AI-generated" and "AI-assisted" has become incredibly blurry. Let me clarify the architectural distinction.

1. The Inference Engine is 100% Deterministic: The "solver" is a standalone Python program (26K lines + NumPy). At runtime, it has zero neural dependencies. It doesn't call an LLM, it doesn't load weights, and it doesn't "hallucinate." It performs a combinatorial search over a formal Domain Specific Language (DSL). You could run this on a legacy machine with no internet connection. This is fundamentally different from o1/o3 or Grok-Thinking, where the model is the solver at test-time.

2. The "Novel Human Input" is the DSL Design: Using an LLM to help write Python boilerplate is trivial. Using an LLM to design a 7-phase symbolic pipeline that solves ARC is currently impossible. My core contributions that an LLM could not "reason" out are:

The Cross DSL: The insight that ~57% of ARC transforms can be modeled by local 5-cell Von Neumann neighborhoods.

Iterative Residual Learning: A gradient-free strategy where the system synthesizes a transform, calculates the residual error on the grid, and iteratively synthesizes "correction" programs.

Pruning & Verification: Implementing a formal verification loop where every candidate solution is checked against the 3-5 training examples before being proposed.

3. Scaling through Logic, not Compute: While the industry spends millions on "Test-time Compute" (GPU-heavy CoT), Verantyx achieves 18.1% (and now 20% in v6) using Symbolic Synthesis on a single CPU. The 208 commits in the repo represent 208 iterations of staring at grid failures and manually expanding the primitive vocabulary to cover topological edge cases that LLMs consistently miss.

If using Copilot to speed up the implementation of a deterministic search algorithm invalidates the algorithm, then we’d have to invalidate most modern OS kernels or compilers written today. The "intelligence" isn't in the typing; it's in the program synthesis architecture that does what pure LLM inference cannot.

I'd encourage you to check the source—it's just pure, brute-force symbolic logic: https://github.com/Ag3497120/verantyx-v6

Editorial Channel
What the content says
+0.20
Preamble Preamble
Medium Advocacy
Editorial
+0.20
SETL
ND

The repository is described as an 'LLM-Free Symbolic Reasoning Engine for Humanity's Last Exam (HLE)' — advocating for unbiased tools for humanity.

+0.20
Article 27 Cultural Participation
Medium Advocacy Practice
Editorial
+0.20
SETL
+0.14

The repository promotes '3.80% bias-free score via structural verification, not statistical guessing', advocating for unbiased scientific contribution.

+0.10
Article 19 Freedom of Expression
Medium Coverage Practice
Editorial
+0.10
SETL
0.00

Repository promotes 'LLM-Free Symbolic Reasoning', advocating a specific approach to information processing.

+0.10
Article 28 Social & International Order
Medium Framing
Editorial
+0.10
SETL
ND

The goal of an 'Engine for Humanity's Last Exam' frames a tool intended to operate within a just social order.

0.00
Article 17 Property
Medium Practice
Editorial
0.00
SETL
ND

The repository is presented as a public project; ownership is implied by the GitHub username.

0.00
Article 26 Education
Medium Practice
Editorial
0.00
SETL
ND

The repository contains a 'Symbolic Reasoning Engine', which is a form of educational/technical tool.

0.00
Article 29 Duties to Community
Medium Framing
Editorial
0.00
SETL
ND

The 'bias-free score' framing implicitly acknowledges a duty to the community to produce unbiased tools.

ND
Article 1 Freedom, Equality, Brotherhood
Medium Practice

Platform provides equal access for users to create repositories, supporting baseline equality but not directly engaging with concepts of dignity.

ND
Article 2 Non-Discrimination
Medium Practice

Platform does not visibly discriminate; repository access is non‑discriminatory.

ND
Article 3 Life, Liberty, Security

ND
Article 4 No Slavery

ND
Article 5 No Torture

ND
Article 6 Legal Personhood

ND
Article 7 Equality Before Law
Medium Practice

Platform interface is equally available to all users, supporting equal protection of its features.

ND
Article 8 Right to Remedy

ND
Article 9 No Arbitrary Detention

ND
Article 10 Fair Hearing

ND
Article 11 Presumption of Innocence

ND
Article 12 Privacy
Medium Practice

Platform shows no direct interference with privacy on this page; standard platform privacy practices apply.

ND
Article 13 Freedom of Movement

ND
Article 14 Asylum

ND
Article 15 Nationality

ND
Article 16 Marriage & Family

ND
Article 18 Freedom of Thought

ND
Article 20 Assembly & Association

ND
Article 21 Political Participation

ND
Article 22 Social Security

ND
Article 23 Work & Equal Pay

ND
Article 24 Rest & Leisure

ND
Article 25 Standard of Living
Medium Practice

Page uses standard web accessibility features (e.g., ARIA labels, semantic HTML) supporting equitable access.

ND
Article 30 No Destruction of Rights

Structural Channel
What the site does
+0.10
Article 19 Freedom of Expression
Medium Coverage Practice
Structural
+0.10
Context Modifier
+0.20
SETL
0.00

Platform enables public sharing of this repository, supporting free expression.

+0.10
Article 27 Cultural Participation
Medium Advocacy Practice
Structural
+0.10
Context Modifier
+0.20
SETL
+0.14

Platform enables sharing and development of this reasoning engine, supporting cultural participation.

0.00
Article 26 Education
Medium Practice
Structural
0.00
Context Modifier
+0.15
SETL
ND

Public repository facilitates open access to educational code resources.

ND
Preamble Preamble
Medium Advocacy

The repository is described as an 'LLM-Free Symbolic Reasoning Engine for Humanity's Last Exam (HLE)' — advocating for unbiased tools for humanity.

ND
Article 1 Freedom, Equality, Brotherhood
Medium Practice

Platform provides equal access for users to create repositories, supporting baseline equality but not directly engaging with concepts of dignity.

ND
Article 2 Non-Discrimination
Medium Practice

Platform does not visibly discriminate; repository access is non‑discriminatory.

ND
Article 3 Life, Liberty, Security

ND
Article 4 No Slavery

ND
Article 5 No Torture

ND
Article 6 Legal Personhood

ND
Article 7 Equality Before Law
Medium Practice

Platform interface is equally available to all users, supporting equal protection of its features.

ND
Article 8 Right to Remedy

ND
Article 9 No Arbitrary Detention

ND
Article 10 Fair Hearing

ND
Article 11 Presumption of Innocence

ND
Article 12 Privacy
Medium Practice

Platform shows no direct interference with privacy on this page; standard platform privacy practices apply.

ND
Article 13 Freedom of Movement

ND
Article 14 Asylum

ND
Article 15 Nationality

ND
Article 16 Marriage & Family

ND
Article 17 Property
Medium Practice

The repository is presented as a public project; ownership is implied by the GitHub username.

ND
Article 18 Freedom of Thought

ND
Article 20 Assembly & Association

ND
Article 21 Political Participation

ND
Article 22 Social Security

ND
Article 23 Work & Equal Pay

ND
Article 24 Rest & Leisure

ND
Article 25 Standard of Living
Medium Practice

Page uses standard web accessibility features (e.g., ARIA labels, semantic HTML) supporting equitable access.

ND
Article 28 Social & International Order
Medium Framing

The goal of an 'Engine for Humanity's Last Exam' frames a tool intended to operate within a just social order.

ND
Article 29 Duties to Community
Medium Framing

The 'bias-free score' framing implicitly acknowledges a duty to the community to produce unbiased tools.

ND
Article 30 No Destruction of Rights

Supplementary Signals
How this content communicates, beyond directional lean. Learn more
Epistemic Quality
How well-sourced and evidence-based is this content?
0.17 high claims
Sources
0.0
Evidence
0.0
Uncertainty
0.0
Purpose
0.8
Propaganda Flags
2 manipulative rhetoric techniques found
2 techniques detected
repetition
Repeated emphasis on 'bias-free' and 'structural verification' in the title and description.
loaded language
Use of phrases like 'Humanity's Last Exam' and 'bias-free score' which carry strong connotative weight.
Emotional Tone
Emotional character: positive/negative, intensity, authority
urgent
Valence
+0.6
Arousal
0.8
Dominance
0.9
Transparency
Does the content identify its author and disclose interests?
0.00
✗ Author
More signals: context, framing & audience
Solution Orientation
Does this content offer solutions or only describe problems?
0.46 solution oriented
Reader Agency
0.1
Stakeholder Voice
Whose perspectives are represented in this content?
0.40 1 perspective
Speaks: individuals
Temporal Framing
Is this content looking backward, at the present, or forward?
prospective long term
Geographic Scope
What geographic area does this content cover?
global
Complexity
How accessible is this content to a general audience?
technical medium jargon domain specific
Longitudinal · 3 evals
+1 0 −1 HN
Audit Trail 9 entries
2026-03-01 03:10 eval_success Evaluated: Mild positive (0.15) - -
2026-03-01 03:10 eval Evaluated by deepseek-v3.2: +0.15 (Mild positive) 10,036 tokens
2026-03-01 03:10 rater_validation_warn Validation warnings for model deepseek-v3.2: 0W 52R - -
2026-02-28 06:10 eval_success Light evaluated: Neutral (0.00) - -
2026-02-28 06:10 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral)
reasoning
GitHub repository page, no explicit human rights discussion
2026-02-28 06:10 rater_validation_warn Light validation warnings for model llama-4-scout-wai: 0W 1R - -
2026-02-28 05:54 eval_success Light evaluated: Neutral (0.00) - -
2026-02-28 05:54 eval Evaluated by llama-3.3-70b-wai: 0.00 (Neutral)
2026-02-28 05:54 rater_validation_warn Light validation warnings for model llama-3.3-70b-wai: 0W 1R - -