Model Comparison
Model Editorial Structural Class Conf SETL Theme
@cf/meta/llama-4-scout-17b-16e-instruct lite +0.38 ND Moderate positive 0.80 0.00 open source
@cf/meta/llama-3.3-70b-instruct-fp8-fast lite 0.00 ND Neutral 0.80 0.00 Open Source
claude-haiku-4-5-20251001 -0.01 ND Neutral 0.11 Intellectual Property & Access
meta-llama/llama-3.3-70b-instruct:free ND ND
Section @cf/meta/llama-4-scout-17b-16e-instruct lite @cf/meta/llama-3.3-70b-instruct-fp8-fast lite claude-haiku-4-5-20251001 meta-llama/llama-3.3-70b-instruct:free
Preamble ND ND -0.20 ND
Article 1 ND ND -0.15 ND
Article 2 ND ND -0.10 ND
Article 3 ND ND 0.00 ND
Article 4 ND ND 0.00 ND
Article 5 ND ND 0.00 ND
Article 6 ND ND 0.00 ND
Article 7 ND ND 0.00 ND
Article 8 ND ND 0.00 ND
Article 9 ND ND 0.00 ND
Article 10 ND ND 0.00 ND
Article 11 ND ND 0.00 ND
Article 12 ND ND 0.00 ND
Article 13 ND ND 0.00 ND
Article 14 ND ND 0.00 ND
Article 15 ND ND 0.00 ND
Article 16 ND ND 0.00 ND
Article 17 ND ND 0.15 ND
Article 18 ND ND 0.00 ND
Article 19 ND ND 0.25 ND
Article 20 ND ND 0.00 ND
Article 21 ND ND 0.00 ND
Article 22 ND ND -0.15 ND
Article 23 ND ND 0.00 ND
Article 24 ND ND 0.00 ND
Article 25 ND ND 0.00 ND
Article 26 ND ND 0.00 ND
Article 27 ND ND 0.00 ND
Article 28 ND ND 0.00 ND
Article 29 ND ND -0.10 ND
Article 30 ND ND 0.00 ND
+0.38 Tests Are the New Moat (saewitz.com)
11 points by taubek 4 days ago | 4 comments on HN | Moderate positive ~lite vlite-1.4
Summary ~lite open source Acknowledges
Discusses commercial open-source incentives and AI's impact
EQ 0.60
SO 0.70
TD 0.50
Lite evaluation by llama-4-scout-wai · editorial channel only · no per-section breakdown available
Longitudinal · 26 evals
+1 0 −1 HN
Audit Trail 46 entries
2026-03-02 16:36 model_divergence Cross-model spread 0.39 exceeds threshold (2 models) - -
2026-03-02 16:36 eval_success Lite evaluated: Moderate positive (0.38) - -
2026-03-02 16:36 eval Evaluated by llama-4-scout-wai: +0.38 (Moderate positive) 0.00
reasoning
Editorial on commercial open-source incentives and AI impact
2026-03-02 16:26 eval_success Lite evaluated: Neutral (0.00) - -
2026-03-02 16:26 eval Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00
reasoning
Neutral tech editorial
2026-03-01 17:51 eval_success Lite evaluated: Neutral (0.00) - -
2026-03-01 17:51 eval Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00
reasoning
Neutral tech editorial
2026-03-01 16:56 model_divergence Cross-model spread 0.39 exceeds threshold (2 models) - -
2026-03-01 16:56 eval_success Lite evaluated: Moderate positive (0.38) - -
2026-03-01 16:56 eval Evaluated by llama-4-scout-wai: +0.38 (Moderate positive) 0.00
reasoning
Editorial on commercial open-source incentives and AI impact
2026-03-01 16:33 eval_success Lite evaluated: Neutral (0.00) - -
2026-03-01 16:33 eval Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00
reasoning
Neutral tech editorial
2026-03-01 15:00 model_divergence Cross-model spread 0.39 exceeds threshold (3 models) - -
2026-03-01 15:00 eval_success Lite evaluated: Moderate positive (0.38) - -
2026-03-01 15:00 eval Evaluated by llama-4-scout-wai: +0.38 (Moderate positive) 0.00
reasoning
Editorial on commercial open-source incentives and AI impact
2026-03-01 14:59 eval_success Lite evaluated: Neutral (0.00) - -
2026-03-01 14:59 model_divergence Cross-model spread 0.39 exceeds threshold (3 models) - -
2026-03-01 14:59 eval Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00
reasoning
Neutral tech editorial
2026-03-01 14:55 eval_success Lite evaluated: Moderate positive (0.38) - -
2026-03-01 14:55 eval Evaluated by llama-4-scout-wai: +0.38 (Moderate positive) 0.00
reasoning
Editorial on commercial open-source incentives and AI impact
2026-03-01 14:55 model_divergence Cross-model spread 0.39 exceeds threshold (3 models) - -
2026-03-01 14:54 eval_success Lite evaluated: Neutral (0.00) - -
2026-03-01 14:54 eval Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00
reasoning
Neutral tech editorial
2026-03-01 14:11 model_divergence Cross-model spread 0.39 exceeds threshold (3 models) - -
2026-03-01 14:11 eval_success Lite evaluated: Neutral (0.00) - -
2026-03-01 14:11 eval Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00
reasoning
Neutral tech editorial
2026-03-01 14:10 eval_success Lite evaluated: Moderate positive (0.38) - -
2026-03-01 14:10 eval Evaluated by llama-4-scout-wai: +0.38 (Moderate positive) 0.00
reasoning
Editorial on commercial open-source incentives and AI impact
2026-03-01 14:10 model_divergence Cross-model spread 0.39 exceeds threshold (2 models) - -
2026-03-01 14:06 eval_success Lite evaluated: Neutral (0.00) - -
2026-03-01 14:05 eval Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00
reasoning
Neutral tech editorial
2026-03-01 13:20 model_divergence Cross-model spread 0.39 exceeds threshold (2 models) - -
2026-03-01 13:20 eval Evaluated by llama-4-scout-wai: +0.38 (Moderate positive) 0.00
reasoning
Editorial on commercial open-source incentives and AI impact
2026-03-01 13:18 eval Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00
reasoning
Neutral tech editorial
2026-03-01 12:41 eval Evaluated by llama-4-scout-wai: +0.38 (Moderate positive) 0.00
reasoning
Editorial on commercial open-source incentives and AI impact
2026-03-01 12:40 eval Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) -0.04
reasoning
Neutral tech editorial
2026-02-28 16:25 eval Evaluated by llama-3.3-70b-wai: +0.04 (Neutral) 0.00
reasoning
Neutral tech editorial
2026-02-28 16:20 eval Evaluated by llama-3.3-70b-wai: +0.04 (Neutral) 0.00
reasoning
Neutral tech editorial
2026-02-28 15:20 eval Evaluated by llama-4-scout-wai: +0.38 (Moderate positive) 0.00
reasoning
Editorial on commercial open-source incentives and AI impact
2026-02-28 15:15 eval Evaluated by llama-4-scout-wai: +0.38 (Moderate positive) +0.38
reasoning
Editorial on commercial open-source incentives and AI impact
2026-02-28 13:59 eval Evaluated by llama-3.3-70b-wai: +0.04 (Neutral)
reasoning
Neutral tech editorial
2026-02-26 23:06 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral)
reasoning
Editorial on commercial open-source incentives and AI impact
2026-02-26 09:40 eval Evaluated by deepseek-v3.2: +0.25 (Mild positive) 12,021 tokens
2026-02-26 04:20 eval Evaluated by claude-haiku-4-5-20251001: -0.01 (Neutral) 13,811 tokens -0.11
2026-02-26 03:50 eval Evaluated by claude-haiku-4-5-20251001: +0.10 (Mild positive) 13,755 tokens +0.12
2026-02-26 03:49 eval Evaluated by claude-haiku-4-5-20251001: -0.02 (Neutral) 14,164 tokens