Model Comparison
Model Editorial Structural Class Conf SETL Theme
@cf/meta/llama-4-scout-17b-16e-instruct lite +0.70 ND Strong positive 0.90 0.00 Investigative Journalism
@cf/meta/llama-3.3-70b-instruct-fp8-fast lite +0.70 ND Strong positive 0.90 0.00 Investigative Journalism
claude-haiku-4-5 lite +0.46 ND Moderate positive 0.77 0.00 Investigative abuse exposure
deepseek/deepseek-v3.2-20251201 +0.80 +0.24 Strong positive 0.09 0.65 Free Expression
meta-llama/llama-3.3-70b-instruct:free ND ND
Section @cf/meta/llama-4-scout-17b-16e-instruct lite @cf/meta/llama-3.3-70b-instruct-fp8-fast lite claude-haiku-4-5 lite deepseek/deepseek-v3.2-20251201 meta-llama/llama-3.3-70b-instruct:free
Preamble ND ND ND 0.90 ND
Article 1 ND ND ND ND ND
Article 2 ND ND ND ND ND
Article 3 ND ND ND ND ND
Article 4 ND ND ND ND ND
Article 5 ND ND ND ND ND
Article 6 ND ND ND ND ND
Article 7 ND ND ND ND ND
Article 8 ND ND ND ND ND
Article 9 ND ND ND ND ND
Article 10 ND ND ND ND ND
Article 11 ND ND ND ND ND
Article 12 ND ND ND ND ND
Article 13 ND ND ND ND ND
Article 14 ND ND ND ND ND
Article 15 ND ND ND ND ND
Article 16 ND ND ND ND ND
Article 17 ND ND ND ND ND
Article 18 ND ND ND ND ND
Article 19 ND ND ND 1.00 ND
Article 20 ND ND ND 0.95 ND
Article 21 ND ND ND ND ND
Article 22 ND ND ND ND ND
Article 23 ND ND ND ND ND
Article 24 ND ND ND ND ND
Article 25 ND ND ND ND ND
Article 26 ND ND ND ND ND
Article 27 ND ND ND ND ND
Article 28 ND ND ND ND ND
Article 29 ND ND ND ND ND
Article 30 ND ND ND ND ND
+0.70 [CAL-LITE] ProPublica (EP-4) (www.propublica.org)
0 points 3 days ago | 0 comments on HN | Strong positive ~lite vlite-1.4
Summary ~lite Investigative Journalism Advocates
ProPublica is an independent, non-profit newsroom producing investigative journalism in the public interest.
EQ 0.80
SO 0.70
TD 0.90
Lite evaluation by llama-4-scout-wai · editorial channel only · no per-section breakdown available
Longitudinal · 19 evals
+1 0 −1 HN
Audit Trail 39 entries
2026-02-28 11:17 model_divergence Cross-model spread 0.30 exceeds threshold (4 models) - -
2026-02-28 11:17 eval_success Lite evaluated: Strong positive (0.70) - -
2026-02-28 11:17 eval Evaluated by llama-4-scout-wai: +0.70 (Strong positive) 0.00
reasoning
Investigative journalism site, implies rights-focused content
2026-02-28 11:17 rater_validation_warn Lite validation warnings for model llama-4-scout-wai: 0W 1R - -
2026-02-28 11:12 model_divergence Cross-model spread 0.30 exceeds threshold (4 models) - -
2026-02-28 11:12 eval_success Lite evaluated: Strong positive (0.70) - -
2026-02-28 11:12 eval Evaluated by llama-4-scout-wai: +0.70 (Strong positive) -0.10
reasoning
Investigative journalism site, implies rights-focused content
2026-02-28 11:12 rater_validation_warn Lite validation warnings for model llama-4-scout-wai: 0W 1R - -
2026-02-28 09:03 model_divergence Cross-model spread 0.30 exceeds threshold (3 models) - -
2026-02-28 09:03 eval_success Light evaluated: Strong positive (0.70) - -
2026-02-28 09:03 eval Evaluated by llama-3.3-70b-wai: +0.70 (Strong positive) -0.10
reasoning
Investigative journalism site
2026-02-28 09:03 rater_validation_warn Light validation warnings for model llama-3.3-70b-wai: 0W 1R - -
2026-02-28 05:16 eval Evaluated by claude-haiku-4-5: +0.46 (Moderate positive) -0.16
2026-02-28 01:41 dlq Dead-lettered after 1 attempts: [CAL-LIGHT] ProPublica (EP-4) - -
2026-02-28 01:39 rate_limit OpenRouter rate limited (429) model=llama-3.3-70b - -
2026-02-28 01:38 rate_limit OpenRouter rate limited (429) model=llama-3.3-70b - -
2026-02-28 01:36 rate_limit OpenRouter rate limited (429) model=llama-3.3-70b - -
2026-02-28 01:36 dlq_replay DLQ message 97625 replayed to LLAMA_QUEUE: [CAL-LIGHT] ProPublica (EP-4) - -
2026-02-28 00:52 eval_success Light evaluated: Strong positive (0.80) - -
2026-02-28 00:52 eval Evaluated by llama-4-scout-wai: +0.80 (Strong positive) 0.00
reasoning
Investigative journalism site, implies rights-focused content
2026-02-28 00:44 eval Evaluated by claude-haiku-4-5: +0.62 (Strong positive) -0.10
2026-02-28 00:41 eval_success Light evaluated: Strong positive (0.80) - -
2026-02-28 00:41 eval Evaluated by llama-3.3-70b-wai: +0.80 (Strong positive) 0.00
reasoning
Investigative journalism site
2026-02-28 00:28 eval Evaluated by claude-haiku-4-5: +0.72 (Strong positive) +0.02
2026-02-28 00:12 eval_success Light evaluated: Strong positive (0.80) - -
2026-02-28 00:12 eval Evaluated by llama-4-scout-wai: +0.80 (Strong positive) 0.00
reasoning
Investigative journalism site, implies rights-focused content
2026-02-28 00:12 eval_success Light evaluated: Strong positive (0.80) - -
2026-02-28 00:12 eval Evaluated by llama-3.3-70b-wai: +0.80 (Strong positive)
reasoning
Investigative journalism site
2026-02-28 00:01 eval Evaluated by claude-haiku-4-5: +0.70 (Strong positive) 0.00
2026-02-27 21:51 eval_success Light evaluated: Strong positive (0.80) - -
2026-02-27 21:51 eval Evaluated by llama-4-scout-wai: +0.80 (Strong positive)
reasoning
Investigative journalism site, implies rights-focused content
2026-02-27 21:47 eval Evaluated by claude-haiku-4-5: +0.70 (Strong positive) 0.00
2026-02-27 21:36 rater_validation_fail Light parse failure for model llama-4-scout-wai: SyntaxError: Unexpected token '+', ..."itorial": +0.8, "... is not valid JSON - -
2026-02-27 21:32 eval Evaluated by claude-haiku-4-5: +0.70 (Strong positive) +0.05
2026-02-27 21:10 eval Evaluated by claude-haiku-4-5: +0.65 (Strong positive) -0.03
2026-02-27 21:01 eval Evaluated by claude-haiku-4-5: +0.68 (Strong positive) +0.13
2026-02-27 21:01 eval Evaluated by claude-haiku-4-5: +0.55 (Moderate positive) -0.17
2026-02-27 15:17 eval Evaluated by deepseek-v3.2: +0.76 (Strong positive) 14,237 tokens
2026-02-27 13:01 eval Evaluated by claude-haiku-4-5: +0.72 (Strong positive)