Longitudinal · 32 evals
+1 0 −1 HN
Audit Trail 52 entries
2026-03-01 16:27 rater_validation_fail Parse failure for model deepseek-v3.2: Error: Failed to parse OpenRouter JSON: SyntaxError: Expected double-quoted property name in JSON at position 13416 (line 310 column 5). Extracted text starts with: { "schema_version": "3.7", "eva - -
2026-03-01 08:29 eval_success Evaluated: Neutral (0.00) - -
2026-03-01 08:29 eval Evaluated by deepseek-v3.2: 0.00 (Neutral) 7,970 tokens -0.20
2026-02-28 22:58 dlq Dead-lettered after 1 attempts: Google outage – resolved - -
2026-02-28 22:58 eval_failure Evaluation failed: AbortError: The operation was aborted - -
2026-02-28 22:39 eval_failure Evaluation failed: AbortError: The operation was aborted - -
2026-02-28 22:06 dlq Dead-lettered after 1 attempts: Google outage – resolved - -
2026-02-28 22:06 eval_failure Evaluation failed: AbortError: The operation was aborted - -
2026-02-28 21:51 eval_failure Evaluation failed: AbortError: The operation was aborted - -
2026-02-28 15:21 eval_success Lite evaluated: Neutral (0.00) - -
2026-02-28 15:21 eval Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00
reasoning
Tech issue reporting
2026-02-28 13:40 eval_success Lite evaluated: Neutral (0.00) - -
2026-02-28 13:40 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Tech news site, neutral stance on human rights
2026-02-28 12:17 eval_success Evaluated: Mild positive (0.20) - -
2026-02-28 12:17 eval Evaluated by deepseek-v3.2: +0.20 (Mild positive) 8,009 tokens -0.22
2026-02-28 12:17 rater_validation_warn Validation warnings for model deepseek-v3.2: 1W 0R - -
2026-02-28 12:07 model_divergence Cross-model spread 0.42 exceeds threshold (4 models) - -
2026-02-28 12:07 eval_success Evaluated: Moderate positive (0.42) - -
2026-02-28 12:07 eval Evaluated by deepseek-v3.2: +0.42 (Moderate positive) 8,070 tokens
2026-02-28 12:07 rater_validation_warn Validation warnings for model deepseek-v3.2: 1W 2R - -
2026-02-28 11:40 eval_success Lite evaluated: Neutral (0.00) - -
2026-02-28 11:40 eval Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00
reasoning
Tech issue reporting
2026-02-28 11:40 rater_validation_warn Lite validation warnings for model llama-3.3-70b-wai: 0W 1R - -
2026-02-28 11:39 eval_success Lite evaluated: Neutral (0.00) - -
2026-02-28 11:39 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Tech news site, neutral stance on human rights
2026-02-28 11:39 rater_validation_warn Lite validation warnings for model llama-4-scout-wai: 0W 1R - -
2026-02-28 11:00 eval_success Lite evaluated: Neutral (0.00) - -
2026-02-28 11:00 eval Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00
reasoning
Tech issue reporting
2026-02-28 10:10 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Tech news site, neutral stance on human rights
2026-02-28 09:53 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Tech news site, neutral stance on human rights
2026-02-28 09:12 eval Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00
reasoning
Tech issue reporting
2026-02-28 08:05 eval Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00
reasoning
Tech issue reporting
2026-02-28 07:46 eval Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00
reasoning
Tech issue reporting
2026-02-28 07:06 eval Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00
reasoning
Tech issue reporting
2026-02-28 06:50 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Tech news site, neutral stance on human rights
2026-02-28 06:13 eval Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00
reasoning
Tech issue reporting
2026-02-28 06:00 eval Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00
reasoning
Tech issue reporting
2026-02-28 05:30 eval Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00
reasoning
Tech issue reporting
2026-02-28 05:21 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Tech news site, neutral stance on human rights
2026-02-28 05:11 eval Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00
reasoning
Tech issue reporting
2026-02-28 05:06 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Tech news site, neutral stance on human rights
2026-02-28 04:43 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Tech news site, neutral stance on human rights
2026-02-28 04:34 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Tech news site, neutral stance on human rights
2026-02-28 04:25 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Tech news site, neutral stance on human rights
2026-02-28 04:08 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Tech news site, neutral stance on human rights
2026-02-28 03:52 eval Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00
reasoning
Tech issue reporting
2026-02-28 03:49 eval Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00
reasoning
Tech issue reporting
2026-02-28 03:13 eval Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00
reasoning
Tech issue reporting
2026-02-28 02:42 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Tech news site, neutral stance on human rights
2026-02-28 02:33 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Tech news site, neutral stance on human rights
2026-02-28 01:43 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral)
reasoning
Tech news site, neutral stance on human rights
2026-02-28 01:31 eval Evaluated by llama-3.3-70b-wai: 0.00 (Neutral)
reasoning
Tech issue reporting