| |
| Evaluation Failed claude-p failed (exit 1) | |
Longitudinal
9 HN snapshots · 4 evals | |
Audit Trail
10 entries | 2026-02-28 15:39 | eval_success | Lite evaluated: Neutral (0.00) | - - | | 2026-02-28 15:39 |
eval
|
Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00 | | | reasoning ED neutral scientific content no rights stance | | 2026-02-28 15:28 | eval_success | Lite evaluated: Neutral (0.00) | - - | | 2026-02-28 15:28 |
eval
|
Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00 | | | reasoning Neutral medical content | | 2026-02-28 09:21 | eval_success | Light evaluated: Neutral (0.00) | - - | | 2026-02-28 09:21 |
eval
|
Evaluated by llama-4-scout-wai: 0.00 (Neutral) | | | reasoning ED neutral scientific content no rights stance | | 2026-02-28 09:21 | rater_validation_warn | Light validation warnings for model llama-4-scout-wai: 0W 1R | - - | | 2026-02-28 09:20 | eval_success | Light evaluated: Neutral (0.00) | - - | | 2026-02-28 09:20 |
eval
|
Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) | | | reasoning Neutral medical content | | 2026-02-28 09:20 | rater_validation_warn | Light validation warnings for model llama-3.3-70b-wai: 0W 1R | - - | | |
| |