0.00 Study identifies weaknesses in how AI systems are evaluated (www.oii.ox.ac.uk)
416 points by pseudolus 114 days ago | 192 comments on HN | Neutral ~lite vlite-1.4
Summary ~lite AI ethics Neutral
AI evaluation study
EQ 0.50
SO 0.50
TD 0.50
Lite evaluation by llama-3.3-70b-wai · editorial channel only · no per-section breakdown available
Longitudinal · 8 evals
+1 0 −1 HN
Audit Trail 18 entries
2026-03-01 18:10 eval_success Lite evaluated: Neutral (0.00) - -
2026-03-01 18:10 eval Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00
reasoning
Tech news no rights stance
2026-03-01 17:08 eval_success Lite evaluated: Neutral (0.00) - -
2026-03-01 17:08 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
ED neutral tech article no rights stance
2026-03-01 16:53 eval_success Lite evaluated: Neutral (0.00) - -
2026-03-01 16:53 eval Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00
reasoning
Tech news no rights stance
2026-03-01 15:27 eval_success Lite evaluated: Neutral (0.00) - -
2026-03-01 15:27 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
ED neutral tech article no rights stance
2026-03-01 15:23 eval_success Lite evaluated: Neutral (0.00) - -
2026-03-01 15:23 eval Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00
reasoning
Tech news no rights stance
2026-03-01 15:19 eval_success Lite evaluated: Neutral (0.00) - -
2026-03-01 15:19 eval Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00
reasoning
Tech news no rights stance
2026-02-28 08:03 eval_success Light evaluated: Neutral (0.00) - -
2026-02-28 08:03 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral)
reasoning
ED neutral tech article no rights stance
2026-02-28 08:03 rater_validation_warn Light validation warnings for model llama-4-scout-wai: 0W 1R - -
2026-02-28 07:52 eval_success Light evaluated: Neutral (0.00) - -
2026-02-28 07:52 eval Evaluated by llama-3.3-70b-wai: 0.00 (Neutral)
reasoning
Tech news no rights stance
2026-02-28 07:52 rater_validation_warn Light validation warnings for model llama-3.3-70b-wai: 0W 1R - -