| 2026-03-06 03:55 | eval_success | PSQ evaluated: g-PSQ=0.280 (3 dims) | - - |
| 2026-03-06 03:55 |
eval
|
Evaluated by llama-4-scout-wai-psq: +0.28 (Mild positive) 0.00 | |
| 2026-03-06 03:42 | eval_success | PSQ evaluated: g-PSQ=0.120 (3 dims) | - - |
| 2026-03-06 03:42 |
eval
|
Evaluated by llama-3.3-70b-wai-psq: +0.12 (Mild positive) 0.00 | |
| 2026-03-05 07:04 | eval_success | PSQ evaluated: g-PSQ=0.280 (3 dims) | - - |
| 2026-03-05 07:04 |
eval
|
Evaluated by llama-4-scout-wai-psq: +0.28 (Mild positive) | |
| 2026-03-05 06:57 | eval_success | PSQ evaluated: g-PSQ=0.120 (3 dims) | - - |
| 2026-03-05 06:57 |
eval
|
Evaluated by llama-3.3-70b-wai-psq: +0.12 (Mild positive) | |
| 2026-03-04 20:11 | eval_success | Lite evaluated: Neutral (0.00) | - - |
| 2026-03-04 20:11 |
eval
|
Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) +0.14 | |
| reasoning Math blog post, no rights discussion |
| 2026-03-04 20:11 | rater_validation_warn | Lite validation warnings for model llama-3.3-70b-wai: 1W 0R | - - |
| 2026-03-04 20:04 | eval_success | Lite evaluated: Neutral (-0.08) | - - |
| 2026-03-04 20:04 |
eval
|
Evaluated by llama-4-scout-wai: -0.08 (Neutral) 0.00 | |
| reasoning Math blog post, no explicit human rights discussion, transparent about sources and methods |
| 2026-03-04 20:04 | rater_validation_warn | Lite validation warnings for model llama-4-scout-wai: 1W 0R | - - |
| 2026-03-04 19:21 | eval_success | Lite evaluated: Mild negative (-0.14) | - - |
| 2026-03-04 19:21 |
eval
|
Evaluated by llama-3.3-70b-wai: -0.14 (Mild negative) 0.00 | |
| reasoning Math blog post, no rights discussion |
| 2026-03-04 19:21 | rater_validation_warn | Lite validation warnings for model llama-3.3-70b-wai: 1W 0R | - - |
| 2026-03-04 19:16 | eval_success | Lite evaluated: Mild negative (-0.14) | - - |
| 2026-03-04 19:15 |
eval
|
Evaluated by llama-3.3-70b-wai: -0.14 (Mild negative) 0.00 | |
| reasoning Math blog post, no rights discussion |
| 2026-03-04 19:15 | rater_validation_warn | Lite validation warnings for model llama-3.3-70b-wai: 1W 0R | - - |
| 2026-03-04 19:11 | eval_success | Lite evaluated: Neutral (-0.08) | - - |
| 2026-03-04 19:11 |
eval
|
Evaluated by llama-4-scout-wai: -0.08 (Neutral) 0.00 | |
| reasoning Math blog post, no explicit human rights discussion, transparent about sources and methods |
| 2026-03-04 19:11 | rater_validation_warn | Lite validation warnings for model llama-4-scout-wai: 1W 0R | - - |
| 2026-03-04 18:12 | eval_success | Lite evaluated: Mild negative (-0.14) | - - |
| 2026-03-04 18:12 |
eval
|
Evaluated by llama-3.3-70b-wai: -0.14 (Mild negative) 0.00 | |
| reasoning Math blog post, no rights discussion |
| 2026-03-04 18:12 | rater_validation_warn | Lite validation warnings for model llama-3.3-70b-wai: 1W 0R | - - |
| 2026-03-04 18:08 | eval_success | Lite evaluated: Neutral (-0.08) | - - |
| 2026-03-04 18:08 |
eval
|
Evaluated by llama-4-scout-wai: -0.08 (Neutral) 0.00 | |
| reasoning Math blog post, no explicit human rights discussion, transparent about sources and methods |
| 2026-03-04 18:08 | rater_validation_warn | Lite validation warnings for model llama-4-scout-wai: 1W 0R | - - |
| 2026-03-04 18:03 | eval_success | Lite evaluated: Neutral (-0.08) | - - |
| 2026-03-04 18:03 | rater_validation_warn | Lite validation warnings for model llama-4-scout-wai: 1W 0R | - - |
| 2026-03-04 18:03 |
eval
|
Evaluated by llama-4-scout-wai: -0.08 (Neutral) 0.00 | |
| reasoning Math blog post, no explicit human rights discussion, transparent about sources and methods |
| 2026-03-04 16:42 |
eval
|
Evaluated by llama-3.3-70b-wai: -0.14 (Mild negative) 0.00 | |
| reasoning Math blog post, no rights discussion |
| 2026-03-04 16:33 |
eval
|
Evaluated by llama-4-scout-wai: -0.08 (Neutral) 0.00 | |
| reasoning Math blog post, no explicit human rights discussion, transparent about sources and methods |
| 2026-03-04 16:03 |
eval
|
Evaluated by llama-3.3-70b-wai: -0.14 (Mild negative) 0.00 | |
| reasoning Math blog post, no rights discussion |
| 2026-03-04 15:58 |
eval
|
Evaluated by llama-3.3-70b-wai: -0.14 (Mild negative) 0.00 | |
| reasoning Math blog post, no rights discussion |
| 2026-03-04 15:49 |
eval
|
Evaluated by llama-4-scout-wai: -0.08 (Neutral) 0.00 | |
| reasoning Math blog post, no explicit human rights discussion, transparent about sources and methods |
| 2026-03-04 15:19 |
eval
|
Evaluated by llama-3.3-70b-wai: -0.14 (Mild negative) 0.00 | |
| reasoning Math blog post, no rights discussion |
| 2026-03-04 15:11 |
eval
|
Evaluated by llama-4-scout-wai: -0.08 (Neutral) 0.00 | |
| reasoning Math blog post, no explicit human rights discussion, transparent about sources and methods |
| 2026-03-04 15:06 |
eval
|
Evaluated by llama-4-scout-wai: -0.08 (Neutral) 0.00 | |
| reasoning Math blog post, no explicit human rights discussion, transparent about sources and methods |
| 2026-03-04 14:39 |
eval
|
Evaluated by llama-3.3-70b-wai: -0.14 (Mild negative) 0.00 | |
| reasoning Math blog post, no rights discussion |
| 2026-03-04 14:34 |
eval
|
Evaluated by llama-3.3-70b-wai: -0.14 (Mild negative) 0.00 | |
| reasoning Math blog post, no rights discussion |
| 2026-03-04 14:24 |
eval
|
Evaluated by llama-4-scout-wai: -0.08 (Neutral) 0.00 | |
| reasoning Math blog post, no explicit human rights discussion, transparent about sources and methods |
| 2026-03-04 14:18 |
eval
|
Evaluated by llama-4-scout-wai: -0.08 (Neutral) 0.00 | |
| reasoning Math blog post, no explicit human rights discussion, transparent about sources and methods |
| 2026-03-04 13:55 |
eval
|
Evaluated by llama-3.3-70b-wai: -0.14 (Mild negative) 0.00 | |
| reasoning Math blog post, no rights discussion |
| 2026-03-04 13:36 |
eval
|
Evaluated by llama-4-scout-wai: -0.08 (Neutral) 0.00 | |
| reasoning Math blog post, no explicit human rights discussion, transparent about sources and methods |
| 2026-03-04 13:18 |
eval
|
Evaluated by llama-3.3-70b-wai: -0.14 (Mild negative) 0.00 | |
| reasoning Math blog post, no rights discussion |
| 2026-03-04 13:02 |
eval
|
Evaluated by llama-4-scout-wai: -0.08 (Neutral) 0.00 | |
| reasoning Math blog post, no explicit human rights discussion, transparent about sources and methods |
| 2026-03-04 12:42 |
eval
|
Evaluated by llama-3.3-70b-wai: -0.14 (Mild negative) 0.00 | |
| reasoning Math blog post, no rights discussion |
| 2026-03-04 12:19 |
eval
|
Evaluated by llama-4-scout-wai: -0.08 (Neutral) 0.00 | |
| reasoning Math blog post, no explicit human rights discussion, transparent about sources and methods |
| 2026-03-04 12:13 |
eval
|
Evaluated by llama-4-scout-wai: -0.08 (Neutral) 0.00 | |
| reasoning Math blog post, no explicit human rights discussion, transparent about sources and methods |
| 2026-03-04 12:05 |
eval
|
Evaluated by llama-3.3-70b-wai: -0.14 (Mild negative) 0.00 | |
| reasoning Math blog post, no rights discussion |
| 2026-03-04 11:34 |
eval
|
Evaluated by llama-4-scout-wai: -0.08 (Neutral) 0.00 | |
| reasoning Math blog post, no explicit human rights discussion, transparent about sources and methods |
| 2026-03-04 11:22 |
eval
|
Evaluated by llama-3.3-70b-wai: -0.14 (Mild negative) 0.00 | |
| reasoning Math blog post, no rights discussion |
| 2026-03-04 10:48 |
eval
|
Evaluated by llama-4-scout-wai: -0.08 (Neutral) 0.00 | |
| reasoning Math blog post, no explicit human rights discussion, transparent about sources and methods |
| 2026-03-04 10:36 |
eval
|
Evaluated by llama-3.3-70b-wai: -0.14 (Mild negative) 0.00 | |
| reasoning Math blog post, no rights discussion |
| 2026-03-04 10:10 |
eval
|
Evaluated by llama-4-scout-wai: -0.08 (Neutral) 0.00 | |
| reasoning Math blog post, no explicit human rights discussion, transparent about sources and methods |
| 2026-03-04 10:05 |
eval
|
Evaluated by llama-3.3-70b-wai: -0.14 (Mild negative) 0.00 | |
| reasoning Math blog post, no rights discussion |
| 2026-03-04 09:30 |
eval
|
Evaluated by llama-4-scout-wai: -0.08 (Neutral) 0.00 | |
| reasoning Math blog post, no explicit human rights discussion, transparent about sources and methods |
| 2026-03-04 09:23 |
eval
|
Evaluated by llama-3.3-70b-wai: -0.14 (Mild negative) 0.00 | |
| reasoning Math blog post, no rights discussion |
| 2026-03-04 08:51 |
eval
|
Evaluated by llama-4-scout-wai: -0.08 (Neutral) 0.00 | |
| reasoning Math blog post, no explicit human rights discussion, transparent about sources and methods |
| 2026-03-04 08:48 |
eval
|
Evaluated by llama-3.3-70b-wai: -0.14 (Mild negative) 0.00 | |
| reasoning Math blog post, no rights discussion |
| 2026-03-04 08:45 |
eval
|
Evaluated by llama-4-scout-wai: -0.08 (Neutral) 0.00 | |
| reasoning Math blog post, no explicit human rights discussion, transparent about sources and methods |
| 2026-03-04 08:13 |
eval
|
Evaluated by llama-3.3-70b-wai: -0.14 (Mild negative) 0.00 | |
| reasoning Math blog post, no rights discussion |
| 2026-03-04 08:09 |
eval
|
Evaluated by llama-4-scout-wai: -0.08 (Neutral) 0.00 | |
| reasoning Math blog post, no explicit human rights discussion, transparent about sources and methods |
| 2026-03-04 08:04 |
eval
|
Evaluated by llama-4-scout-wai: -0.08 (Neutral) -0.14 | |
| reasoning Math blog post, no explicit human rights discussion, transparent about sources and methods |
| 2026-03-04 07:41 |
eval
|
Evaluated by llama-3.3-70b-wai: -0.14 (Mild negative) -0.21 | |
| reasoning Math blog post, no rights discussion |
| 2026-03-04 07:25 |
eval
|
Evaluated by llama-4-scout-wai: +0.06 (Neutral) 0.00 | |
| reasoning Math blog post, no explicit human rights discussion, transparent about sources and methods |
| 2026-03-04 06:59 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.07 (Neutral) 0.00 | |
| reasoning Math blog post, no rights discussion |
| 2026-03-04 06:16 |
eval
|
Evaluated by llama-4-scout-wai: +0.06 (Neutral) 0.00 | |
| reasoning Math blog post, no explicit human rights discussion, transparent about sources and methods |
| 2026-03-04 06:11 |
eval
|
Evaluated by llama-4-scout-wai: +0.06 (Neutral) 0.00 | |
| reasoning Math blog post, no explicit human rights discussion, transparent about sources and methods |
| 2026-03-04 05:50 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.07 (Neutral) 0.00 | |
| reasoning Math blog post, no rights discussion |
| 2026-03-04 05:34 |
eval
|
Evaluated by llama-4-scout-wai: +0.06 (Neutral) 0.00 | |
| reasoning Math blog post, no explicit human rights discussion, transparent about sources and methods |
| 2026-03-04 05:30 |
eval
|
Evaluated by llama-4-scout-wai: +0.06 (Neutral) 0.00 | |
| reasoning Math blog post, no explicit human rights discussion, transparent about sources and methods |
| 2026-03-04 05:06 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.07 (Neutral) 0.00 | |
| reasoning Math blog post, no rights discussion |
| 2026-03-04 04:40 |
eval
|
Evaluated by llama-4-scout-wai: +0.06 (Neutral) 0.00 | |
| reasoning Math blog post, no explicit human rights discussion, transparent about sources and methods |
| 2026-03-04 04:24 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.07 (Neutral) 0.00 | |
| reasoning Math blog post, no rights discussion |
| 2026-03-04 04:19 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.07 (Neutral) 0.00 | |
| reasoning Math blog post, no rights discussion |
| 2026-03-04 04:04 |
eval
|
Evaluated by llama-4-scout-wai: +0.06 (Neutral) 0.00 | |
| reasoning Math blog post, no explicit human rights discussion, transparent about sources and methods |
| 2026-03-04 03:59 |
eval
|
Evaluated by llama-4-scout-wai: +0.06 (Neutral) 0.00 | |
| reasoning Math blog post, no explicit human rights discussion, transparent about sources and methods |
| 2026-03-04 03:49 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.07 (Neutral) 0.00 | |
| reasoning Math blog post, no rights discussion |
| 2026-03-04 03:20 |
eval
|
Evaluated by llama-4-scout-wai: +0.06 (Neutral) 0.00 | |
| reasoning Math blog post, no explicit human rights discussion, transparent about sources and methods |
| 2026-03-04 03:12 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.07 (Neutral) 0.00 | |
| reasoning Math blog post, no rights discussion |
| 2026-03-04 02:40 |
eval
|
Evaluated by llama-4-scout-wai: +0.06 (Neutral) 0.00 | |
| reasoning Math blog post, no explicit human rights discussion, transparent about sources and methods |
| 2026-03-04 02:32 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.07 (Neutral) 0.00 | |
| reasoning Math blog post, no rights discussion |
| 2026-03-04 02:05 |
eval
|
Evaluated by llama-4-scout-wai: +0.06 (Neutral) 0.00 | |
| reasoning Math blog post, no explicit human rights discussion, transparent about sources and methods |
| 2026-03-04 02:00 |
eval
|
Evaluated by llama-4-scout-wai: +0.06 (Neutral) 0.00 | |
| reasoning Math blog post, no explicit human rights discussion, transparent about sources and methods |
| 2026-03-04 01:57 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.07 (Neutral) 0.00 | |
| reasoning Math blog post, no rights discussion |
| 2026-03-04 01:18 |
eval
|
Evaluated by llama-4-scout-wai: +0.06 (Neutral) 0.00 | |
| reasoning Math blog post, no explicit human rights discussion, transparent about sources and methods |
| 2026-03-04 01:17 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.07 (Neutral) 0.00 | |
| reasoning Math blog post, no rights discussion |
| 2026-03-04 00:41 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.07 (Neutral) +0.04 | |
| reasoning Math blog post, no rights discussion |
| 2026-03-04 00:40 |
eval
|
Evaluated by llama-4-scout-wai: +0.06 (Neutral) 0.00 | |
| reasoning Math blog post, no explicit human rights discussion, transparent about sources and methods |
| 2026-03-04 00:05 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.03 (Neutral) 0.00 | |
| reasoning Math blog post, no rights discussion |
| 2026-03-04 00:02 |
eval
|
Evaluated by llama-4-scout-wai: +0.06 (Neutral) 0.00 | |
| reasoning Math blog post, no explicit human rights discussion, transparent about sources and methods |
| 2026-03-03 23:57 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.03 (Neutral) 0.00 | |
| reasoning Math blog post, no rights discussion |
| 2026-03-03 23:20 |
eval
|
Evaluated by llama-4-scout-wai: +0.06 (Neutral) 0.00 | |
| reasoning Math blog post, no explicit human rights discussion, transparent about sources and methods |
| 2026-03-03 23:18 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.03 (Neutral) 0.00 | |
| reasoning Math blog post, no rights discussion |
| 2026-03-03 23:13 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.03 (Neutral) -0.04 | |
| reasoning Math blog post, no rights discussion |
| 2026-03-03 22:49 |
eval
|
Evaluated by llama-4-scout-wai: +0.06 (Neutral) 0.00 | |
| reasoning Math blog post, no explicit human rights discussion, transparent about sources and methods |
| 2026-03-03 22:43 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.07 (Neutral) +0.04 | |
| reasoning Math blog post, no rights discussion |
| 2026-03-03 22:21 |
eval
|
Evaluated by llama-4-scout-wai: +0.06 (Neutral) 0.00 | |
| reasoning Math blog post, no explicit human rights discussion, transparent about sources and methods |
| 2026-03-03 22:16 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.03 (Neutral) 0.00 | |
| reasoning Math blog post, no rights discussion |
| 2026-03-03 22:11 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.03 (Neutral) 0.00 | |
| reasoning Math blog post, no rights discussion |
| 2026-03-03 21:22 |
eval
|
Evaluated by llama-4-scout-wai: +0.06 (Neutral) 0.00 | |
| reasoning Math blog post, no explicit human rights discussion, transparent about sources and methods |
| 2026-03-03 21:18 |
eval
|
Evaluated by llama-4-scout-wai: +0.06 (Neutral) 0.00 | |
| reasoning Math blog post, no explicit human rights discussion, transparent about sources and methods |
| 2026-03-03 21:13 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.03 (Neutral) 0.00 | |
| reasoning Math blog post, no rights discussion |
| 2026-03-03 20:45 |
eval
|
Evaluated by llama-4-scout-wai: +0.06 (Neutral) 0.00 | |
| reasoning Math blog post, no explicit human rights discussion, transparent about sources and methods |
| 2026-03-03 20:41 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.03 (Neutral) 0.00 | |
| reasoning Math blog post, no rights discussion |
| 2026-03-03 20:13 |
eval
|
Evaluated by llama-4-scout-wai: +0.06 (Neutral) 0.00 | |
| reasoning Math blog post, no explicit human rights discussion, transparent about sources and methods |
| 2026-03-03 20:08 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.03 (Neutral) 0.00 | |
| reasoning Math blog post, no rights discussion |
| 2026-03-03 20:03 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.03 (Neutral) 0.00 | |
| reasoning Math blog post, no rights discussion |
| 2026-03-03 19:30 |
eval
|
Evaluated by llama-4-scout-wai: +0.06 (Neutral) 0.00 | |
| reasoning Math blog post, no explicit human rights discussion, transparent about sources and methods |
| 2026-03-03 19:18 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.03 (Neutral) 0.00 | |
| reasoning Math blog post, no rights discussion |
| 2026-03-03 18:45 |
eval
|
Evaluated by llama-4-scout-wai: +0.06 (Neutral) 0.00 | |
| reasoning Math blog post, no explicit human rights discussion, transparent about sources and methods |
| 2026-03-03 18:32 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.03 (Neutral) -0.04 | |
| reasoning Math blog post, no rights discussion |
| 2026-03-03 18:09 |
eval
|
Evaluated by llama-4-scout-wai: +0.06 (Neutral) 0.00 | |
| reasoning Math blog post, no explicit human rights discussion, transparent about sources and methods |
| 2026-03-03 18:03 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.07 (Neutral) +0.04 | |
| reasoning Math blog post, no rights discussion |
| 2026-03-03 17:26 |
eval
|
Evaluated by llama-4-scout-wai: +0.06 (Neutral) 0.00 | |
| reasoning Math blog post, no explicit human rights discussion, transparent about sources and methods |
| 2026-03-03 17:22 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.03 (Neutral) 0.00 | |
| reasoning Math blog post, no rights discussion |
| 2026-03-03 16:51 |
eval
|
Evaluated by llama-4-scout-wai: +0.06 (Neutral) | |
| reasoning Math blog post, no explicit human rights discussion, transparent about sources and methods |
| 2026-03-03 16:50 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.03 (Neutral) | |
| reasoning Math blog post, no rights discussion |