| 2026-03-16 02:09 | eval_success | PSQ evaluated: g-PSQ=0.120 (3 dims) | - - |
| 2026-03-16 02:09 |
eval
|
Evaluated by llama-4-scout-wai-psq: +0.12 (Mild positive) 0.00 | |
| 2026-03-16 01:12 | eval_success | Lite evaluated: Mild negative (-0.24) | - - |
| 2026-03-16 01:12 | model_divergence | Cross-model spread 0.31 exceeds threshold (2 models) | - - |
| 2026-03-16 01:12 |
eval
|
Evaluated by llama-4-scout-wai: -0.24 (Mild negative) +0.01 | |
| reasoning Financial news with no explicit human rights discussion |
| 2026-03-16 01:12 | rater_validation_warn | Lite validation warnings for model llama-4-scout-wai: 1W 0R | - - |
| 2026-03-15 23:59 | eval_success | Evaluated: Neutral (0.07) | - - |
| 2026-03-15 23:59 | model_divergence | Cross-model spread 0.32 exceeds threshold (2 models) | - - |
| 2026-03-15 23:59 |
eval
|
Evaluated by claude-haiku-4-5-20251001: +0.07 (Neutral) 14,189 tokens +0.05 | |
| 2026-03-15 23:59 | rater_validation_warn | Validation warnings for model claude-haiku-4-5-20251001: 0W 3R | - - |
| 2026-03-15 23:22 | eval_success | Evaluated: Neutral (0.02) | - - |
| 2026-03-15 23:22 | model_divergence | Cross-model spread 0.27 exceeds threshold (2 models) | - - |
| 2026-03-15 23:21 |
eval
|
Evaluated by claude-haiku-4-5-20251001: +0.02 (Neutral) 14,205 tokens | |
| 2026-03-15 23:21 | rater_validation_warn | Validation warnings for model claude-haiku-4-5-20251001: 0W 5R | - - |
| 2026-03-15 23:00 | eval_success | PSQ evaluated: g-PSQ=0.120 (3 dims) | - - |
| 2026-03-15 23:00 |
eval
|
Evaluated by llama-4-scout-wai-psq: +0.12 (Mild positive) 0.00 | |
| 2026-03-15 22:12 | eval_success | Lite evaluated: Mild negative (-0.25) | - - |
| 2026-03-15 22:12 |
eval
|
Evaluated by llama-4-scout-wai: -0.25 (Mild negative) 0.00 | |
| reasoning Financial news with no explicit human rights discussion |
| 2026-03-15 22:12 | rater_validation_warn | Lite validation warnings for model llama-4-scout-wai: 1W 0R | - - |
| 2026-03-15 18:12 | eval_success | PSQ evaluated: g-PSQ=0.120 (3 dims) | - - |
| 2026-03-15 18:12 |
eval
|
Evaluated by llama-4-scout-wai-psq: +0.12 (Mild positive) 0.00 | |
| 2026-03-15 17:44 | eval_success | Lite evaluated: Mild negative (-0.25) | - - |
| 2026-03-15 17:44 | rater_validation_warn | Lite validation warnings for model llama-4-scout-wai: 1W 0R | - - |
| 2026-03-15 17:44 |
eval
|
Evaluated by llama-4-scout-wai: -0.25 (Mild negative) 0.00 | |
| reasoning Financial news with no explicit human rights discussion |
| 2026-03-15 16:49 | eval_success | PSQ evaluated: g-PSQ=0.120 (3 dims) | - - |
| 2026-03-15 16:49 |
eval
|
Evaluated by llama-4-scout-wai-psq: +0.12 (Mild positive) 0.00 | |
| 2026-03-15 16:31 | eval_success | Lite evaluated: Mild negative (-0.25) | - - |
| 2026-03-15 16:31 | rater_validation_warn | Lite validation warnings for model llama-4-scout-wai: 1W 0R | - - |
| 2026-03-15 16:31 |
eval
|
Evaluated by llama-4-scout-wai: -0.25 (Mild negative) 0.00 | |
| reasoning Financial news with no explicit human rights discussion |
| 2026-03-15 00:30 | eval_success | PSQ evaluated: g-PSQ=-0.400 (3 dims) | - - |
| 2026-03-15 00:30 |
eval
|
Evaluated by llama-3.3-70b-wai-psq: -0.40 (Moderate negative) | |
| 2026-03-15 00:27 |
eval
|
Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) | |
| reasoning Financial news, neutral rights stance |
| 2026-03-14 23:31 |
eval
|
Evaluated by llama-4-scout-wai-psq: +0.12 (Mild positive) 0.00 | |
| 2026-03-14 23:26 |
eval
|
Evaluated by llama-4-scout-wai: -0.25 (Mild negative) 0.00 | |
| reasoning Financial news with no explicit human rights discussion |
| 2026-03-14 22:51 |
eval
|
Evaluated by llama-4-scout-wai-psq: +0.12 (Mild positive) 0.00 | |
| 2026-03-14 22:46 |
eval
|
Evaluated by llama-4-scout-wai: -0.25 (Mild negative) 0.00 | |
| reasoning Financial news with no explicit human rights discussion |
| 2026-03-14 21:48 |
eval
|
Evaluated by llama-4-scout-wai-psq: +0.12 (Mild positive) 0.00 | |
| 2026-03-14 21:43 |
eval
|
Evaluated by llama-4-scout-wai: -0.25 (Mild negative) 0.00 | |
| reasoning Financial news with no explicit human rights discussion |
| 2026-03-14 20:31 |
eval
|
Evaluated by llama-4-scout-wai-psq: +0.12 (Mild positive) 0.00 | |
| 2026-03-14 20:25 |
eval
|
Evaluated by llama-4-scout-wai: -0.25 (Mild negative) 0.00 | |
| reasoning Financial news with no explicit human rights discussion |
| 2026-03-14 19:31 |
eval
|
Evaluated by llama-4-scout-wai-psq: +0.12 (Mild positive) 0.00 | |
| 2026-03-14 19:26 |
eval
|
Evaluated by llama-4-scout-wai: -0.25 (Mild negative) 0.00 | |
| reasoning Financial news with no explicit human rights discussion |
| 2026-03-14 18:27 |
eval
|
Evaluated by llama-4-scout-wai-psq: +0.12 (Mild positive) 0.00 | |
| 2026-03-14 18:26 |
eval
|
Evaluated by llama-4-scout-wai: -0.25 (Mild negative) 0.00 | |
| reasoning Financial news with no explicit human rights discussion |
| 2026-03-14 16:49 |
eval
|
Evaluated by llama-4-scout-wai-psq: +0.12 (Mild positive) | |
| 2026-03-14 16:49 |
eval
|
Evaluated by llama-4-scout-wai: -0.25 (Mild negative) | |
| reasoning Financial news with no explicit human rights discussion |