| 2026-03-02 08:17 | eval_success | Evaluated: Neutral (0.00) | - - |
| 2026-03-02 08:17 | model_divergence | Cross-model spread 0.32 exceeds threshold (3 models) | - - |
| 2026-03-02 08:17 |
eval
|
Evaluated by deepseek-v3.2: +0.00 (Neutral) 17,891 tokens -0.30 | |
| 2026-03-01 18:48 | eval_success | Evaluated: Moderate positive (0.30) | - - |
| 2026-03-01 18:48 | model_divergence | Cross-model spread 0.32 exceeds threshold (3 models) | - - |
| 2026-03-01 18:48 |
eval
|
Evaluated by deepseek-v3.2: +0.30 (Moderate positive) 12,717 tokens +0.20 | |
| 2026-03-01 18:48 | rater_validation_warn | Validation warnings for model deepseek-v3.2: 0W 2R | - - |
| 2026-03-01 17:37 | eval_success | Evaluated: Neutral (0.10) | - - |
| 2026-03-01 17:37 | model_divergence | Cross-model spread 0.32 exceeds threshold (3 models) | - - |
| 2026-03-01 17:37 |
eval
|
Evaluated by deepseek-v3.2: +0.10 (Neutral) 13,498 tokens -0.16 | |
| 2026-02-28 17:11 | eval_success | Lite evaluated: Neutral (0.00) | - - |
| 2026-02-28 17:11 | model_divergence | Cross-model spread 0.32 exceeds threshold (3 models) | - - |
| 2026-02-28 17:11 |
eval
|
Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00 | |
| reasoning ED, neutral AI tech exploration |
| 2026-02-28 15:38 | model_divergence | Cross-model spread 0.32 exceeds threshold (3 models) | - - |
| 2026-02-28 15:38 | eval_success | Lite evaluated: Neutral (0.00) | - - |
| 2026-02-28 15:38 |
eval
|
Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00 | |
| reasoning ED, neutral AI tech exploration |
| 2026-02-28 15:26 | model_divergence | Cross-model spread 0.32 exceeds threshold (3 models) | - - |
| 2026-02-28 15:26 | eval_success | Lite evaluated: Neutral (0.00) | - - |
| 2026-02-28 15:26 |
eval
|
Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00 | |
| reasoning Technical blog post |
| 2026-02-28 13:57 | eval_success | Lite evaluated: Neutral (0.00) | - - |
| 2026-02-28 13:57 | model_divergence | Cross-model spread 0.32 exceeds threshold (3 models) | - - |
| 2026-02-28 13:57 |
eval
|
Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00 | |
| reasoning Technical blog post |
| 2026-02-28 13:26 | model_divergence | Cross-model spread 0.32 exceeds threshold (3 models) | - - |
| 2026-02-28 13:26 | eval_success | Lite evaluated: Neutral (0.00) | - - |
| 2026-02-28 13:26 |
eval
|
Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00 | |
| reasoning ED, neutral AI tech exploration |
| 2026-02-28 13:07 | eval_success | Lite evaluated: Neutral (0.00) | - - |
| 2026-02-28 13:07 | rater_validation_warn | Lite validation warnings for model llama-3.3-70b-wai: 0W 1R | - - |
| 2026-02-28 13:07 | model_divergence | Cross-model spread 0.32 exceeds threshold (3 models) | - - |
| 2026-02-28 13:07 |
eval
|
Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00 | |
| reasoning Technical blog post |
| 2026-02-28 13:02 |
eval
|
Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00 | |
| reasoning ED, neutral AI tech exploration |
| 2026-02-28 11:57 |
eval
|
Evaluated by claude-haiku-4-5-20251001: +0.32 (Moderate positive) | |
| 2026-02-28 11:27 |
eval
|
Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00 | |
| reasoning Technical blog post |
| 2026-02-28 10:56 |
eval
|
Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00 | |
| reasoning Technical blog post |
| 2026-02-28 10:38 |
eval
|
Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00 | |
| reasoning ED, neutral AI tech exploration |
| 2026-02-28 09:14 |
eval
|
Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00 | |
| reasoning Technical blog post |
| 2026-02-28 08:51 |
eval
|
Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00 | |
| reasoning ED, neutral AI tech exploration |
| 2026-02-28 08:46 |
eval
|
Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00 | |
| reasoning ED, neutral AI tech exploration |
| 2026-02-28 08:19 |
eval
|
Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00 | |
| reasoning Technical blog post |
| 2026-02-28 08:17 |
eval
|
Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00 | |
| reasoning Technical blog post |
| 2026-02-28 08:14 |
eval
|
Evaluated by deepseek-v3.2: +0.26 (Mild positive) 13,551 tokens +0.21 | |
| 2026-02-28 07:30 |
eval
|
Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00 | |
| reasoning Technical blog post |
| 2026-02-28 07:02 |
eval
|
Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00 | |
| reasoning Technical blog post |
| 2026-02-28 07:00 |
eval
|
Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00 | |
| reasoning ED, neutral AI tech exploration |
| 2026-02-28 06:43 |
eval
|
Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00 | |
| reasoning Technical blog post |
| 2026-02-28 06:42 |
eval
|
Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00 | |
| reasoning ED, neutral AI tech exploration |
| 2026-02-28 06:29 |
eval
|
Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00 | |
| reasoning ED, neutral AI tech exploration |
| 2026-02-28 06:11 |
eval
|
Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00 | |
| reasoning Technical blog post |
| 2026-02-28 05:48 |
eval
|
Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00 | |
| reasoning Technical blog post |
| 2026-02-28 05:41 |
eval
|
Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00 | |
| reasoning Technical blog post |
| 2026-02-28 05:26 |
eval
|
Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00 | |
| reasoning Technical blog post |
| 2026-02-28 05:24 |
eval
|
Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00 | |
| reasoning Technical blog post |
| 2026-02-28 04:30 |
eval
|
Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00 | |
| reasoning ED, neutral AI tech exploration |
| 2026-02-28 04:18 |
eval
|
Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00 | |
| reasoning ED, neutral AI tech exploration |
| 2026-02-28 04:17 |
eval
|
Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00 | |
| reasoning ED, neutral AI tech exploration |
| 2026-02-28 03:48 |
eval
|
Evaluated by deepseek-v3.2: +0.05 (Neutral) 13,708 tokens | |
| 2026-02-28 03:41 |
eval
|
Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00 | |
| reasoning ED, neutral AI tech exploration |
| 2026-02-28 03:36 |
eval
|
Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00 | |
| reasoning ED, neutral AI tech exploration |
| 2026-02-28 03:14 |
eval
|
Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00 | |
| reasoning ED, neutral AI tech exploration |
| 2026-02-28 03:12 |
eval
|
Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00 | |
| reasoning Technical blog post |
| 2026-02-28 03:02 |
eval
|
Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00 | |
| reasoning Technical blog post |
| 2026-02-28 02:40 |
eval
|
Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00 | |
| reasoning Technical blog post |
| 2026-02-28 02:38 |
eval
|
Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00 | |
| reasoning ED, neutral AI tech exploration |
| 2026-02-28 02:36 |
eval
|
Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00 | |
| reasoning ED, neutral AI tech exploration |
| 2026-02-28 02:36 |
eval
|
Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00 | |
| reasoning Technical blog post |
| 2026-02-28 02:13 |
eval
|
Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00 | |
| reasoning ED, neutral AI tech exploration |
| 2026-02-28 01:58 |
eval
|
Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00 | |
| reasoning Technical blog post |
| 2026-02-28 01:54 |
eval
|
Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00 | |
| reasoning ED, neutral AI tech exploration |
| 2026-02-28 01:51 |
eval
|
Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00 | |
| reasoning ED, neutral AI tech exploration |
| 2026-02-28 01:38 |
eval
|
Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00 | |
| reasoning ED, neutral AI tech exploration |
| 2026-02-28 01:30 |
eval
|
Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00 | |
| reasoning ED, neutral AI tech exploration |
| 2026-02-28 01:30 |
eval
|
Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00 | |
| reasoning ED, neutral AI tech exploration |
| 2026-02-28 01:29 |
eval
|
Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) 0.00 | |
| reasoning Technical blog post |
| 2026-02-28 01:19 |
eval
|
Evaluated by llama-3.3-70b-wai: 0.00 (Neutral) | |
| reasoning Technical blog post |
| 2026-02-28 00:44 |
eval
|
Evaluated by llama-4-scout-wai: 0.00 (Neutral) | |
| reasoning ED, neutral AI tech exploration |