| 2026-03-02 13:52 | eval_success | Lite evaluated: Mild positive (0.20) | - - |
| 2026-03-02 13:52 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.20 (Mild positive) -0.10 | |
| reasoning News article on EMS blockage |
| 2026-03-02 13:29 | eval_success | Lite evaluated: Mild positive (0.10) | - - |
| 2026-03-02 13:29 |
eval
|
Evaluated by llama-4-scout-wai: +0.10 (Mild positive) 0.00 | |
| reasoning ED, reporting on tech issue during mass shooting |
| 2026-03-02 13:24 | eval_success | Lite evaluated: Mild positive (0.10) | - - |
| 2026-03-02 13:24 |
eval
|
Evaluated by llama-4-scout-wai: +0.10 (Mild positive) 0.00 | |
| reasoning ED, reporting on tech issue during mass shooting |
| 2026-03-02 13:08 | eval_success | Lite evaluated: Moderate positive (0.30) | - - |
| 2026-03-02 13:08 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.30 (Moderate positive) +0.10 | |
| reasoning News article on EMS blockage |
| 2026-03-02 12:34 | eval_success | Lite evaluated: Mild positive (0.10) | - - |
| 2026-03-02 12:34 |
eval
|
Evaluated by llama-4-scout-wai: +0.10 (Mild positive) 0.00 | |
| reasoning ED, reporting on tech issue during mass shooting |
| 2026-03-02 12:21 | eval_success | Lite evaluated: Mild positive (0.20) | - - |
| 2026-03-02 12:21 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.20 (Mild positive) -0.10 | |
| reasoning News article on EMS blockage |
| 2026-03-02 12:17 | eval_success | Lite evaluated: Moderate positive (0.30) | - - |
| 2026-03-02 12:17 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.30 (Moderate positive) +0.10 | |
| reasoning News article on EMS blockage |
| 2026-03-02 12:02 | eval_success | Lite evaluated: Mild positive (0.10) | - - |
| 2026-03-02 12:02 |
eval
|
Evaluated by llama-4-scout-wai: +0.10 (Mild positive) 0.00 | |
| reasoning ED, reporting on tech issue during mass shooting |
| 2026-03-02 11:44 | eval_success | Lite evaluated: Mild positive (0.20) | - - |
| 2026-03-02 11:44 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.20 (Mild positive) 0.00 | |
| reasoning News article on EMS blockage |
| 2026-03-02 11:39 | eval_success | Lite evaluated: Mild positive (0.20) | - - |
| 2026-03-02 11:39 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.20 (Mild positive) 0.00 | |
| reasoning News article on EMS blockage |
| 2026-03-02 11:21 | eval_success | Lite evaluated: Mild positive (0.10) | - - |
| 2026-03-02 11:21 |
eval
|
Evaluated by llama-4-scout-wai: +0.10 (Mild positive) 0.00 | |
| reasoning ED, reporting on tech issue during mass shooting |
| 2026-03-02 11:04 | eval_success | Lite evaluated: Mild positive (0.20) | - - |
| 2026-03-02 11:04 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.20 (Mild positive) 0.00 | |
| reasoning News article on EMS blockage |
| 2026-03-02 10:31 | eval_success | Lite evaluated: Mild positive (0.10) | - - |
| 2026-03-02 10:31 |
eval
|
Evaluated by llama-4-scout-wai: +0.10 (Mild positive) 0.00 | |
| reasoning ED, reporting on tech issue during mass shooting |
| 2026-03-02 10:14 | eval_success | Lite evaluated: Mild positive (0.20) | - - |
| 2026-03-02 10:14 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.20 (Mild positive) -0.20 | |
| reasoning News article on EMS blockage |
| 2026-03-02 09:57 | eval_success | Lite evaluated: Mild positive (0.10) | - - |
| 2026-03-02 09:57 |
eval
|
Evaluated by llama-4-scout-wai: +0.10 (Mild positive) 0.00 | |
| reasoning ED, reporting on tech issue during mass shooting |
| 2026-03-02 09:34 | eval_success | Lite evaluated: Moderate positive (0.40) | - - |
| 2026-03-02 09:34 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.40 (Moderate positive) 0.00 | |
| reasoning News article on EMS blockage |
| 2026-03-02 09:07 | eval_success | Lite evaluated: Mild positive (0.10) | - - |
| 2026-03-02 09:07 |
eval
|
Evaluated by llama-4-scout-wai: +0.10 (Mild positive) 0.00 | |
| reasoning ED, reporting on tech issue during mass shooting |
| 2026-03-02 08:45 | eval_success | Lite evaluated: Moderate positive (0.40) | - - |
| 2026-03-02 08:45 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.40 (Moderate positive) +0.10 | |
| reasoning News article on EMS blockage |
| 2026-03-02 08:12 | eval_success | Lite evaluated: Mild positive (0.10) | - - |
| 2026-03-02 08:12 |
eval
|
Evaluated by llama-4-scout-wai: +0.10 (Mild positive) 0.00 | |
| reasoning ED, reporting on tech issue during mass shooting |
| 2026-03-02 08:02 | eval_success | Lite evaluated: Moderate positive (0.30) | - - |
| 2026-03-02 08:02 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.30 (Moderate positive) 0.00 | |
| reasoning News article on EMS blockage |
| 2026-03-02 07:57 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.30 (Moderate positive) +0.10 | |
| reasoning News article on EMS blockage |
| 2026-03-02 07:28 |
eval
|
Evaluated by llama-4-scout-wai: +0.10 (Mild positive) 0.00 | |
| reasoning ED, reporting on tech issue during mass shooting |
| 2026-03-02 07:22 |
eval
|
Evaluated by llama-4-scout-wai: +0.10 (Mild positive) 0.00 | |
| reasoning ED, reporting on tech issue during mass shooting |
| 2026-03-02 07:10 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.20 (Mild positive) -0.10 | |
| reasoning News article on EMS blockage |
| 2026-03-02 06:47 |
eval
|
Evaluated by llama-4-scout-wai: +0.10 (Mild positive) 0.00 | |
| reasoning ED, reporting on tech issue during mass shooting |
| 2026-03-02 06:30 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.30 (Moderate positive) +0.10 | |
| reasoning News article on EMS blockage |
| 2026-03-02 06:06 |
eval
|
Evaluated by llama-4-scout-wai: +0.10 (Mild positive) 0.00 | |
| reasoning ED, reporting on tech issue during mass shooting |
| 2026-03-02 05:59 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.20 (Mild positive) 0.00 | |
| reasoning News article on EMS blockage |
| 2026-03-02 05:26 |
eval
|
Evaluated by llama-4-scout-wai: +0.10 (Mild positive) 0.00 | |
| reasoning ED, reporting on tech issue during mass shooting |
| 2026-03-02 05:17 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.20 (Mild positive) -0.10 | |
| reasoning News article on EMS blockage |
| 2026-03-02 04:50 |
eval
|
Evaluated by llama-4-scout-wai: +0.10 (Mild positive) 0.00 | |
| reasoning ED, reporting on tech issue during mass shooting |
| 2026-03-02 04:44 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.30 (Moderate positive) 0.00 | |
| reasoning News article on EMS blockage |
| 2026-03-02 04:09 |
eval
|
Evaluated by llama-4-scout-wai: +0.10 (Mild positive) 0.00 | |
| reasoning ED, reporting on tech issue during mass shooting |
| 2026-03-02 04:02 |
eval
|
Evaluated by llama-4-scout-wai: +0.10 (Mild positive) 0.00 | |
| reasoning ED, reporting on tech issue during mass shooting |
| 2026-03-02 04:01 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.30 (Moderate positive) 0.00 | |
| reasoning News article on EMS blockage |
| 2026-03-02 03:18 |
eval
|
Evaluated by llama-4-scout-wai: +0.10 (Mild positive) | |
| reasoning ED, reporting on tech issue during mass shooting |
| 2026-03-02 03:16 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.30 (Moderate positive) | |
| reasoning News article on EMS blockage |