ND An FAQ on Reinforcement Learning Environments (epoch.ai)
37 points by dcre 9 days ago | 7 comments on HN ~lite vlite-2.0
Summary ~lite
Technical FAQ on reinforcement learning environments, neutral tone and content.
Lite evaluation by llama-4-scout-wai-psq · editorial channel only · no per-section breakdown available
Longitudinal 359 HN snapshots · 71 evals
+1 0 −1 HN
Audit Trail 91 entries
2026-03-22 02:22 eval_success PSQ evaluated: g-PSQ=-0.040 (3 dims) - -
2026-03-22 02:22 eval Evaluated by llama-4-scout-wai-psq: -0.04 (Neutral) 0.00
2026-03-22 02:04 eval_success Lite evaluated: Neutral (0.00) - -
2026-03-22 02:04 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical content on Reinforcement Learning Environments, no explicit human rights discussion
2026-03-22 02:04 rater_validation_warn Lite validation warnings for model llama-4-scout-wai: 1W 0R - -
2026-03-22 01:02 eval_success PSQ evaluated: g-PSQ=-0.040 (3 dims) - -
2026-03-22 01:02 eval Evaluated by llama-4-scout-wai-psq: -0.04 (Neutral) 0.00
2026-03-22 00:58 eval_success Lite evaluated: Neutral (0.00) - -
2026-03-22 00:58 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical content on Reinforcement Learning Environments, no explicit human rights discussion
2026-03-22 00:58 rater_validation_warn Lite validation warnings for model llama-4-scout-wai: 1W 0R - -
2026-03-21 23:35 eval_success PSQ evaluated: g-PSQ=-0.040 (3 dims) - -
2026-03-21 23:35 eval Evaluated by llama-4-scout-wai-psq: -0.04 (Neutral) 0.00
2026-03-21 23:20 eval_success Lite evaluated: Neutral (0.00) - -
2026-03-21 23:20 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical content on Reinforcement Learning Environments, no explicit human rights discussion
2026-03-21 23:20 rater_validation_warn Lite validation warnings for model llama-4-scout-wai: 1W 0R - -
2026-03-21 22:20 eval_success PSQ evaluated: g-PSQ=-0.040 (3 dims) - -
2026-03-21 22:20 eval Evaluated by llama-4-scout-wai-psq: -0.04 (Neutral) 0.00
2026-03-21 22:00 eval_success Lite evaluated: Neutral (0.00) - -
2026-03-21 22:00 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical content on Reinforcement Learning Environments, no explicit human rights discussion
2026-03-21 22:00 rater_validation_warn Lite validation warnings for model llama-4-scout-wai: 1W 0R - -
2026-03-21 20:54 eval_success Lite evaluated: Neutral (0.00) - -
2026-03-21 20:54 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical content on Reinforcement Learning Environments, no explicit human rights discussion
2026-03-21 20:54 rater_validation_warn Lite validation warnings for model llama-4-scout-wai: 1W 0R - -
2026-03-21 20:09 eval_success PSQ evaluated: g-PSQ=-0.040 (3 dims) - -
2026-03-21 20:09 eval Evaluated by llama-4-scout-wai-psq: -0.04 (Neutral) 0.00
2026-03-21 19:39 eval_success Lite evaluated: Neutral (0.00) - -
2026-03-21 19:39 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical content on Reinforcement Learning Environments, no explicit human rights discussion
2026-03-21 19:39 rater_validation_warn Lite validation warnings for model llama-4-scout-wai: 1W 0R - -
2026-03-21 19:25 eval_success PSQ evaluated: g-PSQ=-0.040 (3 dims) - -
2026-03-21 19:25 eval Evaluated by llama-4-scout-wai-psq: -0.04 (Neutral) 0.00
2026-03-21 19:00 eval_success Lite evaluated: Neutral (0.00) - -
2026-03-21 19:00 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical content on Reinforcement Learning Environments, no explicit human rights discussion
2026-03-21 19:00 rater_validation_warn Lite validation warnings for model llama-4-scout-wai: 1W 0R - -
2026-03-21 18:43 eval Evaluated by llama-4-scout-wai-psq: -0.04 (Neutral) 0.00
2026-03-21 18:20 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical content on Reinforcement Learning Environments, no explicit human rights discussion
2026-03-21 18:04 eval Evaluated by llama-4-scout-wai-psq: -0.04 (Neutral) 0.00
2026-03-21 17:42 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical content on Reinforcement Learning Environments, no explicit human rights discussion
2026-03-21 17:25 eval Evaluated by llama-4-scout-wai-psq: -0.04 (Neutral) 0.00
2026-03-21 17:03 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical content on Reinforcement Learning Environments, no explicit human rights discussion
2026-03-21 16:40 eval Evaluated by llama-4-scout-wai-psq: -0.04 (Neutral) 0.00
2026-03-21 16:24 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical content on Reinforcement Learning Environments, no explicit human rights discussion
2026-03-21 15:58 eval Evaluated by llama-4-scout-wai-psq: -0.04 (Neutral) 0.00
2026-03-21 15:42 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical content on Reinforcement Learning Environments, no explicit human rights discussion
2026-03-21 15:20 eval Evaluated by llama-4-scout-wai-psq: -0.04 (Neutral) 0.00
2026-03-21 15:03 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical content on Reinforcement Learning Environments, no explicit human rights discussion
2026-03-21 14:36 eval Evaluated by llama-4-scout-wai-psq: -0.04 (Neutral) 0.00
2026-03-21 14:23 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical content on Reinforcement Learning Environments, no explicit human rights discussion
2026-03-21 13:52 eval Evaluated by llama-4-scout-wai-psq: -0.04 (Neutral) 0.00
2026-03-21 13:41 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical content on Reinforcement Learning Environments, no explicit human rights discussion
2026-03-21 13:12 eval Evaluated by llama-4-scout-wai-psq: -0.04 (Neutral) 0.00
2026-03-21 13:02 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical content on Reinforcement Learning Environments, no explicit human rights discussion
2026-03-21 12:32 eval Evaluated by llama-4-scout-wai-psq: -0.04 (Neutral) 0.00
2026-03-21 12:22 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical content on Reinforcement Learning Environments, no explicit human rights discussion
2026-03-21 11:54 eval Evaluated by llama-4-scout-wai-psq: -0.04 (Neutral) 0.00
2026-03-21 11:46 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical content on Reinforcement Learning Environments, no explicit human rights discussion
2026-03-21 11:12 eval Evaluated by llama-4-scout-wai-psq: -0.04 (Neutral) 0.00
2026-03-21 11:06 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical content on Reinforcement Learning Environments, no explicit human rights discussion
2026-03-21 10:29 eval Evaluated by llama-4-scout-wai-psq: -0.04 (Neutral) 0.00
2026-03-21 10:24 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical content on Reinforcement Learning Environments, no explicit human rights discussion
2026-03-21 09:47 eval Evaluated by llama-4-scout-wai-psq: -0.04 (Neutral) 0.00
2026-03-21 09:43 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical content on Reinforcement Learning Environments, no explicit human rights discussion
2026-03-21 09:07 eval Evaluated by llama-4-scout-wai-psq: -0.04 (Neutral) 0.00
2026-03-21 09:04 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical content on Reinforcement Learning Environments, no explicit human rights discussion
2026-03-21 08:30 eval Evaluated by llama-4-scout-wai-psq: -0.04 (Neutral) 0.00
2026-03-21 08:26 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical content on Reinforcement Learning Environments, no explicit human rights discussion
2026-03-21 07:48 eval Evaluated by llama-4-scout-wai-psq: -0.04 (Neutral) 0.00
2026-03-21 07:44 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical content on Reinforcement Learning Environments, no explicit human rights discussion
2026-03-21 07:10 eval Evaluated by llama-4-scout-wai-psq: -0.04 (Neutral) 0.00
2026-03-21 07:06 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical content on Reinforcement Learning Environments, no explicit human rights discussion
2026-03-21 06:32 eval Evaluated by llama-4-scout-wai-psq: -0.04 (Neutral) 0.00
2026-03-21 06:30 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical content on Reinforcement Learning Environments, no explicit human rights discussion
2026-03-21 05:52 eval Evaluated by llama-4-scout-wai-psq: -0.04 (Neutral) 0.00
2026-03-21 05:51 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical content on Reinforcement Learning Environments, no explicit human rights discussion
2026-03-21 05:16 eval Evaluated by llama-4-scout-wai-psq: -0.04 (Neutral) 0.00
2026-03-21 05:15 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical content on Reinforcement Learning Environments, no explicit human rights discussion
2026-03-21 04:36 eval Evaluated by llama-4-scout-wai-psq: -0.04 (Neutral) 0.00
2026-03-21 04:35 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical content on Reinforcement Learning Environments, no explicit human rights discussion
2026-03-21 04:00 eval Evaluated by llama-4-scout-wai-psq: -0.04 (Neutral) 0.00
2026-03-21 03:59 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical content on Reinforcement Learning Environments, no explicit human rights discussion
2026-03-21 03:21 eval Evaluated by llama-4-scout-wai-psq: -0.04 (Neutral) 0.00
2026-03-21 03:21 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical content on Reinforcement Learning Environments, no explicit human rights discussion
2026-03-21 02:45 eval Evaluated by llama-4-scout-wai-psq: -0.04 (Neutral) 0.00
2026-03-21 02:45 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical content on Reinforcement Learning Environments, no explicit human rights discussion
2026-03-21 02:09 eval Evaluated by llama-4-scout-wai-psq: -0.04 (Neutral) 0.00
2026-03-21 02:08 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical content on Reinforcement Learning Environments, no explicit human rights discussion
2026-03-21 01:30 eval Evaluated by llama-4-scout-wai-psq: -0.04 (Neutral) 0.00
2026-03-21 01:30 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical content on Reinforcement Learning Environments, no explicit human rights discussion
2026-03-21 00:41 eval Evaluated by llama-4-scout-wai-psq: -0.04 (Neutral)
2026-03-21 00:41 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral)
reasoning
Technical content on Reinforcement Learning Environments, no explicit human rights discussion
2026-03-21 00:05 eval Evaluated by llama-3.3-70b-wai-psq: 0.00 (Neutral)
2026-03-21 00:01 eval Evaluated by llama-3.3-70b-wai: 0.00 (Neutral)
reasoning
Technical content, zero rights discussion