spicypete 150 karma 2y 3m on HN HN profile →
Coverage
We've seen 1 of an unknown number of submissions
Full eval: 0 Lite-only: 0 Unevaluated: 1
1 stories
1.
HRCB 0.00
E 0.00
S
Measuring AI Ability to Complete Long Tasks (metr.org)
247 points by spicypete 71 days ago | 193 comments | hrcb AI research