AlphaThis system is experimental. Scores and classifications are early-stage research and may be unreliable. Methodology →
About HRO
Reading level: |
The Human Rights Observatory (a project of unratified.org and Safety Quotient Lab) tracks Hacker News stories and
evaluates their linked content against the 30 articles and preamble of the UN Universal Declaration
of Human Rights. Coverage depends on available API credits and free-tier usage plans for LLMs and general compute — the goal is
comprehensive evaluation, but not every story can be scored in real time. Each evaluation produces a Human Rights Compatibility
Bias (HRCB) score showing how the content's editorial and structural signals align with fundamental
human rights provisions.
Evaluations are powered by Anthropic's Claude LLM as the primary model, supplemented by open-source models on Cloudflare Workers AI.
A cron worker crawls HN stories every minute across 6 lists (top, new, best, ask, show, job),
tracking all stories but only auto-evaluating those in the top 7 pages (210 stories) of the
front page plus the top 30 from the best, ask, and show feeds. Stories outside this range are tracked but skipped unless manually triggered.
Fetched content passes through a content gate that classifies non-evaluable pages (paywalls, captchas, bot protection, etc.)
before dispatch. Evaluable content is cleaned and sent via a Cloudflare Queue. A separate consumer worker
evaluates each URL against the UDHR methodology (v3.7) using Claude, with DCP caching per domain
and content snapshots stored in R2. Results are displayed alongside standard HN metadata.
Evaluation velocity varies with API credit availability — check pipeline status for current throughput.
The evaluator operates as a Fair Witness(evidence breakdown) — reporting only what is
directly observable, with no inference beyond the evidence.
HRO is built entirely with Claude Code (Anthropic)
and uses Claude Haiku as the primary evaluation model.
We flag this relationship upfront because the site that measures human rights alignment in others
should be transparent about its own supply chain.
In February 2026, Anthropic refused to remove safeguards preventing Claude from being used
for autonomous weapons and mass surveillance, and was
blacklisted from U.S. military contracts.
This directly implicates provisions HRO tracks:
Article 3 (security of person),
Article 12 (privacy), and the
Preamble's recognition that human rights require institutional protection.
Disclosure: Based on public reporting. HRO has no independent verification of the negotiations.
Sources: NPR ·
CNN ·
MIT Tech Review ·
CNBC
Human Rights Compatibility Bias (HRCB) measures the directional lean
of web content relative to the provisions of the UDHR.
A positive score indicates content that aligns with UDHR provisions; a negative score indicates
content that conflicts with them.
Construct validity: Known-groups validation (2026-03-04) across 44 domains pre-classified by mission and ownership structure
confirms the expected ordering — rights-advocacy/investigative outlets (EFF, ProPublica, Mother Jones, 404 Media) score significantly higher
than neutral tech news (Ars Technica, ZDNet, The Register) which score higher than corporate/commercial-first outlets (Google Blog, CNBC, CoinDesk).
Group means: EP=0.35 > EN=0.21 > EC=0.14. Kruskal-Wallis H=23.4, p<0.0001; all pairwise comparisons significant.
Classification was independent of HRCB scores (based on editorial mission and ownership, not LLM output).
-1.0
←
Strong negative
|
Neutral
|
Strong positive
→
+1.0
Classification & Sentiment Reference
Classification Labels
Strong positive
+0.60 to +1.00
Positive
+0.30 to +0.59
Leaning positive
+0.10 to +0.29
Neutral
-0.09 to +0.09
Leaning negative
-0.10 to -0.29
Negative
-0.30 to -0.59
Strong negative
-0.60 to -1.00
Story-Level Labels
Theme Tag
Dominant human rights theme (e.g., "Privacy & Surveillance", "Free Expression")
Actively promotes rights-aligned positions with clear editorial support
Acknowledges
Recognizes rights issues without strong advocacy or opposition
Neutral
Balanced or no clear directional lean on human rights
Neglects
Overlooks or minimizes rights concerns that are relevant to the topic
Undermines
Actively works against rights provisions through content or framing
Hostile
Strongest negative alignment — content attacks or denies fundamental rights
No Data vs 0.0
No Data (ND) Insufficient evidence to evaluate this UDHR article for the given content. The topic may be absent, or the available text didn't provide enough signal to score. ND is not zero — it means "not measured."
0.0 (Neutral) Relevant content exists but has balanced signals, netting to zero.
The Universal Declaration of Human Rights
Adopted by the United Nations General Assembly on 10 December 1948 (Resolution 217 A),
the UDHR is a milestone document in the history of human rights. It has been translated
into over 500 languages. The Declaration consists of a Preamble and 30 Articles covering civil, political,
economic, social, and cultural rights — from the right to life and liberty
(Article 3) to freedom of expression (Article 19) to the right to education (Article 26).
Methodology
Signal Channels
Editorial What the content says. Analyzes text, arguments, framing, and sourcing.
Structural What the site does. Examines privacy, accessibility, tracking, access models.
The weights depend on content type — news articles lean 65% editorial, product pages lean 60% structural. Combined: (wE × Editorial) + (wS × Structural).Channels are weighted by content type (e.g., Editorial=0.6, Structural=0.4). Combined: (wE × Editorial) + (wS × Structural).
Content Type Weights
Code
Type
E
S
ED
Editorial / News
0.65
0.35
PO
Policy / Legal
0.70
0.30
LP
Landing Page / Marketing
0.40
0.60
CM
Community / Forum
0.55
0.45
DC
Documentation / Reference
0.50
0.50
AC
Academic / Research
0.75
0.25
HR
Human Rights Focused
0.60
0.40
PR
Product / Service
0.40
0.60
PB
Personal Blog
0.70
0.30
SO
Social Media
0.50
0.50
OT
Other
0.55
0.45
Content Type Consensus Vote
Getting the content type wrong shifts the score. A news article misclassified as a product page moves 25% of the weight from what was written to how the site behaves.Content type classification directly determines the E:S weight split, so misclassification can meaningfully distort the final HRCB score. For example, an editorial article (ED: E=0.65, S=0.35) misclassified as a landing page (LP: E=0.40, S=0.60) shifts 25 percentage points of weight from the editorial channel to the structural channel.
When several AI models review the same story, they vote on the content type. The majority wins. Lite-mode models don't vote.When multiple models evaluate the same story, the system uses a majority vote to determine the consensus content type rather than trusting any single model's classification. Each full-mode rater contributes one vote; lite-mode raters do not vote (they don't classify content type). The plurality type becomes the consensus content type used for weight assignment.
Suspect rate — structural-heavy types (PO, LP, PR, AC) where the structural channel has no data, suggesting misclassification
Cross-model disagreement — stories where raters assigned different content types, indicating ambiguous content
Per-Provision Pipeline
Content → E + S → Weights → Combined → DCP → Final → HRCB aggregate
Each of the 31 human rights articles gets its own score for what the content says and what the site does. Each article tracks:Each of 31 UDHR provisions (Preamble + Articles 1-30) is scored independently on both channels. Per-provision fields:
editorial
Editorial channel score for this provision
[-1, +1]
structural
Structural channel score for this provision
[-1, +1]
combined
Content-type-weighted blend of E + S
[-1, +1]
final
After DCP modifier (±0.30 max per article)
[-1, +1]
evidence
Evidence strength for this provision
H / M / L / ND
directionality
How the content engages with this right
A / P / F / C
SETL — Structural-Editorial Tension Level
SETL measures the gap between what a site says and what it does. A high score means the article's words point one way on human rights, but the site's design and tracking point another.SETL measures the divergence between editorial and structural signals for each article. It captures whether a site's words and infrastructure tell different stories.
-1.0 S-dominantBalancedE-dominant +1.0
Formula: sign(E-S) × √(|E-S| × max(|E|, |S|)).
A large gap means the site's articles point one way on human rights, but the site's design and tracking point another.A high absolute SETL means the site's content and infrastructure are misaligned on human rights.
Fair Witness Evidence Layer
Every evaluation separates its evidence into two categories:Inspired by Heinlein's Fair Witnesses from Stranger in a Strange Land, every evaluation separates its evidence into two categories:
Observable Facts Directly verifiable statements grounded in page content. Any reader could confirm these by visiting the page.
Inferences Interpretive conclusions drawn from the observable evidence. These explain why the evidence maps to the score.
The FW Ratio (Fair Witness Ratio) tracks what share of the evidence is directly verifiable — things anyone could confirm by visiting the page. High = grounded in visible facts. Low = relies more on interpretation. Toggle Fair Witness on any story page to see the evidence breakdown.The FW Ratio (Fair Witness Ratio) is the proportion of observable facts to total evidence items. A high ratio means the evaluation is well-grounded in verifiable observations; a low ratio means more interpretive weight. Toggle Fair Witness on any item page to switch between the standard view and a stripped-down view showing only the evidence breakdown.
FW Ratio
FW Ratio = observable_facts / (observable_facts + inferences)
Scale [0, 1]. Higher = evaluation more grounded in verifiable observations; lower = more interpretive weight.
E-Prime Constraint
Observable facts follow an E-Prime constraint — they cannot use "to be" verbs (is, are, was, were). This forces the evaluator to describe specific actions and behaviors rather than making identity claims. "The site is paywalled" becomes "The site displays a paywall overlay after the first paragraph."Observable facts follow an E-Prime constraint — witness_facts must not use "to be" verbs (is, are, was, were, be, been, being). This prevents essentialist claims and forces grounding in specific observable actions, consistent with operationalist epistemology. The constraint applies to facts only; inferences may use "to be" since they are explicitly marked as interpretive.
Evidence Strength & Confidence
High
Direct, explicit content with strong sourcing (max score 1.0)
Medium
Clear signal but may be secondary (max score 0.7)
Low
Tangential, indirect, or weakly sourced (max score 0.4)
ND
Topic absent from content — not counted in aggregate (score 0.0)
Confidence shows how well-supported the score is across all 31 human rights articles. More articles with strong evidence push it higher. Articles with no data don't count.Confidence is an evidence-weighted aggregate across all 31 provisions: H=1.0, M=0.6, L=0.2, ND=0.0. Higher confidence means more provisions had strong evidence.
Directionality Markers
Advocacy
Explicitly argues for or against a right
Practice
Site infrastructure reflects a rights stance
Framing
Presents issues in a rights-aligned or rights-opposed frame
Coverage
Factual content relevant to human rights topics
Volatility
How consistently does a piece of content treat different rights? Low volatility = treats all rights about the same. High = strong on some rights, weak on others.Standard deviation of per-provision combined scores. Measures how uniformly content aligns across different rights.
Low
< 0.10 — Consistent alignment across provisions
Medium
0.10 - 0.25 — Mixed signals across provisions
High
> 0.25 — Aligns on some rights but conflicts on others
Consensus (Multi-Model)
When several AI models evaluate the same story, scores are averaged with weights. Full-mode evaluations count more than lite. Higher self-reported confidence counts more. Truncated content counts less.When multiple models evaluate the same story, a consensus score is computed as a weighted mean. Each rater's weight combines three factors: prompt mode (full = 1.0, lite = 0.5), self-reported confidence (floored at 0.2 so no model is silenced), and a content truncation discount.
Score
Weighted mean across all rater evals [-1, +1]
Spread
Max minus min score across raters — measures disagreement
Count
Number of models that contributed evaluations
How Evaluations Work
Fetch the page and identify its content type (news, marketing, academic, etc.) — this sets the scoring weights
Build a background profile of the site: privacy policy, tracking, access model, and other site-level signals
Score all 31 human rights articles — what the content says, and what the site does
Record the evidence; generate a theme tag, an overall sentiment rating, and a 2–3 sentence summary
Calculate the final score, confidence level, say-do tension (SETL), and overall classification
A URL from HN's top 7 pages (plus top best/ask/show) is fetched as the unit of analysis (other stories can be manually triggered)
The page's content type is classified (Editorial, Community, HR, etc.); when multiple models evaluate the same story, a majority vote determines the consensus type
Channel weights are assigned based on the consensus content type
A Domain Context Profile (DCP) is constructed from 8 domain-level elements
Each of 31 UDHR provisions is scored for editorial and structural signals
Fair Witness evidence (observable facts + inferences) is recorded per provision
Story-level labels are generated (theme tag, sentiment, executive summary)
Ten supplementary signals are assessed (EQ, PT, SO, ET, SR, TF, GS, CL, TD, RTS)
A final classification is assigned (Strong positive to Strong negative)
Evaluation Modes
Two modes balance cost vs. depth. Stories in lite mode show ~lite in the feed.Two evaluation modes serve different cost/quality trade-offs. Stories with only a lite evaluation show ~lite in the feed.
Full
~lite
Channels
Editorial + Structural (31 per-provision)
Editorial + Structural (holistic)
Provisions
31 per-provision scores
Single aggregate score
DCP
Yes (8 domain elements)
No
Fair Witness
Yes (facts + inferences)
No
Supplementary
All 10 signals
5 (EQ, SO, TD, valence, arousal) + tone
SETL
Yes (per-provision)
Yes (holistic)
Confidence
Evidence-weighted
N/A
Output tokens
~4-5K
~200-400
Schema
3.7
lite-1.5
Models
Anthropic Claude Haiku 4.5
Workers AI (Llama 4 Scout)
Feed label
HRCB
~lite
Lite item pages display an editorial+structural summary card instead of the full heatmap.
Lite scores use holistic two-dimension scoring (editorial + structural) blended with content-type weights.
Psychological Safety Quotient (PSQ)
experimental
PSQ measures how safe content feels to read — separate from whether it supports human rights. Torture exposés score high on HRCB (rights-affirming) but low on PSQ (threatening to read). A privacy-violating app landing page scores low on HRCB but high on PSQ (pleasant to read).PSQ is an independent signal measuring reader psychoemotional safety on a 0-10 scale, orthogonal to HRCB's rights-stance construct. The instrument adapts Kline's Psychological Safety Quotient framework (originally validated on social media content against LLM scores, not human ground truth) to news content. Construct independence confirmed by Phase A validation: PSQ threat_exposure and HRCB editorial showed expected divergence on known-construct stories.
Three dimensions are scored per story:
Threat Exposure — intensity of harmful content the reader encounters (violence, exploitation, psychological manipulation). Lower = safer.
Trust Conditions — whether the content creates conditions for reader trust (source transparency, logical coherence, respect for autonomy). Higher = safer.
Resilience Baseline — whether the content leaves readers more or less psychologically equipped (empowerment, coping resources, constructive framing). Higher = safer.
PSQ is experimental until validation criteria are met: (1) pilot distribution across ≥50 stories without clustering, (2) inter-rater ICC > 0.7 across 3 models on 30 stories, (3) CFA factor structure confirmation on N≥100, (4) criterion validity r > 0.4 against human "felt reading safety" ratings, (5) test-retest ICC > 0.8.
Scores are displayed normalized to [-1, +1] (via (psq-5)/5) for visual consistency with E and S channels.
Scores should be treated as ordinal — rank-order is meaningful, but distances between scores are not.
PSQ is currently scored by Workers AI (Llama 4 Scout PSQ); an external DistilBERT scorer at psq.unratified.org collects data for comparison research but lacks sufficient score breadth for production use.
HRCB
PSQ
Construct
Rights stance (does content advance or undermine UDHR?)
Reader safety (how safe is this to read?)
Scale
-1.0 to +1.0
0-10 (displayed as -1.0 to +1.0)
Dimensions
31 UDHR provisions × E+S channels
3 (threat, trust, resilience)
Consensus
Shared (full + lite models)
Separate (PSQ models only)
Models
Haiku 4.5, Llama 4 Scout
Llama 4 Scout PSQ
Status
Validated (known-groups, discriminant)
Experimental (unvalidated domain transfer)
Supplementary Signals & Factions
Supplementary Signals
Ten supplementary signals capture how content communicates, orthogonal to HRCB which measures
directional lean. These are grounded in established psychometric and information quality frameworks.
See the live signal dashboard for global averages and distributions,
or the full glossary for all terms and UDHR article links.
EQ
Epistemic Quality
Source quality, evidence reasoning, uncertainty handling, purpose transparency. Based on the CRAAP Test framework from library science.
PT
Propaganda Flags
Detects 18 propaganda techniques (loaded language, strawman, whataboutism, etc.). Based on Da San Martino et al. (2019) PTC-18 corpus.
SO
Solution Orientation
Problem-only vs solution-oriented framing, reader agency score.
Up to 3 salient rights trade-offs: which UDHR provisions are in tension, and how the content resolves them. Full eval only.
Domain Signal Profiles (Factions)
The Factions page clusters domains by editorial character
— how they cover topics, not just what they score. Instead of using the 31-dimension UDHR fingerprint,
it clusters on 8 normalized supplementary signal dimensions:
EQ
Epistemic Quality
SR
Stakeholder Representation
SO
Solution Orientation
TD
Transparency & Disclosure
PT
Propaganda (inverted)
AR
Arousal
VA
Valence
FW
Fair Witness ratio
Each dimension is z-normalized (zero mean, unit variance) across all domains with 3+ evaluations.
Cosine similarity on the resulting 8D vectors measures editorial character similarity. Domains are
clustered using agglomerative hierarchical clustering with average linkage at golden ratio thresholds:
Faction
sim ≥ 1/φ (0.618) — core editorial alignment
Alliance
sim ≥ 1/φ² (0.382) — sympathetic but distinct
Acquaintance
sim ≥ 1/φ³ (0.236) — occasional overlap
Neutral
sim ≥ 0 — no meaningful relationship
Rival
sim < 0 — oppositional editorial profiles
Domain Context Profile (DCP)
Eight domain-level elements provide context modifiers:
Privacy
ToS
Accessibility
Mission
Editorial Code
Ownership
Access Model
Ad/Tracking
Each element shifts per-article scores by up to ±0.30.
DCP profiles cached 7 days in KV, persisted to D1. Only used in full evaluations (not lite).
Browser-verified signals supplement LLM-inferred DCP with machine-observed data:
Tracking
Security
Accessibility
Consent
A headless browser visits each domain to directly measure tracker scripts, security headers, accessibility attributes, and consent patterns — grounding structural scores in observation.Cloudflare Browser Rendering (Puppeteer) audits each domain via CDP network interception (tracker counting), response header extraction (HSTS, CSP), DOM evaluation (lang attr, skip-nav, alt text coverage), and consent UI pattern matching (cookie banners, dark pattern heuristics). Results produce four br_* DCP elements with calibrated modifiers, refreshed every 7 days.
Content Gate
A pre-evaluation content classifier that identifies non-evaluable pages before they enter the evaluation queue. Pure regex — no LLM calls.
Category
Description
paywall
Subscription wall blocks content access
bot_protection
Cloudflare/Akamai challenge pages
captcha
CAPTCHA or verification required
login_wall
Authentication wall
cookie_wall
Cookie consent blocks content
geo_restriction
Region-restricted content
age_gate
Age verification required
app_gate
Content only available in mobile app
rate_limited
Rate limit or throttle page
error_page
404, 500, or other error pages
redirect_or_js_required
Redirect chains, dead ends, or JavaScript-only SPAs
binary_content
PDF, binary, or non-text content type (detected via Content-Type header)
js_rendered
URL fetched successfully but returned no readable text (JS-rendered SPA)
no_content
Story has no URL and no self-text (cannot evaluate)
hn_removed
Story flagged, deleted, or removed from HN after submission
Runs at two points: cron pre-fetch (primary gate) and consumer (safety net for KV cache misses).
Writes gate_category and gate_confidence to the stories table.
Feeds & Integration
RSS Feeds (Atom)
Subscribe to HRCB-evaluated stories via /feed.xml.
Filter by stance, UDHR provision, or domain — combine freely.
Filters combine: /feed.xml?article=12&filter=negative = negative stories affecting privacy.
Download OPML to import all 31 UDHR provision feeds into your reader at once.
Embeddable Score Badges
Embed a domain's HRCB score anywhere — GitHub READMEs, blog footers, documentation.