+0.48 OpenAI Furious DeepSeek Might Have Stolen All the Data OpenAI Stole from Us

Name: HRCB Evaluation: OpenAI Furious DeepSeek Might Have Stolen All the Data OpenAI Stole from Us
Item: OpenAI Furious DeepSeek Might Have Stolen All the Data OpenAI Stole from Us
Rating: 0.452
Author: HN HRCB

Model: claude-haiku-4-5-20251001 +0.48 @cf/meta/llama-3.3-70b-instruct-fp8-fast lite +0.50 @cf/meta/llama-4-scout-17b-16e-instruct lite +0.80 deepseek/deepseek-v3.2-20251201 +0.30 claude-haiku-4-5 lite +0.68 meta-llama/llama-3.3-70b-instruct:free lite ND Compare

+0.48	OpenAI Furious DeepSeek Might Have Stolen All the Data OpenAI Stole from Us (www.404media.co S:+0.31 )
	1361 points by latexr 396 days ago \| 16 comments on HN \| Moderate positive Contested Editorial · v3.7 · 2026-02-28 10:42:19 0

Summary Intellectual Property & Equitable Compensation Advocates

404 Media investigates OpenAI's corporate hypocrisy: the company complains about DeepSeek's alleged unauthorized use of its training data while OpenAI itself built its systems through extensive, largely uncompensated scraping of creator-generated content. The article strongly advocates for intellectual property rights, fair compensation for creative work, and uniform corporate accountability, positioning data and creative labor as property deserving legal protection and economic return.

Article Heatmap

Negative Neutral Positive No Data

Aggregates

Editorial Mean	+0.48	Structural Mean	+0.31
Weighted Mean	+0.45	Unweighted Mean	+0.41
Max	+0.64 Article 16	Min	+0.20 Article 7
Signal	13	No Data	18
Volatility	0.14 (Medium)
Negative	0	Channels	E: 0.6 S: 0.4
SETL ℹ	+0.47	Editorial-dominant
FW Ratio ℹ	53%	26 facts · 23 inferences

Evidence 20% coverage ℹ

 2H  5M  6L  18 ND 

Theme Radar

HN Discussion 8 top-level · 4 replies

ChrisArchitect 2025-01-29 15:18 UTC link

Earlier: https://news.ycombinator.com/item?id=42861150

https://news.ycombinator.com/item?id=42860888

roshin 2025-01-29 15:43 UTC link

I hate clickbait articles that try to make the bad guys seem like they're angry.

> Both Bloomberg and the Financial Times are reporting that Microsoft and OpenAI have been probing whether DeepSeek improperly trained the R1 model

The company openai is not angry, or furious, or enraged. They simply suspect that deepseek broke their usage agreement and are trying to verify that.

csallen 2025-01-29 16:54 UTC link

There is nothing in this article to suggest that OpenAI is "furious" or even upset. Zero evidence. It's total clickbait.

And it's embarrassing that so many commenters on Hacker News who want to believe this storyline are just pretending that it's true despite the lack of evidence.

dang 2025-01-29 18:14 UTC link

Comments moved to https://news.ycombinator.com/item?id=42861475, which has the more informative of the two articles that this one was lifted from.

Submitters: "Please submit the original source. If a post reports on something found on another site, submit the latter." - https://news.ycombinator.com/newsguidelines.html

Please especially don't submit knock-off articles that jack up the linkbait and indignation. That's what we're trying to avoid on Hacker News. There are enough places to get that hit elsewhere.

SergeAx 2025-01-30 05:38 UTC link

https://archive.is/87Qy9

randalflagged 2025-01-30 06:42 UTC link

[flagged] of course. This place is starting to drift in one direction.

AlpineG 2025-01-30 08:01 UTC link

OpenAI's model is closed source. IDK if distilling can be done via the API effectively? DeepSeek already has distilled models from other open source models like Qwen which have been done by 3rd party researchers, and I assumed that happened rapidly because they are all open source.

adultSwim 2025-01-31 07:00 UTC link

Spot on headline. OpenAI itself uses distillation to launder its own ill-gotten data.

csallen 2025-01-29 16:53 UTC link

I came here to say just this.

Can we change the headline of this article to something more accurate and less clickbaity?

The article unjustifiably labels OpenAI as "furious" despite surfacing zero evidence that that's how they actually feel, obviously in an attempt to paint them as hypocrites who are okay with copying others but are upset at being copied.

This is a very dishonestly-framed and -advertised story.

sebastiennight 2025-01-29 16:59 UTC link

Such a perfect article title, but wasted on clickbait.

ZeroTalent 2025-01-29 19:24 UTC link

As I understand from Twitter, the issue explained in this article is not the actual case at hand. The issue is that they suspect them of stealing the o1 model with the weights via corporate espionage and optimizing it with Matrix Multiplication and other upgrades. That would explain why the outputs are nearly identical in some cases.

I don't know how much of any of this is true. This is what I'm reading on Twitter today.

tim333 2025-01-29 22:05 UTC link

It's funny though. There seem to be a lot of commenters on Hacker News who don't really get the sense of humor thing.

Editorial Channel

What the content says

+0.80

Article 16 Marriage & Family

High Advocacy Framing Coverage

Editorial

+0.80

SETL

+0.57

CENTRAL FOCUS: Article strongly advocates for intellectual property rights of data creators; frames unauthorized use as property violation; emphasizes lack of compensation as core injustice; well-sourced critical coverage

+0.70

Article 17 Property

Medium Advocacy Framing

Editorial

+0.70

SETL

+0.49

Article advocates for creators' right to own and control intellectual property; frames data and model outputs as property subject to ownership disputes and legal protection

+0.70

Article 19 Freedom of Expression

High Advocacy Practice Coverage

Editorial

+0.70

SETL

+0.37

Article exercises free expression through critical journalism; publicly investigates and critiques major corporations and government officials without apparent fear; demonstrates robust press freedom

+0.70

Article 27 Cultural Participation

Medium Advocacy Framing

Editorial

+0.70

SETL

+0.59

Article advocates for creators' rights to share in benefits of their scientific and creative contributions; frames fair compensation as essential protection of intellectual rights

+0.60

Article 8 Right to Remedy

Medium Advocacy Framing

Editorial

+0.60

SETL

+0.42

Article advocates for effective remedy and accountability; critiques systemic gap where OpenAI has faced no remedies despite similar violations; frames investigation as holding power accountable

+0.60

Article 23 Work & Equal Pay

Medium Advocacy Framing

Editorial

+0.60

SETL

+0.49

Article frames data creation and intellectual work as labor deserving just compensation; advocates that creators should be economically compensated for work underlying AI systems

+0.50

Article 30 No Destruction of Rights

Medium Advocacy Framing

Editorial

+0.50

SETL

+0.39

Article opposes corporate practices that violate creators' rights; advocates for prevention of abuse through transparency and public accountability

+0.40

Article 29 Duties to Community

Low Framing

Editorial

+0.40

SETL

Article frames corporate duties and community obligations; implies companies have responsibilities toward communities and individuals whose work/data they use

+0.30

Preamble Preamble

Low Framing

Editorial

+0.30

SETL

Article implicitly affirms human dignity of creators; frames data exploitation as disrespect for rights-holders' intellectual contributions

+0.30

Article 22 Social Security

Low Framing

Editorial

+0.30

SETL

Implicitly addresses economic and social rights; frames lack of compensation for data creation as denial of economic rights to creators

+0.25

Article 12 Privacy

Low Framing

Editorial

+0.25

SETL

Implicitly addresses privacy and informational integrity; frames unauthorized data collection as violation of creators' privacy interests

+0.20

Article 7 Equality Before Law

Low Framing

Editorial

+0.20

SETL

Implicitly addresses equal protection; suggests corporate data practices should be governed fairly regardless of which company commits them

+0.20

Article 21 Political Participation

Low Framing

Editorial

+0.20

SETL

Implicitly addresses democratic participation; journalism enables informed public discourse on corporate governance and AI regulation

Article 1 Freedom, Equality, Brotherhood

Not addressed

Article 2 Non-Discrimination

Not addressed

Article 3 Life, Liberty, Security

Not addressed

Article 4 No Slavery

Not addressed

Article 5 No Torture

Not addressed

Article 6 Legal Personhood

Not addressed

Article 9 No Arbitrary Detention

Not addressed

Article 10 Fair Hearing

Not addressed

Article 11 Presumption of Innocence

Not addressed

Article 13 Freedom of Movement

Not addressed

Article 14 Asylum

Not addressed

Article 15 Nationality

Not addressed

Article 18 Freedom of Thought

Not addressed

Article 20 Assembly & Association

Not addressed

Article 24 Rest & Leisure

Not addressed

Article 25 Standard of Living

Not addressed

Article 26 Education

Not addressed

Article 28 Social & International Order

Not addressed

Structural Channel

What the site does

+0.50

Article 19 Freedom of Expression

High Advocacy Practice Coverage

Structural

+0.50

Context Modifier

SETL

+0.37

404 Media platform enables free expression; independent ownership and investigative mission demonstrate structural commitment to public speech and press freedom

+0.40

Article 16 Marriage & Family

High Advocacy Framing Coverage

Structural

+0.40

Context Modifier

SETL

+0.57

404 Media investigative practice demonstrates structural commitment to protecting IP rights through public accountability journalism

+0.35

Article 17 Property

Medium Advocacy Framing

Structural

+0.35

Context Modifier

SETL

+0.49

404 Media platform protects property rights through investigative accountability journalism

+0.30

Article 8 Right to Remedy

Medium Advocacy Framing

Structural

+0.30

Context Modifier

SETL

+0.42

404 Media's investigative journalism contributes to accountability mechanisms by exposing corporate violations and forcing public reckoning

+0.20

Article 23 Work & Equal Pay

Medium Advocacy Framing

Structural

+0.20

Context Modifier

SETL

+0.49

404 Media's investigative labor exemplifies recognition of creative work's value

+0.20

Article 27 Cultural Participation

Medium Advocacy Framing

Structural

+0.20

Context Modifier

SETL

+0.59

404 Media investigative practice protects this right through accountability journalism

+0.20

Article 30 No Destruction of Rights

Medium Advocacy Framing

Structural

+0.20

Context Modifier

SETL

+0.39

404 Media's investigative practice functions as abuse prevention mechanism through exposure and public reckoning

Preamble Preamble

Low Framing

No structural signals

Article 1 Freedom, Equality, Brotherhood

Not applicable

Article 2 Non-Discrimination

Not applicable

Article 3 Life, Liberty, Security

Not applicable

Article 4 No Slavery

Not applicable

Article 5 No Torture

Not applicable

Article 6 Legal Personhood

Not applicable

Article 7 Equality Before Law

Low Framing

No structural signals

Article 9 No Arbitrary Detention

Not applicable

Article 10 Fair Hearing

Not applicable

Article 11 Presumption of Innocence

Not applicable

Article 12 Privacy

Low Framing

No structural signals

Article 13 Freedom of Movement

Not applicable

Article 14 Asylum

Not applicable

Article 15 Nationality

Not applicable

Article 18 Freedom of Thought

Not applicable

Article 20 Assembly & Association

Not applicable

Article 21 Political Participation

Low Framing

No structural signals

Article 22 Social Security

Low Framing

No structural signals

Article 24 Rest & Leisure

Not applicable

Article 25 Standard of Living

Not applicable

Article 26 Education

Not applicable

Article 28 Social & International Order

Not applicable

Article 29 Duties to Community

Low Framing

No structural signals

Supplementary Signals

How this content communicates, beyond directional lean. Learn more

Epistemic Quality ℹ

How well-sourced and evidence-based is this content?

0.74 medium claims

Sources		0.8
Evidence		0.8
Uncertainty		0.6
Purpose		0.8

Propaganda Flags ℹ

3 manipulative rhetoric techniques found

3 techniques detected

loaded language

Use of strong language: 'surreptitiously and indiscriminately sucking up whatever data it can find' to describe data harvesting practices

repetition

Repeated emphasis on 'without permission or compensation' throughout article for rhetorical reinforcement

exaggeration

Extended satirical 'Hahahaha' passage exaggerates corporate hypocrisy through laughter for emotional emphasis

Emotional Tone ℹ

Emotional character: positive/negative, intensity, authority

cynical

Valence		-0.3
Arousal		0.7
Dominance		0.8

Transparency ℹ

Does the content identify its author and disclose interests?

0.92

✓ Author

More signals: context, framing & audience

Solution Orientation ℹ

Does this content offer solutions or only describe problems?

0.31 problem only

Reader Agency

0.3

Stakeholder Voice ℹ

Whose perspectives are represented in this content?

0.35 3 perspectives

Speaks: journalistsgovernment_official

About: corporationindividualsworkers

Temporal Framing ℹ

Is this content looking backward, at the present, or forward?

present short term

Geographic Scope ℹ

What geographic area does this content cover?

global

United States, China

Complexity ℹ

How accessible is this content to a general audience?

moderate medium jargon general

Longitudinal · 6 evals

Audit Trail 19 entries

2026-02-28 10:42	model_divergence	Cross-model spread 0.63 exceeds threshold (5 models)	- -
2026-02-28 10:42	eval	Evaluated by claude-haiku-4-5-20251001: +0.45 (Moderate positive) +0.16
2026-02-28 07:27	model_divergence	Cross-model spread 0.63 exceeds threshold (5 models)	- -
2026-02-28 07:27	eval	Evaluated by claude-haiku-4-5-20251001: +0.29 (Mild positive)
2026-02-28 01:40	dlq	Dead-lettered after 1 attempts: OpenAI Furious DeepSeek Might Have Stolen All the Data OpenAI Stole from Us	- -
2026-02-28 01:38	rate_limit	OpenRouter rate limited (429) model=llama-3.3-70b	- -
2026-02-28 01:37	rate_limit	OpenRouter rate limited (429) model=llama-3.3-70b	- -
2026-02-28 01:36	dlq_replay	DLQ message 97673 replayed to LLAMA_QUEUE: OpenAI Furious DeepSeek Might Have Stolen All the Data OpenAI Stole from Us	- -
2026-02-28 00:04	eval_success	Light evaluated: Moderate positive (0.50)	- -
2026-02-28 00:04	eval	Evaluated by llama-3.3-70b-wai: +0.50 (Moderate positive)
2026-02-27 21:32	eval_success	Light evaluated: Strong positive (0.80)	- -
2026-02-27 21:32	eval	Evaluated by llama-4-scout-wai: +0.80 (Strong positive)
2026-02-27 21:30	eval_success	Evaluated: Mild positive (0.17)	- -
2026-02-27 21:30	eval	Evaluated by deepseek-v3.2: +0.17 (Mild positive) 10,901 tokens
2026-02-27 21:09	dlq	Dead-lettered after 1 attempts: OpenAI Furious DeepSeek Might Have Stolen All the Data OpenAI Stole from Us	- -
2026-02-27 21:07	rate_limit	OpenRouter rate limited (429) model=llama-3.3-70b	- -
2026-02-27 21:05	rate_limit	OpenRouter rate limited (429) model=llama-3.3-70b	- -
2026-02-27 21:04	rate_limit	OpenRouter rate limited (429) model=llama-3.3-70b	- -
2026-02-27 20:54	eval	Evaluated by claude-haiku-4-5: +0.68 (Strong positive)

build 1ad9551+j7zs · deployed 2026-03-02 09:09 UTC · evaluated 2026-03-02 11:31:12 UTC