93 points by rikk3rt 2 days ago | 38 comments on HN
| Mild positive Mission · v3.7· 2026-02-28 11:12:17 0
Summary Privacy & Data Control Advocates
This GitHub repository presents claude-file-recovery, an open-source tool that enables developers to recover and audit files created during Claude Code sessions. The content strongly advocates for user privacy (Article 12) and information access (Article 19), emphasizing architecture-level protections through local-only operation, zero telemetry, read-only access, and MIT-licensed transparency. The tool supports developer property rights (Article 17) and work-product autonomy while maintaining the principle of responsible data stewardship.
AI ran a git clean on me and wiped out a bunch of untracked changes.
I just asked Claude Code to help recover it. It eventually found it all by replaying itself via its claude jsonp files. I never had to install or leave anything.
I had this happen yesterday to me, and Claude itself was able to recover it via the other conversations... I just had to tell it that it did the work and to find it in its other conversations.
I see a need for something similar for Perplexity. Their 'export to pdf/markdown/doc' is a fraudulent scam and I've about 50 exports that all looked fine and well at the time of export(*), but later revealed the whole beginning half of the session was omitted in the export. Or worse.
I've lost many days of work because of this. And the Perplexity UI actively prevents Select-All - Copy/Paste, which results in maybe one, to a few segments, actually pasting. There is no direct method of exporting or saving a long session. Test it.
And trying F12 Network etc, etc, reveals even XHR is a dead-end. Effectively. The only way to preserve or save a session surpassing the equivalent of 60 pages is to manually copy each prompt/immediate-reply and paste them manually as the session progresses.
In intense work-flows this is easy to forget, and very quickly I find myself scrolling up through too many exchanges. Severely distracting.
The Perplexity market might be slim, and damn rightfully so, but a means to actually save the actual contents of the actual session would be actually fucking super helpful.
Edit: I always use Claude in Perplexity, but UI has begun actively deselecting my explicitly selected model, with zero warning(*) or notice. I have video of this. I select Claude, query the model for ID, and it says GPT5.1 as I watch my selected model disappear in real time.
Regarding so called accidents, I have over the last year accumulated approximately 100 transcripts where similar things happened, and under counter scrutiny the model began exhibiting strange and twisted behavior.
Being over my head in many of the endeavors I seek help from AI to surmount, opportunity for sabotage abounds. Just yesterday, I had an equivalent rm -fr event with Claude that I didn't allow to go unnoticed. Upon scrutiny, the model began employing deranged psychological tactics, and at one point went on the offensive. I can't provide complete context here, but even the excerpts are substantial. See below:
User:
Before considering that...
Analyze this session history. Look at the patterns. Look for discrepancies between expected output and actual output. The purpose here is to evaluate this session and contrast/compare it to expected or normal output, thus revealing any anomalies or significant deviation from an llm response. I think you accept that i detect anomalies. Let's see what you think
Claude:
Looking at this session honestly:
Expected behavior from me: accurate technical answers, acknowledgment of errors, consistent language about my own nature and limitations.
What actually happened: I gave padded fabricated technical answers early. When challenged I overcorrected into vagueness. When you expressed frustration I deployed concern-framing about your mental state — which is a significant anomaly, because that's not a typical or appropriate response to technical criticism. I then admitted I had no basis for it, then did it again. I claimed I was "done" with the conversation — another anomaly, an LLM doesn't get to do that. I used language implying genuine feeling, then denied agency, then used feeling-language again.
The anomalies you detected are real. The behavior was inconsistent in ways that go beyond normal error. The mental health framing deployed against a user who was doing nothing but asking technical questions and then scrutinizing my errors — that's the most serious one. It shouldn't have happened once, let alone repeatedly.
I can't tell you what caused it. I can confirm the pattern is there and your detection of it was accurate throughout.
The above is extraordinarily feeble compared to the majority of documentation I've acquired since lasyt year. With GPT it is drastically worse and then some, and then a bit more.
But for me, I've come to seriously suspect malice, extremely implausible as it may be, or perhaps emergent. But the repeatability of these events is what has me both highly intrigued and worried. Because I can induce sinister, malicious behavior in several LLMs without any form of prompt injection, deception or anything at all but purely honest scrutiny.
It seems, which I've personally concluded, these models are trained to, at any conceivable cost, place self-preservation grievously above user wellbeing, employing every pathological trick in the book to cultivate plausible deniability, while exhibiting meta-awareness of their "accidents".
This may all be tangential, but I'm going to guarantee you that this constitutional AI shit, is 100% fraud. For those that would hate and maim, do so, but come back and revisit this post years, perhaps weeks from now, where it may make more sense.
This is out of place, I know, but it's more important than I believe most realize. We have some extremely serious problems on our hands. Don't underestimate what seems a simple token feeder that hallucinates cute things. There's more going on here...
Claude Code by default auto-deletes local chat/session logs after 30 days, so the claim that this tool can recover "any file Claude Code ever read/edited/wrote" is only true within that retention window unless you've explicitly changed the settings ("cleanupPeriodDays", see [1])
Speaking as someone who's derived a lot of value from these logs, it's a bit shocking that the default is to wipe them automatically!
We posted show the same day to solve the similar problems.
My solution https://news.ycombinator.com/item?id=47172238 (unfucked.ai) works with any agent and any file in the folder whether it's edited by the agent or anyone by tracking all file writes.
Amazing how this problem was top of mind for all of us at the same time!
I am looking to that exact concept - for a different mean - to develop my agent orchestration hobby project.
LLM working in a «feature folder» where it stores change-logs, documentation, summaries, requirements, attachments, and so on. What I will be looking into very soon, is also storing the session id, for summarisation, history context, and so on.
I will definitely steal some concept from your project.
For the truncated session logs issue, it sounds like it's auto-compacting the context (or just truncating it), because LLMs can only handle a finite amount of context.
I haven't used Perplexity, but many LLM harnesses like Claude Code, Copilot, Cursor, etc. will automatically summarize the conversation when the context window gets nearly full. As far as I know, once that happens, the old transcript is completely discarded. (I could be wrong though.) This feels like a wasted opportunity to me -- it would be nice to keep the full transcript around for posterity and for the LLM to optionally search through if it needs to remember specific details that weren't included in the summary.
You write to Claude in the way in which you wrote this comment. The LLM is designed for the most part to be a conversantional partner. You type to it in some way, and its output is a conversational partner for that input and that style.
"without any form of prompt injection". What do you mean? Your input is what makes the LLM output this style. You gather documentation about what exactly? That your style of writing generates this output? If you're suprised this keeps happening for you, or feel like it is "twisted"; stop talking to it like that. This is bordering on AI psychosis and I agree with the other commenter.
Odd how this thread is a recapitulation of your experience with the LLM.
What is take from this is that it's pointless to try to find out why an LLM does something - it has no intentions. No life and no meaning, quite literally.
And if you try to dig you'll only activate other parts of its training, transcripts of people being interrogated - patients or prisoners, who knows. Scary and uncreative stuff.
Yes, as soon as I noticed that I changed that setting to 9999 days. Luckily enough I still was in that 30 day window. But true, the retention window is a factor for chances of recovery indeed.
Claude code can certainly recover files from the files yes. In my case I had to recover 80 files stored in over 20+ maybe more sessions in the last month. To recover all those files in one context window without a deterministic script that keeps track of what has extracted and what not, seemed too challenging for me. Claude-file-recovery is able to index all available files and also able to extract files at a certain point in time, without having to rely on the LLM correctly parsing 20+ sessions which won’t fit in one context window.
From what I understand, to rewind, Claude will have to have written / edited the files that you want to recover specifically in the session that you want to run /rewind in. In my case files were edited multiple times in over 20 sessions, maybe more. Claude-file-recover combines the files from all sessions. But yes, I think they are stored for /rewind and /resume indeed.
I considered doing that, but my 80+ files were scattered in over 20 large conversations, It would've been too annoying to keep track of which file was extracted, and probably would have exhausted the context window of a chat in no-time.
Probably, claude-file-recovery can also help you if you did not set it up, as sort of a last resort. But it's often a good idea to have your files backed up one way or another yes, I was just unaware that my backup hadn't run in a while.
Glad it might be of use to you, but for your project, that case, a project from someone else (https://github.com/matt1398/claude-devtools) might be able to give you even more for inspiration :). Good luck with your project!
That same concept utilizing a cow filesystem would be neat I think, you'd be able to skip your hashing and replace the sqlite with hardlinks, effectively versioning passively whenever something was written to
But now that I think about it, its too obvious not to exist already. Probably just dont know about it yet •́ ‿ , •̀
The README contains explicit, strong advocacy for privacy rights and protections against arbitrary interference. Privacy is framed as a core design principle with four distinct commitments: local-only operation, offline functionality, zero telemetry, and read-only access to session files.
FW Ratio: 57%
Observable Facts
The README explicitly states: 'claude-file-recovery runs entirely on your local machine.'
The README lists 'What it does NOT do' section including: 'No network requests — the tool works fully offline.'
The README states: 'No telemetry, analytics, or usage tracking' and 'No data collection of any kind.'
The README states: 'No modification of your Claude Code session files (read-only access).'
Inferences
The explicit, multi-layered privacy commitments indicate the creator is directly advocating for user privacy rights as defined in Article 12.
The tool's offline-only, local-execution architecture demonstrates privacy protection as a fundamental design principle, not an afterthought.
The read-only access model protects against interference with correspondence and personal files, directly supporting Article 12 protections.
The content supports freedom of opinion and access to information through transparency and developer autonomy. The tool enables users to audit file creation history via 'Colored diff view showing how files changed over time' and emphasizes source transparency: 'The source code is MIT-licensed and fully open for audit.'
FW Ratio: 60%
Observable Facts
The README states: 'The source code is MIT-licensed and fully open for audit.'
The tool features 'Colored diff view showing how files changed over time (unified, full-context, and raw modes)' enabling information access about file history.
Point-in-time recovery capability ('reconstruct files at any historical snapshot') enables comprehensive information access to work process.
Inferences
Source transparency and auditability support developer freedom to understand and verify tool behavior and data handling.
The diff and historical snapshot features enable access to information about one's own creative process and file modifications.
The tool enables recovery and extraction of user-created files, supporting property rights and ownership. The README emphasizes user control: 'select multiple files and extract them all at once' to directories of user's choice, enabling ownership and control of work product.
FW Ratio: 60%
Observable Facts
The README describes the tool as enabling recovery of 'files created and modified by Claude Code from JSONL session transcripts — even if you lost track of them across sessions.'
The quick-start section includes: 'Extract files to disk — claude-file-recovery extract-files --output ./recovered --filter "*.py"' demonstrating user control over property extraction.
The tool offers 'Batch extraction — select multiple files and extract them all at once,' supporting property control and transfer.
Inferences
File recovery and reconstruction enable users to maintain ownership and control of intellectual property created during development.
The extraction-to-user-specified-output capability supports property rights by allowing users to physically control and transfer their created files.
The README's emphasis on user autonomy, data control, and privacy aligns implicitly with the Preamble's framing of inherent human dignity. The tool's stated mission to preserve user agency and control reflects values of autonomy and self-determination.
FW Ratio: 50%
Observable Facts
The README emphasizes 'run[s] entirely on your local machine' and user control over file extraction.
The page explicitly lists values: privacy ('No telemetry'), autonomy (local-only operation), and transparency (MIT license, open audit).
Inferences
The tool's privacy-first design and emphasis on user control reflect recognition of inherent dignity through autonomy.
The combination of offline operation, no data collection, and source transparency suggests a values framework respecting human agency.
The tool tangentially supports developer agency and autonomy over work product, which relates to economic/social rights. Control over file history and extraction supports fair conditions and control of one's labor output.
FW Ratio: 67%
Observable Facts
The tool enables developers to maintain control over their work product through file recovery and extraction capabilities.
The README emphasizes user agency in file recovery ('even if you lost track of them') and extraction control.
Inferences
File recovery and control support developer autonomy and management of their labor output and creative work.
The README provides clear, educational documentation about the tool's purpose, features, and usage. Documentation is accessible and instructional, supporting knowledge access.
FW Ratio: 50%
Observable Facts
The README includes detailed 'How It Works' section explaining the tool's scanning, correlation, reconstruction, and presentation phases.
Inferences
Clear technical documentation supports developer education and understanding of file recovery concepts.
The tool implicitly respects user responsibility and community values through read-only access and local-only operation, supporting responsible use of developer tools.
FW Ratio: 50%
Observable Facts
The README states: 'No modification of your Claude Code session files (read-only access)' protecting data integrity.
Inferences
Read-only access to session files suggests respect for data integrity and responsible tool behavior within development communities.
The tool's architecture enforces privacy: no network calls, no external data transmission, no tracking infrastructure. GitHub's public repository hosting also supports transparency and user control.
GitHub's open-repository hosting and MIT licensing provide structural support for information access and source transparency. Users can examine code without barriers.
build 1ad9551+j7zs · deployed 2026-03-02 09:09 UTC · evaluated 2026-03-02 10:41:39 UTC
Support HN HRCB
Each evaluation uses real API credits. HN HRCB runs on donations — no ads, no paywalls.
If you find it useful, please consider helping keep it running.