Recreating Epstein PDFs from raw encoded attachments

dperfect · 2026-02-06T01:26:50 1770341210

Nerdsnipe confirmed :)

Claude Opus came up with this script:

https://pastebin.com/ntE50PkZ

It produces a somewhat-readable PDF (first page at least) with this text output:

https://pastebin.com/SADsJZHd

(I used the cleaned output at https://pastebin.com/UXRAJdKJ mentioned in a comment by Joe on the blog page)

pests · 2026-02-06T02:53:17 1770346397

So it was a public event attended by 450 people:

https://www.mountsinai.org/about/newsroom/2012/dubin-breast-...

https://www.businessinsider.com/dubin-breast-center-benefit-...

Even names match up, but oddly the date is different.

elmomle · 2026-02-06T03:04:26 1770347066

Your links are for the inaugural (first) ball in December 2011; OP's text referred to a second annual ball in December 2012.

pests · 2026-02-06T05:16:02 1770354962

You are right my first is incorrect but the second does seem to be from 2012.

sorbus-25 · 2026-02-06T06:32:27 1770359547

DUBIN BREAST CENTER SECOND ANNUAL BENEFIT MONDAY, DECEMBER 10, 2012 HONORING ELISA PORT, MD, FACS AND THE RUTTENBERG FAMILY HOST CYNTHIA MCFADDEN SPECIAL MUSICAL PERFORMANCES CAROLINE JONES, K'NAAN, HALEY REINHART, THALIA, EMILY WARREN MANDARIN ORIENTAL 7:00PM COCKTAILS LOBBY LOUNGE 8:00PM DINNER AND ENTERTAINMENT MANDARIN BALLROOM FESTIVE ATTIRE

Groxx · 2026-02-07T23:39:07 1770507547

Since it looks like this got flagged (probably because out of context at a glance it looks like insane babble that somewhat frequently occurs here), some context: this is appears to be text recovered from the pdf, in the links up-thread. Though there's more text than that link shows, and I'm not entirely sure why it's posted in this specific thread, though it's relevant-ish at least.

sorbus-25 · 2026-02-08T03:07:57 1770520077

It's from a contemporaneous reference to the very same event listed in the PDF. I found it online and archived it: https://web.archive.org/web/20260206040716/https://what2wear...

It includes screenshots of what looks like an expanded document for the event.

Why relevant? I found it by searching the archive for "DBC". There were references to "Dubin", then I found the rest online easily. All that extra text could have helped with decoding the base64 text

turtlesdown11 · 2026-02-06T18:58:21 1770404301

interesting, Eva Dubin was highlighted today for offering Epstein her 15 year old daughter and her friends.

She's a medical doctor, who became amnesic when on the stand for Maxwell's case

>Pressed about gaps in her memory, Dubin told the court: "It's very hard for me to remember anything far back and sometimes I can't remember things from last month. My family notices it. I notice it."

nialv7 · 2026-02-06T04:23:55 1770351835

looks like we have it. in the end it's pretty mundane...

JKCalhoun · 2026-02-06T13:53:00 1770385980

There are plenty of other PDF's with Base64 encoded attachments.

klustregrif · 2026-02-06T07:31:07 1770363067

Which begs the question why was it censored?

nickthegreek · 2026-02-06T21:57:13 1770415033

They censored Dont in one location. The current Thought process is they were redacting mentions of Don T.

pbhjpbhj · 2026-02-06T15:18:18 1770391098

Mighty be they censored all pages mentioning keywords, this one says "Breast" ... perhaps they censored all sexual content?

klustregrif · 2026-02-06T15:31:24 1770391884

At the risk of repeating myself. Which begs the question why?

mmastrac · 2026-02-06T15:53:25 1770393205

To protect the people in power, as always.

redeeman · 2026-02-06T19:26:21 1770405981

what is insane is that everyone just accepts it, knows that this happens, and dont go lynch the ones in charge immediately.

There was a time when the guy making the cannon had to sit on top of it for the first shot. Perhaps this kind of policy could be adapted to other situations aswell.

Take the job to guard epstein? take the consequences when things go wrong.

Protect criminals? take the very real consequences if found out

ben_w · 2026-02-06T20:21:18 1770409278

> what is insane is that everyone just accepts it, knows that this happens, and dont go lynch the ones in charge immediately.

For a while, my pet conspiracy theory was that this was Epstein's real cause of death: a lynching by a prison guard made to look like suicide.

I never took it too seriously, because no actual evidence; now I'm more inclined to think it was a coconspirator hoping it would mean no more evidence getting out.

quickthrowman · 2026-02-06T23:15:40 1770419740

Epstein being murdered is the one conspiracy that I personally still think may be possible/probable.

All it takes is a single actor paying off some guards to ‘fall asleep’, a camera to be disabled, and a 15 minute window of opportunity. It’s much more probable than something like the US Government planning 9/11 and somehow keeping thousands of co-conspirators silent.

I don’t really spend a whole lot of time thinking about it since as you said, we’ll never know for sure. It just seems at least probable if he actually did have kompromat on powerful people.

QuercusMax · 2026-02-07T03:57:49 1770436669

Did you see this? https://www.cbsnews.com/news/epstein-files-jail-cell-death-v...

The noose they found in his cell was not the thing that strangled him. If he wasn't murdered then they faked his death.

mikeyouse · 2026-02-06T14:07:48 1770386868

Likely because the named list is a bunch of Trump appointees and mega donors and they're illegally trying to spare them the embarrassment.

lanyard-textile · 2026-02-06T08:55:00 1770368100

Distraction.

notpushkin · 2026-02-06T04:12:19 1770351139

> It produces a somewhat-readable PDF (first page at least) with this text output

Any chance you could share a screenshot / re-export it as a (normalized) PDF? I’m curious about what’s in there, but all of my readers refuse to open it.

dperfect · 2026-02-06T04:25:19 1770351919

Screenshot: https://imgur.com/eWCfYYd

dperfect · 2026-02-06T18:06:12 1770401172

Letting Claude work a little longer produced this behemoth of a script (which is supposed to be somewhat universal in correcting similar OCR'd PDFs - not yet tested on any others though): https://pastebin.com/PsaFhSP1

which uses this Rust zlib stream fixer: https://pastebin.com/iy69HWXC

and gives the best output I've seen it produce: https://imgur.com/itYWblh

This is using the same OCR'd text posted by commenter Joe.

daveguy · 2026-02-06T18:58:11 1770404291

> which is supposed to be somewhat universal in correcting similar OCR'd PDFs

Xerox would like a word.

https://news.ycombinator.com/item?id=29223815

Point being, "correcting" to "correct looking" may be worse than just accepting errors. Errors are often clearly identified by humans as a nonsense word. "Correcting" OCR can result in plausible, but wrong results that are more difficult for the human in the loop to identify.

dperfect · 2026-02-06T20:04:22 1770408262

That's true if we're correcting OCR of actual output text. In this case, it's operating on the base 64 text, trying to produce chunks that form valid zlib streams and PDF syntax so the file can be intact enough to be opened. "Just accepting errors" would mean not seeing any content in the file because it cannot be read.

So yes, the "fixed" output has errors, but it’s not hallucinating details like an LLM, nor is it trying to produce output that conforms to any linguistic or stylistic heuristics.

The phrase "correcting similar OCR'd PDFs" should have been "correcting similar OCR'd base 64 representations of PDFs".

the_real_cher · 2026-02-06T12:24:02 1770380642

This is cool!

bawolff · 2026-02-05T23:52:43 1770335563

Teseract supports being trained for specific fonts, that would probably be a good starting point

https://pretius.com/blog/ocr-tesseract-training-data

pyrolistical · 2026-02-05T23:24:51 1770333891

It decodes to binary pdf and there are only so many valid encodings. So this is how I would solve it.

1. Get an open source pdf decoder

2. Decode bytes up to first ambiguous char

3. See if next bits are valid with an 1, if not it’s an l

4. Might need to backtrack if both 1 and l were valid

By being able to quickly try each char in the middle of the decoding process you cut out the start time. This makes it feasible to test all permutations automatically and linearly

pletnes · 2026-02-06T06:29:43 1770359383

You might need to backtrack a lot more, due to the intermediate compression step?

bawolff · 2026-02-05T23:45:05 1770335105

Sounds like a job for afl

percentcer · 2026-02-05T23:25:44 1770333944

This is one of those things that seems like a nerd snipe but would be more easily accomplished through brute forcing it. Just get 76 people to manually type out one page each, you'd be done before the blog post was written.

jjwiseman · 2026-02-06T00:57:06 1770339426

Or one person types 76 pages. This is a thing people used to do, not all that infrequently. Or maybe you have one friend who will help–cool, you just cut the time in half.

wildzzz · 2026-02-06T03:00:51 1770346851

Typing 76 pages is easy when it's words in a language you understand. WPM is going to be incredibly slow when you actually have to read every character. On top of that, no spaces and no spellcheck so hopefully you didn't miss a character.

ryanSrich · 2026-02-06T03:24:42 1770348282

Seems like a job for an LLM

Forgeties79 · 2026-02-06T12:58:53 1770382733

Quite the opposite if you want to trust the results

sjducb · 2026-02-06T20:06:39 1770408399

The first week of my PHD was accurately copying DNA sequences from an old paper into a computer file. 10 pages in total. I used OCR to make an initial version then text to speech to check it

76 pages is a couple of months of work

quuxplusone · 2026-02-06T21:06:07 1770411967

As TFA says, the hard part is that "1" and "l" look the same in the selected typeface. Whether your OCR is done by computers or humans, you still have to deal with that problem somehow. You still need to do the part sketched out e.g. by pyrolistical in [1] and implemented by dperfect in [2].

[1] - https://news.ycombinator.com/item?id=46906897

[2] - https://news.ycombinator.com/item?id=46916065

fragmede · 2026-02-05T23:31:07 1770334267

> Just get 76 people

I consider myself fairly normal in this regard, but I don't have 76 friends to ask to do this, so I don't know how I'd go about doing this. Post an ad on craigslist? Fiverr? Seems like a lot to manage.

jazzyjackson · 2026-02-06T03:54:25 1770350065

First, build a fanbase by streaming on Twitch.

Krutonium · 2026-02-05T23:55:28 1770335728

Amazon Mechanical Turk?

subscribed · 2026-02-06T11:01:18 1770375678

No, I don't think so :) -- https://techcrunch.com/2023/06/14/mechanical-turk-workers-ar...

WolfeReader · 2026-02-05T23:28:24 1770334104

You think compelling 76 people to honestly and accurately transcribe files is something that's easy and quick to accomplish.

altairprime · 2026-02-06T07:14:17 1770362057

Non-engineers are perfectly willing to volunteer their time to do drudgery. It's one of my opseng career's distinguishing specialties: I'll do drudgery rather than code when appropriate, rather than avoiding it or sulking about it (as was a common response at work for some number of decades!). Learned that lesson when I was 18 from an internship (where I completely failed to deliver any work product due to trying to code around the work). It's part of why I'm going into accounting: apparently having the stamina for dreary work is rare?!

Also look up double/triple data-entry systems, where you have multiple people enter the data and then flag and resolve differences. Won't protect you from your staff banding together to fuck you over with maliciously bad data, but it's incredibly effective to ensure people were Actually Working Their Blocks under healthy circumstances.

pbhjpbhj · 2026-02-06T15:14:16 1770390856

Captcha!

estimator7292 · 2026-02-06T15:35:55 1770392155

Friend, have you ever heard of secretaries?

legitster · 2026-02-06T00:30:28 1770337828

Given how much of a hot mess PDFs are in general, it seems like it would behoove the government to just develop a new, actually safe format to standardize around for government releases and make it open source.

Unlike every other PDF format that has been attempted, the federal government doesn't have to worry about adoption.

gucci-on-fleek · 2026-02-06T09:51:54 1770371514

XPS [0] seems to meet these criteria. It supports most of the features of PDF, is an "official" standard, has decent software support (including lots of open source programs), and uses a standard file format (XML). But the tooling is quite a bit worse than it is for PDF, and the file format is still complex enough that redaction would probably be just as hard.

DjVu [1] would be another option. It has really good open source tooling available, but it supports substantially less features than PDF, making it not really suitable as a drop-in replacement. The format is relatively simple though, so redaction should be fairly doable.

TIFF [2] is already occasionally used for government documents, but it's arguably more complex than PDF, so probably not a good choice for this.

[0]: https://en.wikipedia.org/wiki/Open_XML_Paper_Specification

[1]: https://en.wikipedia.org/wiki/DjVu

[2]: https://en.wikipedia.org/wiki/TIFF

Spooky23 · 2026-02-06T01:39:18 1770341958

You’re thinking about this as a nerd.

It’s not a tools problem, it’s a problem of malicious compliance and contempt for the law.

legitster · 2026-02-06T04:19:42 1770351582

Even the previous justice departments struggled with PDFs. The way they handled it was scrubbing all possible metadata and uploading it as images.

For example, when the Mueller reports were released with redactions, they had no searchable text or meta data because they were worried about these exact kind of data leaks.

However, vast troves of unsearchable text is not a huge win for transparency.

PDFs are just a garbage format and even good administrations struggle.

Ekaros · 2026-02-06T08:35:11 1770366911

I give any new document format 3 to 5 years until it ends up with similar mess. And that is if it starts well designed and limited.

derwiki · 2026-02-06T00:41:02 1770338462

JPEG?

legitster · 2026-02-06T00:50:29 1770339029

That's not really comparable - It needs to be editable and searchable.

charcircuit · 2026-02-06T10:53:29 1770375209

Photoshop and Google Images show it can be done.

recursive · 2026-02-06T02:50:32 1770346232

Lossy

iberator · 2026-02-06T06:50:42 1770360642

ChocMontePy · 2026-02-06T01:48:09 1770342489

You can use the justice.gov search box to find several different copies of that same email.

The copy linked in the post:

https://www.justice.gov/epstein/files/DataSet%209/EFTA004004...

Three more copies:

https://www.justice.gov/epstein/files/DataSet%2010/EFTA02153...

https://www.justice.gov/epstein/files/DataSet%2010/EFTA02154...

Perhaps having several different versions might make it easier.

ChocMontePy · 2026-02-06T06:22:56 1770358976

Also, I found a different base64 encoding with a different font here:

https://www.justice.gov/epstein/files/DataSet%209/EFTA007755...

This doesn't solve the "1 & l" problem for the pdf you are looking at, but it could be useful anyway.

ChocMontePy · 2026-02-06T07:18:07 1770362287

And this might be a copy of the original pdf:

https://www.justice.gov/epstein/files/DataSet%2011/EFTA02702...

Aloisius · 2026-02-07T02:34:27 1770431667

I checked and that's definitely the black and white version of the one encoded in the file.

Someone build a very simple OCR tool that successfully extracted the base64[1]. The only difference besides the lack of the document tracking ids at the bottom is the original was pink on blue for the first page and has some pink text on the second.

[1] https://github.com/KoKuToru/extract_attachment_EFTA00400459

JKCalhoun · 2026-02-07T00:10:00 1770423000

File is gone now, hmmm…

ChocMontePy · 2026-02-07T09:59:19 1770458359

Mirror:

https://web.archive.org/web/20260131153720/https://www.justi...

tcgv · 2026-02-06T16:42:44 1770396164

> Then my mom wrote the following: “be careful not to get sucked up in the slime-machine going on here! Since you don’t care that much about money, they can’t buy you at least.”

I'm lucky to have parents with strong values. My whole life they've given me advice, on the small stuff and the big decisions. I didn't always want to hear it when I was younger, but now in my late thirties, I'm really glad they kept sharing it. In hidhsight I can see the life-experience / wisdom in it, and how it's helped and shaped me.

pavel_lishin · 2026-02-06T19:24:58 1770405898

I think this was meant to be a reply to https://news.ycombinator.com/item?id=46903929 ?

tcgv · 2026-02-06T20:40:45 1770410445

Indeed! Thanks for pointing that out. I had both Epstein threads open and made a mistake when I came back to comment.

pimlottc · 2026-02-05T23:08:58 1770332938

Why not just try every permutation of (1,l)? Let’s see, 76 pages, approx 69 lines per page, say there’s one instance of [1l] per line, that’s only… uh… 2^5244 possibilities…

Hmm. Anyone got some spare CPU time?

wahern · 2026-02-05T23:25:46 1770333946

It should be much easier than that. You should should be able to serially test if each edit decodes to a sane PDF structure, reducing the cost similar to how you can crack passwords when the server doesn't use a constant-time memcmp. Are PDFs typically compressed by default? If so that makes it even easier given built-in checksums. But it's just not something you can do by throwing data at existing tools. You'll need to build a testing harness with instrumentation deep in the bowels of the decoders. This kind of work is the polar opposite of what AI code generators or naive scripting can accomplish.

JKCalhoun · 2026-02-06T13:57:06 1770386226

Not necessarily a PDF attachment?

Someone who made some progress on one Base64 attachment got some XMP metadata that suggested a photo from an iPhone. Now I don't know if that photo was itself embedded in a PDF, but perhaps getting at least the first few hundred bytes decoded (even if it had to be done manually) would hint at the file-type of the attachment. Then you could run your tests for file fidelity.

swsieber · 2026-02-06T14:39:46 1770388786

I'd say 99% of the time, the first 10 bytes would be enough to know the file type.

cluckindan · 2026-02-05T23:57:12 1770335832

On the contrary, that kind of one-off tooling seems a great fit for AI. Just specify the desired inputs, outputs and behavior as accurately as possible.

m000 · 2026-02-06T10:48:32 1770374912

You might be taking the "I" in AI too literally.

sznio · 2026-02-06T09:56:55 1770371815

>It should be much easier than that. You should should be able to serially test if each edit decodes to a sane PDF structure

that's pointed out in the article. It's easy for plaintext sections, but not for compressed sections. Didn't notice any mention of checksums.

pimlottc · 2026-02-06T01:40:44 1770342044

I wonder if you could leverage some of the fuzzing frameworks tools like Jepsen rely on. I’m sure there’s got to be one for PDF generation.

kalleboo · 2026-02-06T04:47:58 1770353278

Easy, just start a crypto currency (Epsteincoin?) based on solving these base64 scans and you'll have all the compute you could ever want just lining up

yatopifo · 2026-02-06T20:17:30 1770409050

Please don’t give ideas to Nvidia.

kevin_thibedeau · 2026-02-05T23:53:30 1770335610

pdftoppm and Ghostscript (invoked via Imagemagick) re-rasterize full pages to generate their output. That's why it was slow. Even worse with a Q16 build of Imagemagick. Better to extract the scanned page images directly with pdfimages or mutool.

Followup: pdfimages is 13x faster than pdftoppm

masfuerte · 2026-02-06T15:04:30 1770390270

This. Not only is it faster, the images are likely to be of better quality. If you rasterize the pages then the images will be scaled, unless you get very lucky.

chrisjj · 2026-02-05T23:15:34 1770333334

> it’s safe to say that Pam Bondi’s DoJ did not put its best and brightest on this

Or worse. She did.

winddude · 2026-02-06T04:06:40 1770350800

there are a few messaging conversations between FB agents early on that are kind of interesting. It would be very interesting to see them about the releases. I sometimes wonder if some was malicious compliance... ie, do a shitty job so the info get's out before it get re-redacted... we can hope...

krupan · 2026-02-06T15:40:19 1770392419

I am in no way a republican apologist, but how many people were clamoring for the immediate releasing these documents, saying it "should be easy" and all that? Laws were passed ordering their sudden speedy disclosure. How would you have handled this?

deadbabe · 2026-02-06T16:36:13 1770395773

Released all files as is, no redactions.

chrisjj · 2026-02-06T17:36:38 1770399398

Sudden speedy immediate didn't happen.

If I was Pam? I wouldn't have been.

If she was me, start earlier, hire better, end later.

eek2121 · 2026-02-05T23:42:32 1770334952

I mean, the internet is finding all her mistakes for her. She is actually doing alright with this. Crowdsource everything, fix the mistakes. lol.

TSiege · 2026-02-06T00:15:48 1770336948

This would be funnier if it wasn’t child porn being unredacted by our government

lazide · 2026-02-06T11:49:18 1770378558

If you think the child porn is the worst part of this mess, I’ve got news for you.

We’d all be lucky if it was just distributing child porn.

block_dagger · 2026-02-06T08:17:49 1770365869

Weren’t. Subjunctive mood.

direwolf20 · 2026-02-06T12:00:16 1770379216

Language is whatever people think it is, and "it wasn't" has plurality agreement which "it weren't" does not

867-5309 · 2026-02-06T09:24:04 1770369844

rubn't. conjunctivitis.

PetriCasserole · 2026-02-06T02:21:29 1770344489

[flagged]

nixosbestos · 2026-02-06T03:22:55 1770348175

Every second of my political consciousness in the United States has been acutely tinged with the awareness that a bunch of people, across most of the political spectrum live in a constant state of denial. Denial of personal responsibility or culpability. Denial of cognitive dissonance. Denial of any distinct, self-informed morals. Denial of anything but a fear of others. Denial of anything that makes them fearful or uncomfortable or might invite confrontation.

I've known from the second I started doing debate and FX/DX in highschool, well, let's just say I never thought that the majority of the 2FA-folks would be worth a damn when tyranny really came knocking. Fear of the other as a form of manipulation, and a distraction from class consciousness, has been their literal raison d'état since decades before I was born.

I guess I was shocked that the President being a convicted rapist and documented child predator would be a bridge too far. But then we re-elected him.

I believe it. We voted for this. We do nothing in the face of zero actual justice. This is exactly as good as we deserve. And best of all, it certainly doesn't stop here. This is what they chose to not redact. When we know they spent enormous tax-payer hundreds-of-people hours redacting the documents.

I don't think it's even conspiratorial to say they left stuff in, so they could use it as justification for not releasing the other HALF of the files that haven't been released, even overly censored.

We deserve this, and the much worse that our apathy has invited.

hsuduebc2 · 2026-02-06T03:39:08 1770349148

I will certainly feel less confident ridiculing conspiracy theories.

I’d never believe Bill Gates would secretly slip antibiotics into his wife’s cocktail to treat an STI he got from a Russian prostitute on convicted pedophile estate.

But here we are.

MadnessASAP · 2026-02-06T05:58:26 1770357506

I wish I could believe in more conspiracy theories. At least then I might believe there was some sort of master plan, that some individual or group had some image of a better world (to them) and that the world was being steered somewhere.

Unfortunately no, it just seems to be greed, incompetence, and incompetent greed. At least when a tank drives over a protestor somebody gets to be on the side of the tank. When the bus goes off a cliff because the driver sold the steering wheel everybody dies.

hsuduebc2 · 2026-02-06T21:16:52 1770412612

Absolutely. It’s not some grand replacement theory. It’s not an intellectual master plan. It’s mostly plain greed and cynicism from the powerful, plus ignorance or a resigned belief that people cannot be changed from everyone else.

I’m in the second group. When a majority of people miss the basics, when a large chunk treat internet content as daily reality rather than algorithmically served rage bait, it feels like there’s nothing you can do.

A friend once told me, “I wish I were more schizo like before, it was much more fun,” and in a bleak way, I get it. I’d almost prefer it if there really were a coherent plan, some deliberate attempt by the mighty to steer civilization. But right now it mostly looks like greed and cynicism. These days, a lot of it seems to be coming out of Silicon Valley but it will change as it always does like it did before.

direwolf20 · 2026-02-06T10:06:23 1770372383

The owner of 4chan met with an Epstein associate 3 days before reinstating /pol/ which lead to the destruction of America.

Epstein was trying to remove tax on banker bonuses in the UK for some reason.

There might not be a single master plan but holy hell is this stuff intertwined with everything that happens.

hsuduebc2 · 2026-02-06T21:04:30 1770411870

Schizos would be schizos anywhere else. Widely available access to information which are biased towards your own bias mostly did that. Most of the people don't understand technology in general nor the algoritmic content suggestion. That is what the real problem is.

balamatom · 2026-02-07T11:23:51 1770463431

>Schizos would be schizos anywhere else.

May I introduce you to https://en.wikipedia.org/wiki/Sluggish_schizophrenia

>Most of the people don't understand technology in general nor the algoritmic content suggestion. That is what the real problem is.

There isn't a common understanding of these mechanisms, because the first thing they were used for, was to brand as "defective" anyone pursuing such understanding on their own terms.

Of course you could always do it by the book i.e. go in blind and debt-enslave yourself until loss of capacity for disentanglement. A small number of such functionaries are indeed required to maintain a colony; and then some surplus ones to keep the first one in their place.

Is that a "conspiracy"? In the sense that you're stuck breathing in sync with a lot of strangers, sure. In the sense of secret master plan? Nah bruh, it's all been out in the open all along. Just mindkillingly terrifying to most of yall. Hence all the phatics.

balamatom · 2026-02-06T11:00:17 1770375617

>I wish I could believe in more conspiracy theories.

Username checks out... well, I can help ya.

You start out easy, like "who invented all those damn conspiracy theories and introduced them into the public culture, anyway?"

direwolf20 · 2026-02-06T10:04:23 1770372263

Epstein was involved in a UK corruption plot to reduce taxes on banker's bonuses. He was involved with insider trading around 9/11. This net is far reaching.

ranger_danger · 2026-02-16T22:53:02 1771282382

> Denial of personal responsibility or culpability

What are you doing differently that has a better effect on the situation?

modo_mario · 2026-02-06T12:35:52 1770381352

>and a distraction from class consciousness

As a non american looking in I feel like that applies to the other side as well and is how you ended up here.

Having paid a bit of attention during the election seeing bernie and trump at least in terms of rethoric more in line with eachother on the same trade agreements, migration, etc whilst also both outperforming Hillary in the same swing states, etc is not some coincidence.

And given that you live in a 2 party state it's always going to swing at some point eventually. No matter how depraved someone like trump is. If the next one is just as bad and they sit it out long enough they will get their turn.

queenkjuul · 2026-02-06T03:03:47 1770347027

Become?

yieldcrv · 2026-02-06T07:40:54 1770363654

> become

the mascot of 4chan was literally pedobear, what time frame are you referring to?

direwolf20 · 2026-02-06T10:05:41 1770372341

The owner of 4chan met with an Epstein associate 3 days before reinstating /pol/ which lead to the destruction of America.

helterskelter · 2026-02-06T01:16:46 1770340606

I wonder if this could be intentional. If the datasets are contaminated with CSAM, anybody with a copy is liable to be arrested for possession.

More likely it's just an oversight, but it could also be CYA for dragging their feet, like "you rushed us, and look at these victims you've retraumatized". There are software solutions to find nudity and they're quite effective.

adaml_623 · 2026-02-06T07:38:20 1770363500

Or it's distraction. Leave nudity in to use up attention that should be turning to analysis of what's been redacted.

There's redaction to protect victims and there's redaction to protect specific co-conspirators in Epstein's spy ring

SketchySeaBeast · 2026-02-06T14:09:23 1770386963

It's hilariously revealing that it keeps redacting "Don't".

chrisjj · 2026-02-06T17:20:07 1770398407

Odd indeed. The President's name contains no apostrophe :)

lukifer · 2026-02-06T17:42:39 1770399759

The emails are bizarrely sloppy with spelling and punctuation, perhaps many usages of "don't" ended up being typed as "don t", triggering an automated find-and-replace.

SketchySeaBeast · 2026-02-06T17:58:12 1770400692

The export itself is also sloppy, with characters like equal signs being added in weird places. Seems like they have it set to cast a wide and poorly set up net.

chrisjj · 2026-02-07T09:45:09 1770457509

Equals signs substituting in some places.

Looks like the result of quoted printable decoding done by inept regex.

JKCalhoun · 2026-02-06T13:50:34 1770385834

I'll take Hanlon’s Razor for 500, Alex.

dagi3d · 2026-02-06T01:02:49 1770339769

the issue is that mistakes can't be fixed in the sense once they are discovered, it doesn't matter if they are eventually redacted

chrisjj · 2026-02-06T00:05:11 1770336311

Let's see her sued for leaking PII. Here in Europe, she'd be mincemeat.

ISL · 2026-02-06T01:19:33 1770340773

The US administration is, at present, regularly violating the law and ignoring court orders. Indeed, these very releases are patently in violation of multiple federal laws -- they're simultaneously insufficiently-responsive to meet the requirements of the law requiring the release of the files and fall afoul of CSAM laws by being incompletely redacted.

The challenge, as we're all experiencing together, is that the law is not inherently self-enforcing.

typeofhuman · 2026-02-06T01:39:05 1770341945

Can you provide a couple examples of the laws they're violating?

roywiggins · 2026-02-06T02:55:03 1770346503

How about court orders?

https://www.cbsnews.com/minnesota/news/ice-violations-judge-...

> ICE has likely violated more court orders in January 2026 than some federal agencies have violated in their entire existence," Schiltz said, adding that he counted 96 court orders that ICE has violated in 74 cases.

https://www.cbsnews.com/news/frustrations-from-judge-prosecu...

typeofhuman · 2026-02-06T03:38:05 1770349085

[flagged]

roywiggins · 2026-02-06T03:43:56 1770349436

"Allegations" from the exact judges whose orders aren't being enacted? The orders in question are pretty simple: release this guy. Don't take this guy out of state. It's pretty clear when they're not being followed. This guy is not a slouch:

https://www.politico.com/news/2026/01/27/patrick-schiltz-jud...

https://storage.courtlistener.com/recap/gov.uscourts.mnd.230...

Did you notice that one article I linked involved a DoJ lawyer admitting that she couldn't convince ICE to obey court orders that she was trying to transmit to them? That's beyond an allegation and into admission. How is that not evidence?

More on these ignored court orders:

https://www.mprnews.org/story/2026/01/28/ice-illegally-detai...

roywiggins · 2026-02-06T14:56:16 1770389776

A specific case:

https://www.startribune.com/judge-orders-detainee-returned-m...

subscribed · 2026-02-06T10:25:34 1770373534

At this point you're taking a piss, this is not a honest discussion stance.

Judges themselves complained about their own orders being ivolated/ignored. Repeatedly.

lproven · 2026-02-06T11:45:48 1770378348

> you're taking a piss

"You are taking a piss" -- you are currently urinating.

"You are taking the piss" -- you are mocking me or this.

subscribed · 2026-02-06T17:19:54 1770398394

Thank you. Sadly can't edit it anymore but I'll remember it next time.

brabel · 2026-02-06T11:21:00 1770376860

If someone violates a court order don’t they get arrested?? Can’t the judge pronounce the perpetrators should be arrested instead of just complaining?

rcxdude · 2026-02-06T12:37:56 1770381476

This is exactly the breakdown of the system that people are sounding the alarm about.

roywiggins · 2026-02-06T14:50:11 1770389411

The problem is that it's always specific to a particular case. So, if one guy isn't being released according to court order, they could order someone held in the courthouse jail until he is, and probably just the threat will get him released. But then 1) nobody ends up in jail, because they're not in contempt anymore and 2) it doesn't do anything for any other cases, and there are so many other cases. This sort of contempt where a judge can just order it is "civil contempt" and is meant to convince someone to comply with the court order, it can't be used to punish someone longer than that (criminal contempt can, but you need an actual prosecution, trial, etc).

You might think "ok can't they be held in contempt for the pattern of ignoring court orders" and, well, you'd think so. But that looks a lot like a universal injunction or a class action and SCOTUS has deliberately been nerfing those.

If they've simply been committing crimes then judges don't have anything to do- they'd have to be prosecuted by someone, or I guess sued civilly, but that won't put them in jail either and takes forever.

hobs · 2026-02-06T13:04:15 1770383055

There's no one in 2026 honestly saying "But what crimes has he committed???" its just concern trolls, sealions, bots, and some nazis.

ISL · 2026-02-06T01:54:23 1770342863

As noted above:

https://www.govinfo.gov/content/pkg/PLAW-119publ38/pdf/PLAW-... : the Attorney General was to have produced the entirety of the Epstein files, with very narrowly-enumerated redactions, in December. She has not done so.

Furthermore, there are numerous allegations that the documents that have been released contain CSAM, which (referencing the PDF above) may fall afoul of 18 U.S.C. 2252–2252A.

In addition, one need only glance at the action in US courts to see egregious violations of the Constitution and valid court orders playing out daily.

https://www.documentcloud.org/documents/26513988-trorder0128...

https://storage.courtlistener.com/recap/gov.uscourts.mnd.230...

typeofhuman · 2026-02-06T03:39:58 1770349198

Allegations aren't evidence. Has the Administration actually been found guilty of violating the law - if that is even possible.

jcranmer · 2026-02-06T04:05:28 1770350728

Yes, the Abrego Garcia and Öztürk detentions are two very newsworthy cases that have actually reached the point of a final judgement in the district courts, as opposed to "merely" preliminary injunctions against the government.

(It's also worth noting that almost none of the government's appeals to their losses in preliminary injunctions have been on the merits as to whether or not their actions were legal, but rather on the grounds of "no one should be allowed to challenge our actions," which has also been a fairly losing argument for everybody except SCOTUS.)

bryceacc · 2026-02-06T06:54:46 1770360886

>if that is even possible

yes.... any administration can be found guilty of violating law, and should be dealt with accordingly.

paulryanrogers · 2026-02-06T12:52:51 1770382371

> Has the Administration actually been found guilty of violating the law - if that is even possible.

Obviously administrations can violate the law. Otherwise this is just an autocracy with term limits.

542354234235 · 2026-02-06T13:18:48 1770383928

>Allegations aren't evidence

Allegations are literally evidence. "He attacked me" is an allegation of a crime and is evidence that would be used in conjunction with other evidence to prosecute said crime.

rockskon · 2026-02-06T05:13:17 1770354797

Evidence is evidence - of which there are enormous amounts of.

anon84873628 · 2026-02-06T05:21:02 1770355262

Are you expecting the administration to prosecute itself?

phorkyas82 · 2026-02-06T07:51:11 1770364271

That's why there is separation of powers or ought to be.

mschuster91 · 2026-02-06T01:47:06 1770342426

There's more than enough credible reports of CSAM in the Epstein Files dump - more than enough for me to not go and download even a single file of them myself, simply because German law does not care about why you are in the possession of CSAM, even if you took the picture yourself.

The legal situation regarding CSAM is very strict no matter which country, and I better hope no one here will actually be dumb enough to provide actual links.

chrisjj · 2026-02-06T10:08:58 1770372538

If those reports are true then what we have is not just an effective deterrent for download and distribution of the set, but legally prosecutable malware targetting anyone who does, empowered by the Interpol CSAM database to which the DOJ should probably already released the offending material.

direwolf20 · 2026-02-06T10:08:34 1770372514

Use encryption

> even if you took the picture yourself.

I'd hope the punishment is more severe in that case!

simonh · 2026-02-06T10:33:07 1770373987

It's a tricky issue. In many countries it's not illegal and quite common for children to run around naked in public, during the summer on beaches for example, and so millions of people have holiday photos that are technically CSAM in their possession that they don't even know they have.

direwolf20 · 2026-02-06T10:41:22 1770374482

CSAM must be for sexual gratification usually. A medical anatomy textbook isn't CSAM.

woooooo · 2026-02-06T10:50:28 1770375028

And now you're in court strenuously arguing that you weren't sexually gratified by the photo of your kid in the tub.

Obviously most people are sensible most of the time but sometimes they are not.

chrisjj · 2026-02-06T13:01:29 1770382889

More than that. CSAM is evidence of abuse. Hence the "A".

And nudity is not required.

direwolf20 · 2026-02-06T13:35:25 1770384925

CSAM has a meaning identical to child porn but doesn't make that meaning explicit. Drawn or generated depictions of child nudity can be considered CSAM in some jurisdictions.

chrisjj · 2026-02-06T15:54:04 1770393244

"CSAM isn’t pornography—it’s evidence of criminal exploitation of kids."

That's from RAINN, the US's largest anti-sexual violence organisation.

mschuster91 · 2026-02-06T15:16:34 1770390994

Yep. Germany is very very strict for example. Even textual descriptions fall under that law.

mschuster91 · 2026-02-06T15:15:58 1770390958

> I'd hope the punishment is more severe in that case!

I'm talking about kids making photos of themselves. Which has been an issue multiple times in the past.

subscribed · 2026-02-06T10:31:25 1770373885

That might be intentional tbh, to make the database toxic to limit the spread.

mikeyouse · 2026-02-06T03:29:41 1770348581

They illegally fired the IGs responsible for whistleblowers and fraud in every department; https://www.nycbar.org/press-releases/firings-of-inspectors-...

They illegally withheld funds (impoundment) from congressionally authorized/mandated expenditures and relied on pocket rescissions to defund programs they didn't like: https://www.cbpp.org/research/federal-budget/pocket-rescissi...

They keep illegally appointing unqualified hacks as US attorney in defiance of the mandate they're approved by the Senate (Essayli, Habba, Halligan, Sarcone, Chattah) - judges have found at least five of the appointments illegal. As one example: https://www.politico.com/news/2025/10/28/judge-los-angeles-t...

They've repeatedly violated court orders to either return immigrant detainees or release them. "This is one of dozens of court orders with which respondents have failed to comply in recent weeks.": https://www.cnn.com/2026/01/27/politics/patrick-schiltz-judg...

The EPA illegally convened a secret panel of climate deniers to issue a sham report in order to repeal the endangerment finding: https://www.nytimes.com/2026/01/30/climate/energy-department...

His targeting and shakedowns of Universities, law firms, and media companies is transparently illegal jawboning.

Everything about the tariffs is obviously illegal which he confirms every time he opens his mouth since he's relying on 'national security' justifications to issue them without Congress and he keeps insisting they're punishment for some random perceived slight.

His illegal firing of Federal workers without the notice required: https://www.npr.org/2025/09/25/nx-s1-5544317/federal-probati...

Some sillier things like renaming the Kennedy Center -- the law that established it literally said that it couldn't be renamed without Congress -- so Trump firing everyone on the board and then appointing a bunch of his flunkees to vote for the name change doesn't cut it.. https://beatty.house.gov/sites/evo-subsites/beatty.house.gov...

It's a literal onslaught of illegality so I can't tell if you haven't read a news article since 2025 or if you're trolling.

k33n · 2026-02-06T06:51:18 1770360678

[flagged]

chrisjj · 2026-02-06T10:01:29 1770372089

How can illegal firings be not illegal?

k33n · 2026-02-06T22:59:22 1770418762

How can legal firings be illegal?

chrisjj · 2026-02-06T23:23:28 1770420208

No way I can see!

rockskon · 2026-02-06T00:42:58 1770338578

Yeah - they'll take these lessons learned for future batches of releases.

rcakebread · 2026-02-06T10:07:35 1770372455

Sicko.

bushbaba · 2026-02-06T02:29:13 1770344953

This proves my paranoia that you should print and rescan redactions. That or do screenshots of the pdf redacted and convert back to a pdf

Snoozus · 2026-02-06T04:46:52 1770353212

this would not have helped here

phanimahesh · 2026-02-06T05:34:39 1770356079

How would that help in this case?

velaia · 2026-02-06T00:13:41 1770336821

Bummer that it's not December - the https://www.reddit.com/r/adventofcode/ crows would love this puzzle

nubg · 2026-02-06T00:50:29 1770339029

Wait would this give us the unredacted PDFs?

ryanSrich · 2026-02-06T03:26:33 1770348393

That's the idea yeah. There are other people actively working on this. You can follow vx-underground on twitter. They're tracking it.

poyu · 2026-02-06T01:15:12 1770340512

I think it's the PDF files that were attached to the emails, since they're base64 encoded.

sznio · 2026-02-06T09:59:36 1770371976

From the unredacted attachments you could figure out what the redacted content most likely contains. Just like the other sloppy redactions that sometimes hide one party of the conversation, sometimes the other, so you can easily figure out the both sides.

alhamdulillah23 · 2026-02-07T01:35:15 1770428115

Got it.

Page 1: https://imgur.com/a/jwgu9uH

Page 2: https://imgur.com/a/4Zi3bkk

Use this: https://github.com/KoKuToru/extract_attachment_EFTA00400459

iwontberude · 2026-02-05T23:01:52 1770332512

This one is irresistible to play with. Indeed a nerd snipe.

netsharc · 2026-02-05T23:22:25 1770333745

I doubt the PDF would be very interesting. There are enough clues in the human-readable parts: it's an invite to a benefit event in New York (filename calls it DBC12) that's scheduled on December 10, 2012, 8pm... Good old-fashioned searching could probably uncover what DBC12 was, although maybe not, it probably wasn't a public event.

The recipient is also named in there...

RajT88 · 2026-02-05T23:41:44 1770334904

There's potentially a lot of files attached and printed out in this fashion.

The search on the DOJ website (which we shouldn't trust), given the query: "Content-Type: application/pdf; name=", yields maybe a half dozen or so similarly printed BASE64 attachments.

There's probably lots of images as well attached in the same way (probably mostly junk). I deleted all my archived copies recently once I learned about how not-quite-redacted they were. I will leave that exercise to someone else.

notenlish · 2026-02-06T06:53:42 1770360822

There's 70 results that come out when searching for "application/pdf" on the doj website

netsharc · 2026-02-06T08:30:46 1770366646

OK, but if the solution is to brute-force them, there's probably a need to choose which files to focus on.

Of course there are other content-types, e.g. searching for "Content-Type: image/jpeg" gets hits as well. But only a few of them actually have the base64 data, mostly there are just the MIME headers.. Looking for "/9j/" (which is Base64 for FF D8 FF, which is the header for JPEG files), the Trumpian justice.gov website ignores "/" and shows results case-insensitively, but there are 4 or 5 base64'ed JPEG images in there.

I also saw that the page is vulnerable to code injection, somehow garbage in one search result preview was OCREd as "<s [lots of garbage]>", and the rest of the search results were striken-through because "<s>" is the HTML to do that.

linuxguy2 · 2026-02-05T22:47:55 1770331675

Love this, absolutely looking forward to some results.

Evidlo · 2026-02-06T02:47:31 1770346051

I took at stab at training Tesseract and holy jeebus is their CLI awful. Just an insanely complicated configuration procedure.

subscribed · 2026-02-06T11:23:16 1770376996

Gods, I had a flashback just from you mentioning that.

I had a reasonably simple problem to solve, slightly weird font and some 10 words in English (I actually only missed one or two blocks for missing letters to cover all I needed).

After a couple of days having almost everything (?) I just surrendered. This seems to be intentionally hostile. All the docs scattered across several repositories, no comprehensive examples, etc.

Absolutely awful piece of software from this end (training the last gen).

queenkjuul · 2026-02-06T03:03:17 1770346997

I'm only here to shout out fish shell, a shell finally designed for the modern world of the 90s

FarmerPotato · 2026-02-05T23:01:14 1770332474

If only Base64 had used a checksum.

zahlman · 2026-02-05T23:26:54 1770334014

"had used"? Base64 is still in very common use, specifically embedded within JSON and in "data URLs" on the Web.

bahmboo · 2026-02-06T00:10:28 1770336628

"had" in the sense of when it was designed and introduced as a standard

ks2048 · 2026-02-06T06:22:51 1770358971

I wonder if jmail (https://www.jmail.world/) has worked on this?

I tried to find the message in this blog post, but couldn't. (don't see how to search by date).

blindriver · 2026-02-06T00:15:23 1770336923

On one hand, the DOJ gets shit because it was taking too long to produce the documents, and then on another, they get shit because there are mistakes in the redacting because there are 3 million pages of documents.

tclancy · 2026-02-06T12:56:44 1770382604

It really doesn’t matter which foot you use to step on your own dick. This could not have been more mishandled if they gave it to an actual snake.

rexpop · 2026-02-06T05:45:07 1770356707

"On the one hand the chef gets shit for taking too long, and then on another for undercooked, badly plated dishes."

Incompetence is incompetence.

rapind · 2026-02-06T02:27:19 1770344839

What they are redacting is pretty questionable though. Entire pages being suspiciously redacted with no explanation (which they are supposed to provide). This is just my opinion, but I think it's pretty hard to defend them as making an honest and best effort here. Remember they all lied about and changed their story on the Epstein "files" several times now (by all I mean Bondi, Patel, Bongino, and Trump).

It's really really hard to give them the benefit of the doubt at this point.

Rebelgecko · 2026-02-06T05:28:19 1770355699

My favorite is that sometimes they redact the word "don't". Not only does it totally change the meaning of whatever sentence it's in, the conspiracy theory is that they had a Big Dumb Regex for redacting /Don\W+T/i to remove Trump references

thereisnospork · 2026-02-06T00:45:12 1770338712

Considering the justice to document ratio that's kind of on them regardless.

subscribed · 2026-02-06T11:30:14 1770377414

It's pretty clear who they should be reacting (victims/minors) and who they shouldn't (perpetrators).

They wasted months erasing Trump from that instead. So it's on them.

krupan · 2026-02-06T15:44:52 1770392692

Government is bad at stuff, and more news at 11

hypeatei · 2026-02-06T10:20:35 1770373235

The zeitgeist around the files started with MAGA and their QAnon conspiracy. All the right wing podcasters were pushing a narrative that Trump was secretly working to expose and takedown a global child sex trafficking ring. Well, it turns out, unsurprisingly, that Trump was implicated too and that's when they started to do a 180. You can't have your cake and eat it too.

zahlman · 2026-02-05T23:26:07 1770333967

> …but good luck getting that to work once you get to the flate-compressed sections of the PDF.

A dynamic programming type approach might still be helpful. One version or other of the character might produce invalid flate data while the other is valid, or might give an implausible result.

yunnpp · 2026-02-06T02:40:30 1770345630

Time to flex those Leetcode skills.

winddude · 2026-02-06T04:28:20 1770352100

here's another few to decode,

https://www.justice.gov/epstein/files/DataSet%2010/EFTA01804...

https://www.justice.gov/epstein/files/DataSet%209/EFTA007755...

https://www.justice.gov/epstein/files/DataSet%209/EFTA004349...

and than this one judging by the name of the file (hanna something) and content of the email:

"Here is my girl, sweet sparkling Hanna=E2=80=A6! I am sure she is on Skype "

maybe more sinister (so be careful, i have no ideas what the laws are if you uncover you know what trump and Epstein were into)...

https://www.justice.gov/epstein/files/DataSet%2011/EFTA02715...

[Above is probably a legit modeling CV for HANNA BOUVENG, based on, https://www.justice.gov/epstein/files/DataSet%209/EFTA011204..., but still creepy, and doesn't seem like there's evidence of her being a victim]

Enhaj12 · 2026-02-07T18:01:05 1770487265

Regarding EFTA00434905

I tried and got alot of errors, cant seem to fix it, due to corruption.

https://www.docfly.com/editor/fa3bcb1fa9e8d2629b32/v9r21qsju...

Tried to get AI to guess the remaining text: https://pastebin.com/Z9X2d510

netsharc · 2026-02-06T08:50:57 1770367857

Geezus, with the short CV in your profile, you couldn't tell an LLM to decode "filename=utf-8"CV%5F%5F%5FHanna%5FTr%C3%A4ff%5F.pdf"? That's not "Bouveng".

Anyway searching for the email sender's name, there's a screenshot of an email of hers in English offering him a girl as an assistant who is "in top physical shape" (probably not this Hanna girl). That's fucking creepy: https://www.expressen.se/nyheter/varlden/epsteins-lofte-till...

winddude · 2026-02-06T15:54:49 1770393289

not sure how I missed the url encoding. yea, fuck not sure I want to decode that PDF, and their's a high probability that that's a victims name.

Wonder why there's so many random case files in the files.

Snoozus · 2026-02-06T04:49:56 1770353396

this one has a better font, might be a simple copy&paste job

winddude · 2026-02-06T05:14:29 1770354869

I've checked for copy and paste, there's so many character flaws, their OCR must have sucked really bad, I may try with deepseekOCR or something. I mean the database would probably more searchable if someone ran every file through a better OCR.

eek2121 · 2026-02-05T23:41:19 1770334879

Honestly, this is something that should've been kept private, until each and every single one of the files is out in the open. Sure, mistakes are being made, but if you blast them onto the internet, they WILL eventually get fixed.

Cool article, however.

misja111 · 2026-02-06T12:58:40 1770382720

Won't that entire DOJ archive already be downloaded for backup by several people? If I'd be a journalist working on those files, this is the very first thing I would do as soon as those files were published. Just to make sure you have the originals before DOJ can start adding more redactions.

SomaticPirate · 2026-02-06T04:18:44 1770351524

Are there archives of this? I have no doubt after this post goes viral some of these files might go “missing” Having a large number of conspiracies validated has lead me to firmly plant my aluminum hat

direwolf20 · 2026-02-06T10:13:57 1770372837

https://github.com/yung-megafone/Epstein-Files

IshKebab · 2026-02-06T14:22:19 1770387739

Disappointing how terrible open source OCR still is.

sorbus-25 · 2026-02-06T04:17:15 1770351435

Event details: https://web.archive.org/web/20260206040716/https://what2wear...

sorbus-25 · 2026-02-06T04:27:02 1770352022

DUBIN BREAST CENTER SECOND ANNUAL BENEFIT MONDAY, DECEMBER 10, 2012 HONORING ELISA PORT, MD, FACS AND THE RUTTENBERG FAMILY HOST CYNTHIA MCFADDEN SPECIAL MUSICAL PERFORMANCES CAROLINE JONES, K'NAAN, HALEY REINHART, THALIA, EMILY WARREN MANDARIN ORIENTAL 7:00PM COCKTAILS LOBBY LOUNGE 8:00PM DINNER AND ENTERTAINMENT MANDARIN BALLROOM FESTIVE ATTIRE