The short version
- The hardest tax documents are where extraction accuracy gets tested. For us, that means brokerage statements that run hundreds of pages, cover multiple accounts, or arrive as scanned paper. If a system handles those, the cleaner documents are covered.
- On our internal benchmark of those hardest documents, accuracy went from 98.6% to 99.9%+. That matches the level we've held on the brokerage statements firms see day-to-day for some time. The last mile of accuracy is the hardest mile.
- The improvement isn't a new AI model. We built a self-correcting verification layer on top of extraction. Juno cross-references multiple data points within the same document, catches discrepancies, and fixes them before the data lands in your tax software.
- The rules behind that check came from the patterns hundreds of firms flagged in real returns last tax season. We don't train models on your client data.
Why I'm writing this
I'm Jack Flitcroft. I'm Head of AI at Juno and one of the co-founders. The most common question I get from tax pros evaluating Juno: how accurate is it, actually? Below is the methodology behind our latest accuracy work. What changed, how we verify, what the test dataset is, where the limits are.
A note up front. This post is about brokerage statement accuracy specifically, but the work behind it is about Juno's accuracy overall. Brokerage statements are one of the hardest documents in tax. Hundreds of pages, multiple accounts inside one PDF, scanned paper formats from older firms. If our accuracy holds there, it holds everywhere. We use brokerage statements as our stress test.
The last mile: closing the gap on our hardest documents
Our brokerage statement extraction has been at 99.9%+ across the majority of brokerage statements firms see day-to-day for some time. Those are the cleaner, shorter documents from the major brokerages.
The accuracy gap was on a specific subset I'd call ultra-complex: PDFs that run hundreds of pages, contain multiple distinct accounts inside the same document, include thousands of individual transactions, or arrive as scanned images of paper rather than digital exports. On that subset, our internal benchmark was at 98.6%. While very high, 1.4% inaccuracy on these large documents creates a needle in the haystack scenario our customers had to manually solve.
This release closes that gap. The ultra-complex subset is now at 99.9%+, matching the rest of our accuracy.
The last mile (going from 98.6% to 99.9% on the hardest documents) is harder than the first mile (going from 80% to 90%). It takes handling edge cases that don't show up in most documents but show up consistently in the hardest ones. Hundreds of formats. Multiple accounts inside one statement. Items buried in footnotes, reported in non-obvious locations, or formatted in ways that look like something else entirely.
A note on what this doesn't cover. Accuracy depends on document quality. If a brokerage statement arrives as a badly scanned photocopy (pages cut off, text illegible, images rotated), no extraction system will produce reliable output, including ours. We flag low-confidence reads so your team knows where to look, but we can't invent data that isn't on the page.
How the self-correction works
The brokerage statement is one of the few tax source documents with a built-in self-check. Most extraction systems don't use it. We do now.
Brokerage documents contain internal consistency checks that most extraction systems ignore. We don't.
⚙️ The mechanism in one sentence: Juno cross-references multiple extracted values from the same document to verify accuracy, and self-corrects when it finds a discrepancy.
Our extraction accuracy is now high enough to make this cross-check viable. When the verification layer finds a discrepancy, Juno goes back to identify the mistake. Issues surface inside Juno before the data gets to your tax software, instead of surfacing when a reviewer or the IRS catches them later.
What we built on top of extraction
I want to be precise about what this release is.
The extraction pipeline that gets Juno to 99.9%+ on most brokerage statements is the same one we've been running. That pipeline is unchanged. What's new is a deterministic, rule-based correction layer that sits on top of it.
🔁 What's new this release: A tax-logic verification layer on top of our existing extraction. AI extracts the data. Tax logic checks the work.
The change matters because it gives us a verification step on the hardest documents. AI does the extraction. Tax logic verifies the results using the document's own internal data, and triggers a correction if anything doesn't reconcile.
We could only build this because Juno is in use across hundreds of firms. We iterated on the cases our users flagged in tax season and built them into the verification layer. Hundreds of formats. Edge cases the foundation alone wasn't always catching, now caught.
🔒 Important: we do not train models on user data. We built a verification system informed by hundreds of formats, scenarios, and edge cases identified by our user base over tax season. The corrections come from the patterns users flagged, not from any training on the underlying client data.
The accuracy improvement is tax-rule-based mechanisms layered on top of strong extraction, not a model upgrade.
What the test dataset actually covers
Our test dataset contains a wide array of brokerage document types. The 99.9%+ accuracy figure on the hardest subset is on a benchmark that includes:
- PDF documents that aren't digitized. Scans of paper statements. Image-based PDFs without selectable text. The hardest format because OCR errors compound everything downstream.
- Multi-hundred-page documents with thousands of transactions. The kind of document that's painful to even open in a PDF reader. Heavy-trading clients, hedge-fund-like activity, day-trading retail.
- Statements with multiple distinct accounts. Juno handles each account separately, even when they're combined in one document.
- A significant diversity of brokerage types. Different formats from different custodians, because no two brokerages report the same way.
- Normal 1099 forms hidden inside the brokerage statement. 1099-DIV, 1099-NEC, 1099-OID, 1099-INT, and 1099-MISC. These are often buried throughout the brokerage statement and easy to miss if you're only reading the main report.
- Account fees, municipal bond interest, and other items often hidden throughout the document. Items that previous versions of Juno didn't always pick up because they weren't on the summary page. The new correction layer catches them.
If you're evaluating Juno on your own brokerage statements, the test that matters is what happens on your real documents, not on our benchmark. I'd rather you run a free trial on five real client returns than take our number on faith.
What this changes for your firm
A few practical implications:
- Hidden 1099s inside brokerage statements get caught. If you've ever had a client send you a fat brokerage statement and discovered a 1099-OID buried on page 87 after you'd already filed, this update is for you.
- Multi-account statements separate cleanly. Joint, IRA, and managed accounts inside the same statement get identified and assigned correctly on the return.
- Scanned PDFs from paper statements still work. Older clients who keep paper statements aren't a workflow exception anymore.
- Reconciliation is built in. The summary-against-transactions check happens before the data lands in your tax software, not after a reviewer finds the gap.
- Your review step still matters. We built Juno to handle the extraction so your team can focus on the judgment calls, not to eliminate review entirely. The validation screen flags anything Juno isn't confident about, so you know exactly where to spend your time.
Accuracy FAQ
What does "99.9%+" mean here?
It means our internal benchmark for ultra-complex brokerage statements is now scoring 99.9% or higher on the metric we use to measure extraction accuracy. The benchmark includes the document types described above. On the brokerage statements that most firms see day-to-day, our accuracy has been at the same 99.9%+ level for some time. This release closes the gap on the hardest subset.
Are you training AI models on my client data?
No. Our correction logic is deterministic and rule-based. We built the rules from the patterns and edge cases users flagged in their own returns. We did not fine-tune or otherwise train any model on the underlying client data in those returns.
How is this different from OCR?
OCR turns an image of text into machine-readable characters. It doesn't understand what those characters mean or check whether they reconcile to anything else. What we're describing is the layer above OCR: AI extracting structured tax data from documents, plus a deterministic tax-logic check that compares extracted values against other extracted values from the same document.
Does this approach work on other tax documents?
The specific verification approach varies by document type. For brokerage statements, the document's own structure gives us a built-in cross-check. For other documents (W-2s, 1099 forms, K-1s, business returns), we apply different verification methods appropriate to each format. The broader pattern (extract with AI, verify with tax logic) is the methodology we apply across all of Juno's extraction.
What's the right way for me to evaluate accuracy on my own returns?
Run a trial on real client returns. Pick the messiest brokerage statements in your book and see what comes through. If anything's off, the validation screen will flag low-confidence reads, and you'll know exactly what to check. That's a better measure than any vendor benchmark.
Ready to test it on your own returns?
Start a free trial of AI tax preparation software built by a CPA, trusted by 1,000s of tax pros.