A common reason taxpayers delay filing missing returns is the belief that HMRC will not notice. That belief made sense twenty years ago. It does not now. HMRC operates one of the more sophisticated data matching environments in the public sector, supplemented by machine learning risk models and a steady inflow of offshore data from more than 100 jurisdictions. This article describes what HMRC actually sees, how that visibility shapes any multi-year disclosure, and why reconciling to HMRC's data is the safeguard rather than the threat.
Sit this alongside the Multi-Year Tax Arrears Roadmap pillar, the operational walk through of filing five or more years of missing returns, and the procedural piece on the Digital Disclosure Service. The data picture in this article is the input to all three.
What HMRC Connect is
HMRC Connect is the data matching and analytics system that cross references taxpayer records against information feeds from a long list of third party sources. It has been in place since around 2010 and has expanded steadily in coverage. Connect does not make tax decisions; it surfaces discrepancies and risk patterns for HMRC caseworkers to act on.
The data feeds Connect draws on include:
- PAYE returns filed by employers, with payroll detail on every employee in the country.
- Bank and building society interest reports under domestic information powers.
- Land Registry property ownership and transfer records.
- Companies House director and shareholding information.
- Letting and short term rental platform data (Airbnb, Booking.com and others, under the OECD model rules now adopted in the UK).
- Payment processor data (PayPal, Stripe and others, under the digital platform reporting rules).
- DVLA records for high value vehicles.
- Insurance policy information for specific risks.
- Common Reporting Standard offshore account data from over 100 participating jurisdictions.
The Connect environment combines these feeds with HMRC's own data on filed returns, payments and notices. The output is a risk profile for each taxpayer that is continuously updated.
Where machine learning fits in
HMRC publicly describes its use of advanced analytics and machine learning in compliance risk identification. The models flag patterns that look anomalous against peer benchmarks: an undeclared income stream that produces deposits inconsistent with declared earnings, expense ratios outside the band for the trade, or property transactions that do not appear on a return where they would be expected to.
The models do not by themselves trigger enforcement. They generate the case allocations that HMRC compliance teams work through. The practical implication for a taxpayer with a multi-year disclosure is that the disclosure will be cross checked against the same data HMRC already holds. A disclosure that reconciles cleanly closes the risk profile; one that does not invites a deeper look.
The Common Reporting Standard and offshore data
The Common Reporting Standard (CRS) is the OECD framework under which over 100 participating jurisdictions exchange financial account information automatically with each tax authority where account holders are tax resident. The UK has been participating since 2017. The data exchanged includes account balances, interest, dividends, sale proceeds and other income paid on financial accounts.
For a UK tax resident with a foreign bank account, a foreign brokerage, a foreign pension or a foreign rental property held through a financial account, the assumption should be that HMRC already has the basic information. The CRS feed is not perfect (timing lags, scope gaps, jurisdictions not yet onboarded), but it is comprehensive enough that an offshore disclosure that omits a CRS reported account is almost certain to produce a query.
Offshore loading on penalties
Where undeclared tax involves offshore income or assets, Finance Act penalty regimes apply higher percentage bands than for the equivalent domestic underdisclosure. The Worldwide Disclosure Facility is the standard route. A general DDS disclosure is not the right vehicle for offshore positions.
What HMRC visibility means for any disclosure
The practical implication of HMRC's data position is that a multi-year disclosure has to reconcile to what HMRC can already see. Three failure patterns recur:
- 1The disclosure declares income for some years but not others, while HMRC's third party data shows a continuous income stream across the entire period.
- 2The declared figures sit consistently below what HMRC sees on the same source (for example bank interest substantially below the building society report).
- 3A specific income stream HMRC has data on (rental income from a platform, dividends from a known shareholding, property sale proceeds) is absent from the disclosure altogether.
Each of these triggers the same response: HMRC counters with the data it has, and the unprompted disclosure framing becomes harder to defend. The safeguard is to build the disclosure on the same data HMRC already holds, and to address any gaps explicitly in the methodology note.
The reconciliation checklist
Before submitting a multi-year disclosure, the working file should be reconciled against the following HMRC visible sources:
- Employment income on each year against P60 and P45 records pulled from the HMRC personal tax account.
- Bank interest on each year against the building society interest figures HMRC receives directly.
- Rental income on each year against any letting platform feeds, plus Land Registry records of properties held in the taxpayer's name.
- Dividend income on each year against Companies House shareholding records for any company the taxpayer is or has been a director of.
- Self-employment receipts against payment processor exports.
- Offshore receipts against CRS reported accounts.
Where the disclosure figures and HMRC visible figures match, no further work is needed. Where they diverge, the divergence is explained inside the methodology note, with the basis for the disclosure figure stated openly.
Where the data also drives failure to notify
HMRC's data position also affects the failure to notify regime under Finance Act 2008 Schedule 41. Where HMRC can show that an income source was visible (through a third party report) and the taxpayer did not notify chargeability by the 5 October deadline following the relevant tax year, the failure to notify penalty bites in addition to the late filing penalty on any return that was due.
An unprompted disclosure made before HMRC has used that data is the lever that pulls the failure to notify penalty down from the prompted minimum (10% for non deliberate) toward the unprompted minimum (potentially 0%). A disclosure made after HMRC has issued a check letter that references the same data is by definition prompted, and the lower band is no longer available.
The interaction with the assessment time limits
HMRC's data visibility interacts with the assessment time limits in the Taxes Management Act 1970. The four year window applies where conduct was not careless; the six year window applies to careless conduct; the twenty year window applies to deliberate conduct; the twelve year window is the offshore baseline. Where HMRC has data on an income stream and the taxpayer made no disclosure, the case for careless or deliberate conduct (and therefore the longer window) becomes easier for HMRC to argue.
For a taxpayer with several years of undisclosed UK income visible to HMRC through Connect, the practical exposure is usually six years rather than four. For one with undisclosed offshore income, the baseline is twelve years. These are the windows the disclosure needs to cover.
Where the data is patchy, and what that means
HMRC's data position is comprehensive but it is not omniscient. Domestic cash income with no banking trail, very small platform earnings under the de minimis reporting thresholds, and certain older years where third party reporting was less developed all sit in gaps. Some offshore jurisdictions are not yet full CRS participants.
The practical implication is that an unprompted disclosure can sometimes get ahead of HMRC even on income streams HMRC has not yet seen, by declaring them voluntarily before the data feed catches up. This is what makes the unprompted route worth using even for situations where HMRC visibility today looks incomplete: the feeds expand year on year, and the cost of being on the wrong side of that expansion is significantly higher than the cost of declaring early.
What to do with this picture
The point of describing HMRC's data position is not to alarm but to ground the disclosure strategy in what HMRC actually sees. Three practical takeaways:
- 1Assume HMRC already has visibility on the major income streams. Build the disclosure to reconcile to that assumption.
- 2Get the disclosure in before HMRC moves on it. The unprompted position is the variable that produces the biggest single change in the eventual bill.
- 3Use the methodology note to address any gap between the disclosure figures and what HMRC sees, openly. Defensive vagueness loses; transparent explanation works.
Common questions about HMRC data and disclosures
Can I check what HMRC already knows about me before I disclose
The HMRC personal tax account gives a partial view: employment history, PAYE position, state benefits, and some interest information. It does not show the Connect risk profile or the third party data Connect draws on. A specialist can review the personal tax account alongside the bank statements, platform exports and Companies House records to reconstruct the picture HMRC is likely working from.
Does HMRC really process Airbnb and similar platform data
Yes. The UK has implemented the OECD digital platform reporting rules, under which platforms report seller information to HMRC. Letting platforms, online marketplaces and gig economy platforms are all in scope. The data feed is established and used in compliance risk allocation.
How accurate is the Common Reporting Standard data
CRS data is comprehensive for participating jurisdictions and has expanded steadily since 2017. Coverage gaps exist for jurisdictions that have not yet onboarded, for certain account types and for timing. The practical baseline is that for any major participating jurisdiction, HMRC has the account level data.
Will HMRC tell me what they have before I disclose
Generally no. HMRC does not preview its risk position to a taxpayer who has not yet disclosed. Once a compliance check is opened, HMRC may share specific data points that prompted the check. By that stage the disclosure is prompted, with the higher penalty bands engaged.
Worried about what HMRC sees. Confidential assessment
A specialist reconstructs the picture HMRC is likely working from, then scopes the disclosure that reconciles to it. Free, fully confidential, no obligation.
Continue the series
The Multi-Year Tax Arrears Roadmap: Catching Up on Years of Unfiled ReturnsRead the complete guide and the rest of the series.

