A-Team Insight Blogs

FCA and Turing Institute Collaborate on Synthetic Data to Advance AML Detection

20 April 2026

Subscribe to our newsletter

The Financial Conduct Authority has published a research note from its synthetic data anti-money laundering project, an initiative that began in autumn 2024 and was developed with the Alan Turing Institute, Plenitude Consulting, and Napier AI to create a synthetic dataset for AML detection testing. The paper marks the culmination of that work to date and sets out how the dataset will be used in the FCA’s upcoming AML detection sprint, giving firms a controlled environment in which to test transaction monitoring approaches without relying on live customer data. The project adds to the FCA’s use of shared testing environments for regulatory innovation and industry collaboration.

Firms need rich transactional data to test whether models can identify suspicious behaviour that unfolds across multiple accounts, entities, and payment flows, yet access to that data is restricted by privacy, legal and confidentiality concerns. The FCA notes that criminals are estimated to launder between 2% and 5% of global GDP each year, or about $800 billion to $2 trillion, underscoring both the scale of the threat and the pressure on financial institutions to invest in more effective detection and prevention tools.

The regulator worked with the Alan Turing Institute, Plenitude Consulting and Napier AI, each contributing a different layer of expertise. The FCA’s role covered regulatory leadership, oversight, and technical skills. The Alan Turing Institute brought synthetic data research and technical capability. Plenitude contributed financial crime and industry expertise. Napier AI supplied applied technology and product experience in financial crime detection.

Methodology

The dataset was created in three stages:

The project began with real banking data that had already been anonymised at source, with no personal data included in the initial request. That provided the statistical base for the exercise while limiting privacy risk from the outset.
The dataset was then enriched with synthetic money laundering typologies designed to reflect recognisable real-world behaviours associated with illicit finance, so the resulting data could be used to test whether detection tools can identify suspicious patterns rather than only normal transaction activity.
Using the anonymised source data and the embedded typologies, the team generated fully synthetic datasets with the Adaptive and Iterative Mechanism (AIM), a privacy-focused method that introduces controlled randomness to prevent reverse engineering of individual customers or transactions while preserving enough structure for meaningful analysis. Access to the final dataset is limited to participating firms in the data sprint under contractual and control measures.

The testing phase showed the dataset is usable for controlled AML testing. The team found little statistical divergence from the anonymised source data, while maintaining privacy safeguards through the generation process. Tests on the embedded typologies also produced a useful range of detectability, rather than patterns that were either too obvious or too obscure.

That next phase is the Synthetic Data AML Solution Sprint, which the FCA will run through its Digital Sandbox. The regulator says participating firms will use the dataset to test transaction monitoring approaches and then reconvene to share findings. Applications are open until 26 April. The stated aim is to create a setting in which new detection techniques can be demonstrated and challenged without exposing real customer data or requiring privileged access to bank datasets.

The report is also careful about the limits of what synthetic data can do:

The dataset can only capture known typologies and cannot reflect laundering methods that have not yet been identified or codified. It also notes internal coherence challenges, including the difficulty of preserving realistic relationships between customers, accounts and transactions, and the limitations of modelling time-based behaviour when transactions are generated independently. In some cases, the project chose to document anomalies rather than sanitise the data, on the basis that an unrealistically clean dataset would be less useful for AML testing.
Synthetic data can contain emergent artefacts that arise from privacy processing, typology injection, and modelling choices rather than genuine risk signals. Firms could optimise their systems to the typologies embedded in the dataset without improving broader detection capability, or place too much confidence in results derived from synthetic rather than live operational data. The FCA’s conclusion is that synthetic data should complement rather than replace real-world calibration and validation.

After the data sprint, the FCA will use participant feedback to refine the dataset, so it continues to reflect the complexity of financial crime and consider whether access could eventually extend beyond the sandbox. Any broader rollout would depend on stronger governance, privacy safeguards, and alignment with international standards, supported by clearer technical documentation and disclosure of limitations. The next phase will also need to address which additional typologies should be added, how access should be managed beyond the sandbox, and what evaluation standards are needed so results can be compared and trusted across participants.

The project shows the FCA using collaborative testing infrastructure to address a persistent financial crime risk. By bringing together public sector oversight, academic research, and industry expertise to build a usable synthetic AML testing environment, the regulator is showing how structured collaboration can strengthen detection capabilities while preserving the confidentiality and controlled handling of sensitive financial data.

Subscribe to our newsletter

RegTech Insight

WEBINAR

Recorded Webinar: Sponsored by FundGuard: NAV Resilience Under DORA, A Year of Lessons Learned

The EU’s Digital Operational Resilience Act (DORA) came into force a year ago, and is reshaping how asset managers, asset owners and fund service providers think about operational risk. While DORA’s focus is squarely on ICT resilience and third-party dependencies, its implications extend deep into core operational processes that are critical to market integrity, investor...

Find out more

25 February 2026

RegTech Insight

BLOG

A-Team Group Announces Winners of the 2025 RegTech Insight Awards (USA)

A-Team Group is delighted to announce the winners of the 2025 RegTech Insight Awards USA, recognising the leading providers of RegTech solutions, and consultancy services for capital markets across North America. Spanning more than 30 categories, the 2025 awards programme recognised excellence across a wide range of regulatory compliance solutions and services. A-Team Group also presented...

20 November 2025

RegTech Insight

EVENT

RegTech Summit London

Now in its 9th year, the RegTech Summit in London will bring together the RegTech ecosystem to explore how the European capital markets financial industry can leverage technology to drive innovation, cut costs and support regulatory change.

01 October 2026

RegTech Insight

GUIDE

AI in Capital Markets Handbook 2026

AI adoption in capital markets has moved into a more disciplined phase. The priority is now controlled deployment: where AI can be used safely, where it can deliver measurable value, and how outputs can be governed, monitored and evidenced. The 2026 edition of the AI in Capital Markets Handbook examines how AI is being applied...

21 May 2026

Data Management Insight Market & Alt Data Insight RegTech Insight TradingTech Insight

Browse by brand

RegTech Insight

TradingTech Insight

Data Management Insight

Browse by content type

A-Team Insight Blogs

FCA and Turing Institute Collaborate on Synthetic Data to Advance AML Detection

Share article

Related content

WEBINAR

Recorded Webinar: Sponsored by FundGuard: NAV Resilience Under DORA, A Year of Lessons Learned

BLOG

A-Team Group Announces Winners of the 2025 RegTech Insight Awards (USA)

EVENT

RegTech Summit London

GUIDE

AI in Capital Markets Handbook 2026

Share on Mastodon

A-Team Insight Blogs

FCA and Turing Institute Collaborate on Synthetic Data to Advance AML Detection

Share article

Related content

webinars

Recorded Webinar: Navigating a Complex World: Best Data Practices in Sanctions Screening

Related content

WEBINAR

Recorded Webinar: Sponsored by FundGuard: NAV Resilience Under DORA, A Year of Lessons Learned

BLOG

A-Team Group Announces Winners of the 2025 RegTech Insight Awards (USA)

EVENT

RegTech Summit London

GUIDE

AI in Capital Markets Handbook 2026