About a-team Marketing Services
The knowledge platform for the financial technology industry
The knowledge platform for the financial technology industry

A-Team Insight Blogs

Clean Data Is Not Enough to Power AI

Subscribe to our newsletter

By Shai Popat, managing director, product and commercial strategy, financial information, SIX.

Agentic AI projects are beginning to roll out across the financial industry. Many firms are testing AI’s feasibility by assigning it relatively simple tasks, such as summarising information or retrieving data and documents from internal databases.

Two maxims are often cited when discussing AI adoption. One is “Garbage in, garbage out”, and the other is “Trust but verify”. Both are important principles, but neither fully explains how to solve one of AI’s biggest problems: hallucinations.

“Garbage in, garbage out” relates to data quality. If the data used to train or feed an AI model is poor, the insights it generates will inevitably suffer. Clean, sanitised and optimised data inputs are critically important, but they are not enough on their own.

This becomes especially important in complex financial datasets such as corporate actions, symbology, issuer hierarchies and regulatory reporting, where even a small inconsistency in interpretation can materially affect multiple teams and workflows across a financial institution.

Hallucinations Persist

Many firms experimenting with agentic AI will already have encountered hallucinations despite using high quality data inputs. For agentic AI to meet the high standards required for safe use in financial institutions, robust governance frameworks are essential. Firms are increasingly navigating the balance between the benefits of speed and the risks posed by “acceptable” hallucinations.

Legacy infrastructure is also ill-equipped to handle the rapid inflow of data AI systems require. Even the highest quality datasets will occasionally contain errors. Previously, if something looked wrong, a human could intervene and correct it. And even if an issue was missed initially, the pace of human decision-making often meant it would be caught before spreading further.

AI operates at a very different speed. A single incorrect input can be processed and reused multiple times before anyone notices. Consider a corporate action example. If an AI agent hallucinates an incorrect coupon payment from a bond prospectus, that error could flow into cashflow projections, distorting portfolio valuations and bond pricing. Firms therefore need clear frameworks governing how data is used across AI systems to ensure everyone is working from the same accurate information.

Trace and Validate

This is where the second adage, “trust but verify”, becomes equally important. If an AI-generated output appears questionable, the person overseeing it must be able to trace and validate the result rather than simply accept it at face value.

That requires governance around the verification process itself. If AI remains a black box, determining why a hallucination occurred, or even whether an output is wrong, becomes extremely time-consuming. But with the right controls in place, firms can understand how the AI moved from A to B. That makes it far easier to identify the root cause of a hallucination and verify whether the output is correct.

For decades, financial firms have developed governance procedures to minimise the impact of human error, whether fat-finger trades or an extra zero entered into a system. The same discipline is now needed for AI, ensuring outputs move from plausible to defensible and from interesting to usable.

Semantic Layer

As AI adoption becomes more widespread, increasing volumes of data are being transmitted through APIs designed specifically for AI use cases. This simplifies access and removes translation layers that can introduce additional errors. Fewer layers also improve accountability, as there are fewer points at which errors can enter the process, making audit trails easier to manage.

The crucial ingredient that makes AI usable within this streamlined model is the semantic layer. Sitting above raw data and APIs, the semantic layer provides business meaning, enabling AI to translate a question into the correct data calls, joins and calculations using consistent definitions. APIs and Model Context Protocols (MCPs) make data accessible to AI agents, but they do not provide meaning on their own.

Even with high quality data, AI struggles without that context. It may not understand what Swiss-domiciled refers to, how identifiers such as Valor, ISIN and LEI connect to one another, or how ratings should be standardised. The result is often a fragmented set of outputs that users must piece together manually, undermining the very purpose of agentic AI.

For data providers, which play a critical role in enabling the adoption of agentic AI, the responsibility extends beyond simply supplying high quality data. They must also help build the foundations of the semantic layer through standardised datasets and identifier mapping. Increasingly, this data is delivered through cloud-enabled architectures, reducing fragmentation, improving consistency and enhancing timeliness. This is particularly valuable when querying large volumes of information. Cloud infrastructure, combined with APIs accessible through Model Context Protocols, creates a scalable bridge between AI agents and enterprise data, replacing one-off queries with continuous, structured access.

The firms that succeed with agentic AI will be those with the strongest data foundations and governance frameworks, capable of combining advanced AI models with interoperability and contextual understanding across reference, regulatory and event-driven datasets. That is what will allow high quality data to be used to its fullest extent without introducing additional risk.

Subscribe to our newsletter

Related content

WEBINAR

Recorded Webinar: Unpacking Stablecoin Challenges for Financial Institutions

The stablecoin market is experiencing unprecedented growth, driven by emerging regulatory clarity, technological maturity, and rising global demand for a faster, more secure financial infrastructure. But with opportunity comes complexity, and a host of challenges that financial institutions need to address before they can unlock the promise of a more streamlined financial transaction ecosystem. These...

BLOG

ISDA Taps Gentek AI for DRR Traceability Tool

The International Securities Swaps and Derivatives Association has selected Gentek AI to build a traceability tool for Digital Regulatory Reporting (DRR). Gentek will develop a tool designed to let users track the history of DRR decision-making and connect coding choices back to regulatory requirements. The story behind the announcement is that Gentek comes to the...

EVENT

AI in Capital Markets Summit London

Now in its 3rd year, the AI in Capital Markets Summit returns with a focus on the practicalities of onboarding AI enterprise wide for business value creation. Whilst AI offers huge potential to revolutionise capital markets operations many are struggling to move beyond pilot phase to generate substantial value from AI.

GUIDE

AI in Capital Markets Handbook 2026

AI adoption in capital markets has moved into a more disciplined phase. The priority is now controlled deployment: where AI can be used safely, where it can deliver measurable value, and how outputs can be governed, monitored and evidenced. The 2026 edition of the AI in Capital Markets Handbook examines how AI is being applied...