About a-team Marketing Services
The knowledge platform for the financial technology industry
The knowledge platform for the financial technology industry

A-Team Insight Blogs

Opinion: Dangerous Data

Subscribe to our newsletter

By Philippe Verriest, Director, Euroclear

Data is a vital component of our capital markets. Ensure this data is accurate, and you have a solid basis from which to make considered business decisions. Get it wrong and you risk contaminating many parts of the transaction chain.

If there is one lesson that the global financial crisis has taught us it is that poor data quality can have a devastating impact on risk management and a company’s bottom line. Directly linked to the crisis is a swathe of new regulations aimed at improving transparency. As a result, data management is under increasing watchdog scrutiny like never before.

The core thrust of the data governance agenda is twofold. First, the aim is to remedy the apparent frailties of data management, especially during times of extreme volatility. The second aim is to ensure that firms can cope with ever-increasing data capture and regulatory reporting requirements. The capture of data, its use, and subsequent reporting to industry supervisors in the form of trade reporting, all come with sizeable costs. Furthermore, market watchdogs in different markets should be wary of adding extra costs through the implementation of new rules that are in effect duplicate reporting requirements.

Indeed, firms can broadly classify data costs along three lines. First come feeds from vendors. Second, more feeds mean more data management and operational processing you will be faced with – again at a cost. And third are the indirect costs that go hand-in-hand with fixing trade breaks and the additional data validation effort in risk and finance as a result of faulty securities reference data.

Golden reference data, and why it matters, is a familiar topic. In the boom years, it would be fair to say that raw data quality was less of a concern – money was abundant and firms tolerated the costs of acquiring multiple data feeds as well as the outsourcing or installation of specialised software to cleanse the data. Firms could also afford to take on extra staff to deal with operational breaks as a result of poor reference data. When operational burdens became overwhelming, firms would more often than not take the simple option and offshore such processing activities to low-labour cost countries rather than tackle the root cause of the faulty data.

Regulatory Clout

Today, with the regulatory pendulum firmly swung towards prescriptive oversight, those sceptics who have previously argued that supervisors are good at talking tough while less good at acting on their words are in for a shock.

Mainland European regulators as well as the Prudential Regulation Authority (PRA) and Financial Conduct Authority (FCA) in the UK, now wield powers that can oblige firms to validate the data they use for corporate actions and trade reporting, amongst others. In the UK, the Bank of England has declared that the PRA can subject a firm’s data to onsite-checks without prior warning. What penalties will be meted out to non-compliant firms caught during dawn raids remains to be seen.

However, for firms to treat correct financial data as a ‘check-box process’ to appease national and regional watchdogs would be missing the point. Cross-checking data feeds and improving quality can and does significantly improve a firm’s business activities and bolster its bottom line.

The cost to buy reference data from vendors and the ultimate cost savings associated with rationalising this data remains of critical importance. While bottom-line cost savings have undoubtedly triggered the hunt for more accurate data, there has been a marked shift in the focus from the direct costs of procurement to the indirect costs of fixing trade breaks and errors caused by contaminated or incomplete data.

Data Overload

Recent market research suggests that the cost of repairing a single broken trade due to bad data ranges from 7 euros to as much as 50 euros. Because data repositories are often centralised in each firm, inaccuracies can be replicated in many different places within the same firm. A Tier 1 investment bank, with a multi-market strategy across time zones, may well conduct up to 250 million trades a year. On average, 8% or 20 million trades, experience trade breaks. Of these 20 million failed trades, 30% are directly linked to poor reference data used for the trade which equates to at least 46 million euros per year in costs that could be avoided.

Surely, there are enough data vendors out there with sufficiently high levels of quality data. And there are specialist firms armed with tools to tackle data cleansing to an appropriate level of satisfaction. Unfortunately, mainstream data vendors typically provide data accuracy levels of only 75%-80%. What about the other 20%-25% of faulty information? Furthermore, in the reference data industry it is commonplace for vendors to buy data from one another, reformat and rebrand it, thereby propagating errors or causing ‘false positives’ in data cleansing.

When a big Tier 1 bank takes more than 100 different data feeds, you could argue that there should be no data quality issue. They simply apply the ‘majority rules’ approach and use the information that is being repeated by the lion’s share of data vendors. Wrong. This approach is not only dangerous but can be really expensive.

Indeed, most off-the-self deployed solutions are based on the premise that ‘quality’ data is produced by comparing data across multiple sources and taking a consensus. Data is also chosen on the basis of preconceived notions such as Bloomberg being the best data provider for US corporates and Thomson Reuters being the hallmark provider for FX prices. To further obscure the picture, it is commonplace for data vendors to provide ‘raw’ data containing erroneous exceptionally high or low values, which are too often digested in the data mix without further thought.

The Utility Approach

In reply to client needs, Euroclear and SmartStream partnered in 2012 to provide a solution that accentuates good data and improves the quality of questionable securities reference data through a centralised data cleansing utility known as the CDU (Central Data Utility). The CDU takes the unconventional approach of cleansing securities reference data by independently validating the quality of each feed without directly comparing the data between two feeds. One common data model is used across all data vendors to enrich the initial feed. What is more, the CDU does not limit its service to the cleansing of data vendor feeds; it also accesses primary data sources, such as Euroclear’s own comprehensive settlement data, and that of stock exchanges, issuer agents and lead managers. Data accuracy results are above 98%.

The utility has built in tolerance-checking features to warn clients if, for example, an interest rate value for a FRN would suddenly spike. Furthermore, basic checks, which up till now go undetected in some back offices, come as standard. Business rule checks form part of the initial diagnostics; a basic example – a zero-coupon bond should never have coupon information attached to it in any data system.

Mistakes will happen in the management of reference data, but they must be minimised or at least contained. When mistakes are uncovered, the CDU will create a ‘ticket’ and the error is investigated by an allocated person. As well as correcting any potential errors it is equally important to understand how the errors occurred. Taking this one step forward, the CDU also provides a full audit trail of changes made to the original data and provides a reason as to why a data value has changed. As an industry utility with a common rationale, the error is fixed once for the entire user community. Making data retrieval and usage as easy as possible, clients can elect to receive their refined data in a variety of universal formats: XML, CSV, File Transfer as well as proprietary reporting formats.

What is more, does it not make good sense for a neutral, user-owned part of the capital market infrastructure to run such a cleaning platform? A commercial provider could well look at its own commercial interests first, rather than strive for the best possible data quality. Bad data does unfortunately translate as business opportunities for firms that patch and bill, patch and bill without addressing the root cause.

Streamlining for the Future

Why does data management matter more today than ever before? It is primarily a combination of reducing cost and rebuilding trust with clients and regulators.

Regulatory initiatives, such as the EU’s Capital Requirement Directive (CRD IV), will require firms to comply with strict rules concerning data accuracy, integrity and high levels of data quality to be used for risk assessments. Banks estimate that compliance to the CRD IV will cost north of 3 billion euros with data collection, quality assurance and reporting being among the main cost drivers.

Services that enable market participants to reduce their capital costs to manage operational risks will be welcome. Data quality improvements will lead to fewer trade repairs, and it turn this should drastically lower the estimated $10 billion yearly spend by the industry on reference data. Bad data avoidance, or at least early detection, is the flip side of the coin to attain the same outcome.

Subscribe to our newsletter

Related content


Upcoming Webinar: ESG data sourcing and management to meet your ESG strategy, objectives and timeline

Date: 11 June 2024 Time: 10:00am ET / 3:00pm London / 4:00pm CET Duration: 50 minutes ESG data plays a key role in research, fund product development, fund selection, asset selection, performance tracking, and client and regulatory reporting, yet it is not always easy to source and manage in a complete, transparent and timely manner....


Firms in the US Prepare to Meet Compliance Date for UPI in Regulatory Reporting

The Derivatives Service Bureau (DSB) has released figures indicating industry readiness for the first jurisdictional compliance date for the inclusion of the Unique Product Identifier (UPI) in regulatory reporting in the US on 29 January 2024. The US is the first jurisdiction to start UPI reporting in G20 derivatives markets with EU EMIR Refit regulations...


RegTech Summit New York

Now in its 8th year, the RegTech Summit in New York will bring together the regtech ecosystem to explore how the North American capital markets financial industry can leverage technology to drive innovation, cut costs and support regulatory change.


Applications of Reference Data to the Middle Office

Increasing volumes and the complexity of reference data in the post-crisis environment have left the middle office struggling to meet the requirements of the current market order. Middle office functions must therefore be robust enough to be able to deal with the spectre of globalisation, an increase in the use of esoteric security types and...