Azdex has risen to prominence in the past year or so, given its activities in the hot space of business entity data. The vendor is in the process of securing financial backing from the Deutsche Borse, and recently held a well-attended event in London focused on ‘benchmarking business entity reference data quality’.
The event featured speakers from a range of institutions, including Barclays Capital, Citigroup, CSFB, HBOS, and Standard Bank, as well as a presentation from Azdex’s Steve French.
French said, “As a consultant we were asked, how good is my data? How much will it improve my data if I centralise operations?” Comparing data to an external source may tell a client that their data is 75% accurate, but Azdex believes this data needs to be triangulated.
Azdex’s analysts look at four key attributes of the data to assess its overall quality. First are formats. Said French, “Poorly formatted data usually indicates bad data.” Here, the best match they’ve had with a client is 95% match rate compared with Azdex data, and worst was 33%, although the average tends to be between 35% and 45%.”
Second is poorly populated data. The highest match rate was 97%, lowest 66% and average 81%. The third, which is duplication, usually indicates that there will be higher rates of decay. The highest rate was 99%, lowest 10% and average 80%. And the final attribute is accuracy, where the highest match rate was 95%, lowest was 58% and average was 78%.
This information when collated provides data coefficients (see diagram, Page 8). Two organizations, said French, could have the same accuracy level but far different coefficients if one has less duplications and/or better formatted data. “This coefficient approach works over time as you can’t keep the accuracy levels going if the formats are wrong.”
Azdex then maps out the data stratification, which bases ratings of data quality on tiers of data importance. For example if you have millions of dollars invested in a particular security, you would expect the data referencing that security to be very good, compared to data for a security you very rarely trade. As a result, each tier would have different benchmarks of coefficients.
The charted line representing the link between the tier and the coefficient, according to French, has been “a lot flatter than most organizations expect as the good data is not as good as they thought and the bad data is better than they thought”. The result of these findings could impact an organization’s spend and emphasis on its data efforts and provide measures of the efficiencies of its data efforts.
Azdex also performs mathematical analysis based on organizational benchmarking. A series of questions about the type of organization it is profiling result in various positions either ascended or descended by their response, similar to the game Snakes and Ladders.
Questions include whether the organization has centralised data management operations. If yes, the organization stays in the top quartile, or moves down to mid quartile if no. French said, “We’ve never seen better than middle of the pack for any operations that are siloed.” Another question is whether the front office is involved in data quality, with a positive answer knocking the respondent down a peg. He continues, “Does the company follow scrupulous standards formats with documentation that we can see as proof? If yes, they move up a notch. Is it a large acquisitive organization? If yes, they move back as it makes it very difficult to maintain data quality.”
The obvious question would then be why does Azdex perform such profiling? Said French, “We need to be able to compare organizations within the same groupings to properly benchmark them and to fit the right solution to the problem.”
He continues, “Movement between the segments is harder than we thought and we have zero examples of large organizations doing this. Improvements can, however, be made to the data itself in short time scales and at a low level of risk. We have seen one large firm improve their data by several quartiles although their environment remains in the lower quartile.”
Data from 25 financial institutions was used in compiling these benchmarking metrics.
Another presentation was also given by Simon Leighton-Porter, who highlighted some of the impact that the upcoming MiFID regulations could have on business entity data. Obviously, the impact of MiFID is even more wide-ranging and is a topic Reference Data Review has started to cover and intends to cover more fully as institutions begin to realise what a large-scale issue MiFID truly is.