Following its publication of a report on the data discovery vendor community, consultancy firm Bloor Research has produced three more reports on other areas of the data management spectrum: data profiling, data cleansing and data quality. The smaller vendors in the reports seem to appear in the most favourable light due to their targeted coverage of their respective areas.
The data profiling report indicates that this area is distinct from data discovery because of its close association with data cleansing. Vendor solutions in the profiling space attempt to discover relationships between data elements (much the same as data discovery solutions) but also perform statistical analysis of that data to determine whether it is commensurate with expectations.
The report indicates that the similarity of data profiling to data discovery has been a hindrance for many of the vendors in the space because they have not fully exploited its potential. Philip Howard, author of the reports, examines four key aspects of the solutions in the space against which to judge their success: scalability and support for multiple, heterogeneous data sources; the data discovery facilities provided; support for collaboration between data users and stewards; and level of drill down with regards to statistical data.
“Flexibility will mean that the tool is more suitable for a wider range of tasks. If you are going to use data pro?ling as a part of broader data quality initiatives then you should be able to run data cleansing and matching routines without having to re-parse the information that you have already parsed for pro?ling purposes,” the report contends.
Bloor examines 18 of the 20 main vendors in the data profiling space and places them into three camps: those that only offer data profiling; those that focus on data quality; and those that offer a broader set of capabilities. The first category includes BDQ, Datiris, Exeros, Sypherlink and x88; in the second are Datactics, Datamentors and Trillium; and in the third group are Ataccama, DataFlux, Global IDs, IBM, Informatica, Microsoft, Pervasive, SAP Business Objects and Talend. It also highlights the partnerships that have sprung up between the vendors, including CA and Exeros, BDQ and Datactics, and Ataccama and iWay.
The report contends that the smaller vendors, which have tended to place more emphasis on profiling, are a better bet than the larger, more generally focused players in the market. Of the big name providers, Bloor highlights Trillium as an example of a vendor “some way ahead of its major competitors in terms of data discovery”. It also references IBM and CA: “In the latter case thanks to its partnership with Exeros, though CA will be focused on data discovery to augment data modelling rather than for other purposes.”
Of the smaller players in the market, the firm singles out Exeros, Global IDs, Sypherlink and x88 as the leading innovators in this market, along with Ataccama, “especially when used in conjunction with iWay’s Integration Server”.
The data cleansing vendor report includes a lot of the same players but examines their capabilities for matching, standardisation and data enrichment. The report is largely focused on the process of name and address cleansing for data and highlights SAP, Business Objects, IBM, DataFlux, Trillium, Informatica, Microsoft and Oracle as the biggest players in this market. For the financial services community in particular, Bloor spotlights Silver Creek as a smaller, semantically focused vendor that is worthy of note.
Instead of merely profiling the vendor landscape, the data quality report asks whether firms should opt for a platform at all and if so, how broad it should be. It discusses the pros and cons of opting for an all in one data management approach versus a best of breed approach, which may entail integration complications. Bloor also discusses the influence that geographic presence and coverage may have on vendor choice.
“As may be imagined there are fewer vendors in this report than in the preceding ones, primarily because of the number of vendors who specialise in only one part of the market. We also have three notable omissions that declined to participate in one or more parts of this series and therefore could not be included here: Microsoft, Oracle and Pervasive,” the report says.
Of those included in the survey, it cites Global IDs, Trillium and Informatica as the “leading products” on an all round basis. It also highlights DataFlux and Datactics as “worth consideration” and gives Datanomic a mention for its “ease of use”.