Consultancy firm Bloor Research has released the first of four market updates profiling the data management vendor community. The report profiles the ‘data discovery’ vendor community, which it indicates is a new sector that has distinguished itself from data profiling. Data discovery deals with: “the discovery of relationships between data elements, regardless of where the data is stored,” says the report.
Data discovery is not as closely linked to data quality, according to Bloor, and is more important in the implementation of master data management (MDM) solutions. “It can be used to complement data modelling tools, it may be employed for business intelligence purposes; and it has a significant role to play in supporting data migrations, data archival and data governance,” the report explains. The discovery of data relationships can be achieved via four different approaches, says Bloor: data modelling; data profiling; structured search; and an approach based on a model driven architecture. Each of these approaches has their benefits and drawbacks and the 30 main vendors in the space have each adopted one or more of these methods. Bloor spoke to 21 of the vendors and their feedback is incorporated into the report. The data modelling approach, which Bloor describes as the “oldest and least efficient”, involves reverse engineering of existing database schemas. According to the report, this approach is not sufficient to fully support data discovery because it analyses metadata rather than the data itself. Data profiling tools perform statistical analysis against data sources that are specifically designed to assist data cleansing processes. The vendors that have adopted this approach have often failed to view data discovery as significant on its own terms and instead seen it as a function of data quality, says Bloor, and this is a drawback for the area. The approach is also heavily manual and has all the associated issues with manual processing, says the report. Structured search uses a federated platform that enables vendors to query multiple, heterogeneous data sources via virtual views. Indexes are built via an automated process against the tables that the firm is interested in and then a search is implemented on the front end. “This sort of approach has the advantage that it can be used for both business intelligence and data discovery purposes. Similarly, it can be employed by both business analysts and data management personnel,” says the report. There is only one vendor that uses a model driven architecture approach, says Bloor, which involves the reverse engineering of databases using physical, logical and conceptual data models. The approach by Rever is useful for migration projects between databases and it allows for the comparison of hierarchical and network databases with relational ones. Three of the main vendors in the space, Sybase, Oracle and Human Inference, did not agree to participate in the report, says Bloor. Of those that participated: CA and Embarcadero are in the data modelling space; Composite Software is in the structured search space; Rever uses the model driven architecture approach; and the rest all use data profiling. This last group comprises: BDQ, Datiris, Exeros, Sypherlink, x88, Ataccama, Datactics, Datamentors, Trillium, Data?ux, Global IDs, IBM, Informatica, Microsoft, Pervasive, SAP Business Objects and Talend. The report concludes: “Most of the leading products in this update, from a technical perspective, are offered by smaller vendors. However, such suppliers have obvious drawbacks such as limited geographic coverage, as a result of which many users will continue to prefer a big name provider.” Bloor highlights a number of the larger vendors to watch in the space, including Trillium, which it describes as “some way ahead of its major competitors”. It also points to Informatica as a possible contender to consider, along with CA, which it describes as a “clear leader”. “Of the smaller players in the data discovery space we would single out Exeros, Global IDs, Sypherlink and x88 (which is very new and therefore has the potential for greater things) amongst the data pro?ling vendors, as well as Ataccama, particularly when used in conjunction with iWay’s Integration Server for real-time processing,” the report adds. It also singles out Rever as an important one to watch due to its innovative approach.