
The data ecosystem has been awash with references to “artificial intelligence readiness” in the past few months, a reflection of the importance being placed on the technology within capital and private markets.
The term is generally used in calls for institutions to upgrade their data management systems to ensure their data is of good enough quality to be utilised by the AI models they hope to deploy.
The concern around data quality was illustrated in a recent survey by Semarchy, a master data management (MDM) and data ingestion specialist, which found that while 74 per cent of businesses are investing in AI this year, 98 per cent say poor AI data quality is undermining success.
The discussion around data quality improvement for AI can be simplistic, however. Within it lays a welter of different, often conflicting implications. What, for instance, does “good quality” mean when it comes to the data? What role does a firm’s existing technology play in determining that? And how can data chiefs establish how to make the necessary changes to their own systems?These are just a few questions that Craig Gravina, chief technology officer at Semarchy, took in during a conversation with Data Management Insight. Here, Craig also argues that another critical question mark hangs over the discussion: how much data autonomy will be lost in changing data to suit a specific use?
Data Management Insight: What do you see as the key problem with the way firms approach their preparation for AI integration?
Craig Gravina: The way that I think this is evolving is that if all you’re doing from a data platform perspective is consolidating high-quality data and then letting the data science teams or others move it outside of the system, then you’ve eliminated all of the controls and the governance that ensure data trust, lineage and explainability of AI, it all goes out the window.
DMI: What is at stake for organisations that don’t grasp this principle?
CG: AI is here to stay. Models are becoming commoditised and organizations are focusing on the impact their data has to the inferences that AI makes. Mitigating risk related to inaccurate data, data privacy, and emerging regulation and compliance requirements is becoming just as important as the innovation itself.
This responsibility is shifting to the data professionals that have traditionally managed and governed the data assets of the company.
While the emphasis on these professionals has been delivering AI-ready data, without the ability to manage and govern how the data is being used, the genie is out of the bottle.
DMI: What is the general solution to this challenge?
CG: Rather than thinking of your master data management or data platform as a way to produce better data to be used somewhere else, you need to consider it a platform for delivering assets that are leverageable by data science teams and by agentic AI directly.
DMI: What should organisations be striving for as they take this approach?
CG: The need for high quality trusted data for AI is becoming more and more prevalent. Organizations are aggregating data from lots of different source systems, consolidating it somewhere – in our case in the MDM where they’re reconciling duplicates and delivering certified data.
This certainly addresses a number of use cases for the organisation, but without considering the data science and AI use cases, those teams will say ‘I need data to train an AI model or generate vector databases’, and they’ll basically just take extracts of the data assets. How can data teams be responsible for the quality of the data and the impact in the AI ecosystem once it’s left the system?
The thought process is how does your MDM or your data platform treat the data science teams and the AI itself as just another consumer, another persona that you serve from the business.
DMI: Semarchy offers clients the Semarchy Data Platform (SDP), which give them an enterprise-wide platform on which they can deliver data across users, automate work flows through native-AI applications and accelerate the time-to-market for innovations using DataOps tools. How does it help clients approach AI readiness?
CG: Semarchy’s approach allows the data professional, who’s being told they now have responsibility for the data that AI is consuming, to serve these personas while still maintaining governance, quality and trust. So if you look at it from that lens of how AI and data platforms are converging, extending the MDM use cases to not only deliver golden data records, but to be equally responsible in curating data assets in a consumable form for data science and AI ecosystems.
This means delivering vector data, RAG and MCP directly from the MDM, and not just thinking about delivering data sets. If you take this convergence even further, core aspects of MDM, data governance and data integration are just as relevant to your AI models and agentic resources.
It’s not a far leap to think of data platforms extending to master model management and AI governance as well.
DMI: You place a lot of store in data products. Can you explain why?
CG: We think of data products as much more than just data sets or AI-Ready data, they also expose vector data, RAG, and MCP servers to satisfy the needs of data science and agentic ecosystems. This approach enables data teams to service the teams charged with innovation and AI advancement AI-native data from the platform they already work with.
And by exposing these AI-native data assets through self-discoverable and governed data product catalogues, these teams can self-serve their individual needs and focus on innovation rather than massaging data or injecting risk into the data pipelines that traditionally seed their work.
Subscribe to our newsletter


