About a-team Marketing Services
The knowledge platform for the financial technology industry
The knowledge platform for the financial technology industry

A-Team Insight Blogs

Data Integration Tools: Now, Where Did I leave it…?

Subscribe to our newsletter

By Nick Jones, Senior Consultant, Citisoft

Have you ever tried to do a simple job around the house and spent several times longer looking for the right tools and materials than it took to actually do the job?  Have you ever wanted to make a change to a system and been frustrated by the time and cost needed to check what the impact of the change (if any) would be on ‘upstream’ and / or ‘downstream’ systems?

The Problem with Data Integration Tools

There is a good selection of feature-rich ‘data management’ tools in the investment management market, each being sold on the promise that it will ‘generate consistency and coherence in all your data’. Indeed, the best integration tools excel in moving data from A to B in an organised and controlled manner.

However, the ensuing integration projects frequently fail to deliver full benefits, on schedule or on budget. When moving data from A to B expands to become moving some of it from B to C as well, and then some similar data has to be moved from D to C, and from D to E, and finally from C to F, there can be a proliferation of ad-hoc point-to-point data flows. Individually, the flows all make perfect sense, and individually they may be developed and deployed efficiently. But the task of managing the data becomes more and more challenging as the number of point-to-point interfaces grows.

This is not always the fault of the tools themselves. It is at least partly because the tools are being used incorrectly. What is not appreciated – and what vendors have little incentive to highlight – is that data integration tools in themselves are not a sufficient data management solution, though they do form an essential element of a broader solution.

In an information-centric business, data is one of the main materials used to produce our products. And data integration tools can do a superb job of moving it around … so that we never quite know where we should be looking to find it, in the correct state, when we most need it!  (Any analogy with domestic partners tidying things up should be studiously avoided in this context.)

So, in order to manage data effectively, and enable us to get the maximum benefit from it and from the products it supports, we need a data map, a form of indexing (or metadata) that tells us where we can find any particular piece of data in any specific state.

Why Metadata?

At its core, this data map, or metadata, is a logical model of the data we are using, expressed and described in business terms. As well as telling us what the data means, it can tell us where it is stored, how it reached its location, and where it will be used. Immediately this information is available it begins to provide benefits in terms of speed of requirements definitions, consistency of data use, and change impact analysis.

Once the data from disparate sources can all be addressed as if it is in a single place, with known and consistent forms and structures, it becomes possible to standardise procedures for operating on items – such as moving them between locations, or performing calculations and transformations. A data warehouse can be a good starting point for this metadata structure, if the data model is appropriate.

Once a structure is in place, it is feasible to use it to support automated generation of new, self-documenting data flows, including data interfaces and outbound reports. (Metadata-aware tools not only move the data, they record where they have put it too.)

This is not just a theoretical possibility. A Citisoft client is already using this metadata-driven development approach to save 50-75% of the previous development time for new client inbound data flows. A lengthy specification, development and testing cycle has been replaced by an iterative ‘map, run, check’ process, performed by (technically aware) Business Analysts. This allows much closer client and business involvement in the process, and much quicker identification (and resolution) of problems prior to deployment.

Cleaning Up: Enhanced Visibility and Usability

Extracting maximum value from data integration tools requires that they are used in the context of a logical data model. Creating this logical structure is simpler if there is a physical data model (e.g. a data warehouse) underpinning it.

With the metadata structure in place it is simpler to: co-ordinate multiple data integrations and extracts, ensure consistency of mapping, avoid multiple ad-hoc point-to-point data flows, and massively increase the visibility and usability of data.

Subscribe to our newsletter

Related content

WEBINAR

Recorded Webinar: How to optimise SaaS data management solutions

Software-as-a-Service (SaaS) data management solutions go hand-in-hand with cloud technology, delivering not only SaaS benefits of agility, a reduced on-premise footprint and access to third-party expertise, but also the fast data delivery, productivity and efficiency gains provided by the cloud. This webinar will focus on the essentials of SaaS data management, including practical guidance on...

BLOG

Snowflake Cortex Simplifies Route to Deriving Value from Generative AI

Snowflake has unveiled Snowflake Cortex, an innovative managed service designed to simplify how organisations derive value from generative AI. The service provides access to large language models (LLMs), AI models, and vector search functionality in the Snowflake Data Cloud, and includes serverless functions that help users accelerate analytics and build contextualised LLM-powered apps within minutes,...

EVENT

RegTech Summit New York

Now in its 8th year, the RegTech Summit in New York will bring together the regtech ecosystem to explore how the North American capital markets financial industry can leverage technology to drive innovation, cut costs and support regulatory change.

GUIDE

Hosted/Managed Services

The on-site data management model is broken. Resources have been squeezed to breaking point. The industry needs a new operating model if it is truly to do more with less. Can hosted/managed services provide the answer? Can the marketplace really create and maintain a utility-based approach to reference data management? And if so, how can...