Data Governance & Lineage

Data Management Insight Knowledge Hub

Data Lineage

In a nutshell: Data lineage traces data from source to destination, noting every move the data makes and taking into account any changes to the data during its journey for full traceability. It is critical to regulatory compliance and offers numerous business and operational benefits.

Read on in our Knowledge Hub ‘Everything you need to know’ section to understand the full details of what data lineage is all about, who it impacts, the key requirements, the technical and data challenges it presents, and the outlook.

You can also take a look at all the latest content we have related to data lineage. And you can see a listing of key vendors delivering solutions to this data and technological challenge.

Key resources
Everything you need to know
Solution providers

Our Data Lineage Knowledge Hub delivers everything you need to know about data lineage, with a full overview, key resources from across the A-Team Insight platform, and a list of solutions providers.

Key resources

Blogs

Data Management Insight

Data Fabric vs. Data Mesh: 10 Companies Provisioning Modern Data Architectures for Enterprise AI

As institutions absorb ever greater volumes of data to meet their increasingly complex operational needs and those of regulators, they face a dilemma of how to store and distribute that critical information. Fragmented legacy systems have long been an impediment to the smooth management of data and now corralling multiple-cloud configurations can be added to…

03 February 2026

Data Management Insight

Agentic AI Deployment Presents Potentially Dangerous Data ‘Trust Paradox’

Artificial intelligence deployment in capital markets’ data processes may be approaching an inflection point that, if not managed properly, could introduce dangerous risks to institutions’ operations. The growing deployment of anonymous agents has the potential to hardwire data errors into workflows, magnifying data weaknesses as the automating technology scales processes, according Informatica from Salesforce. The…

03 February 2026

Data Management Insight

7Rivers Q&A: Enabling Modern Data Processing

Milwaukee, Wisconsin-based 7Rivers gives its clients the tools to draw actionable insights and real-world applications from their data. A-Team Group Data Management Insight spoke to Jessica Emhoff, Vice President of Marketing, about the company and how it is empowering financial institutions. Data Management Insight: Hello Jessica. Can you tell us a bit about how 7Rivers…

03 February 2026

White papers

The Transparency Imperative: Forging Competitive Advantage in Private Markets Through Next-Generation Data Management

Private assets markets are growing in importance to the global economy, with asset under management at their highest on record. However, these markets – which comprise a growing list of asset classes that include private equity and hedge fund holdings, private credit exposure, property and other alternatives – are founded on outdated and high-risk data…

Download

01 December 2025

Data Management Insight

Dispelling Myths About North American Securities Class Actions

Securities class actions serve an important role in protecting and enhancing shareholder value. However, recovering settlement proceeds can sometimes be overlooked due to the perceived complexity, limited resources, or lack of in-house expertise. In particular, North America has one of the most extensive and developed class action frameworks in the world. But despite the system…

Download

13 November 2025

Data Management Insight RegTech Insight

The Data Transformation Imperative: From Operational Burden to Strategic Advantage

A new report by SimCorp examines the investment management industry’s data opportunities – and challenges – in a fast-changing financial landscape, offering guidance on how companies can plot a course towards future-proofing their data capabilities and making immediate operational improvements. New data complexity challenges are presenting themselves to investment managers at an immense scale, the…

Download

21 October 2025

Data Management Insight

Webinars

Recorded Webinar: Unpacking Stablecoin Challenges for Financial Institutions

The stablecoin market is experiencing unprecedented growth, driven by emerging regulatory clarity, technological maturity, and rising global demand for a faster, more secure financial infrastructure. But with opportunity comes complexity, and a host of challenges that financial institutions need to address before they can unlock the promise of a more streamlined financial transaction ecosystem. These…

Find out more

18 November 2025

Data Management Insight TradingTech Insight

Recorded Webinar: Unlocking Transparency in Private Markets: Data-Driven Strategies in Asset Management

As asset managers continue to increase their allocations in private assets, the demand for greater transparency, risk oversight, and operational efficiency is growing rapidly. Managing private markets data presents its own set of unique challenges due to a lack of transparency, disparate sources and lack of standardization. Without reliable access, your firm may face inefficiencies,…

Find out more

15 October 2025

Data Management Insight

Recorded Webinar: End-to-End Lineage for Financial Services: The Missing Link for Both Compliance and AI Readiness

The importance of complete robust end-to-end data lineage in financial services and capital markets cannot be overstated. Without the ability to trace and verify data across its lifecycle, many critical workflows – from trade reconciliation to risk management – cannot be executed effectively. At the top of the list is regulatory compliance. Regulators demand a…

Find out more

08 October 2025

Data Management Insight

Guides

Data Lineage Handbook

Data lineage has become a critical concern for data managers in capital markets as it is key to both regulatory compliance and business opportunity. The regulatory requirement for data lineage kicked in with BCBS 239 in 2016 and has since been extended to many other regulations that oblige firms to provide transparency and a data…

Download

12 October 2018

Data Management Insight RegTech Insight

Regulatory Data Handbook 2025 – Thirteenth Edition

Welcome to the thirteenth edition of A-Team Group’s Regulatory Data Handbook, a unique and practical guide to capital markets regulation, regulatory change, and the data and data management requirements of compliance across Europe, the UK, US and Asia-Pacific. This year’s edition lands at a moment of accelerating regulatory divergence and intensifying data focused supervision. Inside,…

Download

16 September 2025

Data Management Insight RegTech Insight

AI in Capital Markets: Practical Insight for a Transforming Industry – Free Handbook

AI is no longer on the horizon – it’s embedded in the infrastructure of modern capital markets. But separating real impact from inflated promises requires a grounded, practical understanding. The AI in Capital Markets Handbook 2025 provides exactly that. Designed for data-driven professionals across the trade life-cycle, compliance, infrastructure, and strategy, this handbook goes beyond…

Download

15 April 2025

Data Management Insight RegTech Insight TradingTech Insight

Everything you need to know about: Data Governance & Lineage

What is data lineage?

Data lineage covers the lifecycle of data, from its origins, through to what happens to the data when it is processed by different systems, and where it moves from and to over time. It can be applied to most types of data and systems, and is particularly valuable in complex, high volume data environments. It is also a key element of data governance, providing an understanding of where data comes from, how systems process the data, how it is used and by whom.

The importance of data lineage has escalated in recent years in response to increasing regulatory demand where regulators are demanding full transparency and audit trails of the data behind all trading decisions.

But over time firms have come to understand the value and benefits it can deliver. Acceleration of automation has also advanced use cases. Beyond compliance, extensive data lineage can provide operational transparency and reduce risk and costs. From a business perspective, data lineage can improve data quality and allow the business to make better decisions and spot new business opportunities and strategies.

Data lineage is often represented visually to show the movement of data from source to destination, changes to the data and how it is transformed by processes or users as it moves from one system to another across an enterprise, and how it splits or converges after each move. Visualisation can demonstrate data lineage at different levels of granularity, perhaps at a high level providing data lineage that shows which systems data interacts with before it reaches its destination. As granularity increases, it becomes possible to provide detail around the particular data, such as its attributes and the quality of the data at specific points in the lineage.

By building a picture of how data flows through an organisation and is transformed from source to destination, it is possible to create complete audit trails of data points, an aspect of lineage that has become increasingly necessary to meeting regulatory requirements and ensuring data integrity for the business.

The necessary scope of data lineage can be determined by regulatory requirements, enterprise data management strategy, data impact and critical data elements. It is not necessary to boil the ocean – instead, best practice identifies regulatory requirements and business processes to which the application of data lineage is beneficial.

Who is involved in data lineage?

Reflecting the regulatory compliance and business uses cases of data lineage, related job titles include:

Business analyst
Business intelligence developer
Compliance officer
Data analyst
Data architect
Data governance analyst
Data modeller
Data quality analyst
Solutions architect

Regulations driving adoption

Data lineage was initially implemented by financial institutions to track data across individual data management projects. It rose to prominence and became part of the regulatory landscape following the implementation of BCBS 239 in January 2016, a Basel Committee on Banking Supervision (BCBS) rule designed to improve data aggregation and reporting across financial markets, as well as accountability for data.

These requirements were the early drivers of improved data lineage, which has since been reinforced by a number of regulations that require firms to implement lineage to demonstrate exactly how they came to the results published in regulatory reports. Data lineage allows firms to not only prove the validity of report entries, but also take a proactive approach to identifying and fixing any gaps in required data.

BCBS 239

Regulatory requirement: Basel Committee on Banking Supervision rule 239 (BCBS 239) came into force on January 1, 2016 and is designed to improve risk data aggregation and reporting. It is based on 14 principles that underpin accurate risk aggregation and reporting in normal times and times of crisis. To achieve compliance, banks must capture risk data across the organisation, establish consistent data taxonomies, and store data in a way that makes it easily accessible and straightforward to understand.

Data lineage response: Data lineage must be implemented to support risk aggregation, data accuracy and reporting, and conversely, to ensure risk data can be traced back to its origin and risk reports can be defended.

GDPR

Regulatory requirement: General Data Protection Regulation (GDPR) is an EU data privacy regulation that came into force on May 25, 2018. It is designed to harmonise data privacy laws across Europe and protect EU citizens’ data privacy. The requirements of GDPR include gaining explicit consent to process personal data, giving data subjects access to their personal data, ensuring data portability, notifying authorities and individuals of data breaches, and giving individuals the right to be forgotten.

Data lineage response: Firms subject to GDPR are dependent on data lineage to track data and provide transparency about where it is and how it used. Data lineage provides firms with the ability to demonstrate compliance with the regulation and, from a data subject’s perspective, supports access to personal data and the execution of other rights such as the right to be forgotten.

MiFID II

Regulatory requirement: Markets in Financial Instruments Directive II (MiFID II) is a principles based directive issued by the EU. It took effect on January 3, 2018, and aims to increase transparency across Europe’s financial markets and ensure investor protection. The demand for reference and market data for both pre- and post-trade transparency, including trade reporting and transaction reporting, is unprecedented, leading to data management challenges including sourcing required data, reporting in near real-time, and uploading reference and market data to MiFID II mechanisms including Approved Publication Arrangements (APAs) and Approved Reporting Mechanisms (ARMs).

Data lineage response: MiFID II operations can benefit from data lineage in a number of ways. Lineage can be used to identify any gaps in trade reporting data, and any similarities across numerous regulatory reporting obligations. It can also be used to map MiFID II reporting data from source systems to APAs and ARMs and vice versa.

CCAR

Regulatory requirement: The Comprehensive Capital Analysis and Review (CCAR) is an annual exercise carried out by the Federal Reserve to assess whether the largest bank holding companies (BHCs) operating in the US have sufficient capital to continue operations throughout times of economic and financial stress, and have robust, forward-looking capital planning processes that account for their unique risks. From a data management perspective, CCAR requires data sourcing, analytics and risk data aggregation for stress tests designed to assess the capital adequacy of BHCs and for regulatory reporting purposes.

Data lineage response: CCAR requires attribute level data lineage to track data from source to destination and ensure the validity and veracity of capital plans. Data lineage can also be used to identify any data gaps in reporting and highlight any data quality issues.

FRTB

Regulatory requirement: Fundamental Review of the Trading Book (FRTB) regulation will take effect in 2022. It is a response to the 2008 financial crisis, which exposed fundamental weaknesses in the design of the trading book regime, and focuses on a revised internal model approach to market risk and capital requirements, a revised standardised approach, a shift from value at risk to an expected shortfall measure of risk, incorporation of the risk of market illiquidity, and reduced scope for arbitrage between banking and trading books.

The data management challenges of the regulation are significant and include data sourcing, facilitating capital calculations, and gathering historical data as well as real price observations for executed trades or committed quotes to meet requirements around non-modellable risk factors (NMRFs) and the linked risk factor eligibility test.

Data lineage response: To satisfy the demands of FRTB, data lineage may be needed to track historical data and trade data aggregation required for the risk factor eligibility test of NMRFs, essentially the provision of at least 24 real price observations of the value of the risk factor over the previous 12 months.

Business use cases of data lineage

Beyond regulatory compliance, data lineage offers business benefits, but it must be approached as a long-term activity rather than a point solution if it is to provide ongoing value.

Among the business benefits of successful data lineage implementation are:

Understanding data: It may sound simple, but understanding data that is used and stored across an organisation can be very difficult when it includes masses of internal data, several sources of external data, data silos and data in different formats. By applying data lineage, it is possible to gain a greater understanding of the data a company holds, where it is, what it is used for, its value and potential. With a good understanding of data, it is also possible to assign responsibility for data ownership to individuals, departments or lines of business within the organisation.

Improved business decisions: By providing access to accurate, trusted data quickly and efficiently, data lineage allows business to make smarter, faster and better informed decisions. Decisions can be made more proactively where there is data lineage and defended on the basis of being able to determine the exact data underlying a decision.

Identifying business opportunities: Using data lineage to gain a better understanding of data, and to visualise data and processes, can provide new business opportunities, such as the potential to create new products by combining certain data and processes, or the possibility of finding an external partner to upscale and commercialise specific datasets.

Data discovery: Data lineage provides the ability to decide what data is important and find the right data quickly. This is crucial to business decisions and can help firms remain competitive and identify new business opportunities.

Improved analytics: More reliable and better quality data that is understood and easily accessible supports improved analytics and the knock-on effect of better business decisions.

Increased efficiency: By eliminating duplicated data and redundant data and systems, and providing a clear view of data and how it changes and moves around an organisation, data lineage can provide increased operational efficiency that can support both cost reduction and business needs for fast access to trusted data.

Impact assessment: Data lineage can be used to study how changes to IT systems or business processes could affect specific products or reports downstream.

Cost reduction: Data lineage offers a number of ways to reduce costs. The need to review data across an organisation as a first step of data lineage allows firms to identify and delete any duplicated data, focus on data silos and decide their fate, and discover unused data that can be eradicated and redundant systems that can be switched off. This will optimise a firm’s data footprint and reduce the costs of data management.

Understanding data provides an opportunity to review licensed data, which may be licensed more than once in any one organisation or not used to any great extent, avoid the penalties of using unlicensed data, and renegotiate licenses with data vendors to make external data provision more cost effective.

Data lineage and data discovery can also support new projects at lower cost as some required data and processes can be identified and reused.

Business intelligence and change management: The ability of data lineage to expose an organisation’s data lends itself well to business intelligence and change management. What-if analyses can be made using existing data and processes, starter projects can be undertaken to predict outcomes of change, and favourable projects can be developed quickly using existing and new resources. Rather than calling on IT to build new systems from scratch, the business can discover how new commercial concepts could work before investing in systems.

Data ownership: By clarifying where data is, who uses it and what for, data lineage can allow data ownership to be handed over to relevant individuals, departments or lines of business that can best exploit the data.

What are the challenges of implementing data lineage?

The challenges of implementing data lineage fall into three buckets – operations, technology and data management.

Operations

The operational challenges of data lineage start with winning management buy-in and funding for a solution that can be expensive, requires significant human input, and offers only a modicum of advantage in early implementation.

The best approach here is to educate management and start small. Decide whether a pilot project is going to provide insight into business opportunities or achieve an element of regulatory compliance, prioritise the most important and relevant data, scope the project carefully, and identify stakeholders that should be involved.

In the first instance, it may be useful to assess where required data comes from manually and create baseline data lineage before considering automation. It is also important to make sure the pilot project is scalable for other data sources or areas of the organisation before making a business case for lineage.

Proving the concept of data lineage and demonstrating quick wins to the business should, hopefully, be enough to start the journey towards a larger data lineage programme spanning part or all of the organisation.

Technology

The technology challenges of data lineage arise from growing numbers of regulations with overlapping requirements, smarter auditors and regulators asking for responses to questions on demand. Technology innovation adds to the challenge, with cloud-based applications and services, big data systems, machine learning, artificial intelligence and natural language processing technologies creating complex infrastructure. Data can be managed in new and interesting ways, but keeping track of it and ensuring it can be trusted is increasingly difficult.

At the heart of addressing these challenges is the selection of a solution, or solutions, to support an organisation’s data lineage. Questions to consider include: how much lineage is already in place; to what extent will manual lineage be necessary; how will lineage be documented; how will it need to be scaled; how will impact assessment be managed; what is the long-term aim for automation; which areas of the organisation will be covered and at what level in terms of technical and business lineage; how will data lineage be sustained; what skills are required; and how much will it cost?

There are no catch-all answers to these questions and few organisations will find answers to all of them in one solution, leading most firms to implement a combination of in-house systems and vendor solutions.

Whatever the selected solution, however, it will not provide value in isolation. It is important to consider how data lineage and its metadata will integrate with the rest of an organisation’s business metadata as this will provide rich data and the ability to slice and dice the data. Data lineage also needs to run alongside an organisation’s systems development lifecycle plan to ensure it is maintained as technologies are changed.

And, of course, scalable and flexible technology is essential, not only to master growing volumes of existing data types, but also to embrace additional datasets, alternative data, regulatory change and new regulatory requirements.

Data management

Implementing automated data lineage is a complex data management task that can include huge volumes of data, multiple legacy systems, mountains of spreadsheets, siloed data, uncharted data flows, mixed data formats, and creating metadata to describe the data.

Early considerations include identifying all the data across an organisation, assessing data quality and bringing manual processes into an automated lineage framework wherever possible.

An inventory of data can start the process of identifying which data is important to the business and should be part of a data lineage programme, which data can be left as is, and which data can be scrapped. Challenges here include mining outsourced and black box data, which can be difficult, if not impossible, to capture.

As well as identifying data that can be scrapped, the initial data inventory can uncover redundant systems that can be switched off, reducing the operations burden and the cost of systems infrastructure.

As data lineage is built out, data quality must be constantly monitored to facilitate lineage that is fit for purpose. Data quality can be addressed separately to data lineage, perhaps using the concept of a ‘data quality firewall’ based on a data management platform that enforces data policies and ensures data quality controls are executed before data is input to systems. Alternatively, it can be addressed within a data lineage framework using rules, controls and alerts.

Technology solutions

While most data lineage projects start as in-house manual developments responding to a specific requirement, an increasingly regulated environment, growing volumes of data and the need to provide fast access to business data are driving automation, in many cases based on a combination of in-house and vendor solutions.

A typical data lineage automation solution includes functionality that captures and documents data flows, such as a flow of financial instruments, from the data source to its final destination, perhaps a regulatory or internal report. Drilldown functionality allows particular points in the lineage to be inspected more closely, while traceability and audit ensure it is possible to track a piece of data through its journey across an organisation and verify its accuracy. Filtering capabilities allow users to filter for different data categories, such as reference data or trade data, and understand the data’s lineage and attributes.

Another technology facet of data lineage is visualisation, which can provide a real-time view of data moving through processes and systems, improve the understanding of data, highlight any defects in data flows, and visualise the impact of any changes to data and systems. Documentation is managed dynamically to reflect these changes in lineage.

Automation can also capture business logic and/or metadata that can be stored in a repository and used to create source to target data lineage, eliminate duplicated or redundant data, and provide business and technical users with the ability to locate, understand, and manage information that supports business operations.

These types of automated solutions offer many benefits, including the ability to trace data errors, identify discrepancies, control access to information and model what would happen if a new process or department were added to the business. They can also reduce time spent on validating data accuracy and put trusted information in the hands of decision makers.

Vendor solutions provide these types of functionality. There may be slight differences in underlying technologies, scope and potential for automation, but the key difference between vendor solutions is delivery, with some vendors providing cloud-based solutions that can be up and running quickly, and others offering enterprise software solutions that need to be implemented and maintained in-house.

Outlook

Going forward, data lineage is likely to follow the steady flow of data, applications and analytics into the cloud environment, extensive automation will become the norm, and the goal of zero-gap data lineage will be within reach.

Vendor solutions

ASG Technologies

ASG Technologies Group provides more than 3,000 global organizations with a modern approach to Digital Transformation. ASG is the only solutions provider for both Information Management and IT Systems. ASG’s Information Management solutions enable companies to find, understand, govern and deliver information of any kind, from any source through its lifecycle. The IT Systems Management solutions empower companies to support digital initiatives, operate IT infrastructure more efficiently and reduce the cost of managing IT systems landscapes. For more information, visit ASG.com or connect with us on LinkedIn, Twitter and Facebook.

Visit ASG now

MarkLogic

MarkLogic is an operational and transactional Enterprise NoSQL database platform trusted by global organizations to integrate their most critical data. Designed to integrate data from silos better, faster, and with less cost, MarkLogic can help integrate data and build a 360-degree view up to four times faster than if using a traditional database.

Visit MarkLogic now

3d innovations (3di) – Data lineage for data compliance and licensing solutions

AxiomSL – Data capture and visualisation of data sources, data flows and business logic

Bloomberg – Solutions based on the Financial Instrument Global Identifier (FIGI)

Cambridge Semantics – Automatic capture of schema and statistical metadata describing data sources

Collibra – Interactive data lineage diagrams

Compact Solutions – Metadata integration platform providing data lineage

Datum – Metadata management for use cases including General Data Protection Regulation (GDPR)

Dremio – Data lineage to support analytics

Erwin – Web-based solution mapping data elements to sources

Global IDs – Data lineage layer that maps columns and tables to establish data flow

IBM – Metadata based data and business lineage

Informatica – Data lineage based on a machine learning enterprise data catalogue

Manta – Documents data lineage as it crunches programming code and provides an interactive map

Octopai – Automated cross-platform metadata management and data lineage

Smartlogic – Data lineage based on a semantic AI platform

Solidatus – Visualised data lineage based on metadata management

Talend – Cloud-based open source and enterprise lineage solutions

Trifacta – Data wrangling column-based solution

If you want to appear on this page please contact Jo Webb at jo@a-teamgroup.com or call us on +44 (0)20 8090 2055.

Browse by brand

RegTech Insight

TradingTech Insight

Data Management Insight

Browse by content type

Data Management Insight Knowledge Hub

Data Lineage

Key resources

Blogs

Data Fabric vs. Data Mesh: 10 Companies Provisioning Modern Data Architectures for Enterprise AI

Agentic AI Deployment Presents Potentially Dangerous Data ‘Trust Paradox’

7Rivers Q&A: Enabling Modern Data Processing

White papers

The Transparency Imperative: Forging Competitive Advantage in Private Markets Through Next-Generation Data Management

Dispelling Myths About North American Securities Class Actions

The Data Transformation Imperative: From Operational Burden to Strategic Advantage

Webinars

Recorded Webinar: Unpacking Stablecoin Challenges for Financial Institutions

Recorded Webinar: Unlocking Transparency in Private Markets: Data-Driven Strategies in Asset Management

Recorded Webinar: End-to-End Lineage for Financial Services: The Missing Link for Both Compliance and AI Readiness

Guides

Data Lineage Handbook

Regulatory Data Handbook 2025 – Thirteenth Edition

AI in Capital Markets: Practical Insight for a Transforming Industry – Free Handbook

Everything you need to know about: Data Governance & Lineage

What is data lineage?

Who is involved in data lineage?

Regulations driving adoption

Business use cases of data lineage

What are the challenges of implementing data lineage?

Outlook

Vendor solutions

Share on Mastodon