A-Team Insight Blogs

How to Extract Value from Unstructured Data

9 October 2019

Subscribe to our newsletter

Unstructured data offers untapped potential but the platforms, tools and technologies to support it are nascent and are often deployed for a specific problem with little reuse of common technologies from application to application.

The challenges of managing and analysing unstructured data, and considerations when making investments in the data, were discussed during a panel session at A-Team Group’s recent Data Management Summit in New York City. The panel comprised Gerry Mintz, managing partner of Percepta Partners; Matt Good, chief technology evangelist at Kingland; Steve Grill, managing director, head of legal data services, research and data management (formerly JP Morgan); Evan Schnidman, president, Prattle, a Liquidnet Company; and Gurraj Singh Sangha, global head of risk and market intelligence at State Street.

The key takeaways from the panel were the levels of fragmentation and complexity of unstructured data in the market, and the importance of meticulousness in approach and expertise in execution. “In terms of technology, we are finding that the space is very fragmented,” said one panellist. “Folks might say that there are a lot of vendors in the text analytics and language processing space, but the big players are making investment bets in lots of different spaces. All of these big tech vendors are providing you with the toolkit capabilities to go and potentially build these solutions yourselves. AWS Amazon just last year offered new services around extraction of text, for example. But there is still the problem that you need a full solution to meet the business use case.”

“Domain expertise matters. It’s that simple,” added another panellist. “Being an expert in financial services gives you an insight into what documents matter, what correlations you need to be looking for, and what language patterns you want to find in an earnings call. When you start digging into complex language that has real indications for KPIs you need an analyst in the room, you need to know what to look for. And unless you have specialist knowledge about both the natural language processing side and the financial services implications, the odds are that you are going to be operating in an atheoretical vacuum that is going to result in spurious correlation that breaks down pretty much as soon as you get out of sample.”

“The key word is knowledge,” agreed a third. “Unstructured data is all about knowledge. That’s how it differs from structured data, which is all about information. It’s not about just collecting data, it’s about understanding what the data means, organising it in a way that people can understand, and explaining how it applies specifically to the firm.”

The panel agreed that a key issue is that machine learning tends to be very abstract. There are nuances in language – context, interpretation and so on. When you are training algorithms for any use case, it can be challenging to process information to an ontology to connect various facets of information to specific situations in the marketplace.

“It is important when you are going down this path to move slowly,” concluded the panel. “Just processing language in and of itself can open up many wrong directions. When you are applying it to risk, ingesting an enormous amount of information connected to portfolios, it is inordinately complex.”

Ultimately, the advice is to proceed cautiously and be careful of bias, be careful of subjectivity, and be careful of interpretation. “It’s about having an excellent ecosystem with people and technology, and finding some early victories. The ability to find opportunities to automate even just some of the routine will help you – just a little bit here, a little bit there – all of that adds up to excellent knowledge work, and you’ll soon be on your way to showing some return on investment in this space.”

Subscribe to our newsletter

Data Management Insight

WEBINAR

Recorded Webinar: How to organise, integrate and structure data for successful AI

Artificial intelligence (AI) is increasingly being rolled out across financial institutions, being put to work in applications that are transforming everything from back-office data management to front-office trading platforms. The potential for AI to bring further cost-savings and operational gains are limited only by the imaginations of individual organisations. What they all require to achieve...

Find out more

25 September 2025

Data Management Insight

BLOG

Free from Fear and Lock-In – The Efficiency Jackpot Back-Offices in PE can Deliver

By Gareth Hewitt, Co-founder and CEO, LemonEdge. Private equity firms and fund administrators face heavier workloads and closer scrutiny than ever before, yet many back offices still run on systems built for a past era, when there was less expectation that services needed to be delivered quite as regularly. Teams recognise that sticking with these...

09 January 2026

Data Management Insight

EVENT

TEST Event page 2

Now in its 15th year the TradingTech Summit London brings together the European trading technology capital markets industry and examines the latest changes and innovations in trading technology and explores how technology is being deployed to create an edge in sell side and buy side capital markets financial institutions.

16 February 2026

GUIDE

Regulatory Data Handbook 2025 – Thirteenth Edition

Welcome to the thirteenth edition of A-Team Group’s Regulatory Data Handbook, a unique and practical guide to capital markets regulation, regulatory change, and the data and data management requirements of compliance across Europe, the UK, US and Asia-Pacific. This year’s edition lands at a moment of accelerating regulatory divergence and intensifying data focused supervision. Inside,...

16 September 2025

Data Management Insight RegTech Insight

Browse by brand

RegTech Insight

TradingTech Insight

Data Management Insight

Browse by content type

A-Team Insight Blogs

How to Extract Value from Unstructured Data

Share article

Related content

WEBINAR

Recorded Webinar: How to organise, integrate and structure data for successful AI

BLOG

Free from Fear and Lock-In – The Efficiency Jackpot Back-Offices in PE can Deliver

EVENT

TEST Event page 2

GUIDE

Regulatory Data Handbook 2025 – Thirteenth Edition

Share on Mastodon

A-Team Insight Blogs

How to Extract Value from Unstructured Data

Share article

Related content

webinars

Recorded Webinar: Unlocking Transparency in Private Markets: Data-Driven Strategies in Asset Management

Related content

WEBINAR

Recorded Webinar: How to organise, integrate and structure data for successful AI

BLOG

Free from Fear and Lock-In – The Efficiency Jackpot Back-Offices in PE can Deliver

EVENT

TEST Event page 2

GUIDE

Regulatory Data Handbook 2025 – Thirteenth Edition