How can institutions get an edge from their data when much of the information they use has become commoditised?
That’s one of the questions driving efforts to extract the most meaning and value from unstructured sources of data, pools of information that have until recently been largely out of reach for all but the biggest organisations.
Extracting data from unstructured sources is the new frontier in data management. Making sense of reports, meeting notes, voice calls, social media posts, messages and other non-linear and time asynchronous sources has been notoriously difficult to automate, requiring time-gobbling and resource-hogging manual processes. Even then, it’s been difficult to ensure the quality of the data that’s been extracted.Technology, however, is changing that. Digitalisation is making it possible for companies to access and build automation tools that can help them find and distil the data they need from vast stores of material.
“Unstructured data has always existed but not at the scale it does now,” said Ashly Joseph, data management lead at JP Morgan. “It was very small and very expensive for someone to look at that data.
“To look at unstructured data we need better capabilities like artificial intelligence (AI), natural language processing, the ability to look at images and extract that information – lots of technology capability,” Joseph told Data Management Insight. “Now that is something that we are actively fuelling.”
Expert Speakers
The technology that has made this possible will be the focus of A-Team Group Data Management Insight’s next webinar. Entitled “Strategies, tools and techniques to extract value from unstructured data”, the webinar will bring together a panel of experts, including Joseph, to digest the latest in harnessing what was relatively recently considered a data source that was all but out of bounds.
Another speaker at the event on 12 September will be Vahe Andonians, founder of Cognaize, which specialises in automating unstructured data. He argued that having the capabilities to find and use data from broader sources could be the difference between an organisation’s success and its failure.
“Having more data to base your decisions on usually leads to better decisions, but the democratization of that piece means that everybody has the same information,” Andonians told Data Management Insight. “If you know just a tiny bit more than somebody else, it may now matter zero but it will matter a billion times more in four years,”.
Part of the difficulty, he adds, is resolving the dichotomy between unstructured data sources being difficult only for machines to understand; humans can make perfect sense of it. But to mine information quickly and efficiently, that understanding needs to be translated into the digital systems that are interpreting the sources.
Making that work will cost, he says, but it will be critical.
“If we bring that cost down, suddenly we can process more information and come to better decisions.”
AI Application
The webinar will consider issues such as the extent to which financial firms are using unstructured data. It will look at the challenges of sourcing, managing and analysing the data and offer some practical advice on tools and techniques that can bring the best results.
“One of the big challenges regarding unstructured data is governing data quality, as the required controls tend to be manual-qualitative rather than automated-quantitative,” said Brian Greenberg, business engagement lead for enterprise data management at BNY, who will also join the webinar.
Among the biggest challenges is ensuring the consistency of unstructured data, said Joseph. When data arrives into systems in structured form, it can be easily integrated and manipulated by relatively simple rules. When data lacks that formality, however, the checks that are vital to ensuring data can be used are tricker to automate.
“What that means is better capabilities using natural language processing, using AI and machine learning and working with the data scientists to make sure that what’s the best way to look at this,” she said.
A key factor in mining and deriving value from unstructured data will be the application of AI to the process. Machine learning has been used to read and understand reports and other text-based materials for many years. More recently, generative AI models have been deployed to write code and rules for managing that information.
“The rapidly expanding capabilities of AI have proven very effective at scaling this approach, especially against large amounts of unstructured data,” Greenberg told Data Management Insight of AI’s use.
- A-Team Group Data Management’s webinar “Strategies, tools and techniques to extract value from unstructured data” will begin at 10:00am ET/3:00pm London/4:00pm CET on 12 September. Click here to register your attendance.
Subscribe to our newsletter