Have you ever considered how much value is tied up in unstructured data that is difficult to access? What nuggets of useful information could be buried in documents such as company reports? And how you could turn masses of unstructured data into actionable information?
If the answer to these questions is yes, and it probably is for most financial firms, solutions based on artificial intelligence (AI) and natural language processing (NLP) are emerging that can identify and extract otherwise hidden data gems, discover relationships between people, companies and products in unstructured data, and greatly reduce the amount of time it takes experts at banks and trading firms to read and consume unstructured data.
Kingland Systems has been working with AI since 2010 and in recent years has invested in people, technology and partnerships with universities to drive its cognitive computing platform to new levels. The company has also done time and motion studies showing that risk analysts spend 50% to 70% of their time reading unstructured data and looking for valuable information. Using text analytics and data extraction, the task can be done accurately in minutes, and provide actionable data that can be fed into appropriate systems.
Tony Brownlee, executive vice president of business development and chief strategy officer at Kingland, explains: “If, for example, you have a 200 page document and you are looking for specific data attributes that are all buried in different formats and languages such as legal prose, it is a time challenge for a knowledge worker to find, extract and put the data into the systems that will use it. Using NLP technology it is possible to gather unstructured data from many sources such as PDFs, scanned images and HTML documents, look at the characteristics of the documents, and train and tune the cognitive engine to read the material and extract selected attributes.”
In terms of time, it would take an average person seven hours to read and retain 60% of the content in a 478 page document with 85,056 words, 3,470 sentences, 3,390 numbers, 219 tables and 5,996 entities, people and products. Kingland’s cognitive engine, which operates as a cloud service with clients uploading scanned documents, can read the document and hundreds more in a few minutes, and extract selected data. Similarly, and on a commercial basis, DTCC uses the Kingland engine in its Mutual Fund Services business to read prospectuses and find and update required data in minutes – the output of the engine is 99% accurate.
Brownlee acknowledges that it takes time, typically about six months, to train the engine to identify specific data and, where required, related data attributes, and notes that where requirements are super complex firms may still want experts to read documents and extract data. Also, every client has different priorities around elements such as data sources, data attributes and consuming systems, making projects expensive, at least for now.
The benefits, however, can be significant. For example, it is possible to read 10 years of annual reports in minutes, discover how often particular clients appear and use this data for marketing purposes. It is also possible to extract people, entities and products, and the relationships between them, to identify valuable business opportunities and potential risks.
In the financial sector, demand for text analytics and extraction is coming predominantly from global investment banks and large sell-side and buy-side firms that are highly regulated. Brownlee says they often turn to Kingland to read and extract data from reports and financial agreements. On the regulatory front, they look to Kingland for help in identifying specific attributes included in particular regulations. For example, a bank onboarding a client will fill in a form that should include information such as the client’s name, address, type of business and Legal Entity Identifier (LEI), although there is no guarantee that all this information will be collected. The Kingland cognitive engine can be trained to read tens of thousands of forms to find out whether specific information, such as the LEI, has or hasn’t been included on forms related to particular clients.
Brownlee concludes: “We are at an inflection point in AI and NLP technology and have reached a time of ‘the art of the possible’. It is possible to unlock documents and use the resulting treasure trove of data to great effect.”