Early identification of use cases that provide a focal point for the implementation of knowledge graphs is essential to successful projects, according to a panel of experts speaking at a recent A-Team Group webinar.
The webinar discussed the benefits and pitfalls of implementing knowledge graphs, including how to build, sustain and maintain them, use cases, the challenges of deployment in existing data management landscapes, and approaches firms can take to resolve challenges and achieve return on investment.
The webinar speakers were David Newman, senior vice president and head of enterprise knowledge graph solutions, data management and insights at Wells Fargo; Michael Pool, executive director, head of ontology and semantic modeling at Morgan Stanley; and Alex Brown, chief technology officer at Datactics.
Early consensus suggested that use cases of knowledge graph technology must be identified at the outset to provide a focal point for the implementation process.
“There is a tendency to try to create an enterprise-wide ontology, which can be overwhelming,” said Pool. “The key to getting a knowledge graph built is to identify the use cases you need to address and show value – it is easy to modify the graph and the ontology as you go. You don’t need to worry about getting everything right as long as you design and solve those use cases in a way that is scalable.”
Newman added: “Organisations have a great opportunity if they start with the right investment – namely a centre of excellence where groups come together and plan deployment by defining naming conventions, file structure standards, governance, and processes for approval by owners of key data areas and how the concepts presented in those key data models will be aligned with each other.”
Once use cases are agreed, it is important to make it as easy as possible for people to put data sets into the graph and extract the data they need. That said, Pool commented: “If you need a small group of engineers to get data into the graph, the project is not going to work. It is also important to think about what kind of metadata is going to be important at the outset and get it into the graph early so it becomes part of the process. That metadata is what is going to allow people to find the data and help them make links.”
Use cases of knowledge graphs was the subject of an audience poll conducted during the webinar. Some 75% of respondents highlighted data analytics, while business insight and data integration were both referenced by two-thirds of respondents.
A number of use cases were explored by the webinar speakers, the first of which was linking structured data with alternative data sets – an application that is becoming increasingly important in capital markets.
Newman noted the opportunity for payment monitoring and how the ability to view a network of payments is helpful in identifying financial crime. “When that layer of knowledge is injected into the network we gain insights that we would normally have many more challenges to identify,” he said.
Brown described a project Datactics has been working on to build a knowledge graph and populate a graph database with UK Companies House data and other data sets from the Office of National Statistics. The objective of the project was to link company entities to persons of significant control.
“This is a common challenge for tasks such as KYC due diligence,” Brown said. “We sought to demonstrate the importance of data quality matching by taking data verbatim from these sources, throwing it into graph databases, generating a knowledge graph, and then comparing this with a more refined approach where we ingested, normalised and cleansed the data before discovering links using fuzzy matching and natural language processing.”
Brown says the project proved the importance of carefully preparing data and imposing quality controls whenever a knowledge graph is generated or populated. “If the data is wrong, you can miss the relationship between nodes or fail to identify that two nodes are actually the same thing, so you need to be careful that your knowledge graph is giving you the complete picture if it is feeding into something like a KYC process.”
Data cataloguing can also benefit from the use of knowledge graphs. “For example, an organisation might be asked for its national identification numbers,” explained Newman. “There is typically no column in a table database that represents national identification numbers. With a knowledge graph you can organise your data so you can query the broad class of data of national identification numbers.”
Challenges of implementation
Newman suggested the challenges of deploying knowledge graphs in existing data management landscapes can be divided into organisational and technological issues that are ‘natural growing pains’ of introducing new technologies and capabilities. The challenges are about giving different groups in an organisation an opportunity to review the diversity of knowledge graph tools so the right tools can be found and adopted in a technology standards catalogue,” he said. He also noted the need to develop a semantic model to underpin selected tools.
Brown emphasised the importance of automated, reproducible, and auditable pipelines to populate knowledge graphs with high quality data.
A final audience poll considered the the extent of business and operational benefits organisations are gaining, or would expect to gain, from the implementation of knowledge graphs. Some 38% of respondents are gaining, or expect to gain, significant business benefits. The same percentage are gaining, or expect to gain, some business benefits. From an operational standpoint, 13% said they are gaining, or expect to gain, some operational benefits. A similar percentage said they expect to gain neither business nor operational gains from the implementation of knowledge graphs.
“The biggest factor in return on investment is data utilisation and the way to achieve that is making the knowledge graph the framework from which data utilisation occurs,” said Pool. “The key is to be able to say, ‘I want my front line analyst to be able to figure out the impact on this particular portfolio of something happening in this particular country’.”
He concluded: “Solve the small problems and think about how they fit into the larger framework. “Don’t let perfection be the enemy of the good, because if you try to solve the whole thing before you even get started, you will never find a solution.”