Generative AI in RegTech: Promise, Pitfalls and a Glimpse Into the Future

The release of ChatGPT in October 2022 created an overnight sensation and Artificial Intelligence (AI) now dominates every corner of the business news cycle including financial services and capital markets.

Before ChatGPT, the world of AI/ML revolved largely around data scientists and communities of developers on GitHub. Financial institutions have been leveraging AI technologies for decades across multiple use-cases. NCR was using Convolutional Neural Network (CNN) technology to ‘read’ handwritten digits on its cheque-scanning machines in the 1990s and use-cases have ranged from forecasting price movements, algorithmic trading, surveillance, regulatory reporting, and risk management to robo advisors like Betterment and Wealthfront.

While there was growing awareness and some concerns with developments around areas like autonomous driving, unless you could string a few lines of Python code together and had a working knowledge of regression, clustering, and other statistical models, the world of AI remained a black box for most people.

That all changed when ChatGPT appeared. It delivered an interactive experience with human-like characteristics, and the reaction from the general public and regulators alike was dramatic. Less than two years later, AI is raised during every panel discussion or increasingly as a focus topic in its own right at industry conferences.

But is it all noise? This article examines where GenAI is currently being leveraged by RegTech suppliers and examines potential use-cases along with headwinds for GenAI adoption and inherent risks associated with the technology in its current form. Finally, it discusses concerns the RegTech community should consider as the technology continues to develop with an optimistic note for the future.

Definitions and Terminology

GenAI, Large Language Models (LLMs), and Generative Pretrained Transformers (GPTs) are terms that are related but denote distinct concepts within the field of artificial intelligence. Here’s how they relate to each other:

Generative AI refers to a class of artificial intelligence technologies capable of generating new content. The core capability of GenAI is analyzing large sets of unstructured data and generating outputs that mimic the original data in a “plausible” manner. The content can range from text, images, and audio to synthetic data and more. GenAI applications include GANs (Generative Adversarial Networks) for creating realistic images (including deepfakes), or Variational Autoencoders for drug discovery.
Generative Pretrained Transformers (GPTs) use the transformer architecture, which relies on self-attention mechanisms to predict and generate an output. The ‘pre-trained’ part of the name indicates that the model is initially trained on a large corpus of data (text, programming code, etc.) in an unsupervised manner before being fine-tuned with context-rich data for specific tasks.
Large Language Models (LLMs) are a subset of Generative AI focused on understanding and generating human language. These models are trained on vast amounts of text (a corpus) to predict the next word in a sequence, giving them the ability to generate coherent and contextually relevant outputs based on the input they receive in the form of “Prompts.” LLMs are used in a variety of applications including chatbots, writing assistance tools, and content generation.

In summary, Large Language Models are a subset of Generative AI specialised in the domain of text. While all LLMs are part of Generative AI, not all Generative AI systems are LLMs. LLMs apply generative capabilities specifically to language-based tasks. Both use similar underlying technologies like neural networks and machine learning algorithms. However, LLMs apply these technologies specifically to understand syntax, semantics, and context within language.

What is Prompt Engineering?

Prompt engineering is the practice of crafting instructions that guide LLMs, to generate specific and useful outputs. It acts as the interface between human intent and machine function. This practice is essential because the quality of an LLM-generated response depends heavily on the input prompt’s clarity and specificity. Prompt engineering is as much art as it is science. This article by TechTarget describes how becoming a proficient prompt engineer requires a blend of technical and soft skills. Strong language skills to use precise and clear language to formulate prompts that the AI can understand effectively; technical understanding of AI model architectures, training data, and tokenization processes and a familiarity with neural networks and machine learning principles; analytical skills to analyze the AI’s responses to refine prompts continually and adjust the AI’s behavior through feedback loops; creativity and problem-solving.

Compliance Use Cases Using GenAI

Surveillance and Monitoring: Marketing and communications compliance covers promotional communications and content with clients and advertisements for financial products. This content is subject to compliance oversight. RegTech is leveraging LLMs to assist in this by scanning content as it’s being created and flagging potential non-compliant text for misleading information or overpromising. For example, a “guaranteed return” or an advertisement promoting a new retirement product superimposed on an image of a luxury yacht would be a compliance violation. These solutions also include recommended disclosures that should be included with the content. Unlike traditional, post-production reviews, these solutions catch potential compliance violations before supervisory review, reducing time to publication and boosting efficiency. RegTech firms offering LLM-based solutions in this space include Saifr (SaifrScan and SaifrReview) and Red Oak Compliance Solutions.

Regulatory intelligence: Staying abreast of new and changes to existing regulations can be extremely labor intensive, particularly for global firms dealing with hundreds of regulatory agencies. LLMs can significantly reduce manual efforts by automatically scanning for proposed changes and generating alerts that can be routed into the compliance workflow. Corlytics Regulatory Monitoring and LEO All-In-One are among the RegTechs offering LLM-powered solutions for these use cases. The FCA and Bank of England (BofE) in the UK and FINRA in the US are supporting workflow automation by making their regulations machine-readable. Taking this a step further, LLMs can help assess the firm’s written policies against the latest regulations and flag where adjustments need to be made delivering a significant upgrade to transparency.

Customer Identification for KYC and AML Compliance: Customer due diligence during onboarding and subsequent monitoring of external sources can both be enhanced by LLMs’ ability to digest large amounts of text data from websites, news sources, and sanctions watchlists. Earlier this year Saifr acquired GOST, a sanctions and adverse media screening tool powered by LLMs.

Enhanced customer experience: LLMs can significantly improve the client experience via smart Chatbots that provide a more natural, human-like interaction for resolving customer queries.

Headwinds to GenAI Adoption

Legislative and regulatory concerns, a shortage of key skills in the market, and the extreme cost of building and training fit-for-purpose LLMs are among the current issues impeding a wider uptake of these technologies across the capital markets.

Legislative and Regulatory

In March, the European Parliament adopted the EU AI Act, the world’s first comprehensive law governing AI. The AI Act classifies AI systems into different risk categories, imposing stringent compliance requirements for high-risk AI-enabled applications while allowing more freedom for lower-risk use cases. The regulation emphasizes safety, transparency, traceability, and non-discrimination, aiming to ensure that AI systems used in the EU adhere to these principles. The Act also deals with unacceptable risks, banning certain AI applications outright, such as real-time remote biometric identification in public spaces unless it’s for specified law enforcement purposes.

Post-Brexit, the UK government has introduced a regulatory framework to keep pace with a rapidly advancing technology, rather than enact far-reaching regulation. The framework focuses on empowering existing regulators such as the Financial Conduct Authority (FCA) and the Competition and Markets Authority (CMA) to provide guidance within their domains. The UK’s regulatory framework is described as “pro-innovation” and “context-specific”, aiming to support technological advancement without imposing overly restrictive rules. The UK is also looking to establish global partnerships to harmonize AI safety standards, reflecting a balanced approach to fostering innovation while managing potential risks. Last year, the FCA made its Digital Sandbox generally available to market participants and third-party solution providers to test innovative solutions within a regulatory framework as close to real market conditions as possible.

In the US, President Biden issued an executive order to enhance the management of AI’s risks while promoting its potential benefits. The directive encompasses a comprehensive strategy that includes privacy protections, advancing AI innovation, ensuring safety and security standards, and promoting equitable and nondiscriminatory use of AI. This includes requiring developers of significant AI systems to share safety test results with the government and enforcing rigorous testing standards to pre-emptively manage risks associated with AI technologies.

All three regions emphasize the importance of safety, security, transparency, and the ethical use of AI. There’s a common understanding that AI should advance societal goals without compromising privacy or security. The EU has taken a more prescriptive approach with the EU AI Act, categorizing, and regulating AI based on risk levels whilst the UK has opted for a lighter, principles-based regulatory framework that leverages existing structures.

The US directive focuses on sector-specific guidelines and enhancing existing laws to accommodate the oversight of AI. The SEC has issued several proposed rules to address potential abuse and the possibility of an AI-induced market crash. Many of these rules are being challenged in the courts and the outcome remains uncertain. This uncertainty makes it difficult for US-based firms to commit to GenAI beyond narrow use-cases and proofs of concept.

While regulation might slow down innovation in certain high-risk areas, it can also, when carefully crafted, stimulate new developments by providing clear guidelines and standards that ensure safety and fairness in AI applications. This balance between innovation and regulation is crucial for fostering an environment where AI can evolve in a manner that is both innovative and responsible. Based on anecdotal comments at several recent conferences, the UK and EU are ahead of the US in this regard.

Competition for Talent in the AI Arms Race

The competition for AI talent between Big Tech (Google, Meta, et al) and well-funded FinTech start-ups that have emerged since the release of ChatGPT is making it difficult for even the largest financial firms to attract enough talent to take internal GenAI initiatives beyond the PoC stage. This is particularly acute in Europe as noted in this recent Reuters article.

Scaling to Production Challenges

Building a fit-for-purpose LLM from scratch requires a significant investment of time, resources, and money. Setting up a PoC with an open-source LLM and demonstrating potential is relatively straightforward if you have the requisite data science skills and the availability of good training data. But so far, scaling a PoC into production has proved a step too far for most firms. LLM training is compute-intensive and follows an iterative process over weeks or months. A survey of participants at a recent conference indicated very few plans to implement production-scale LLMs in 2024.

One of the few companies to invest and announce a production-ready model for the Financial Services industry is Bloomberg with its BloombergGPT. Launched in 2023, the model is trained on forty years of Bloomberg data and validated against existing finance-specific NLP benchmarks, a suite of Bloomberg internal benchmarks, and broad categories of general-purpose NLP tasks from popular benchmarks (e.g., BIG-bench Hard, Knowledge Assessments, Reading Comprehension, and Linguistic Tasks). Notably, the BloombergGPT model outperforms existing open models of a similar size on financial tasks by large margins, while still performing on par or better on general NLP benchmarks. A formal whitepaper describes the approach taken including model training, parameter tuning, and validation.

Ethics, Auditability, and Security

The nature of GenAI models makes it difficult to explain why or how the model generated a particular result which creates a compliance issue and makes the model unfit for certain classes of production. Add to this the fact that a model trained on a curated corpus of clean training data can still generate an erroneous or non-compliant output. There’s also a risk that models trained on public data can generate biased results unless training data is carefully screened before training. Data cleansing itself can require a significant investment of time and resources.

Another impediment results from GenAI models being susceptible to new forms of cyber-attack. Malicious actors are feverishly working to exploit the new technology for nefarious purposes. Models trained on personal data – for example for KYC – are particularly susceptible to attack as well as potential GDPR violations.

The Road Ahead for GenAI in RegTech

The journey of GenAI for RegTech is one marked by both unprecedented potential and considerable challenges. As illustrated, GenAI is poised to revolutionize sectors like compliance monitoring, regulatory intelligence, and customer due diligence by automating complex processes and providing insights at scale. These advancements promise significant operational efficiencies, cost reduction, and time to value for the GRC function.

Despite the regulatory uncertainties, skills shortages, ethical concerns, and other hurdles, the trajectory for GenAI in RegTech appears largely positive. The regulatory frameworks being developed in the EU, UK, and the US, while seemingly stringent, can also enable a clearer path forward by defining safety and ethical guardrails that enable innovation and the responsible deployment of AI technologies, safeguarding it against potential misuse and ensuring its alignment with broader societal values.

Ongoing advancements in AI models, exemplified by initiatives like BloombergGPT, demonstrate that scalable and effective solutions can be delivered for the industry. These models are setting benchmarks that promise improvements over traditional AI in accuracy, speed, and adaptability to financial tasks.

In conclusion, while GenAI faces its share of growing pains, the foundations are being laid for a future where its integration within RegTech is inevitable. With continued investment in talent, technology, and robust regulatory frameworks, GenAI is set to offer transformative benefits to the RegTech sector. This evolution will likely not only meet but exceed current expectations, driving significant performance improvements and proving indispensable for modern regulatory compliance and beyond.

Subscribe to our newsletter

Browse by brand

RegTech Insight

TradingTech Insight

Data Management Insight

Browse by content type

A-Team Insight Blogs

Generative AI in RegTech: Promise, Pitfalls and a Glimpse Into the Future

Share article

Related content

WEBINAR

Upcoming Webinar: Sponsored by FundGuard: NAV Resilience Under DORA, A Year of Lessons Learned

BLOG

Complex Sanctions Environment Demands Powerful Screening Monitors: SIX Report

EVENT

RegTech Summit New York

GUIDE

Regulatory Data Handbook 2025 – Thirteenth Edition

Share on Mastodon

A-Team Insight Blogs

Generative AI in RegTech: Promise, Pitfalls and a Glimpse Into the Future

Share article

Related content

webinars

Recorded Webinar: Navigating a Complex World: Best Data Practices in Sanctions Screening

Related content

WEBINAR

Upcoming Webinar: Sponsored by FundGuard: NAV Resilience Under DORA, A Year of Lessons Learned

BLOG

Complex Sanctions Environment Demands Powerful Screening Monitors: SIX Report

EVENT

RegTech Summit New York

GUIDE

Regulatory Data Handbook 2025 – Thirteenth Edition