A-Team Insight Blogs

Databricks Extends Capabilities of Lakehouse Data and AI Platform

6 July 2022

Subscribe to our newsletter

Databricks, provider of the Lakehouse data and AI platform, has extended the platform’s capabilities with the addition of advanced data warehousing and governance, data sharing innovations including an analytics marketplace and data clean rooms for data collaboration, automatic cost optimisation for ETL operations, and machine learning (ML) lifecycle improvements.

The company, founded by the creators of open source solutions Delta Lake, Apache Spark and MLflow, works across business sectors including financial services, where its customer base includes the likes of Nasdaq, ABN Amro, Schroders, FIS, and Swedbank.

“Our customers want to be able to do business intelligence, AI and machine learning on one platform, where their data already resides. Databricks Lakehouse Platform gives data teams all of this on a simple, open, and multi-cloud platform,” says Ali Ghodsi, co-founder and CEO at Databricks.

The company’s additional data warehousing capabilities include Databricks SQL Serverless, available in preview on AWS and providing fully managed elastic compute for improved performance at a lower cost; Photon, a query engine for lakehouse systems that will be made generally available on Databricks Workspaces in coming weeks; open source connectors for Go, Node.js, and Python, to make it simpler to access the lakehouse from operational applications; and Databricks SQL CLI, enabling developers and analysts to run queries directly from their local computers.

Data governance additions include Unity Catalog, which will be made generally available on AWS and Azure, and provides centralised governance for all data and AI assets, with built-in search and discovery, and automated lineage for all workloads.

The company’s marketplace for data and AI will be available later this year, providing a place to package and distribute data and analytics assets. Unlike pure data marketplaces, Databricks’ offering enables data providers to package and monetise assets such as data tables, files, machine learning models, notebooks and analytics dashboards. Cleanrooms, also available later this year, will provide a way to share and join data across organisations with a secure, hosted environment and no data replication required.

ML advancements include MLflow 2.0, which includes MLflow Pipelines that can handle the operational set up of ML for users. Instead of setting up orchestration of notebooks, users can define the elements of the pipeline in a configuration file and MLflow Pipelines manages execution automatically. Beyond MLflow, Databricks has added serverless model endpoints to directly support production model hosting, as well as model monitoring dashboards to analyse real-world model performance.

Delta Live Tables is an ETL framework using a simple, declarative approach to building data pipelines. Since its introduction earlier this year, Databricks has expanded the framework with a new performance optimisation layer designed to speed up execution and reduce the costs of ETL.

Ghodsi concludes: “These new capabilities are advancing our Lakehouse vision to make it faster and easier than ever before to maximise the value of data, both within and across companies.”

Subscribe to our newsletter

Data Management Insight

WEBINAR

Recorded Webinar: Unlocking Transparency in Private Markets: Data-Driven Strategies in Asset Management

As asset managers continue to increase their allocations in private assets, the demand for greater transparency, risk oversight, and operational efficiency is growing rapidly. Managing private markets data presents its own set of unique challenges due to a lack of transparency, disparate sources and lack of standardization. Without reliable access, your firm may face inefficiencies,...

Find out more

15 October 2025

Data Management Insight

BLOG

Centralised Data Management Key to AI Success: Webinar Review

The absence of a centralised data management strategy for artificial intelligence is the biggest hurdle to integrating data from different sources for use with the technology. That was the finding of a survey of capital markets participants at a recent A-Team LIVE webinar “How to Organise, Integrate, and Structure Data for Successful AI”. While expert...

08 October 2025

Data Management Insight

EVENT

AI in Data Management Summit New York City

Following the success of the 15th Data Management Summit NYC, A-Team Group are excited to announce our new event: AI in Data Management Summit NYC!

19 March 2026

Data Management Insight

GUIDE

Regulatory Data Handbook 2025 – Thirteenth Edition

Welcome to the thirteenth edition of A-Team Group’s Regulatory Data Handbook, a unique and practical guide to capital markets regulation, regulatory change, and the data and data management requirements of compliance across Europe, the UK, US and Asia-Pacific. This year’s edition lands at a moment of accelerating regulatory divergence and intensifying data focused supervision. Inside,...

16 September 2025

Data Management Insight RegTech Insight

Browse by brand

RegTech Insight

TradingTech Insight

Data Management Insight

Browse by content type

A-Team Insight Blogs

Databricks Extends Capabilities of Lakehouse Data and AI Platform

Share article

Related content

WEBINAR

Recorded Webinar: Unlocking Transparency in Private Markets: Data-Driven Strategies in Asset Management

BLOG

Centralised Data Management Key to AI Success: Webinar Review

EVENT

AI in Data Management Summit New York City

GUIDE

Regulatory Data Handbook 2025 – Thirteenth Edition

Share on Mastodon

A-Team Insight Blogs

Databricks Extends Capabilities of Lakehouse Data and AI Platform

Share article

Related content

webinars

Recorded Webinar: Navigating a Complex World: Best Data Practices in Sanctions Screening

Related content

WEBINAR

Recorded Webinar: Unlocking Transparency in Private Markets: Data-Driven Strategies in Asset Management

BLOG

Centralised Data Management Key to AI Success: Webinar Review

EVENT

AI in Data Management Summit New York City

GUIDE

Regulatory Data Handbook 2025 – Thirteenth Edition