Technically dispatch: Why Databricks wanted a database
All about their recent $1bn acquisition of Neon
Databricks, your friendly neighborhood purveyor of “Leading Data and AI Solutions for Enterprises” recently bought a small (but beloved) database company called Neon. Neon is a wrapper around PostgreSQL, the most popular production database in the world; their database helps people build bread and butter apps. Meanwhile, Databricks is out there helping giant enterprises build fancy data and AI workflows…so why in LeCun’s name did they make this acquisition at all?
Note: just before this post published, Snowflake also acquired CrunchyData, a competitor of Neon’s, for about 1/4 the price, for probably similar reasons (or maybe just because Databricks did it).
To understand the question better, let’s break down what developers actually use data for. Traditionally, databases have broken down into two categories: operational and analytical. Operational databases are what power the day-to-day of your app, like your users table. Analytical databases store long term data for later analysis.
These different types of databases also have different technical requirements. Operational queries – or requests to the database – are usually short, sweet, and need to be real fast. Analytical queries are usually long, complex, and require joining data across multiple sources; but there’s more of a tolerance for them to take longer.
Historically, Databricks has played squarely in the analytical world. They helped developers and data teams get the most out of their analytical data, starting with basic Spark workflows and eventually expanding into models and AI. Neon, meanwhile, is squarely in the transactional world. Postgres is the textbook transactional database. So what is Databricks doing acquiring a transactional database, when all of their customers use them for analytical data?
Indeed, developers on Hacker News were confused:
And many worried that Databricks would ruin Neon:
I can only speculate as to the logic here, but it’s clear that Databricks is trying to expand into the operational market. They bought bit.io, another Neon-like serverless Postgres vendor, in 2023 (and promptly shut them down). I’ll give you two perspectives on why.
The first is the more obvious one: competition is getting steep in the AI space, and Databricks is starting to lose their advantage in the warehouse part of their business (data lakes, remember them?). Hacker News user jamesblonde says it well:
In non-engineer English: open source software is getting very good. You don’t need to pay up for Databricks to run large analytical or AI workloads anymore. Expanding into production / operational data gives them a new angle to build a business on.
My thought? This is about AI agents. In the post announcing the acquisition, Neon put out an interesting stat:
“We leaned into agent-focused development, and within a few months, over 80% of databases were being created by AI agents rather than humans.”
Even if we discount this number by half, it’s still a striking stat. Because of how easy the Neon software is to use via API, it was becoming a popular way for AI agents building apps to get a database going. There’s no human in this loop: someone asked an agent to build an app (potentially using Neon), and the agent autonomously spun up a Neon database to power the app. This, in my mind clearly, is the future of software development: engineers as verifiers and conductors, while AI agents do most of the work.
So if you’re Databricks, and you’re trying to bet your whole $60B+ company on AI, Neon might offer an interesting opportunity for you. The AI agents that people are building on your platform are going to go out there and build apps…why not give them the option to use your production database? And thus pay you for it instead of another database vendor? They can even build in preferences and native integrations to make sure models built on Databricks prefer to use Neon.
Anyway, this is just speculation. But for $1B…I hope I’m right.