A 8 bit lakehouse being explored by a team of people

Why data teams should pick data lakehouse architectures

Why data teams should pick data lakehouse architectures

Why data teams should pick data lakehouse architectures

Why data teams should pick data lakehouse architectures

Dec 4, 2025

Dec 4, 2025

Hi, it’s Kote from Tower. Welcome to part three of my lakehouse series, where I explain why you should choose a lakehouse. I believe this is the most important part so far.

In part one, I explained what data lakehouses, data lakes, and data warehouses are. In part two, I looked at the theory behind choosing between them. Now, I’ll focus on real-world decisions.

To find out why so many teams, especially smaller ones, are choosing lakehouse architecture, I spoke with experts with years of experience building and scaling data systems. Scott Haines, an O’Reilly author and distributed data systems engineer at Buf, and Serhii Sokolenko, co-founder of Tower, shared their insights. This article is based on their ideas and some of my own research.

Why this conversation matters for small data teams

If you’re in a small data team or working solo at a data-focused company, you know the challenges. Your data grows faster than your team. You’re dealing with more logs, events, product usage, JSON exports, ML training data, and more. Once you think you’ve set things up, you find you also need to manage images, audio, video, and other data types that don’t fit well in a traditional data lake or warehouse.

Warehouses can handle some of this, but much of the data is unstructured and better suited to a data lake. This means you have to set up and manage both systems, which takes extra time and effort. You spend more time on infrastructure and less on building pipelines or helping your organization make decisions.

A lakehouse brings everything together. Teams get one system for all analytics and ML workloads, with open, affordable storage and the reliability of a warehouse. Building a lakehouse might seem hard, but managed services like Tower make it easier by offering a ready-to-use lakehouse architecture for developers.

This mix of features is making lakehouse architecture more popular. The experts I spoke with think it should be the default choice for teams with limited resources and big goals.

Scott Haines, author and engineer: Why catalogs, not storage, define a real lakehouse

Scott Haines, author of Delta Lake: The Definitive Guide, emphasized that the real value of a lakehouse lies in its metadata layer, not in the storage beneath it. As he puts it:

“A lakehouse without a catalog is basically a lake.”

Scott says the catalog provides governance, schema control, consistency, and context. Without a catalog, teams end up with unmanageable buckets of files, just like a basic data lake.

But technology is only part of the story.

“You can’t be sloppy with how you build things. You need context, and that’s what’s interesting.”

Scott believes that being disciplined with layout, lineage, and governance turns an object store into a real architecture. This careful approach gives small teams the reliability of a warehouse and the flexibility of a lake, without adding complexity they can’t handle.

Serhii Sokolenko, co-founder of Tower: Lakehouse read scalability matters for AI agents

“One maybe surprising upside of storing business facts in lakehouses is how well they scale for reads. That matters a lot for AI scenarios, where hundreds of data agents may need to pull facts at the same time while answering users’ questions. In practice, you’re pushing most read traffic onto public-cloud object storage—an infrastructure designed for near-limitless read scalability,” said Serhii Sokolenko, co-founder of Tower and a frequent speaker on agentic data engineering topics.

Data Warehouse-level governance, without the extra cost

Cost efficiency is a big reason to consider the lakehouse approach. Data warehouses are usually proprietary, and vendors charge extra. Storing data in S3 and using a metadata catalog usually costs much less than using a proprietary data warehouse.

Another reason is the data warehouse-level governance that lakehouses offer, such as ACID transactions, schema enforcement, partitioning, and data versioning. You can keep your data reliable and consistent without paying warehouse storage prices, but you do need to manage things carefully. As Scott explained, you can’t afford to be sloppy. Being able to store more data, keep more history, and support more use cases without raising your infrastructure costs can be a game-changer.

Create a platform that grows with your data, even if your team doesn’t grow

Companies can’t always afford to grow their data teams. This means the data you’re handling keeps growing, but you might still be the only person managing it. Modern catalogs, compute engines, and orchestrators let you manage a lakehouse without needing a large platform engineering team.

Because the architecture is unified, predictable, and open, you don’t have to rebuild your foundations every time you add a new tool or try a new use case. Your team can stay small, while your system remains flexible enough to support new workloads.

When a lakehouse might not be the right choice

A lakehouse isn’t right for everyone. Many teams wouldn’t benefit from it, so it’s worth thinking about whether it fits your needs. Reasons not to build a lakehouse architecture right now include:

  • If you have less than 50GB of data and a single Postgres instance meets your needs, you don’t need this level of abstraction.

  • If your analytics are exclusively structured BI with no plans for ML or unstructured workloads, a warehouse may be the simpler option.

  • If your goal is simply to archive raw data without analysis, a lake is enough.

Final thoughts

Not every problem needs a lakehouse, and not every team needs one right now.

Data lakehouses offer the flexibility of a data lake, the governance of a data warehouse, and the affordability of open object storage, all in one system. They reduce complexity, avoid lock-in, support both BI and AI, and allow teams to focus on delivering insights rather than maintaining duplicated infrastructure. They didn’t become popular by accident. They solve real, everyday challenges modern data teams face.

If you’re thinking about building a lakehouse, you can use Tower with Iceberg, Lakekeeper, or other tools you may want to bring..  

You can check out our examples, read our guides, or try it yourself with our free trial. If you want to talk about lakehouses, join us on Discord.

© Tower Computing 2025. All rights reserved

Data Engineering for fast-growing startups and enterprise teams.

© Tower Computing 2025. All rights reserved

Data Engineering for fast-growing startups and enterprise teams.

© Tower Computing 2025. All rights reserved

Data Engineering for fast-growing startups and enterprise teams.

© Tower Computing 2025. All rights reserved

Data Engineering for fast-growing startups and enterprise teams.