Blog

OneLake: Beyond the Single Lake Illusion - How Fabric's Multi-Cloud Abstraction Really Works

28 July 2025

The pitch is elegant: OneLake is the "single unified data lake" underpinning Microsoft Fabric, promising a frictionless experience across data domains, teams, and clouds. It sounds simple—one place for all your data, no matter where it actually lives. But under the hood, the reality is far more nuanced, clever, and

technically ambitious.

This blog unpacks how OneLake actually abstracts underlying cloud storage systems like ADLS Gen2, AWS S3, and Google Cloud Storage (GCS). We'll go beyond marketing to explore how OneLake handles the complex realities of cloud heterogeneity, performance, security, and the idea of a "single copy of data." Is this truly revolutionary data architecture, or is it smart virtualization dressed up in new clothes? Let's get into it.

The OneLake Promise: One Copy, One Experience

Fabric's OneLake positions itself as the default storage layer for the entire Microsoft Fabric ecosystem. All Fabric workloads—Power BI, Synapse, Data Factory, Real-Time Analytics—read and write directly to OneLake. This isn't just convenience; it's a strategy to standardize how data is accessed and managed, regardless of whether it's structured, semi-structured, or unstructured.

The core claim? "One copy of data." This means avoiding redundant copies across services or tools. Instead of ETLing data into specialized storage for each use case, OneLake encourages direct access via native formats (like Delta Lake on Parquet).

But here's the kicker: OneLake can connect to data in other clouds without moving it. That's the real magic—and the challenge.

Shortcuts: The Glue of OneLake's Illusion

At the heart of OneLake's abstraction model is the concept of Shortcuts. These are essentially metadata pointers to data that exists elsewhere, whether inside OneLake or in external cloud storage like S3 or ADLS Gen2.

Think of a Shortcut as a symbolic link or mount point. From the user's perspective, the data appears as if it's part of OneLake's unified namespace. But behind the scenes, the data may still physically reside in AWS or Google Cloud Storage.

Shortcuts support Delta Lake format natively, enabling compatibility with Fabric's analytical engines. When you query a Shortcut, Fabric doesn't pull the data into OneLake; it accesses it in-place. This is how OneLake avoids data duplication while delivering a seemingly unified experience.

Benefits of Shortcuts:

Zero-copy data sharing across workspaces

Cross-cloud access without ETL

Unified governance and cataloging

But this raises some big questions.

Performance Across Clouds: Abstraction vs. Reality

The performance of cross-cloud data access depends heavily on the physical location of the data and the compute resources trying to access it. Even with Shortcuts, if your compute engine is in Azure and your Shortcut points to a dataset in S3, network latency and egress costs become non-trivial.

Microsoft acknowledges this. While Shortcuts enable seamless data access, they don't magically flatten performance differences. Accessing data in-place across cloud boundaries will always have a performance trade-off. There's no way around physics.

That means architects need to be strategic:

Use Shortcuts for metadata consolidation, occasional queries, or non-performance-critical workloads.

Co-locate performance-critical compute and data where possible.

Monitor costs related to data egress and latency.

In short, Shortcuts solve governance and access pain points, but they're not a silver bullet for performance.

Security Boundaries: Trust, Identity, and Control

Abstracting multiple clouds under OneLake raises serious security questions. How does OneLake handle identity, permissions, and data governance across different providers?

Each cloud platform has its own security model:

Azure uses Azure AD (Entra ID) and RBAC.

AWS relies on IAM roles and policies.

GCP uses IAM with its own structure.

Fabric addresses this with federated identity and delegated authorization. When creating a Shortcut to external storage, users provide credentials or delegate access using service principals or OAuth tokens. Fabric then enforces access control based on those credentials.

Still, this creates complexity:

Enterprises must manage cross-cloud identity federation securely.

Auditing and logging must span clouds and platforms.

Governance policies must be enforced consistently, which can be tricky when native policies differ.

The unified governance promise of OneLake is strong, but it requires tight integration with Microsoft Purview and careful planning across security domains.

Is This a True "Single Copy"?

On paper, OneLake's Shortcuts mean there's only one copy of the data. But it's important to be precise about what "single copy" means:

Logical single copy: All tools point to the same data location. No ETL needed. No duplicates.

Physical single copy: The data is literally stored once and accessed in-place.

OneLake generally achieves a logical single copy. But depending on performance needs, compliance constraints, or data sovereignty laws, organizations may still create physical copies.

Moreover, tools outside the Fabric ecosystem may not be able to use Shortcuts as easily. That means compatibility is limited to environments that understand OneLake's namespace and metadata structure.

So while the abstraction is powerful, it's not universal. Trade-offs exist.

Revolutionary Abstraction or Smart Virtualization?

OneLake's approach is undeniably clever. Using metadata-based Shortcuts to unify data access across clouds is an elegant form of virtualization. It's comparable to what Snowflake does with external tables or what Databricks enables with Unity Catalog.

But is it revolutionary?

Yes, in scope and integration. OneLake isn't just a storage abstraction layer; it's deeply integrated into Microsoft Fabric's end-to-end analytics stack. That means governance, security, BI, and engineering workflows all benefit from the abstraction.

No, in concept. The idea of abstracting data across systems via metadata is not new. The difference is how Microsoft has productized and operationalized it at scale.

So, the answer is: it depends on what problem you're trying to solve.

Practical Trade-offs for Architects

For data architects, OneLake is both a gift and a challenge. Here's how to think about its trade-offs:

Pros:

Simplified data architecture

Reduced data duplication

Centralized governance

Faster onboarding for cross-cloud data

Cons:

Performance unpredictability for cross-cloud Shortcuts

Complexity in securing multi-cloud access

Limited utility outside the Fabric ecosystem

The key is to balance abstraction with awareness. OneLake can streamline your data landscape, but it doesn't eliminate the fundamental realities of cloud diversity.

Final Thoughts

OneLake's unified lake abstraction is not a magic wand—but it is a powerful toolkit. The ability to federate data access across clouds, expose it through a single namespace, and govern it centrally is a leap forward in simplification.

Just remember: abstraction always comes with trade-offs. Whether you're building a modern data mesh, migrating to Fabric, or consolidating data governance, OneLake gives you more leverage—as long as you understand what it's actually doing under the hood.

Reach out to us at This email address is being protected from spambots. You need JavaScript enabled to view it. to discuss how OneLake within Fabric can help elevate your data analytics environment.

4Sight CP Aldert van Wyngaard Blog Banner 2024 01 1

Back to Blog