Architecture — Microsoft Fabric Guide

Architecture

Core Architecture

Understanding the foundational building blocks of Microsoft Fabric.

Architecture Overview

Microsoft Fabric is organized around three key concepts: Capacities (the compute engine), Workspaces (the organizational unit), and Experiences (the specialized tools for different data roles).

Microsoft Fabric Architecture

Fabric Experiences

🔄 Data Factory ⚙️ Data Engineering 🧪 Data Science 🏢 Data Warehouse ⚡ Real-Time Intelligence 📊 Power BI

⬇

Compute & Processing

Apache Spark SQL Engine KQL Engine Analysis Services Data Pipelines

⬇

OneLake — Unified Storage

Delta / Parquet Shortcuts ADLS Gen2 Multi-cloud

⬇

Governance, Security & Compliance

Microsoft Purview Entra ID Sensitivity Labels Managed VNets

Key Concepts

Capacities

A capacity is a dedicated set of compute resources. All Fabric workloads (Spark, SQL, Power BI, etc.) share the same capacity pool measured in Capacity Units (CUs). You choose a SKU (F2, F4, F8, … F2048) based on your workload needs.

Workspaces

Workspaces are the primary organizational and security boundary in Fabric. Think of them as folders that contain your artifacts (lakehouses, warehouses, notebooks, pipelines, reports). Each workspace is mapped to a capacity and has its own access control.

OneLake

OneLake is Fabric's built-in data lake — a single, unified storage layer for the entire organization. It's built on Azure Data Lake Storage Gen2 and uses the Delta Lake open format. Every Fabric tenant gets exactly one OneLake, and every workspace automatically gets a folder in OneLake.

✅ Best Practice

Use Shortcuts to reference data in external storage (AWS S3, Google Cloud Storage, other ADLS accounts) without copying it. This lets you unify your data view without moving data.

Experiences

Experience	Purpose	Key Artifacts
Data Factory	Data ingestion and orchestration	Pipelines, Dataflows Gen2
Data Engineering	Big data transformation with Spark	Lakehouse, Notebooks, Spark Jobs
Data Science	Machine learning and experimentation	Notebooks, Experiments, Models
Data Warehouse	Enterprise data warehousing with T-SQL	Warehouse, SQL Queries
Real-Time Intelligence	Streaming and time-series analytics	Eventhouse, KQL Queryset
Power BI	Business intelligence and reporting	Semantic Models, Reports, Dashboards

📚 Learn More

What is Microsoft Fabric? ↗ Fabric Architecture Overview ↗ OneLake Overview ↗ OneLake Shortcuts ↗ Lakehouse Overview ↗ Data Warehousing ↗

Data Patterns

Medallion Architecture

The proven data organization pattern for building reliable and scalable data lakehouses.

What is the Medallion Architecture?

The medallion architecture (also known as the "multi-hop" architecture) is a data design pattern that organizes data into three logical layers: Bronze, Silver, and Gold. Each layer represents an increasing level of data quality and business readiness.

Medallion Architecture Flow

Bronze Raw Data

Raw ingestion from sources. Append-only, immutable, exact copy of source data.

→

Silver Cleansed

Validated, deduplicated, enriched. Conformed data models and business rules applied.

→

Gold Business-Ready

Aggregated, curated for reporting. Star schemas, KPIs, and consumption-ready datasets.

Layer Details

🥉 Bronze Layer (Raw)

Stores data exactly as received from source systems
Append-only ingestion — never modify or delete raw records
Include metadata columns: _ingestion_timestamp, _source_system, _batch_id
Store in Delta format for time-travel and ACID transactions
Retain raw data for compliance, auditing, and reprocessing

🥈 Silver Layer (Cleansed & Conformed)

Apply data quality rules: deduplication, null handling, type casting
Standardize column names and data types across sources
Join and enrich data from multiple Bronze tables
Apply slowly changing dimensions (SCD Type 1/2) where needed
This layer is the "single source of truth" for your organization

🥇 Gold Layer (Business-Ready)

Build star/snowflake schemas with facts and dimensions
Pre-aggregate KPIs and business metrics
Optimized for Direct Lake mode in Power BI
Apply column-level and row-level security as needed
This layer serves dashboards, reports, and ad-hoc analysis

Naming Conventions

Recommended naming pattern

Lakehouses:
  lh_bronze          — Raw ingestion lakehouse
  lh_silver          — Cleansed and conformed data
  lh_gold            — Business-ready consumption layer

Tables (Bronze):
  bronze_crm_customers          — Source: CRM, Table: customers
  bronze_erp_sales_orders       — Source: ERP, Table: sales_orders

Tables (Silver):
  silver_dim_customer           — Conformed customer dimension
  silver_fact_sales             — Conformed sales fact

Tables (Gold):
  gold_sales_summary_daily      — Daily sales aggregation
  gold_customer_360             — Customer 360 view

⚠️ Common Pitfall

Don't skip the Silver layer. Going directly from Bronze to Gold creates brittle pipelines and makes it harder to add new consumers later. The Silver layer provides a stable contract between producers and consumers.

When to Use Medallion vs. Other Patterns

Pattern	Best For	Considerations
Medallion (Bronze/Silver/Gold)	Most Fabric implementations; batch and micro-batch workloads	Clear separation of concerns; well-understood pattern
Data Mesh	Large orgs with domain-oriented teams	Combine with medallion within each domain
Lambda / Kappa	Dual batch + real-time pipelines	Use Real-Time Intelligence alongside medallion

📚 Learn More

Delta Optimization & V-Order ↗ Medallion Lakehouse Architecture ↗

Real-Time

Real-Time Intelligence

Streaming analytics, event processing, and time-series workloads in Microsoft Fabric — from ingestion to live dashboards in seconds.

💡 When to use Real-Time Intelligence

Use Real-Time Intelligence when you need sub-second latency on streaming data — IoT telemetry, clickstreams, fraud detection, operational monitoring, or log analytics. For batch/micro-batch workloads, the Lakehouse + medallion pattern is more appropriate.

Core Components

📡 Eventstreams

No-code event ingestion from 30+ sources — Azure Event Hubs, Kafka, IoT Hub, custom apps, CDC streams. Transform events in-flight with built-in processors (filter, aggregate, union).

🏠 Eventhouse

The primary database for real-time data. Built on Azure Data Explorer (Kusto) engine — optimized for append-heavy, time-series workloads with automatic indexing and compression.

🔍 KQL Queryset

Kusto Query Language for exploring streaming data. Purpose-built for time-series: summarize, make-series, render timechart, anomaly detection, and pattern matching.

📊 Real-Time Dashboards

Live dashboards with auto-refresh down to 1-second intervals. Pinned KQL visuals, parameters, and cross-filtering — no import or refresh schedule needed.

🔔 Activator

No-code trigger engine — monitor streaming data and fire actions (emails, Teams messages, Power Automate flows) when conditions are met. Think "alerts as a service."

🌐 Real-Time Hub

Centralized catalog of all streaming data in your organization. Discover, subscribe to, and share real-time event streams across workspaces and domains.

Architecture Pattern: Event-Driven Pipeline

Eventhouse vs. Lakehouse: When to Use What

Aspect	Eventhouse (KQL)	Lakehouse (Spark/SQL)
Data pattern	Append-heavy, time-series, streaming	Batch, micro-batch, full reloads
Latency	Sub-second ingestion to query	Minutes (Spark jobs) to seconds (Direct Lake)
Query language	KQL (Kusto Query Language)	Spark SQL, PySpark, T-SQL
Best for	Logs, IoT, clickstream, monitoring, fraud	Data warehousing, ML feature stores, reports
Retention	Hot/warm caching with auto-purge policies	Persistent Delta tables in OneLake
Integration	Eventstreams, Real-Time Hub, Activator	Data Factory, notebooks, Power BI Direct Lake
OneLake	Can mirror data to OneLake as Delta for cross-engine access	Native OneLake storage

🎯 Architecture tip

Use both together: Eventstreams routes hot data to Eventhouse for real-time dashboards and alerts, while simultaneously landing the same events into a Lakehouse Bronze layer for historical analytics and ML. This "lambda-like" pattern is natively supported in Fabric.

📚 Learn More

Real-Time Intelligence ↗ Eventstreams ↗

Core Architecture

Architecture Overview

Key Concepts

Capacities

Workspaces

OneLake

Experiences

📚 Learn More

📎 See Also

Medallion Architecture

What is the Medallion Architecture?

Layer Details

🥉 Bronze Layer (Raw)

🥈 Silver Layer (Cleansed & Conformed)

🥇 Gold Layer (Business-Ready)

Naming Conventions

When to Use Medallion vs. Other Patterns

📚 Learn More

Real-Time Intelligence

Core Components

📡 Eventstreams

🏠 Eventhouse

🔍 KQL Queryset

📊 Real-Time Dashboards

🔔 Activator

🌐 Real-Time Hub

Architecture Pattern: Event-Driven Pipeline

Eventhouse vs. Lakehouse: When to Use What

📚 Learn More