πŸ‘€ Who is this for?

IT Admin Data Architect Platform Owner Data Steward β€” This page covers data catalog strategies in Microsoft Fabric: OneLake Catalog capabilities, Microsoft Purview integration, federated governance, and best practices for metadata-driven data discovery.

Overview

Overview: Data Cataloging in Fabric

Why cataloging matters and how Fabric delivers it through a dual-catalog model

Data cataloging is the discipline of creating an organized, searchable inventory of all data assets within an organization. Without a catalog, data consumers spend significant time hunting for the right datasets, duplicating existing work, or unknowingly using stale or non-compliant data. In enterprise environments where hundreds of teams produce thousands of artifacts, a well-maintained catalog becomes the foundation of effective data management.

Microsoft Fabric addresses this challenge through a dual-catalog model that operates at two complementary scopes. OneLake Catalog is Fabric's built-in catalog β€” deeply integrated into the Fabric experience, optimized for day-to-day discovery and governance of OneLake assets. Microsoft Purview is the enterprise-wide governance platform that spans the entire data estate: Microsoft 365, Azure services, Fabric, on-premises systems, and even third-party clouds like AWS and Snowflake.

These two catalogs are not competitors β€” they are layers in a unified governance architecture. OneLake Catalog provides the fast, contextual experience that data engineers and analysts need within Fabric. Purview provides the cross-estate visibility, compliance controls, and regulatory-grade governance that IT and compliance teams require. Together, they ensure that data is discoverable, trustworthy, and protected at every level.

The key pillars of effective data cataloging in Fabric are:

πŸ” Discovery

Enable data consumers to find the right data quickly through search, browse, metadata filtering, and AI-powered recommendations. Reduce time-to-insight by making assets visible and well-described.

πŸ›‘οΈ Governance

Establish policies, ownership, and accountability for data assets. Track endorsement status, metadata quality, and policy compliance. Ensure data meets organizational standards before broad consumption.

πŸ” Security

Protect sensitive data through classification, sensitivity labels, and access controls. Apply data loss prevention policies and audit who accesses what. Ensure regulatory compliance across the estate.

πŸ”— Lineage

Trace data from source through transformations to consumption. Understand impact of changes, validate data freshness, and build trust through transparency. Enable root-cause analysis when issues arise.

OneLake Catalog

OneLake Catalog

Fabric's built-in catalog for browsing, governing, and securing OneLake assets

OneLake Catalog is the native data catalog experience within Microsoft Fabric. It provides a centralized interface for discovering, governing, and securing all data assets stored in OneLake β€” including lakehouses, warehouses, semantic models, pipelines, notebooks, and reports. Unlike external catalog solutions that require separate configuration and synchronization, OneLake Catalog is deeply embedded in the Fabric platform and reflects the current state of your data estate in real time.

The catalog is organized around three primary tabs, each addressing a distinct aspect of data management:

Explore Tab

The Explore tab is the primary discovery interface for data consumers. It allows users to browse all accessible data assets across workspaces, domains, and item types. Users can filter by metadata attributes, sensitivity labels, endorsement status, owner, and freshness. The search experience supports natural language queries and surfaces results based on relevance, usage patterns, and the user's permissions.

Key capabilities include cross-workspace browsing with domain-based organization, metadata-driven filtering (item type, owner, sensitivity label, endorsement), lineage visualization showing upstream and downstream dependencies, and permission-aware results that respect workspace and item-level security. Users see only what they have access to β€” the catalog never exposes assets beyond a user's permissions.

Govern Tab

The Govern tab provides governance insights and recommended actions for catalog administrators and data stewards. It surfaces governance reports that highlight gaps in metadata quality, ownership assignments, sensitivity labeling, and endorsement status. The actionable governance experience presents specific recommendations β€” such as "42 datasets lack descriptions" or "15 items have no assigned owner" β€” and enables bulk remediation directly from the interface.

Policy compliance monitoring tracks whether assets meet organizational governance requirements. Administrators can see which domains are well-governed and which need attention, track trends over time, and identify areas where governance debt is accumulating. The Govern tab transforms governance from a periodic audit exercise into a continuous, observable process.

Secure Tab

The Secure tab provides a unified view for managing security across OneLake assets. It consolidates workspace roles, item permissions, row-level security, and sensitivity labels into a single pane. Administrators can audit who has access to specific assets, review permission inheritance, and identify over-permissioned users or groups.

This tab is particularly valuable for access reviews and compliance audits. Rather than navigating to each workspace individually, security teams can assess the access landscape from a centralized view, identify anomalies, and take corrective action.

Catalog API

The OneLake Catalog exposes a REST API for programmatic metadata discovery. This enables automation scenarios such as custom governance dashboards, integration with external ITSM tools, automated metadata validation pipelines, and bulk operations on catalog assets. The API supports querying across workspaces, retrieving lineage information, and reading governance metadata.

Catalog API β€” Discover Items
GET https://api.fabric.microsoft.com/v1/admin/items
    ?workspaceId={workspace-id}
    &type=Lakehouse

Authorization: Bearer {token}
Content-Type: application/json

The catalog also integrates with Microsoft Teams (sharing and discovering assets in context), Excel (connecting to endorsed datasets), and Copilot (natural language questions about your data estate).

πŸ’‘ Organize by Business Domain

Structure your OneLake Catalog around business domains (Finance, Sales, Operations, HR) rather than technology silos. Assign domain owners who are accountable for metadata quality and endorsement within their domain. This aligns catalog organization with how business users think about data and makes discovery intuitive.

Microsoft Purview

Microsoft Purview

Enterprise-wide data governance across your entire data estate

Microsoft Purview is the enterprise-wide data governance platform that provides a unified view of your entire data estate β€” spanning Microsoft 365, Azure services, Microsoft Fabric, on-premises systems, and third-party clouds including AWS, Google Cloud, Snowflake, and Databricks. While OneLake Catalog focuses on the Fabric experience, Purview operates at the organizational level, delivering the cross-estate visibility and compliance controls that large enterprises require.

Purview's Unified Data Catalog serves as the authoritative inventory of all data assets across the organization. It automatically discovers and classifies data wherever it resides, maintains a living data map, and enables governance teams to apply consistent policies across heterogeneous environments. For organizations using Microsoft Fabric, Purview extends governance beyond OneLake to encompass the full data supply chain.

The platform provides AI-aware security capabilities that understand how data flows through AI systems, ensuring that sensitive information is protected even when consumed by machine learning models, copilots, and automated processes. Cross-cloud coverage means that governance policies defined in Purview apply regardless of where data is stored or processed.

πŸ—ΊοΈ Data Map

Browse and search your entire data estate from a single interface. Automated scanning discovers assets across Azure, Microsoft 365, on-premises, and multi-cloud environments. Visualize the topology of your data landscape.

πŸ”— Data Lineage

End-to-end visual lineage from source systems through ETL transformations to Power BI reports. Understand how data flows, identify impact of changes, and trace issues back to their origin across system boundaries.

🏷️ Classification

Automated sensitive data classification using built-in and custom classifiers. Detect PII, financial data, health records, and other sensitive categories. Apply classifications consistently across all data sources.

πŸ“– Business Glossary

Define business terms and link them to technical assets. Create a shared vocabulary across the organization. Ensure consistent interpretation of metrics, dimensions, and KPIs regardless of the underlying system.

πŸ”’ Sensitivity Labels

Microsoft Information Protection (MIP) labels that flow downstream automatically. Labels applied in Purview propagate to Fabric items, Power BI reports, and exported files β€” enforcing protection throughout the data lifecycle.

πŸ“‹ DLP & Compliance

Data loss prevention policies that detect and prevent unauthorized sharing of sensitive data. Regulatory compliance frameworks for GDPR, HIPAA, SOX, and industry-specific requirements. Audit trails for all governance actions.

⚑ Key Distinction

Microsoft Purview is the estate-wide governance plane β€” it spans your entire organization's data landscape across all platforms and clouds. OneLake Catalog is the Fabric-focused experience β€” optimized for fast, contextual data discovery and governance within Microsoft Fabric. Use both: Purview for cross-estate policies and compliance, OneLake Catalog for daily Fabric operations.

Comparison

OneLake Catalog vs. Purview

Understanding the scope and strengths of each catalog

OneLake Catalog and Microsoft Purview serve different but complementary roles in the governance architecture. The following comparison highlights where each excels and helps you understand when to use which tool for specific governance tasks.

Dimension OneLake Catalog Microsoft Purview
Coverage Fabric/OneLake assets only Cross-cloud, multi-environment (Azure, M365, AWS, GCP, on-prem)
Governance Workspace-level metadata & security management Advanced: sensitivity, data quality, lineage, policy enforcement, full audit
Data Products Listing and search within Fabric Full marketplace: publish, endorse, business glossary, data contracts
Security Fabric-level, workspace-scoped access control Unified, regulatory-grade, estate-wide security posture
Lineage Within Fabric only End-to-end, cross-system lineage
Classification Basic metadata tagging Automated sensitive data classification with 300+ built-in classifiers
Integration Embedded in Fabric UI, Teams, Excel, Copilot M365, Azure, on-prem, multi-cloud, third-party connectors

The key takeaway is that these tools operate at different scales. OneLake Catalog is your "inner loop" β€” fast, contextual, and optimized for the Fabric user experience. Purview is your "outer loop" β€” comprehensive, policy-driven, and designed for enterprise-wide governance and compliance.

Who Uses What? Persona Guide

Different roles interact with OneLake Catalog and Purview in different ways. Use this guide to understand which tool each persona should focus on and why.

πŸ‘¨β€πŸ’» Data Engineer

OneLake Catalog

Your daily driver. Use OneLake Catalog to find datasets, check lineage within Fabric, discover endorsed lakehouses and warehouses, and verify data freshness before building pipelines. You rarely need Purview directly β€” its policies flow into your Fabric experience automatically.

πŸ“Š Data Analyst / BI Developer

OneLake Catalog

Search for certified and promoted datasets to build reports on. Connect directly from Power BI, Excel, or Teams. Check endorsement status and sensitivity labels before sharing dashboards. OneLake Catalog surfaces everything you need without leaving the Fabric ecosystem.

πŸ›‘οΈ Data Steward

Both

You live in both worlds. Use OneLake Catalog's Govern tab daily to monitor metadata quality, assign ownership, manage endorsements, and remediate governance gaps within your domain. Use Purview for defining data quality rules, managing glossary terms, and ensuring your domain meets enterprise-wide compliance standards.

πŸ”’ Compliance / Security Officer

Purview

Your primary workspace. Use Purview for estate-wide sensitivity classification, DLP policy enforcement, regulatory compliance reporting (GDPR, HIPAA, SOX), and cross-system audit trails. OneLake Catalog's Secure tab is useful for Fabric-specific access reviews, but Purview is where you define and enforce organization-wide security posture.

πŸ—οΈ Data Architect

Both

Design domain structures and workspace organization using OneLake Catalog. Use Purview's Data Map for cross-system lineage tracing, impact analysis across the full data estate, and validating that your architecture decisions support governance requirements. You bridge the gap between Fabric-local design and enterprise-wide data strategy.

βš™οΈ Platform Owner / IT Admin

Both

Manage capacity, workspaces, and tenant settings within Fabric. Use OneLake Catalog's Secure tab for access audits and permission reviews. Use Purview to define tenant-wide governance policies, configure sensitivity labels, and monitor compliance across all data platforms β€” not just Fabric.

πŸ’‘ Use Both Together

Most organizations will use both β€” OneLake Catalog for day-to-day Fabric operations (finding datasets, checking endorsements, reviewing permissions) and Microsoft Purview for enterprise-wide governance (cross-estate policies, regulatory compliance, sensitive data protection, and full lineage tracing). They are designed to work together, not as alternatives.

Integration

How They Work Together

The federated governance model connecting OneLake Catalog and Microsoft Purview

OneLake Catalog and Microsoft Purview are designed to work as an integrated governance stack. Fabric assets discovered and managed through OneLake Catalog are automatically surfaced in Purview's Unified Data Catalog, creating a seamless bridge between Fabric-local governance and enterprise-wide compliance.

Metadata Synchronization

When data assets are created or modified in Fabric, OneLake Catalog captures their metadata β€” item type, workspace, owner, description, sensitivity label, endorsement status, and lineage relationships. This metadata is automatically synchronized to Microsoft Purview, where it joins the broader data map alongside assets from Azure SQL, Synapse, Databricks, S3, and other sources. Changes flow bidirectionally: sensitivity labels applied in Purview propagate back into Fabric, and metadata updates in Fabric are reflected in Purview.

Federated Governance Model

The integration enables a federated governance approach. OneLake Catalog handles local management β€” domain owners govern their assets within Fabric, apply endorsements, manage descriptions, and control workspace access. Purview provides global compliance β€” enterprise policies, cross-tenant controls, regulatory frameworks, and organization-wide sensitivity classifications are defined centrally and enforced everywhere.

This model respects the principle of subsidiarity: decisions are made at the most local level possible, with central oversight ensuring consistency and compliance. Domain teams have autonomy within guardrails established by the governance office.

Sensitivity Labels & DLP

Sensitivity labels defined in Microsoft Purview are enforced within Fabric. When a label is applied to a lakehouse or semantic model, it automatically propagates to downstream artifacts β€” reports, exports, and shared links all inherit the protection. Data loss prevention (DLP) policies configured at the enterprise level in Purview are visible and enforced within Fabric, preventing unauthorized sharing of sensitive data.

End-to-End Lineage

Lineage tracing spans the entire data lifecycle. Purview captures lineage from external source systems, through Fabric pipelines and transformations, all the way to Power BI reports consumed by business users. This cross-system lineage enables impact analysis (what happens if this source schema changes?), root-cause investigation (where did this data quality issue originate?), and compliance auditing (can we prove the provenance of this regulatory report?).

Data Quality & Governance Policies

Data quality rules and governance policies defined at the enterprise level in Purview are visible within Fabric's OneLake Catalog. Data stewards can see which assets meet quality thresholds, which are flagged for remediation, and how their domain compares to organizational standards β€” all without leaving the Fabric experience.

Integration Flow

Governance Flow Architecture
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    DATA GOVERNANCE FLOW                              β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                                     β”‚
β”‚  Data Sources ──→ OneLake ──→ OneLake Catalog ──→ Purview Unified  β”‚
β”‚  (SQL, APIs,      (Storage)   (Fabric-local       Catalog           β”‚
β”‚   Files, S3)                   governance)        (Enterprise-wide) β”‚
β”‚                                                                     β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚  POLICIES FLOW BACK DOWN:                                     β”‚  β”‚
β”‚  β”‚  Purview β†’ Sensitivity Labels β†’ OneLake Catalog β†’ Fabric     β”‚  β”‚
β”‚  β”‚  Purview β†’ DLP Policies β†’ OneLake β†’ Reports & Exports       β”‚  β”‚
β”‚  β”‚  Purview β†’ Data Quality Rules β†’ Visible in Fabric UI         β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚                                                                     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

This bidirectional integration means that governance is not a one-time setup but a living process. As data assets evolve, the catalog stays current. As policies are updated centrally, enforcement propagates automatically. The result is a governance posture that scales with the organization without requiring manual synchronization.

Best Practices

Best Practices

Proven strategies for effective data cataloging and governance in Fabric

🏒 Organize by Business Domain

Structure your catalog around business domains (Finance, Sales, HR, Operations) rather than technology silos. Assign domain owners who are accountable for metadata quality, endorsement decisions, and access governance within their domain. This aligns with data mesh principles and makes discovery intuitive for business users who think in terms of business processes, not infrastructure.

πŸ“‹ Enforce Metadata Standards

Register every dataset, dashboard, pipeline, and ML model with rich metadata. Automate classification using Purview's built-in classifiers. Set organizational policies requiring descriptions, ownership, sensitivity labels, and business glossary terms for all published assets. Incomplete metadata is invisible metadata β€” undocumented assets are effectively lost.

βš–οΈ Implement Federated Governance

Maintain a tenant-wide baseline of compliance requirements (sensitivity labeling, ownership, minimum metadata). Delegate stricter, domain-specific rules to domain owners who understand their data best. Balance central control (consistency, compliance) with domain autonomy (agility, context). The center defines guardrails; domains operate within them.

πŸ“Š Monitor Catalog Health

Review catalog governance reports regularly β€” weekly for active domains, monthly for the full estate. Track gaps in ownership assignment, metadata quality scores, sensitivity labeling coverage, and endorsement status. Create governance KPI dashboards that make health visible to leadership. What gets measured gets managed.

⭐ Use Endorsement Strategically

Promote artifacts that are ready for broader use within a domain. Certify only those meeting rigorous quality criteria β€” accuracy, freshness, documentation completeness, and owner responsiveness. Define explicit certification criteria and a review process. Endorsement loses value if applied indiscriminately; it should signal genuine trustworthiness.

πŸ”— Leverage Shortcuts

Minimize data duplication with OneLake shortcuts β€” reference data in place rather than copying it. Use Materialized Lake Views for performance-sensitive scenarios while maintaining a single source of truth. Shortcuts preserve governance metadata and lineage without the overhead of data replication and the risk of governance divergence.

πŸ”„ Automate Lifecycle Operations

Treat Fabric artifacts as code using Git integration and CI/CD deployment pipelines. Version pipelines, notebooks, semantic models, and reports. Maintain a full audit trail for all changes β€” who changed what, when, and why. Automation reduces human error and ensures governance processes are repeatable and auditable.

πŸš€ Plan Phased Adoption

Start with high-value pilot workloads β€” datasets that are widely used, business-critical, or subject to regulatory requirements. Demonstrate value with a single domain before scaling. Roll out domain by domain, incorporating lessons learned. Continuously refine governance processes based on feedback from data producers and consumers.

Checklist

Implementation Checklist

A phased approach to implementing catalog and governance in your Fabric tenant

Implementing data cataloging and governance is a journey, not a one-time project. The following phased checklist provides a structured path from initial setup to mature, organization-wide governance. Each phase builds on the previous one β€” complete the foundation before advancing to classification and policy enforcement.

Phase 1: Foundation

Phase 2: Classification & Labeling

Phase 3: Governance & Policy

Phase 4: Scale & Optimize

⚠️ Avoid Governance Overload

Don't try to govern everything at once. Start with your highest-value, most-used datasets and expand incrementally. Overly ambitious governance programs create friction without delivering value. Begin with a narrow scope, demonstrate ROI, build organizational buy-in, and then expand. The goal is sustainable governance, not comprehensive documentation that nobody maintains.