About the client
Headquartered in Houston, Texas, the customer is one of the leading suppliers of fuels, lubricants, and petrochemicals in the United States. With a vast nationwide network of branded stations, the organization has been delivering reliable energy solutions for over a century.
Beyond fuels, the client has diversified into specialty lubricants and industrial oils, in addition to being recognized for its sustainability and community-driven initiatives.
Client challenges
The organization had already invested significantly in Azure Synapse to power its enterprise analytics. However, as business units began pushing for more advanced analytics and machine learning use cases, limitations in the existing setup became increasingly evident.
While enterprise data was centralized, business teams struggled to reliably access and use it for their specific needs. They had clearly defined use cases but lacked a unified platform to bring data together and operationalize these initiatives.
Accessing data was often a manual and time-intensive process. Each request required IT teams to extract, move, or recreate datasets, which introduced delays and increased dependency on central teams.
This led to several key challenges:
- Data duplication
In the absence of a governed distribution model, business units created their own copies of enterprise data. Over time, this resulted in multiple versions of the same dataset and reduced trust in insights. - Limited enablement for advanced analytics and ML
Although teams had clear use cases, they lacked a platform that could reliably support experimentation, model development, and analytics at scale. - Operational bottlenecks for IT
Data provisioning workflows were inefficient and heavily dependent on IT, slowing down analytics initiatives across the organization. - Gaps in governance and standardization
There was limited control over access, lineage, and policy enforcement, along with challenges in managing Dev, Test, and Prod environments consistently. - Fragmented data ecosystem
Critical data across procurement, maintenance, and refinery operations was spread across systems like SAP, LTRM, and the AVEVA PI System, making it difficult to build unified analytics solutions.
These challenges made it difficult for the organization to scale analytics and move toward a more data-driven operating model.
Solution
LevelShift implemented a governed, zero-copy enterprise data platform on Microsoft Fabric, designed to centralize enterprise data control while enabling scalable, self-service analytics across business units.
The engagement began with a structured discovery and assessment phase to map data sources, workloads, governance requirements, and access patterns across the organization. Based on these findings, LevelShift implemented a four-pillar Fabric architecture focused on segmentation, data distribution, automation, and governance to establish a scalable and secure enterprise analytics foundation.
Pillar 1: Segmentation Principles
Enterprise domain and business unit workspace segmentation
A domain-driven workspace model was established, with a centralized Enterprise Domain managed by IT and separate Business Unit Domains for teams such as refining, lubricants, and light oils.
Each domain included dedicated Dev, Test, and Prod workspaces aligned with Git branches, enabling controlled development, validation, and production deployment.
Pillar 2: Data Distribution Strategy
Zero-copy data integration using OneLake shortcuts
Instead of replicating data, enterprise datasets from Azure Synapse were connected to Microsoft Fabric using OneLake shortcuts. This enabled direct access to Synapse SQL pools without physical data movement, ensuring a single source of truth while eliminating duplication and reducing storage overhead.
Centralized data distribution with controlled access
Enterprise data was provisioned only through the Enterprise Domain, where IT configured and managed all Synapse shortcuts. Business units accessed this data in read-only mode through their respective domains, preventing duplicate connections and ensuring consistent data usage across the organization.
Support for business unit-specific data and workloads
In addition to enterprise data, business units were enabled to ingest and manage non-enterprise data within their domains. This allowed teams to combine enterprise datasets with external sources such as SAP, APIs, and local systems for analytics, reporting, and machine learning use cases.
Enablement of advanced analytics and ML workloads
The platform supported analytics and data science workloads through Fabric Lakehouses, notebooks, and semantic models. Business units could perform ML experimentation, model tracking, and reporting directly within their domains using governed enterprise data.
Pillar 3: Automation & CI/CD
Template-driven provisioning and CI/CD automation
Standardized templates were created for domain setup, workspace structure, permissions, and lakehouse components.
CI/CD pipelines were implemented with Git integration, enabling automated promotion of artifacts across Dev, Test, and Prod environments with validation and approval gates.
Pillar 4: Governance & Security
Governance and security with Microsoft Purview
A layered governance model was implemented using Microsoft Purview, including role-based access control, sensitivity labeling, data classification, lineage tracking, and audit logging. That ensured enterprise-wide visibility, compliance, and policy enforcement while maintaining flexibility for business users.
This architecture replaced fragmented data access patterns with a scalable, governed framework that balanced centralized control with decentralized analytics capabilities.
Benefits
The implementation transformed how data was accessed, governed, and utilized across the organization:
- Eliminated data duplication through a zero-copy architecture
- Reduced data provisioning time from manual processes to near-instant access
- Improved data consistency and trust with a centralized data model
- Strengthened governance with unified control and visibility
- Enabled scalable self-service analytics and machine learning
