IBsolution Blog

Data Mesh: Decentralized approach to data architecture

Written by Raik Schleifer | Feb 24, 2023

Ongoing digitization means that companies are generating ever greater volumes of data, while user requirements for data are rapidly increasing and constantly changing. Companies want to use this information to make data-based decisions and optimize their business. As far as the underlying data architecture is concerned, a centralized organization in the sense of a data warehouse or data lake has dominated in companies in the past.

 

 
SAP Data Warehouse Cloud supports the principles of the data mesh approach

 

 

Classic approaches reach their limits

However, such monolithic data platform architectures are increasingly reaching their limits when it comes to the challenge of processing structured, semi-structured and unstructured data from different data sources at different speeds, qualities and quantities. One of the reasons for this is the existence of central data teams that are responsible for providing and preparing the data. In their role as intermediaries between data producers and consumers, they are often caught between the demands of both sides. Centralized data processing departments often cannot cope with the increasing pace of change in requirements due to ever-shorter application lifecycles and the explosive growth of distributed IT landscapes. This leads to increased dissatisfaction on the part of data producers and consumers.

 

While data producers have domain knowledge and understand the meaning and context of the data, data consumers recognize the potential of the data for the business and have high demands on data quality. The dilemma for data teams: they are expected to provide high quality data, but they lack domain knowledge and have no influence on the quality of data generation. Faced with this constellation, companies are increasingly coming to the realization that their current data architecture (in many cases an isolated data warehouse or data lake) may no longer fit current requirements. The call for more democratization and scalability in data provisioning is getting louder.

 

The principles of data mesh

This is exactly where the so-called data mesh architecture comes into play. This novel approach brings data producers and consumers as close together as possible, eliminating the need for an intermediary between teams. The data-producing teams should provide the data in such a way that consumers can extract value from the data without detailed domain knowledge.

 

Data mesh is based on four fundamental principles:

  • Domain Ownership

  • Data as a Product

  • Self-Service Data Platform

  • Federated Computational Governance

These principles enable the creation of an efficient data architecture that best supports business goals and avoids the shortcomings of traditional data management structures. In the following, we describe the four data mesh principles in more detail and look at different solutions and technologies from SAP’s portfolio that contribute to each of the principles.

 

Principle 1: Domain Ownership

In the data mesh, data ownership is decentralized so that individual business units (domains) take responsibility for their own data. This promotes accountability, encourages innovation, and reduces the risk of data silos. Data is made available for operational use in non-domain systems and for analytical purposes.

 

SAP Datasphere (formerly SAP Data Warehouse Cloud) offers good opportunities for implementing this domain-driven approach. A key component is the Space concept. Here, a Space corresponds to a domain in the data mesh framework. Each of these domains represents an isolated workspace to which users and connections can be arbitrarily assigned. This enables the isolation of metadata, self-service data modeling and self-service data flows. At the same time, SAP Datasphere provides the ability to explicitly share objects across Spaces.

 

Principle 2: Data as a Product

Data provided by domains is treated as products – and the users of these products are the customers to be satisfied. The idea behind this: The responsibility for data quality lies with the business units, since they know their data best. In this context, aspects such as findability, security, understandability and trustworthiness are relevant, among others. As valuable assets, data products help create business value and drive growth by being available as a usable product. One SAP solution that supports this principle is Data Marketplace. As a central component of SAP Datasphere, it plays an important role in making data products available for both internal and external data exchange.

 

Principle 3: Self-Service Data Platform

To autonomously manage their data products, domain teams need access to a highly abstracted infrastructure that can reduce or even eliminate complexity and friction in the provisioning and management of data. Domain teams should not have to worry about technical details such as interfaces or protocols. Therefore, a self-service data platform must be provided that includes appropriate tools to best support data producers in creating, maintaining, and operating their data products.

 

With the help of SAP Business Technology Platform (BTP) and the other SAP products mentioned above, such a self-service data platform can be built. In addition, artificial intelligence can provide useful support for the respective topics, since AI technologies are an integral part of SAP BTP.

 

Principle 4: Federated Computational Governance

In order for the independent data products to interact smoothly, a certain degree of standardization is required. In the interest of uniform use of the self-service data platform, the decentralized domain teams and the central platform teams agree on certain universal guidelines. These apply to all data products and their interfaces and ensure an interoperable data ecosystem.

 

While issues such as data protection, security and governance are managed centrally in traditional architectures, the data mesh approach specifically shifts these to the domain teams. This means that overall responsibility for data products includes, for example, the protection of personal data in addition to high data quality. SAP HANA Cloud offers suitable technologies to implement domain-specific protection measures and thus ensures, among other things, the legally compliant use of the data products.

 

Conclusion: Data architecture for digital enterprises

Data mesh represents a paradigm shift and paves the way to a domain-driven data architecture. Each business unit is responsible for the definition, quality and creation of data from its own domain. Data mesh applies the principles of product management to data management. The goal is for business units to promote usable data as a product.

 

Data mesh is an extremely exciting approach for modern, digital enterprises and represents a vision for organizations looking to move in this direction. However, data mesh requires a certain level of digital maturity and corresponding skills among employees in the company – especially among employees involved in data provision. Consequently, comprehensive change management should accompany the introduction in order to successfully establish data mesh as a forward-looking data architecture and new organizational principle.