Key Learnings - The Heart of Data Mesh & Fabric
Learnings about Data Mesh, Data Products, Data Contracts, Data Catalog, AI and Data Culture
Participating a conference is about taking the time, meeting people, discuss state-of-the-art and getting the details behind the slides. While possibly some insights seems to be trivial, it can always give a unterstanding of the current state of the topic in practice.
Fig. 1: Pictures from BARC event “The Heart of Data Mesh & Fabric”, Timm Grosser showing current challenges according to the coming BARC Data Management Survey
From the BARC conference “The Heart of Data Mesh & Fabric” I put together my take aways for reflection. Attending the conference the focus been on Data Mesh and the core topic Data Products. More and more can be seen that Data Contracts getting the enabler for Data Products, data governance and automatization. In a similar way Data Catalogs or how BARC refer to Data Intelligence can play an important role, even if there are currently discussions of the Meta Grid for decentralized approaches to make use of metadata. We could also see that AI is driving many topics. But as before, withoud high data quality and the right organization and Data Culture it is hard to really make use of it.
Enjoy the read, join the discussion!
Data Mesh
Data Mesh is a socio-technical approach to building a decentralized data architecture by leveraging a domain-oriented, self-serve design. Here are my top 6 take aways from the conference about Data Mesh.
Fig. 2: Jan Henning explains what is needed to make Data Mesh a success story
✳️ In average 73% of the companies “need a more distributed approach to data & analytics". Data & analytics experts in business units see that at 92%.
✳️ As a benefit of implementing Data Mesh, more than 60% see greater agility and faster response to business changes, faster satisfaction to data demand in business domains an better use of data.
✳️ For defining and priorizing use cases of a data domain, the usage of Domain-driven Design (DDD) practices like event storming or domain storytelling works well.
✳️ There is a greate divide between business departments and IT which can be solved by interdisciplinary collaboration via data domains and their data products.
✳️ Limited domain expertise and limited capacity in central teams and increasing requirements can prevent further scale of data & analytics where Data Mesh could help.
✳️ When it comes to data mesh, it's still a work in progress everywhere. It is a long road of learning how to do it.
Data Product
A Data Product or in Data Mesh Data-as-a-Product is a logical unit of data, metadata, code and the necessary infrastructure under a clear ownership. Here are my top 5 take aways about Data Product.
Fig. 3: Guido Schmutz and Thomas Gassmann presenting how to handle changes in the data product lifecycle
✴️ Data Product is for companies currently more important than Data Mesh. This means that Data Products can be handled independently from Data Mesh.
✴️ Data Product development should start with continuous exploration of new Data Products.
✴️ Over the time decomposing of a Data Products is possibly necessary. Consider handling old and new versions of the Data Product e. g. by using a translator to the old schema/contract.
✴️ A data marketplace can play a crucial role for access of trustworthy Data Products.
✴️ Reporting is the most obvious use of data. The real value can be created by building Data Products beyond Reporting!
Data Contract
A Data Contract is a formal agreement between a data producer and consumer. Here are my top 6 take aways about Data Contracts.
Fig. 4: Dr. Simon Harrer tells us, a data contract is not a contract - possibly to be discussed
✅ 70% or more of the participants of a BARC study consider business and technical information on the data product and security and privacy-related information as relevant.
✅ A Data Contract is not a contact, it is rather an offer to several potential consumers.
✅ Data Contracts are a practical way to implement federated computational governance of the Data Mesh.
✅ It is recommended to establish 'continuous documentation' from the business data owner as part of the development process.
✅ Open Data Contract Standard (ODCS) seems to be popular as an orientation for Data Contract implementation. E. g. Lidl Breuniger use it. But often it seems to be important to adapt to your needs.
✅ Consider consumer-based Data Contracts as a kind of requirement to the data. Typically Data Contracts are rather producer-oriented.
Data Catalog
A Data Catalog is a inventory of an organizations data assets and metadata. Here are my top 5 take aways about Data Catalog.
Fig. 5: Sonja Dirnberger shows Schaeffler’s way to a more integrated data catalog landscape
🔵 Current enhancements in the data catalog/data intelligence market are around Data Product and Data Product management like requirement management, data domain management and data contract management.
🔵 To use a Data Catalog in a Data Mesh can increase data literacy, regulatory compliance, improve user experience and increase data quality. Furthermore it increases trust in data and data value.
🔵 Using a technical and a business-oriented Data Catalog in parallel can lead to unclear use cases and redundant efforts.
🔵 To make Data Mesh more tangible, a Data Catalog can help the users to get in touch with it and interact, e. g. finding things, requesting access, …
🔵 A Data Catalog can help making your Data Mesh more transparent e. g. the Data Products, key actors, dependencies, business knowledge, …
Artificial Intelligence
Artificial Intelligence (AI) refers to the ability of computers to solve human-like task. Machine Learning and GenAI are currently the main topics for. Here are my top 5 take aways about Artificial Intelligence.
Fig. 6: Jan Ulrich Maue shows the value of a Data Fabric for Retrival Augmented Generation (RAG)
🧠 Buying AI out of the box doesn’t help for differentiation. You have to make it on your own and you need the right data.
🧠 Use AI to automate routine tasks, while humans should be responsible for new situations where AI is not helpful. Furthermore, humans must be skilled and responsible for what happens, never AI.
🧠 When starting with AI, many are building lighthouses. Start with AI at scale for everyone – like your own internal chatbot, connected to your internal knowledge. Then make use of embedded and stand-alone AI vendors delivering in their tools. If you have high quality data and the right organization build highly individualized and game changing AI for your company.
🧠 AI is a great opportunity for data protection, for example to classify company data.
🧠 Often domain team cannot take on tasks like MLOps (e.g. monitoring data/concept drift), so it might make sense to centralize such tasks
Data Culture
Data Culture describes the ability of an organization to make use of data. Here are my top 5 take aways about Data Culture.
Fig. 7: Florian Bigelmaier uses storytelling for moderation
1️⃣ Change from “Why do you need to know” to “why can’t these data be accessed to everyone”.
2️⃣ Data Culture is the foundation for a data-driven company. Not everyone works at a computer but people record data or have contact in different ways. Consider this.
3️⃣ To effectively working with data you have to unite business and IT. Data Mesh can help.
4️⃣ A common language for data is essential to be effective. There are always many platforms and seldom a single truth.
5️⃣ There is always a lot to do in the area of data & analytics. Don't solve problems that don't exist at the moment and always ask why.
What are your most important insights at the moment about Data Mesh & Co.?