Data Catalogs in different Data Architectures
Pattern of how Data Catalogs can be useful in different contexts
Metadata plays a major role in modern data landscapes. It enables the automation of data processes and the interpretation of existing data assets. In the latter case, the data catalog can be seen as the end user's tool, which enables the interpretation of the data as well as the addition of specialist business data. Data catalogs have become increasingly important in recent years in order to make metadata usable. The driving factors behind this include the increasing creation, growth and use of data in general.
From our customers we hear different motivations why they are dealing with the idea of a data catalog:
Fig. 1: Customer motivation for a data catalog
Especially interesting to hear is that they are interested in new data architecture approaches where Data Mesh and Data Fabric is currently really hot as we know. Very often then they come to the point that a data catalog is a central element to enable these kind of data architectures.
The following overview shows differend types of data catalogs:
Fig. 2: Overview of different data catalog types1
This means also that not all data catalogs are equal. For a first orientation it is always helpful to understand the different types and usage scenarios. Additionally the selection of the appropriate data catalog and its use in the corresponding context of different data architectures play an important role here in this and the following blogs.
Today, data catalogs are powerful tools for users to quickly access and understand the data available in the company. This can be achieved without constantly requesting authorizations for systems or having to work through tables and databases in the hope of interpreting the data correctly. The latest technological developments, such as the integration of artificial intelligence using the example of ChatGPT, are quickly being harnessed to further improve orientation in complex, distributed architectures. It therefore plays an important role on the way to becoming a data-driven company.
In recent years, data architecture patterns have emerged in analytical data management, which have been accompanied by the constant further development of data catalogs:
Fig. 3: Overview of different data catalog types
This results in various specific usage scenarios for data catalogs in the company deeper viewed in the following blogs (tbd):
As this and the following blogs are not introductory and focus on the context of data architectures, I recommend Ole Olesen-Bagneux book “The Enterprise Data Catalog” or his blog series “Symphony of Search” as currently one of the best sources for a thorough introduction into this topic.
This blog is based and adapted from my article in the german BI-Spektrum issue 4/2023.
Based on Fraunhofer ISST-Report “Data Catalogs” - https://www.isst.fraunhofer.de/content/dam/isst-neu/documents/Publikationen/Datenwirtschaft/Fraunhofer-ISST_DataCatalogs_Report-kl.pdf