The Truth about the "Single Version of Truth"
There is not only one way, if it is even necessary
It is very common that customers have problems to deliver an integrated reporting and this one truth everyone expect from a data platform.
First, is this really what you need? A Single Version of Truth is rather a defensive orientation for your data strategy. If you want to be flexible, business-oriented or fast, multiple versions of truth can be acceptable. Important is, you know what you are doing.
How it started…
I remember a scene several years ago - I've been part of a discussion between the sales department and the controlling department of a division.
Sales: “We need to adapt our sales plan revenue to the season, as we have to make reporting usable for us!”
Controlling: “No! Everyone here is planning with a linear distribution over 12 months, as this is easier to handle and aligned to the group.”
Sales: “But this is not realistic and over the season the plan will always differ from the actuals. But fine let us do the planning two times and we call it a level 2 reporting.”
Controlling: “This is not possible, there is only one truth!”
Fig. 1: Discuss, but come to a common understanding
Finally, Controlling won…!
Such situations easily leads to such a situation (customer example):
Fig. 2: We can have Multiple Sources of Truth and Multiple Version of Truth at the same time - for the price of some challenges…
Fig. 2 shows a grown architecture without a clear data governance and data architecture. It is not about the technologies in place, rather about how they are used. Within the architecture showed, there was a lot of data exchange between the different operational systems. Many local and operational tools are used leading to different experiences, different understanding of data, and a lot of missunderstanding. But Multiple Versions of Truth can be accepted. Here the customer prefer individual freedom of data, velocity and flexibility over consistency and costs.
How it is going …
First, let’s clarify some terms:
Single Source of Truth (SSoT) - The principle of data storage according to which a particular piece of information always originates from one place. From a business and organisational perspective, this means that data is only created at the source, in the relevant master system, according to a specific process or set of processes. SSoT enables greater data transparency, a central storage system, traceability, clear ownership, cost-effective reuse, etc.
Single Version of Truth (SVoT) - The practice of providing decision makers with clear and accurate data in the form of answers to highly strategic questions. Effective decision making requires that accurate and verified data serves a clear and controlled purpose and that all stakeholders trust and recognize this purpose. SVoT enables greater data accuracy, clarity, timeliness, alignment, etc. SVoT refers to a view [of data] that everyone in an organisation agrees is the real, trusted number for specific operational data.
As said, the opposite of a SVoT, a Multiple Version of Truth (MVoT) can be OK, if intentionally used. A MVoT can bring a lot of flexibility and autonomy to the data usage. But you should understand that there are different ways to reach or at least approach a Single Version of Truth, even in distributed or complex data landscapes (example from another customer discussion):
Fig. 3: Different ways to approach a Single Source of Truth
The Truth is, a Single Source or Version of Truth will be expensive and useless at the same time if you just do it to have one. Depending on your organisation structure and technological setup you can create the SVoT physically (central Data Platform), decentral/federated e. g. if you have different platforms for different use cases and data is clearly separated. If this is not possible create this consistent view virtually wie a semantic layer and a data catalog. Even the semantic can be distributed as it is very typical with tools like Qlik or Power BI but stay consistant via a central data catalog as a reference.
You will not solve these scenarios just with a technological approch but need to build the organization and processes to make that work. But be aware, even with a rather technical approach like a central platform, you have challenges to stay consistent without a minimum data governance.
Outlook
There will always be the discussion about how to come to this Single Version of Truth. We have to be aware that approaches like Data Mesh explicitly give up this single source idea to enable the business and time to analytics by decentralize data management and focus on where it is important e. g. by using a data catalog or cross domain modeling. Often it is not a black or white decision, as you have to adapt to your individual situation and needs.
This is a relocated and extended article from my experimental Substack “Data & AI Chronicles”. What do you think about the Single Source of Truth? Is it really necessary?