Data without context is meaningless
Author: Bas van Gils
If you follow the news, you are not only confronted with data (e.g. data about the stock exchange, about Covid infections, and the state of popularity of the government), you are also confronted by news about data: privacy concerns, the need to protect (statistical) data, and new business models based on data are discussed almost daily.
To me, data management is about two things: (1) managing data as an asset, and ensuring it is fit for purpose, and (2) creating value with data. When I talk about this with organizations, eyes quickly glaze over. Data is presumed to be like water: it is always there when you need it and it is always of sufficient quality to do whatever you want with it. It is often considered to be a resource that sits in our systems and that the IT department takes really good care of it.
This point of view is fundamentally flawed for many reasons. In this blog post, I would like to consider one of these reasons which is:
Data is not something we can leave with the IT department, it is a business asset and leveraging it effectively requires close cooperation between business and IT stakeholders.
In support of this claim, I would like to go back to Shannon's famous 1948 article called "A Mathematical Theory of Communication" in which he proposed to encode information in ones and zeros so that the information can be transmitted between parties. We could, for example, that the signal "0001" means "the weather is good", whereas "0010" means "the weather is ok, but we expect rain". On a certain day we transmit/communicate "0010" and the other end will have a pretty good idea about current weather conditions in our location. Great! But note: the numbers do not mean anything in and of themselves. We have assigned meaning to them. In a different context, "0010" could have a completely different meaning (e.g. it could mean: bombs away!). The whole scheme works only because we have agreed on the specific meaning of each code.
We have reached the point where we have transcended the situation where we encode meaning in bits and bytes but the same discussion is relevant still, yet at another abstraction layer. These days we talk about the meaning of data elements: what does it mean to be a Customer, and does Customer mean the exact same thing in different parts of the company? What exactly is an exception state for a business transaction? Modern technology ensures that, once we reach an agreement on these definitions, the translation to Shannon's ones and zeros is done automagically.
Deciding (a) what data elements are key for running a successful business, (b) assigning meaning to these data elements, taking the different use-contexts into account, and (c) clearly communicating these meanings is a business responsibility. This sets requirements for how data can be effectively stored and communicated across the enterprise. This, in turn, is an IT responsibility. Together, these are the two sides of the data-coin: one cannot exist without the other.
I would like to end this article with a hypothesis: locating, sourcing, and using the right data with the right quality at the right time becomes much more effective with good (contextual) definitions of data. Investing in these definitions will pay itself back tenfold, easily.