
Data complexity is often cited as a significant barrier to the delivery of programs such as customer experience management, master data management and advanced analytics.
What is data complexity?
Data complexity refers to the level of intricacy and size of data sets, particularly in terms of their structure, volume, and diversity. It involves analyzing and understanding the characteristics of data to determine its complexity, which can have implications for various processes such as storage, processing, analysis, and decision-making.
How to assess data complexity
There are several aspects to consider when assessing data complexity:
- Volume: Refers to the quantity of data available, typically measured in terms of bytes, records, or observations. Large volumes of data can pose challenges in terms of storage, processing, and analysis.
- Variety: Describes the diversity of data formats and types. Data can come in various forms, such as text, numbers, images, audio, video, or even unstructured data like social media posts. Managing and integrating diverse data formats can be complex.
- Velocity: Refers to the speed at which data is generated and needs to be processed. Real-time data streams or rapidly changing data can create challenges in terms of capturing, storing, and analyzing information in a timely manner.
- Veracity: Relates to the quality and reliability of data. Data may contain errors, inconsistencies, or missing values, which can affect its usability and reliability for analysis and decision-making.
- Complexity of relationships: Describes the interconnectedness and dependencies within data sets. Data may have complex relationships and dependencies that need to be understood and accounted for during analysis.
- Structure: Refers to the organization and arrangement of data. Structured data follows a predefined schema or format, while unstructured data lacks a predefined structure. Analyzing and processing unstructured data, such as text documents or social media posts, can be more challenging than structured data.
Why is it important to understand data complexity?
Understanding data complexity is essential for designing appropriate data management strategies, implementing efficient data processing systems, and applying appropriate analysis techniques. It helps organizations make informed decisions about data storage, processing capabilities, and the selection of tools and technologies to handle their data effectively.
Some of our clients have, for example, identified more than 1000 individual systems that hold client data.
While this level of complexity is extreme, the reality is that most large organisations have numerous systems, containing similar or related information.
Data complexity is poorly understood.
In one environment, we asked whether an apparently unused field in a mainframe data source could be reused for a newly identified business purpose. The answer – “it would take business analysts one year to do a full assessment and impact analysis in order to understand whether this field was, in fact, available, or was being used for something business critical.”
Sourcing the correct, or most trusted, version of data – whether to support analytics or even to populate the most recent email address for a marketing campaign can similarly consume a lot of time.
Enter the governed data catalogue
The concept of a data catalogue – a single source for users to find, access and share data – is simple. Yet, delivery can be difficult.
In order to succeed, the data catalogue must simplify the process of finding, understanding, defining and, finally, cataloguing the data you have. The only way to get this done is to use a team approach that allows those closest to each business function to do the data governance research, create their data catalogue, and then continue to improve on the results until, over time, a trusted enterprise picture of data is defined.

Modern data catalogues must support a range of use cases – from data discovery, lineage and relationships; to data observability and alerts; to business-friendly data governance and metadata management – as discussed in the Precisely webinar recording
What is the value of the completed data catalogue
Even when partially completed the data catalogue creates order and repeatability, saving time and reducing the risks of data-related projects. At its core, the data catalogue:
- Helps users easily find data
- Provides context for data, enabling impact assessment and reducing confusion
- Enforces and improve data rules and standards
- Paints an accurate picture of the data landscape and helps to identify areas for improvement
- Opens up new possibilities for data products and reuse.
- Enabling collaboration and improving understanding between business and technical stakeholders
Ultimately, a well-implemented and comprehensive business-ready data catalogue serves as a powerful tool for empowering knowledge workers, enabling them to effortlessly access a vast array of high-quality data.
By harnessing this invaluable resource, they can delve deep into the intricacies of their operations, unearth meaningful business insights, and unlock a wealth of knowledge to enhance decision-making processes.
This seamless access to data not only enables organizations to derive significant value from their information assets but also empowers them to gain a distinct competitive edge in today’s dynamic market landscape. With a business-ready data catalogue at their disposal, the possibilities for growth and success become boundless.

Leave a comment