A common theme, across our software solution stack, is metadata management. The ability to provide context and meaning to data is a foundation of many other disciplines.
The broad range of applications for metadata mean that approaches to metadata management vary widely, with no one tool set or platform addressing every need – particularly when addressing the complex data landscapes of big, modern enterprises.
Our partner choices reflect different approaches to metadata management that address different needs. This blog post is intended to help you to understand why we have selected these partners and where they may fit in to your data landscape.
The Data Catalog
A data catalog is quickly becoming the cornerstone of metadata management for data-driven businesses. The catalog is intended to ensure that any knowledge workers can quickly and easily find the data that they need to do their job, provide them with relevant context to allow them to make decisions, and get insight into how data is used within the organisation.
Key capabilities that differentiate data catalogs include:
- Business driven
- Tool is accessible to all knowledge workers to allow data context to be referenced, crowd source content and support collaboration across and within data siloes
- Interface supports both technical and non-technical users
- Provide real-time insights into how data is being used to support business processes, compliance events, reports, and metrics
- Integrated data quality
- Value-driven data scoring
- Data assets can be linked to business goals, objectives and outcomes to measure the ROI of metadata management activities
- Governance driven
- Clearly understand and communicate accountability for data and decisions around data, and automate key processes to reduce the data stewardship workload and time wasted in meetings
- Extensible data model to allow you to quickly and easily add any metadata type to your metadata repository
- Business metadata may include data policies, report definitions, KPI’s and metrics, business terms, curated data sets, business processes and similar assets that provide context to data
- Technical metadata documents physical systems, cloud data sources, data dictionaries, reference data, etc. Wherever possible this metadata should be harvested automatically
- Relationships allow you to model your data landscape and understand how assets support each other.
- Open integration layer allowing metadata
We have found Infogix Data360 Govern to be a good fit to these criteria and it is our recommended core platform to support metadata management.
Unified Data Lineage
Data catalogs, like Data360 Govern, offer some form of automated metadata harvesting. For example, we can ingest databases structures etc by connecting directly to the underlying database and reading the tables.
However, for most catalogs technical metadata and lineage is ingested via connectors to underlying ETL tools, data modelling tools and the like. In many cases these connectors are limited, for example a metadata vendor that provides and ETL tools may provide connectors for their ETL tool but have very limited connectors for 3rd party tools.
In practice, most organizations depend on multiple ETL tools, processes and code to move data around. For example, we may move data from operational systems to the enterprise data warehouse using an enterprise ELT tools. Once data is in the EDW we may use stored procedures to manipulate the data further e.g. to aggregate raw data or to move data into the EDW schemas. Data may be further manipulated in the reporting layer. Tracking and maintaining changes to these lineages can be very difficult, but is increasingly an organizational necessity to ensure trust in reporting.
Our partner, MANTA provide a specialist unified lineage platform. They make it quick and easy to connect to and ingest metadata from most commonly used data sources, reporting tools, modelling tools, ETL tools and even read code, such as JAVA, SQL and COBOL, to trace movements of and changes to data.
While MANTA provides a lineage view of data it also exposes data to third party data catalogs – such as Infogix, Collibra, IBM, Informatica and more. MANTA enhances the data catalog by harvesting data flows and keeping these synchronised to their business context.
Understanding your ERP or CRM
Another niche application we have discovered is providing business context and meaning to the metadata layers of common, enterprise ERP and CRM packages such as SAP, Salesforce, and the Oracle and Microsoft stacks.
These platforms have large complex table structures that may not be meaningful when accessed at the database level. Safyr from Silwood Technology provides self-service metadata discovery for your ERP or CRM systems. Safyr makes it easy to, for example, isolate the tables and columns used for customer master in your ERP or CRM, provide business context and link these to the underlying tables, and present this metadata to your data catalog.
For example, finding personal data in SAP is made much simpler using Safyr. Another use case would be to understand the impact of migrating from SAP ECC to Sap S4/HANA, or migrating from Peoplesoft to Dynamics.
No one size fit
Based on these descriptions you can make decisions based on your organization’s size, complexity and priorities. What is clear is that for large organizations a multifaceted approach to metadata management will reduce manual effort and give a more accurate result.
Give us a call to learn more.