The data volume and the frequency of data changes has increased exponentially in the context of digitalization. That leads to very heterogeneous data landscapes with many different data sources. To leverage big data advantages, the key success factor is to get more transparency and fast access to the data. In previous information management projects CAMELOT identified that many data-driven companies explore use cases in order to improve big data governance. 

What is a Data Catalog and why is it needed?

A Data Catalog (DC) is designed to help users finding the right information quickly. This need has become even more important in complex data environments with multiple sources. An appropriate tool support is required that helps users to collaboratively use the information within a defined framework of workflows and rules. The liberation of data should be established with fast access and searchability.

Who needs a Data Catalog?

A DC tool can be used across different domains such as Procurement, Sales, Finance, SCM, etc. Data analysts, data scientists, data stewards and other roles need to find and understand data. For example, a data scientist needs to access multiple data domains and sources, e.g. from an ERP or CRM system. Via a workflow the access to domain specific tables could be granted by the domain expert, e.g. a data steward of ERP material data. This governance is important, as for example under GDPR not every data should be accessible by every user. With a strict governance through all processes and roles, a DC ensures compliance and privacy.

How can a Data Catalog be used?

It is important that the DC contains and structures business definition of the data so that the users of different departments can work with it independently. This clearly differentiates it from a data model, because the user group is not necessarily limited to data users only. With a broad set of functionalities, a DC can transfer raw data from different sources into consumption ready data for various users. It provides certain aspects of Data Visualization, such as scorecards, data quality dashboards and reports. Data Governance is another function with role-based views or the ability to use workflows. Additionally, functions in the area of Data Collaboration, Data Analytics, Data Modelling, Data Integration or Metadata Management can be part of a DC platform.


Figure 1: Data Catalog as the bridge between data sources and data users

Across our projects, CAMELOT is recognizing that DCs are platforms that combine several trend topics. This leads to a huge variety of solutions with different strengths. For example, some perform very well in the area of data quality, others in data collaboration. Targeting explorative big data scenarios, DCs can govern the data analytics part successfully. It is very important to guide our customers to the real use-case they want to tackle with a potential tool support. Generally, CAMELOT sees a huge potential leveraging DCs to accomplish data liberation, transparency and trust.

Comments

Leave a Reply

Your email address will not be published.

Recommended articles

Innovation

IoT and Digital Manufacturing: networking at all levels

The Internet of Things is basically about networking. But many IT managers are still not exploiting the potential of integration at …

read more
Data & Analytics

How to Leverage the Four Types of Enterprise Information?

Your enterprise information is the most heterogeneous, omnipresent and probably promising asset of your company. But how to leverage it?

read more
Procurement

Overcoming the Hurdles for Automation in Procurement

Academia as well as experts have confirmed that since the early 90s the procurement function has evolved towards being recognized as …

read more