Share this article:

Data Catalogs Meet Data Marketplaces – From Selection to Implementation

Share This Article

In today’s data-driven world, the value of data lies not just in its abundance but in how effectively it is managed and utilized.

Robust data governance — encompassing ownership, quality, confidentiality — enables companies to stay compliant. More importantly, it turns data into a competitive advantage by unlocking analytics at scale, driving efficiency, enabling smarter decisions, and creating new revenue streams.

But achieving this requires more than governance, it demands seamless data discovery and accessibility, made possible today by the integration of data catalogs and data marketplaces.

What Is a Data Catalog Versus a Data Marketplace?

Data catalogs and data marketplaces are both essential for establishing effective data governance, but they serve distinct purposes. This article explores their differences, similarities, and how combining them can unlock data at scale.

Data Catalog in a Nutshell

A data catalog is a centralized repository that collects, organizes, and enriches metadata about an organization’s data assets. This facilitates quick and easy data discovery and management, enabling users to understand and utilize data effectively. Key features of a data catalog include:

  • Data Discovery: Improves data visibility and supports finding required data easily and quickly by allowing users to search data assets within the organization.
  • Data Quality, Profiling & Classification: Analyzes data to gather statistics and insights on quality, structure, and distribution to derive classification of data item.
  • Business Term Creation: Defines and manages standardized business terms linked to data elements in the catalog – translating to business-usable concepts.
  • Metadata management: Stores metadata about data assets such as data source, data type, data owner, and data lineage 
  • Data Lineage: Tracks the origin and transformation of data enabling effective data and information sharing across different systems and processes.
  • Data governance: Introduce organization wide data governance policies including data quality, data privacy, and compliance to enable business efficiency.
  • Collaboration: Facilitates collaboration among data users by providing information about data usage and context.

Illustrative example of a Data Catalog by Ataccama (https://www.ataccama.com/platform/data-catalog)

Data Marketplace in a Nutshell

A data marketplace is a platform for exchanging data—whether internally within an organization or externally with other entities. Similar to an e-commerce marketplace, it enables data providers to share or monetize their data assets. Key features of a data marketplace include:

  • Data Trading: Enables the buying and selling of data between different organizations.
  • Data Monetization: Allows data providers to monetize their data by offering it for sale.
  • Data Access: Provides access to a wide variety of external data sources, which can be used to enrich internal data.
  • Data Licensing: Manages the licensing agreements for data assets.
  • Data Quality: Often includes mechanisms to ensure the quality and reliability of the data being traded.

Illustrative example of a Data Marketplace by Snowflake (https://www.snowflake.com/en/data-cloud/marketplace)

Key Differences Between Data Catalogs and Data Marketplaces

While both tools focus on data management, their primary objectives differ:

  • Data Catalogs: primarily focused on organizing and managing internal data assets, making it a valuable tool for data governance offices within an organization.
  • Data Marketplaces: Enable external data sharing and monetization, fostering value creation beyond the organization.

Data catalogs prioritize compliance and internal transparency, while data marketplaces emphasize accessibility and the commercialization of data assets.

Combining Data Catalogs & Data Marketplaces

In the past ten years, companies have been rolling out data catalogs to govern internal data due to increased regulations. Although some companies have been successful, many failed due to lack of adoption from employees outside data management offices. Why so? Because the majority of employees are interested in data accessibility, sharing and exchange. In short, governing data is not the ultimate goal but rather a means to unlock data in a controlled and secure manner.

With the above in mind, in the past three years, Data Catalog vendors have progressively added Data Marketplace features to their Data Catalog tool. Nowadays, companies are rolling out features from both tools in a unified way for enhanced data governance and enabling data sharing for analytical goals.

Let’s picture an illustrative example of a financial services company combining both tools. The unified platform allows data analysts to not only discover internal datasets related to customer transactions, but also integrate external datasets such as macroeconomic indicators provided by a third-party organization. The integrated governance features ensure that all data assets comply with regulatory requirements, and the marketplace functionality enables the company to monetize anonymized and aggregated trade data by offering it to third-party researchers.

Key Components for a Successful Data Catalog & Data Marketplace Implementation

Selecting the Right Solution

Selecting a suitable tool for your company depends on a variety of factors – you should aim to find a tool that supports your data governance needs, provides the functionality you are looking for in terms of data sharing, and fits into your IT landscape and budget. 

There is a large number of tools in the market and Unit8 recommended assessment methodology combines desk research and hands-on testing of the tools involving future end-users and business units.

Pre-condition & Requirement Gathering:

A tool is only as good as its adoption this is why it is key to identify and involve the key stakeholders incl. future end-users as early as possible. The requirements stated by the future end-users build the key for your assessment and provide valuable insights into the key functionalities your tool needs to provide. In addition, representatives from IT architecture and IT security need to be involved as a data catalog requires interfaces to your key systems that need to be evaluated thoroughly.

By conducting interviews, process shadowings, and process mapping workshops, you can derive the strategic capabilities and intended use cases for the data catalog and marketplace to define key evaluation dimensions. The outcome of this phase is first, a well-defined project team split into core members—delivering the solution—and extended stakeholders—providing end-user feedback and IT architecture and security support. Second, an importance-weighted requirement matrix to start your evaluation.

Illustrative example of weighted requirements matrix

Tool evaluation & RFP process

Unit8 strongly advises to have a clear structured assessment approach for evaluating the various vendors on the market to ensure an objective, requirement-focused selection of the tool and build acceptance across the company towards the final decision by making the decision process transparent.

Unit8 data catalog assessment methodology

Starting with an initial long-list of vendors the evaluation starts to narrow down the pool of potential solutions based on their i) fit-for-purpose ii) fit-for-cost iii) fit-for-future – it is important to also take less obvious criteria into account like vendor support, future product roadmap, and company stability & reputation. Once the list of vendors has been narrowed down to a manageable short-list, the RFP process will be conducted and contact to the vendors established for a more in-depth evaluation.

POC or Testing Phase 

Unit8’s experience shows that PoC or testing phase is a key step to ensure the correct choice. The goal is to get first-hand feedback on the user experience from future end-users and test the functionalities advertised by the vendors. In a PoC phase you would set up the solution connecting to actual systems of your company (with non-sensitive test data) to understand the complexity of integrating the tool into your IT landscape in a scenario as close to reality as possible. A less resource-intensive alternative is to conduct a testing phase based on demo instances provided by the vendors, which still provides a good test of UX, but will not test the integration complexity into your IT landscape.  

Although this step requires time and resources to set up and conduct a PoC or test phase, Unit8 observed that only hands-on testing of the tools can reveal certain benefits and preferences from the end users, which need to be catered to for adoption. 

After conducting the testing phase with the final candidates, at least two vendors should be considered to leverage your company’s negotiation power during final commercial discussions. This last step of negotiations can result not only in significant financial benefits but also in preferable terms & conditions or implementation support by the vendor.

 

Roles & Responsibilities

Responsibilities related to Data Catalog must be embedded in your company-wide data governance operating model among existing data role holders. As mentioned above, data governance tools are now combining both capabilities of cataloging and marketplace in one single tool. If the Data Catalog part is focusing on the defensive objective to achieve compliance, the Data Marketplace part is focusing on the offensive objective to achieve competitiveness. Hence, each role will have responsibilities spanning across both objectives.

Managing a Data Catalog and Marketplace involves a combination of technical, governance, and strategic responsibilities to ensure that data is discoverable, accessible, and usable while maintaining compliance and security standards. Below are the key roles and their associated responsibilities:

Data Governance Roles

  • Data Governance Officer
    • Policy Definition: Establish policies and standards for data governance, ensuring compliance with regulations and internal protocols.
    • Process & Training: Translate policies into processes on how to use the data catalog and marketplace. Train all type of users
    • Oversight: Monitor and enforce adherence to data governance frameworks across the catalog.
  • Data Owner
    • Accountability: Maintain overall responsibility for the accuracy, quality, and security of data assets listed in the catalog, as well as sharing of curated data products to consumers 
  • Data Steward:
    • New Data Registration: Identify & prioritize data assets that must be registered in the data catalog. Collaborate with system owners and data owners to integrate & register the assets to the catalog
    • Metadata Curation: Maintain the metadata in the catalog, ensuring it is accurate, up-to-date, and comprehensive, with the relevant classification
    • Data Quality: Monitor data quality metrics, addressing issues and improving data consistency.
    • Data Product Offering: Manages data products throughout lifecycle, ensures high data quality, business adoption and value creation​. Collects requirements from consumers and works with developers to build new or evolve existing data products fostering shareability & reusability across the company​
    • Data Product Access: Grants & revokes data access to consumers, and frequently reviews who have access

Data Governance Roles in relation to Business Roles

Non-Data Governance Roles

  • Business User & Data Citizen:
    • Data Discovery & Collaboration: Users can easily discover existing company data, consult lineage and check latest data quality status. They can contribute to the Catalog by adding metadata, tagging and providing feedback, fostering a collaborative data culture
    • Data Access & Self-service Analytics: Users can leverage the data marketplace to easily request data access and consume curated datasets to conduct analytics significantly accelerating time-to-market

  • IT Administrator:
    • Infrastructure Management: Oversee technical infrastructure, provide support, ensure performance, reliability and monitor potential overload impact on source systems
    • Source Systems Integration: Support teams to connect source systems to the Catalog & Marketplace to sync metadata or run data quality checks

Implementation Challenges & Risks

Implementing a data catalog & marketplace tool involves by nature a wide range of systems, network zones, and people (think of all the system and data owners or IT security). This creates high complexity through varying requirements and expectations from your stakeholders. Let’s focus on the key challenges and risks based on Unit8’s experience!

Risk 1: Choosing the wrong roll-out approach for your company

Risk Description: Setting the scope of initial source systems to be covered either too wide or choosing irrelevant source systems. This might lead to unmanageable complexity for the project team, as each additional source system adds significant workload. This makes it even more crucial to choose source systems that are actually relevant to the future end-users in order to make sure that the effort to connect the system is justified.

How to Mitigate: 

  • Choose the right roll-out approach for your project:
    i) Use case driven roll-out: Identify specific use cases that require your data catalog and start connecting the relevant systems for these use cases.
    ii) Domain driven roll-out: Identify business or even better data domains in the company and start connecting key systems from those areas.
  • Select the right priority systems:
    In addition to the chosen approach we recommend using the following two guiding questions to choose the right systems in the beginning: How often is the system used as a data source for analytics purposes? Does my solution provide good out-of-the box connectors for the system?

Risk 2: Permission Requirements of Scanning Tools

Risk Description: Scanning tools from the selected vendor might require extensive permissions to your source systems which might be unacceptable for your internal teams due to security risks. This situation causes long delays in the implementation since all parties (vendor, internal data source team, internal data catalog implementation team) need to agree on a mitigation plan, significantly increasing the implementation duration.

How to Mitigate: 

Engage all relevant stakeholders incl. system owners and security early in the process to review and approve the required permissions. Develop a clear and agreed-upon mitigation plan that addresses security concerns and ensures timely implementation.

Risk 3: Incomplete Data Collection by Scanning Tools

Risk Description: The scanner tool from the vendor might not be able to collect all the desired information from the source system, especially when certain custom structures are in place. It is best to test the scanning tool for each system and identify missing properties of data assets early on.

How to Mitigate: Conduct thorough testing of the scanning tool on each system to identify any missing properties. Collaborate with all parties to plan the time and effort needed to either develop a custom scanning tool internally or work with the vendor to improve the existing tool.

Conclusion

In summary, while both data catalogs and data marketplaces aim to enhance data accessibility, a data catalog focuses on organizing and managing internal data assets to improve discoverability and governance, whereas a data marketplace emphasizes facilitating the exchange and consumption of data products to unlock data within your organization and generate business value.

Want to explore how data catalogs and marketplaces can unlock value for your organization? Reach out to us to start the conversation.

Contact us

Want to receive updates from us?

agree_checkbox

By subscribing, you consent to Unit8 storing and processing the data provided above in order to provide you with the requested content. For more information, please review our Privacy Policy.

Our newsletter features industry news, the latest case studies, and future Unit8 events.

close

This page is only available in english