Share this article:

From Cryptic SAP Fields to Clear Metadata: Automating Documentation with the Unit8 GPT Wizard

Share This Article

Enterprise systems like SAP are powerful, but they come with a well-known challenge: understanding the data. Business users and data teams frequently encounter technical fields such as WAERS, MATNR, or NETWR – short identifiers whose meaning is rarely obvious without consulting documentation or domain experts.

In many organizations, this documentation is scattered: in Confluence pages, spreadsheets, internal wikis, or sometimes only in the minds of experienced employees. It can be messy and unstructured but still important. As a result, analysts and engineers often spend significant time searching for field definitions instead of working with the data itself.

This challenge is particularly visible in large ERP environments where thousands of fields must be documented and maintained. The process is typically manual, fragmented, and difficult to scale.

Recently, we explored how AI agents can automate metadata documentation by orchestrating multiple enterprise systems by adding a new feature for the Unit8 GPT Wizard.

The Metadata Challenge in ERP Systems

Metadata documentation plays a critical role in enterprise data ecosystems. Without clear definitions of data fields, organizations face several challenges:

  • Data analysts struggle to interpret datasets correctly
  • Data engineers spend time answering repetitive questions about field meanings
  • Data governance initiatives slow down due to incomplete metadata
  • Onboarding new employees becomes significantly harder

Despite its importance, documentation often falls behind the pace of system evolution. Each new integration, table, or field requires updates that are rarely prioritized.

The result is a growing documentation gap between how systems evolve and how well they are understood.

We explored whether AI could help close this gap.

The Approach: Orchestrating Metadata Sources with AI

To address this problem, we built a new feature for the Unit8 GPT Wizard, an integration accelerator designed to connect ChatGPT Enterprise with internal enterprise systems.

Instead of relying on a single source of truth, the AI agent orchestrates multiple systems to collect and generate metadata documentation.

The architecture connects 4 main components:

 SAP – source of entities and column metadata
Confluence – internal documentation source
External reference sources – enrichment for missing definitions via web search
Azure Blob Storage – centralized storage for generated documentation

The agent interacts with these systems to automatically discover fields, retrieve existing knowledge, enrich missing information, and produce structured documentation following a previously specified template. 

From Field Discovery to Structured Documentation

In the workflow, the user begins by asking the agent to retrieve available entities and columns from the SAP system.

For example, the agent may return entities such as:

  • ProductSet
  • SalesOrderLineItemSet

Along with technical column names like:

  • WAERS
  • MATNR
  • NETWR

These identifiers are typical in SAP environments, where technical field names are cryptic and difficult to interpret without additional context.

Once the user selects fields to document, the agent performs several automated steps.

First, the agent inspects sample values stored in the selected SAP columns. By analyzing the actual data, the system can infer the context of the field – for example identifying currencies, material identifiers, or numeric transaction values. 

Next, the agent searches the user’s Confluence spaces to determine whether documentation for the selected fields already exists. If relevant information is found, the agent extracts and reuses it.

The agent then enriches the metadata using web search. This step is especially useful when internal documentation is incomplete or missing, and it also helps capture standardized SAP abbreviations and commonly used field definitions.

Finally, the agent generates structured documentation using a predefined template and stores the results in Azure Blob Storage.

The output is a clean, standardized JSON file describing the field, its meaning, and its source.

Capability: Automating Metadata Documentation

This workflow demonstrates how an AI agent can automate one of the most time-consuming tasks in enterprise data management.

Instead of manually researching field definitions across multiple systems, the agent can:

Discover Metadata
Retrieve entities and technical fields directly from SAP. 

Understand Data Context

Analyze sample values stored in SAP columns to infer the meaning of fields – for example, identifying currencies, material identifiers, or transaction amounts.

  • Reuse Existing Knowledge
    Extract documentation from internal knowledge bases such as Confluence.

  • Fill Documentation Gaps
    Enrich missing information using external reference sources.

  • Generate Structured Artifacts
    Produce standardized documentation files ready for integration with metadata catalogs or governance platforms.

This process significantly reduces the manual effort involved in maintaining metadata documentation.

The Business Value: Faster Data Understanding

While the feature focuses on a specific use case, the implications extend to broader enterprise data challenges.

Organizations managing complex ERP systems often face a trade-off between operational work and documentation quality. As systems evolve, documentation tends to fall behind.

Automating metadata generation can help address this issue by making documentation creation part of the data workflow rather than a separate manual task.

This creates several benefits:

  1. Improved Data Discoverability
    Analysts can understand datasets more quickly.
  2. Reduced Dependency on Domain Experts
    Field definitions become easier to access and share.
  3. Faster Onboarding
    New employees can navigate enterprise data models more efficiently.
  4. Stronger Data Governance
    Organizations maintain more consistent and up-to-date metadata.

In short, the goal is not simply to generate documentation faster, but to make enterprise data ecosystems more understandable and accessible.

Summary

Maintaining metadata documentation in complex ERP systems is a persistent challenge. Field definitions are often scattered across multiple systems and require manual effort to maintain.

We explored how the Unit8 GPT Wizard can automate this process by orchestrating multiple knowledge sources and generating structured documentation automatically.

The results show how AI agents can reduce manual work while improving the accessibility and quality of enterprise metadata.

As organizations continue to scale their data platforms, solutions like this can help ensure that data remains not only available, but also understandable.


Explenations:

  1.  Data Element for Currency Key, used to define the currency for financial amounts, purchasing documents, or sales orders
  2.  Material Number, representing a unique alphanumeric code assigned to products, materials, or services
  3.  Net Value of a document item (such as a sales order or billing item) in the document currency, excluding tax and potential discounts

Want to receive updates from us?

agree_checkbox 

By subscribing, you consent to Unit8 storing and processing the data provided above in order to provide you with the requested content. For more information, please review our Privacy Policy.

Our newsletter features industry news, the latest case studies, and future Unit8 events.

close

This page is only available in english