Metadata: Enabling Data Sharing

How do you facilitate greater usage and sharing of data to unlock more business value?

Metadata is descriptive information about data such as data source, location, owner, field names and so on. With advanced analytics, the definition of what constitutes data is greatly expanded. In addition to databases, your data includes archives of photos and video, diagnostic test results, sensor readings, log files, documents, spreadsheets, and more.

Given this expanded definition, data can be found almost anywhere and data owners distributed throughout your company. A metadata project will help you understand what data is available to be inventoried and shared with the rest of the organization. A logical first step is to take an inventory of datasets and collect standard information about them in an electronic catalog. In the example below, all of the fields are searchable and the record provides users with enough details to search, retrieve, and evaluate the dataset.

Objective

Your objective is to lead the development of a metadata management strategy to facilitate the sharing of datasets, create new opportunities for collaboration, and reduce redundancy in data collection. Typically, there are three stages in metadata content development: Data Catalog, Data Definitions, and User Annotations.

Collecting Metadata

In your data catalog, the metadata (descriptive details) will enable others to discover and view the data. These six types of questions can guide you on what to include in the catalog design.

Who?

  • Who created the data?

  • Who owns the data?

  • Who will maintain it?

  • Who is using the data?

What?

  • What is the purpose of the data?

  • What is the content of the data?

  • What is the security level?

When?

  • When was the data created?

  • When was is last updated?

  • Is there a date when the data becomes invalid?

Where?

  • Where is the data stored?

  • Where did the data come from?

Why?

  • Why was the data created?

How?

  • How is the data formatted?

  • How many databases store this data?

  • How can users gain access to the data?

Organizing Metadata

Here is an example of a metadata schema for a data catalog, which shows how the descriptive information can be organized.

Conclusion

Metadata collection and management can give you significant insights into the variety and types of datasets available within your organization. You can also save time and resources in the long run. Your teams benefit by sharing reliable datasets and avoiding duplication.

When your solution is in place, you’ll make it easier for people to design and manage new analytic models that generate actionable insights to solve business challenges.