All Posts

Why Taxonomy is Critical to Master Data Management (MDM)

Organizations are paying more and more attention to Master Data Management (MDM). MDM comprises a set of processes and tools that consistently defines and manages the non-transactional data entities of an organization, such as product information or customer data. 

According to a study by Aberdeen, companies using MDM are more than twice as likely to be satisfied with data quality and speed of delivery, compared to those not using it. 

MDM promises not just greater control over consistent reference data; but an ability to manage the relations between data entities in order to generate more effective business knowledge. From this perspective, MDM requires an understanding and agreement about the meaning of terminology.   Hence, the natural role of taxonomy. Taxonomy is about "semantic architecture" - it is about naming things and making decisions about how to map different concepts and terms to a consistent  structure.  

New call-to-action

MDM challenges and the argument for data taxonomy

Ambiguity.  The same term can have different meanings.  Taxonomy provides a hierarchy that helps remove ambiguity. It includes mechanisms for understanding context and making meaning precise.    

Consistency.  It can be difficult to get complete agreement on what terms to use.  Also, people will use terms inconsistently if given a choice. Sometimes, in legacy situations, different terms were used in the past and for various reasons the data can't be re-tagged to provide consistent metadata.  A thesaurus can map terms together to account for these inconsistencies. 

Connections. Taxonomies can also represent related concepts (technically also part of a thesaurus) that can be used to connect processes, business logic, or dynamic/related content to support specific tasks.

An MDM strategy defines the process for cleansing the data, harmonizing the attributes, and ensuring that all required information is present. 

But, master data management programs also need to leverage taxonomy, and taxonomy should make use of MDM initiatives. 

  • Although taxonomy is typically applied to unstructured content, it is increasingly supporting structured and transactional content - a data taxonomy.
  • Similarly, master data plays an essential role in making unstructured information consistent, findable, and valuable. 

The following provides a brief example of key concepts and the role of taxonomy.  Note that the transactional data is on the left, the non-transactional persistent reference data on the right. 


Let's look at the product master. 

We have two different manufacturers who both offer mechanical pencils.  In our product master, they are called the same thing.  However the original product manufacturers do not necessarily use the same terms to describe their products.  The original bills of lading might have contained the following: 

There are a couple of observations:

  1. The description uses abbreviations that are not user friendly. 
  2. The attributes are not consistent. 

One manufacturer classifies their product as Stationary and other calls it Home Office. Further, one abbreviates the attribute of Color as Bl and the other uses Blk. With these inconsistencies, it is impossible to deliver an excellent user experience where this data may need to be displayed. 

Bringing it all together with taxonomy and master data management 

Master Data Management fixes these inconsistencies by improving data quality.  Each supplier has a way or organizing and describing their products that may or may not be aligned and consistent. However, the retailer needs to drive a consistent user interface and experience to achieve the best business outcomes. 

  • A centralized repository where "the source of truth" exists 

  • Governance processes for fixing inconsistencies or providing feedback to suppliers 

  • Rules for automating remediation of predictable inconsistencies 

  • Tools for cleansing and normalizing the data (running scripts and converting the data) 

The role of a data taxonomy is even more important in multi-domain MDM, which is the direction in which the industry is heading. 

According to Gartner, 58% of the reference customers in its 2018 Magic Quadrant Report on Master Data Management Solutions are facing the requirement for multi-domain MDM.  

Whereas in the past, most MDM systems were focused on a single area such as product data or customer data, more organizations now want to bring data together from multiple domains, to allow for a broader range of business use cases and greater use of analytics. 

In order to conduct analytics across domains and develop effective governance programs, organizations need to set up consistent taxonomies and standard metadata, especially on their critical data. The data models will need to reflect a consistent taxonomy. Ultimately, the relationships among different taxonomies should be captured and documented through an ontology, but having an MDM with appropriate taxonomies is a foundational step to take.

Nothing about this is easy (or sexy) but it needs to be done if your initiatives are going to make headway.  Our team of information science experts can help.  Give us a shout if you'd like to talk.


Seth Earley
Seth Earley
Seth Earley is the Founder & CEO of Earley Information Science and the author of the award winning book The AI-Powered Enterprise: Harness the Power of Ontologies to Make Your Business Smarter, Faster, and More Profitable. An expert with 20+ years experience in Knowledge Strategy, Data and Information Architecture, Search-based Applications and Information Findability solutions. He has worked with a diverse roster of Fortune 1000 companies helping them to achieve higher levels of operating performance.

Recent Posts

[RECORDED] Unlock the Value of Data Discovery Using Knowledge Graphs and Hybrid AI

Successful knowledge management, risk management and process automation initiatives use Knowledge Graphs and Text Analytics for data discovery to extract value from documents and transform them into actionable insights and data. Knowledge Graphs aka Semantic Networks are the bedrock of an organization’s Information Architecture - modeling an organization’s products, services and people. Such semantic approaches, leveraging Natural Language techniques, have been the backbone of Text Analytics. Recently, advances in Machine Learning (ML) are augmenting such traditional approaches to create Hybrid AI. Attend the next Earley Information Science webinar to understand the key steps to set up your next data discovery initiative for success using the latest methodology and technologies. We’ve partnered with Expert.AI, a recognized leader in document-oriented text analytics platforms to explain the technical and methodological advances that enable better data discovery.

First Party Data: The New Imperative

The need for accurate data to support digital transformation and the emergence of new restrictions on the use of third-party data have prompted many companies to focus their attention on first party data.

Knowledge Graphs, a Tool to Support Successful Digital Transformation Programs

Knowledge graphs are pretty hot these days. While this class of technology is getting a lot of market and vendor attention these days, it is not necessarily a new construct or approach. The core principles have been around for decades. Organizations are becoming more aware of the potential of knowledge graphs, but many digital leaders are puzzled as to how to take the next step and build business capabilities that leverage this technology.