Why Information Taxonomy Must Represent the Landscape of the Business

"Master Data, enterprise architecture and digital transformation begin with a conceptual view of the enterprise."

In the online world, a great deal of misunderstanding has developed regarding the term “taxonomy.”  Originally used to describe biological classifications, the term has also been used in the fields of library science, economics, education, technology, and information management.

Taxonomy is broadly understood as a way to classify the world into concepts and physical things.  The world is comprised of categories and hierarchies with all physical and biological systems being comprised of subsystems.  Similarly, information systems are comprised of hierarchies and systems built on subsystems. An enterprise taxonomy takes a holistic view of those hierarchies and provides reference for controlled vocabularies which are the official lists of terms for metadata fields – the attributes that describe the nature of content, knowledge, products and customers.

New call-to-action

Taxonomies are essential to enterprise knowledge management. The user experience is dependent upon taxonomy. The taxonomy development process always leads to multiple taxonomies, each of which describes an aspect of the business. The combination of all of the taxonomies in the enterprise and the relationships among them comprise the enterprise ontology. When the ontology is combined with data, it becomes an enterprise knowledge graph. Enterprise ontology management requires collaboration across business units, functions, application owners, and a range of stakeholders.

An Evolving Definition

When taxonomy was first applied to the online world, agencies and developers considered the navigational structure to be a taxonomy – a method to organize the information resources of a web site or an intranet so that users could click through from one resource to another and easily understand the knowledge structure represented on the site or intranet.   This initial definition became an oversimplification as taxonomy began to be understood as more than navigation.

The classification definition of taxonomy (as opposed to the navigation definition) included data structures – the terminology used to tag content. A taxonomy is a form of controlled vocabulary that is used for asset tagging. Control is achieved through intentional change management using structured governance processes and a governance model that includes metrics for course corrections based on a specific business objective. The “term store” in Microsoft SharePoint became the container for storing and managing each controlled vocabulary in the organization.

Content management (including content migration and content classification), digital asset management, and enterprise knowledge management systems use taxonomy to correctly classify content for enterprise search and retrieval. Best practices for content strategy include provisions for taxonomy creation, further taxonomy development and refinement (with iterative testing), taxonomy management, and taxonomy governance.

Classification versus Navigation

Neither of these views – navigational structures at one end and classification terminology at the other – considers the processes and systems that lie between the two extremes. These processes and systems have names, but they are not navigation and are not terms applied to content, so they do not fit into the classification category either.  This may seem trivial in concept but is nuanced and complex in practice. Organizations continue to have challenges around finding and using information, not because the technologies are inadequate or because people don’t know how to make information useable, but because the landscape of information and the ecosystem of technologies is a living, changing, evolving thing, so static solutions are not effective.   Moreover, when systems, processes, terminology and data structures are called different things by different people; sometimes the same names are used to describe different things. It therefore becomes more difficult to make sense of that changing ecosystem. 

As the information environment changes, systems need to be updated, switched out, reconfigured, and integrated on an ongoing basis.  Change is continual, and the organization is challenged with upkeep and understanding of an increasingly complex array of applications, tools, structures, processes and mechanisms.

One way to deal with this complexity is to begin with a foundation of consistency.  In some programs this is called Master Data, in others, Enterprise Architecture, in still others, data standards, data quality, data curation, metadata management, information architecture, and our favorite, "taxonomy initiatives."  In any case, the goal is to develop consistency in the way that information of all kinds is classified. At the same time, the core architecture needs to be flexible and extensible. Consistency is maintained through intentional, structured change management processes.

Taxonomy and Business Concepts

Why call this taxonomy?  Perhaps because our definition begins at the conceptual level.   Taxonomy begins with a view of the enterprise from the perspective of business users and business objectives.  It defines the things that are most important to the business as relates to people, products, services, customers, processes and the value-creating mechanisms at the heart of the enterprise. 

Taxonomy does not start with data or data structures. 

It does not start with systems or applications. 

A starting point for taxonomy begins with concepts by asking the right questions about how the company functions, with the ultimate goal of improving operational efficiency:

  • What content and information do you interact with on a day-to-day basis?  
  • What processes do you engage with, applications you interact with, and people you speak to, both internally and externally?  
  • How do those people, processes and technologies interact with each other?

The result is a map of concepts and how those concepts are manipulated, transformed, interacted with and organized to produce the value of the enterprise.  It encompasses the intellectual capital of the organization and the value networks that consist largely of information flows. 

The Domain Model

The mapping process output is the “domain model” – the master diagram of everything that defines the enterprise.  Sometimes this is referred to as an ontology, but regardless of what it is called, the domain model is a way of defining the fundamental building blocks of knowledge and process in an organization.  Those building blocks represent the things that are then captured and manipulated through thousands of processes and interactions across hundreds of thousands to hundreds of millions of transactions with the marketplace. 

The conceptual building blocks get translated into design elements – including the data structures, application designs, search systems (and search results format), and information processes that comprise the business.  By starting at the highest conceptual level, a common understanding of terminology and concepts can be woven into every process and every system.  In contrast, starting at the level of data structures misses important conceptual elements and misses how those conceptual elements come together. 

Digital Transformation Requires Common Understanding

Although no organization can start with a completely clean sheet in looking at its domain model, all digital transformations should begin with this point of view.  As processes are transformed to serve customers in new ways and as more products and services are virtualized and digitized, the foundational data structures need to be consistent, data processes need to be governed, and organizations need to be in agreement about the meaning of terminology and have common taxonomies that define all aspects of the business.  Digital transformations are both data transformations and process transformations.  Those transformations need to start with common definitions and understanding at all levels of the organization. 

A classic error made when organizations embark on enterprise taxonomy programs is to try to get agreement on a single set of navigational structures.  That will never work.  There is no single way to look at all information. 

Multiple navigational structures can be derived, however, from consistent classification structures.  Consistent organizing principles, definitions, naming conventions, and data standards can be applied to various types of systems, various levels of granularity, and various contexts. 

Enterprise Architecture and Enterprise Level Taxonomy Design

Enterprise architecture, enterprise data standards, enterprise data quality, and enterprise governance begin with enterprise taxonomies.  Unfortunately, many taxonomy consultants, information architects, digital agencies, and system integrators do not have a global view of knowledge and information that begins with the conceptual view of structured and unstructured data.  It is easier to dive into the weeds of a system deployment, functional specifications, or application development.  Many organizations do not have the patience or vision to begin with this abstract perspective.

The result of such a vision does not always have a short-term impact on the bottom line, yet tangible immediate ROI that is the driver for most project decisions and resource allocations. 

The transformations that are occurring of business and technology are unprecedented.  We are at an inflection point in human history with respect to machine learning and AI powered digital assistants and high functionality chatbots. Seeing the possibilities requires understanding that the organization as a series of information flows that need to be optimized.   Products and services are increasingly software- and information-based.  Customer interactions are increasingly digital. 

Optimized Customer Experience Requires Optimized Information Flows

The goal of digital transformation is to more effectively serve customers.  However, when systems are stitched together through piecemeal integrations with one-off architectures and siloed interests, the information flows that comprise the customer experience cannot be optimized, and therefore neither can the customer experience. Every business process in support of the customer requires a reference set of “official” terms – the controlled vocabularies and reference data provided by taxonomy and ontology. Master data management depends on consistency across applications provided by taxonomy and ontology.

The era of big data, analytics, cognitive computing, semantic technology and machine learning requires a holistic approach to integrating structured and unstructured data, content, information, and knowledge and allowing for endless combinations of digital services and systems.  Evolution and adaptation of the enterprise is based on the digital DNA of the business. That DNA is built on a foundation of consistent, curated, defined and – at some level – organized information.  Taxonomy is essential to digital transformations and understanding the landscape of the business is foundational to taxonomy development. This begins with understanding the business and the concepts that are important to the business and building libraries of use cases to test and refine the taxonomy and information architecture. In this way, information flows throughout the enterprise can be streamlined – speeding up the “information metabolism” of the enterprise.

The new master data starts with taxonomy.

Meet the Author
Seth Earley

Seth Earley is the Founder & CEO of Earley Information Science and the author of the award winning book The AI-Powered Enterprise: Harness the Power of Ontologies to Make Your Business Smarter, Faster, and More Profitable. An expert with 20+ years experience in Knowledge Strategy, Data and Information Architecture, Search-based Applications and Information Findability solutions. He has worked with a diverse roster of Fortune 1000 companies helping them to achieve higher levels of operating performance.