All Posts

Making the Business Case for Taxonomy

The most important issues to business today include information overload, integration challenges, improved efficiency, adaptability to competitive challenges, and the faster pace of business change. All of these require improved ways of connecting people with each other and with the information and systems on which they depend to do their jobs. A core challenge is that there is a disconnect between IT functions and the line of business people whose job it is to keep the enterprise delivering products and services to customers.

Information Overload and Other Challenges

The most important issues to business today include information overload, integration challenges, improved efficiency, adaptability to competitive challenges, and the faster pace of business change. All of these require improved ways of connecting people with each other and with the information and systems on which they depend to do their jobs.

A taxonomy is an organizing principle. It is a foundation on which to base any kind of system. It doesn’t matter what kind of project you are involved in, it will benefit from clearly defined, concise language and terminology. A taxonomy helps fine tune search tools, it creates a common language for sharing concepts, and it allows an efficient organization of documents and content across information sources. Whether a structured tool such as a CRM system, or a less structured one, like a content management system that organizes information for web sites or intranets, all technologies that deal with information require a basis in taxonomy. This is even more important when various systems must interact.

Take, for example, the issue of language: I call the person I do business with a Customer. Someone else calls them a Client. When we need to exchange or combine or analyze data which entity are we talking about? What is the document that outlines what we are providing, is it a statement of work, a proposal, an SOW or something else?

When knowledge workers (and face it, everyone is a knowledge worker to some degree...) search for information as an input for their tasks and then create outputs, are they using language that is unambiguous? Can their work be easily found and re-purposed? Are they sure they are not recreating information that already exists?

These are important questions, but there are larger issues that can have an even greater impact on the organization and its senior executives and officers. Are all of these challenges of business going to be magically solved with a taxonomy? Of course not, but if the underlying structure is not in place, then essential tools, technologies and processes will not function together. Connecting system A to system B makes little sense when a common language has not been established to have information make sense in the new context.

IT Problem or Business Problem?

A core challenge is that there is a disconnect between IT functions and the line of business people whose job it is to keep the enterprise delivering products and services to customers. The problem is that each group views the problem through their own lens. IT considers this a problem of systems – if they do their piece and build the tools then they have done their job. Accounting looks at performance of audit types of functions, which, if done the unusual way, is costly and cumbersome but again, gets the job done. Legal addresses regulatory, compliance and liability issues from a perspective of corporate policies and contracts. And the business wants to focus on getting products and services to market. However, no one is looking at how all of these pieces need to fit together - the big ‘semantic picture’, so to speak.

Consider what happens if each constituency does their job, but accounting people spoke British English, IT spoke a Cajun dialect, legal an inner city slang, and business people spoke the language of scientific researchers. For all practical purposes, the languages they use in communicating with their professional peers are as different as these corners of the English language. In order for documents and pieces of content to be reusable and understandable in all of these different contexts and for these different audiences we need to develop a Rosetta stone of the enterprise. That is an enterprise taxonomy.

Common Language Across Cultures

Many people think that this is an insurmountable task – getting people to agree on common terms and meanings. In some ways it is. Language is too ambiguous and variable, needs are too diverse to be able to develop a common denominator of communication for all circumstances. Instead we create a structure for defining and applying terms and for managing change. The alternative is uncontrolled and chaotic. But too much control is stifling and impractical.

Determining where to control and centralize and where to allow variability is part of the process of developing and implementing an enterprise taxonomy.

The Problem with Search

There is a prevalent opinion that a Google-like search interface is the answer to these challenges. There are many reasons why this is a fallacy. One is that in an enterprise, many of the clues that Google uses to deliver results are missing. Google will use links between sites to determine how to rank results. If lots of other sites point to a document then that document is deemed to be more valuable. In the corporate intranet, there is not equivalent way of ranking results. (There may be some linkages in larger sites, but this is not necessarily enough to significantly affect results.)

Another fundamental flaw with pure search solutions is that meaning, value, and applicability are context dependent. The usefulness of a piece of content is in the eye of the beholder. A document is useful to me if I can use it to solve a problem. This is dependent upon my role, my task, and my background. A search engine cannot determine these factors and present results based on my needs. However, if I as a system designer perform some process analysis in order to understand a user’s tasks and how they go about solving their problem, I can present information in anticipation of their needs. The role of a taxonomy is to define the labels that correspond to user tasks, experience, needs, and context and help refine their search or guide their navigation.

Leveraging Taxonomies

Part of the analysis phase in taxonomy development is to understand what users are trying to accomplish, and then say “here is a set of documents that you should look at when you are performing these tasks”. For example, I am a sales person may be preparing a proposal for a customer. If I search on a large repository for documents that I can re-purpose, and I search on the word “proposal”, I will likely pull up a lot of documents that may contain the term proposal, but they may not be example proposals that I can reuse.

On the other hand, if I define the business development function as including proposal creation, I can infer that sample proposals will be a useful piece of content to have access to. I therefore can define a tag called “sample proposal” or some other label that we agree will designate documents that can be reused. I may want to go further and define the specific industry that I am writing for, the product or service offering, the size of the deal and so on. By carefully defining labels for the documents I can search based on these labels or navigate to a place where these documents reside. These results will be precisely for my task at hand and will save me from creating a proposal from scratch or from endless searching for a relevant document.

So in the first case I search on “proposal” and pull up perhaps hundreds of documents containing the term proposal. In the second case I search or navigate to a place that contains a smaller subset of documents that more closely meet my criteria.

Improving Recall

Imagine that in one system I refer to proposals for customer service outsourcing as “service outsourcing” and in another repository, the documents have been marked “business process outsourcing”. If I search on one term, I really also want the documents with the other term. These terms are synonymous. As I derive our taxonomy, I will make note of terms that may be used interchangeably and apply a “synonym ring” to the search mechanism, enabling search on one term to return documents containing the other terms.


As we just observed, search is one area where taxonomies can be leveraged. What about navigation? Some people equate taxonomy with navigation. Taxonomy is not necessarily the same as navigation but certainly can inform navigation. What does this mean? It means that by understanding the underlying structure of information and how people access that information, we can propose a structure by which users can click through the content. Navigational structures can directly reflect the taxonomy, in some cases. For example, if I organize information about the enterprise according to departments or functional areas, with geographies comprising navigational nodes, this could be exactly the same as the taxonomy. In other cases, users may navigate according to a task or business process that could start out with a geography (say North America) and then shift to a task, such as customer service. Customer Service does not reside under North America in the taxonomy.

So where do we go from here?

Taxonomy development and maintenance is an ongoing process. There are many details around how they are derived and how they are applied, but the first point to make in your organization is that this is an important process that requires time, attention and resources. In the next few years, this function will be embedded in business and IT processes. The evolution of business and technology has led to this point where it is essential that we agree on terminology in order to integrate, collaborate and communicate most effectively. Not addressing this issue will lead to more problems of information overload, difficulties in integrating systems and avoidable inefficiencies in the organization. The short term goal should be to educate your organization on these issues, medium term, begin the process of formalizing sharing and application of consistent language across systems and processes, long term, the goal is to develop a mature process for ongoing maintenance and governance of enterprise taxonomies. It is important to start the process now, rather than wait for search, navigation, and access of information to become a beast.

Earley Information Science Team
Earley Information Science Team
We're passionate about enterprise data and love discussing industry knowledge, best practices, and insights. We look forward to hearing from you! Comment below to join the conversation.

Recent Posts

[Earley AI Podcast] Episode 31: Kirk Marple

It’s All About the Data Guest: Kirk Marple

[Earley AI Podcast] Episode 30: Alex Babin

The Holy Grail of AI Guest: Alex Babin

The Critical Element of Foundational Architecture

Recently I chaired the Artificial Intelligence Accelerator Institute Conference in San Jose – in the heart of Silicon Valley.  The event has brought together industry innovators from both large and small organizations, providing a wide range of perspectives. For example, the CEO of AI and ML testing startup of Kolena, Mohamed Elgendy and Srujana Kaddevarmuth, Senior Director, Data & ML Engineering, Customer Products, Walmart Global Tech discussed productization of AI solutions and ways to increase adoption. I especially liked the idea of a model catalogue from which data scientists can retrieve data sets and machine learning models that others have built rather than starting from scratch.