Growth Series BLOG

Its a Premium BLOG template and it contains Instagram Feed, Twitter Feed, Subscription Form, Blog Search, Image CTA, Topic filter and Recent Post.

All Posts

Taxonomy in Information Archaeology

Clink, clink went two halves of a Japanese rifle shell case on my researcher's desk at the National Archives and Records Administration facility in College Park, Maryland. They fell from the envelope attached to a memorandum in the folder I took from the large, archival documents box belonging RG-319, Office of Assistant Secretary, Army Staff Operations. The memorandum discussed problems associated with placing Imperial Japanese Army rifles under U.S. Army control back into service as part of the mobilization effort in Japan in response to rising tensions on the Korean Peninsula between 1948 and 1950. It was a good plan except: 1) the parts of the Japanese rifles were hand-crafted by each soldier during final assembly at time of issue and were unique and therefore the guns lacked interchangeable replacement parts; 2) the shells were designed for the gun bore, and 3) the US military had no practical means to mass produce shells for these archaic weapons.

This story illustrates a number of important points. A theory requires supporting knowledge to establish its actual goodness. Knowledge is a work product that moves through an organization. The repository for a work product artifact can be in an unusual place. Navigating to that place requires both an external structure and a diligent, informed seeker. Once accessed, retrieval results may include both target and unanticipated, serendipitous materials.

The business of taxonomy or metadata projects is the successful conversion of data-stuffs into reusable intellectual assets to serve organizational strategic objectives by using knowledge-oriented tactics and tools. In converting data into assets that can be developed or repurposed, an organization creates wealth that enables workers with find goodness in their ideas, decisions, and key relationships, expanding the effectiveness of their endeavors both directly and through diffusion.

Taxonomies are semantic models for governing, interpreting, and maintaining data, and enable solutions to search and resource navigation. As such, taxonomies drive the quality (accuracy, consistency, and saliency) of serendipitous knowledge search and retrieval, and efficiency by making the task of solving problems that demand high-value information easier compared to base-line solutions, enabling measurement in terms of real results.

Information seekers know their needs and wants. Product branding initiatives, professional and shop jargon, and the location of information repositories may make the experience of filling those needs seem similar to dealing with old Japanese rifles. However, taxonomy-driven navigation and search provides standardized, all-purpose reusable semantic bullets: users get meaningful results.

Effective taxonomy work requires learning the vocabulary common to an endeavor and the conceptual relationships among concept terms. This requires more than term capture, it requires mining dialogues and documents for informed points-of-view of creators/seekers and for the organizing structures in their speech. Relationally structuring the language of an endeavor enables smart use of terms, and a pragmatic level of search sensitivity to document nuance and information seeker variations.

Achieving a natural feeling in a highly structured vocabulary requires capturing the actual language of an endeavor and creating bridges across dialects, pidgins, and creoles. Language collection requires both active listening and the use of techniques such as site visits, interviews, and source/text analysis. Giving semantic form to language data requires a level of tooling. Tooling may involve standard approaches to data solicitation, such as card sorting and mental modeling activities, or may require thoughtful analysis of user-task behavior or logical concept-relationships. The goal of data collection and modeling is to discover the semantic categories the words represent, to clarify their logical relationships, and to organize the universe of discourse structurally into an integrated taxonomy or ontology.

One tendency is to think of linguistic taxonomies as being biological taxonomies. However, this is not the case. Linguistic taxonomies are schemes for the flexible representation concepts. Respect for the integrity and independence of major categories is preserved by creating a small number of "facets," such as roles, expertise, or product category distinctions. The facet structure provides a crystalline structure for the universe of discourse. Internally, facet structure is hierarchically, with contextual membership criteria stressing similarities and functions. Semantic facets are akin to the cultural totems, organizing social and cultural relationships about overarching themes, representing deep structures in perspectives and patterns of thought. Investigating a variety of environments and perspectives are essential in taxonomy work for this exact reason. Semantic facets are not Linnaean classes. A Linnaean approach focuses on internal characteristics (such as anatomy, calde, or morphology) are relevant, while the linguistic approach is ecological in its focus: words mean different things in different contexts. Linguistically, the Linnaean approach yields etymological unabridged dictionaries.

Environmental testing is required to confirm the validity taxonomy's facet system. If the facet system presents a complete, high-level framework for an endeavor, it has natural validity. Facet internal consistency is very important and a goal of first importance, but may not be essential. Mathematically, the pairing of completeness and consistency is a fundamental dilemma; pragmatically, speakers naturally speak. Linguistic taxonomies naturally enable speaker effectiveness.

The apparent budget utility of reusing archaic, idiosyncratic rifles to try to defend Japan against aggression from North Korea was unnatural to the U.S. Military Establishment (its then official name), and presented a complex array of challenges. In the end, the documents were buried in hundreds of boxes containing thousands of pages with mixed levels of preservation, some organized by the many individual offices' filing systems, others by a military decimal number system developed many years before, awaiting a persistent researcher: the decision makers selected another, very different solution, but that is another story.

Earley Information Science Team
Earley Information Science Team
We're passionate about enterprise data and love discussing industry knowledge, best practices, and insights. We look forward to hearing from you! Comment below to join the conversation.

Recent Posts

Designing AI Programs for Success - a 4 Part Series

Recorded - available as on demand webcast AI is plagued by inflated and unrealistic expectations due to a lack of broad understanding of this wide-ranging space by software vendors and customers. Software tools can be extremely powerful, however the services, infrastructure, data quality, architecture, talent and methodologies to fully deploy in the enterprise are frequently lacking. This four-part series by Earley Information Science and Pandata will explore a number of issues that continue to afflict AI projects and reduce the likelihood of success. The sessions will provide actionable steps using proven processes to improve AI program outcomes.

The Missing Ingredient to Digital Transformation: Scaling Knowledge Communities and Processes

The holy grail of digital transformation is the seemingly conflicting goals of high levels of customer service and pressure to reduce costs. “Digital Transformation” has become an all-encompassing term – in a piece in this column about customer data platforms, I asked whether the term has lost its meaning: The phrase “digital transformation” can mean anything and everything — tools, technology, business processes, customer experience, or artificial intelligence, and every buzzword that marketers can come up with. Definitions from analysts and vendors include IT modernization and putting services online; developing new business models; taking a “digital first” approach; and creating new business processes, and customer experiences. The overarching objective of a digital transformation program is to improve end-to-end efficiencies, remove friction from information flows, and create new value streams that differentiate a company’s offerings and strengthen the customer relationship. Having assisted large global enterprises with building the data architecture, supporting processes, and governance for multiple digital transformations, in my experience, there are two broad classes of initiatives that seem to get funding and others that miss the boat in terms of time, attention, and resources.

4 Reasons B2B Manufacturers need Strong Product Data

There are many manufacturers who have started to take the leap forward in the digital space, but there are still a great number who rely solely on their distributors to manage their product data. We are going to look at 4 key reasons why its so important that manufacturers own their product and dedicate the time and resources to build it out.