What Executives Need to Know about Retrieval Augmented Generation

This article originally appeared in KMWorld.

Organizations are taking a cautious approach to Generative AI – the Large Language Model (LLM) powered ChatGPT-like applications that have burst onto the technology and consumer scene. Increasingly, the C-suite is trying to factor in how LLMs and Generative AI will be part of their digital transformation roadmaps. The risks of diving into this technology are significant, and include:

Unrealistic expectations of LLMs as a magic solution to managing corporate content without requisite human involvement
Generating responses misaligned with company policies or brand image
The lack of knowledge since the end of the LLM's machine learning phase or the lack of organizational information
Difficulty distinguishing between creative outputs and fabricated responses (hallucinations)
Absence of clear audit trails and citation sources
The threat of exposing trade secrets or other proprietary knowledge
The potential financial burden of using proprietary LLMs

However, the rewards of a successful implementation are significant. Generative AI can increase productivity, save time on routine tasks, improve human creativity, act as a sounding board or starting point for research, and help access, synthesize, and summarize large amounts of information.

How Organizations Overcome Problems Inherent in LLMs

First, it is essential to understand the limitations of LLMs. LLMs are not solutions in themselves. Human intervention is needed at critical points. Systems need to understand things that are specific to the organization. That information has to be structured and curated. Organization-specific knowledge needs to be accessed by AI systems. LLMs don't automatically know your company's language, terminology, acronyms, or processes.

At the same time, sensitive information cannot be safely uploaded to a publicly available or commercial LLM as it becomes part of training data. By using the APIs of commercial LLMs, the question and the answer do not become part of the model training. Since the organization is querying a knowledge store behind the firewall, the information is secure. By specifying that answers should only come from ingested knowledge, as well as instructing the system to respond with "I don't know," if the answer is not in the data source, hallucinations can be eliminated. This is called "Retrieval Augmented Generation" or RAG.

Why Use an LLM If It Doesn't Know the Answer?

Instead of using it to answer the query directly, an LLM is used to process a query and use that processed query to retrieve information from a knowledge source or database. The results are also processed to make them sound more conversational.

The Value of Metadata

Metadata is more important than many people recognize. In a recent research project, we found that an LLM could answer questions 53% of the time without metadata but 83% of the time with metadata. That is a vast improvement in performance. Metadata provides valuable context that may not be available in the text itself.

Guidelines for Successfully Deploying LLMs

Use cases

Choose a narrow set of use cases to begin with – the more limited and more clearly defined, the better. Use cases need an unambiguous outcome to be testable and show success. A clear, testable, and unambiguous use case would be, "Use an LLM to troubleshoot a modem installation using the installation manual as a reference" or "Use the LLM to determine milestones in a project based on a project document." These form the foundation benchmark to test approaches.

Identify needed content

In the case of the modem installation, the installation guide is necessary and must contain the steps needed to troubleshoot. In the case of project milestones, the project documents or database must contain those milestones. Otherwise, the system does not have the information necessary to answer.

Tune the LLM to reduce creative outputs

Setting a parameter called "temperature" to 0 will reduce creative responses. The parameter instructs the model to use only information in the database. If it encounters a question it can't answer from the data source, it should respond with, "I don't know." That will eliminate hallucinations.

Gather metrics

Identify the use cases or queries to which the LLM responded with "I don't know" and identify knowledge gaps for remediation. Test on those use cases when onboarding new content and data sources to see whether the new information addresses the gaps.

Use a knowledge/information architecture to enrich data

Metadata applied to content will improve the performance of an LLM. Identifying departments, processes, content types, topics, and other information characteristics will provide additional cues for the LLM to increase its ability to answer questions.

End-user acceptance

Users will only trust what they understand. Providing the traceability of answers from an LLM using the knowledge base approach and retrieval augmented generation will reassure executives, internal users, and customers that the information is correct, accurate, and current.

Content operations and governance

Using LLMs calls out the need for knowledge/content operations. Organizations will also need a mechanism for allocating resources and measuring results, as well as making course corrections. Governance encompasses these important processes. While not as glamorous as the Generative AI applications, governance is foundational to their success.

Summary

It is essential to assess knowledge against these criteria. Organizations that deal with their content, knowledge, and data systems now will be ahead of the game. Organizational knowledge and data is a crucial enabler.