All Posts

    The Critical Element of Foundational Architecture

    Recently I chaired the Artificial Intelligence Accelerator Institute Conference in San Jose – in the heart of Silicon Valley.  The event has brought together industry innovators from both large and small organizations, providing a wide range of perspectives. For example, the CEO of AI and ML testing startup of Kolena, Mohamed Elgendy and Srujana Kaddevarmuth, Senior Director, Data & ML Engineering, Customer Products, Walmart Global Tech discussed productization of AI solutions and ways to increase adoption. I especially liked the idea of a model catalogue from which data scientists can retrieve data sets and machine learning models that others have built rather than starting from scratch.

    The presentations were excellent and ranged from higher level business problem and process approaches to deeper dives into the technology behind innovative approaches to solutions. 

    A presentation from Quantiphi discussed conversational AI – the use of chatbots to offload routine tasks from call center agents as well as  provide “co-pilot” capabilities as it is listening in on discussions.  Prabhpreet Bajaj, an associate practice lead at the company, walked through a section on ChatGPT and Large Language Models and pointed out the challenges that we have recently discussed in our webinar, writing and podcasts (link to ChatGPT article and webinar). 

     These include a wide range of issues, such as:

    • the problem of “hallucinations” in which LLM’s create responses that sound plausible but are completely wrong,
    • the challenges with exposing proprietary information to a public model,
    • issues of source tracking and traceability,
    • selecting and fine tuning of a foundational language model or a more specialized industry model, and
    • developing a corporate ontology.

    All these actions serve as preparation for training on a corporate or departmental knowledge base. 

    The need to point to an organization’s knowledge assets and not expose company IP is critical. The part that was glossed over a bit in presentations at the event as was the need for curated knowledge assets.  When I asked “So, are you assuming that a company has a knowledge base that is correctly structured and organized?” The answer was “yes.” (See “Assume a Can Opener,” one of my prior articles, posted on our website).  One cannot assume that what is needed as part of the solution is available.

    We have to assume that in many cases, work needs to be done to structure, tag, and curate content.  It is rarely can existing content be directly used by a ChatGPT type of application.  Creation of a functional AI requires all of the things that we have preached over the years – correctly structured content models, taxonomies, metadata structures.  These become embeddings in the data that LLM applications can retrieve.

    The simple approach is for the LLM to process the query and use that processed query to retrieve information from a vector database. Content is ingested with “embeddings”- the correct metadata that adds context to the content such as product name or error code, etc., for an installation or troubleshooting guide, for example. The result is then processed and formatted by the LLM for conversational presentation to the customer. 

    The challenge right now is that every vendor has the talking points around how to use LLM’s, but many vendors are missing the nuances.  Too often, they are glossing over a host of issues, such as the need for detailed knowledge process analysis, structured content development, metadata models that allow for contextualization of answers, the taxonomies that become part of the embeddings, and governance and metrics for tracking progress and ensuring continuous improvement. 

    Many of the initiatives using virtual assistants are tackling the lower hanging fruit of external customer service or reps who are supporting the customer experience.  Few are tackling the overarching challenges of knowledge across the enterprise.  Curating and managing content creation at the source, at the department and functional level, needs to be operationalized using centralized standards and processes, using component models so that content is chunked into semantically meaningful pieces.      

    As Large Language Models advance and become increasingly commoditized, the first adopters will have a progressively decreasing differentiator in the marketplace.  The organizations that leverage all their knowledge assets are the ones that will build an increasing competitive advantage.  Starting now will give the enterprise a leg up in our increasingly conversational world where the primary access to information systems will be through asking questions in natural language.         


    Seth Earley
    Seth Earley
    Seth Earley is the Founder & CEO of Earley Information Science and the author of the award winning book The AI-Powered Enterprise: Harness the Power of Ontologies to Make Your Business Smarter, Faster, and More Profitable. An expert with 20+ years experience in Knowledge Strategy, Data and Information Architecture, Search-based Applications and Information Findability solutions. He has worked with a diverse roster of Fortune 1000 companies helping them to achieve higher levels of operating performance.

    Recent Posts

    [RECORDED] Product Data: Insights for Success - How AI is Automating Product Data Programs

    Artificial Intelligence is changing the way businesses interact with their customers. From hyper-personalized experiences to chatbots built on Large Language Models, AI is driving new investment in digital experiences. That same AI and LLM can also be used to automate your product data program. From data onboarding and validation to generating descriptions and validating images, AI can help generate content faster and at a higher quality level to improve product findability, search, and conversion rates. In our second webinar in the Product Data Mastery series, we’re speaking with Madhu Konety from IceCream Labs to show exactly how AI and product data can work together for your business.

    AI’s Value for Product Data Programs

    By Dan O'Connor, Director of Product Data, Earley Information Science

    The Critical Role of Content Architecture in Generative AI

    What is Generative AI? Generative AI has caught fire in the industry – almost every tech vendor has a ChatGPT-like offering (or claims to have one). They are claiming to use the same technology – a large language model (LLM) (actually there are many Large Language Models both open source and proprietary fine-tuned for various industries and purposes) to access and organize content knowledge of the enterprise. As with previous new technologies, LLMs are getting hyped. But what is generative AI?