Accelerating Data and Analytics Capabilities Age of Generative AI: How Governance Is a Key Enabler

This article originally appeared in Enterprise Viewpoint.

The underlying principles of Artificial Intelligence have been evolving over decades. Recent advances have created nothing short of a revolutionary breakthrough in information management. Generative AI is in the public consciousness and corporate applications are promising but require certain guardrails and decision-making policies and processes. While “governance“ is a term that brings to mind bureaucratic structures with little practical on-the-ground application, a correctly designed decision-making framework driven by business process/outcome measures and KPIs provides a critical component of data analytics and AI programs.

What Is Generative AI?

Generative AI is a class of technology that can create new content, data, and information by ingesting large amounts of information so the algorithm can learn about relationships of terms, language, and concepts.

Generalized models called Large Language Models or (LLMs are based on broad ranges of ingested data and content. More specific models are based on a particular industry, domain, or set of tasks and processes. Generalized models cannot answer questions about company specific knowledge unless they are fine-tuned or trained on that knowledge (which is typically proprietary and behind the organization fire wall). When generative AI is used to access company-specific information, its answers to questions sound more natural emulating how a human might answer the question.

Use Cases for Generative AI in the Enterprise

The best-known use cases revolve around content generation – developing marketing copy, creating email campaign messaging, writing letters, creating job descriptions and carrying out other routine, time consuming tasks. It is important to have human oversight around these tasks, both to provide more personality to the content and to ensure that it is on target and aligned with the company brand.

Generative AI can also be a good brainstorming tool, providing outlines for research, generating ideas for initiatives and fleshing out ideas with content, developing solutions and designing products based on parameters and requirements. There are many data focused applications, including anomaly detection (including identifying fraud patterns), data synthesis, data fill and data security.

However, the most important application, which has been overshadowed by the astounding capabilities of content generation, is access to corporate information. Organizations have struggled for years to access, organize and manage data and content effectively. Generative AI is the key to accessing “dark data” (the unstructured content that has been accumulating for years but not managed or curated), as well as accessing day-to-day information and streamlining enterprise processes.

Tuning Generative AI

Because Generative AI can make up answers that sound reasonable but are incorrect, the proper guardrails need to be in place. These include a mechanism for “retrieval augmented generation” (RAG) an approach that effectively tunes the generative AI based on knowledge sources that are the gold standard for the enterprise. The algorithm can be directed to retrieve answers only from a specific source or sources and instructed to answer “I don’t know” if it does not have the answer from those sources. This effectively eliminates hallucinations and leverages the language model in processing a response from approved data sources that is more conversational.

Using RAG will be the greatest single application for generative AI in the enterprise. Using Gen AI to access business intelligence, performance measures, support material and organizational knowledge will speed the flow of information throughout the organization in unprecedented ways and become a tremendous source of competitive advantage when correctly deployed.

What Needs to Be Governed?

Because the source of training or tuning content (the knowledge base used for retrieval in RAG), is so critical, that information needs to be structured, managed, and curated either by the departments or through a centralized managed process that ensures consistency in data and content standards and consults with subject matter experts (SMEs) as needed. Those SMEs are essential to the content lifecycle process since they understand the domain and can verify that questions are answered correctly. A governance framework consists of decision-making bodies, decision making rules/procedures and mechanisms to ensure compliance with those rules.

A core part of governance is monitoring the KPIs that are impacted through the Generative AI initiative. A process owner needs to gather baselines prior to deployment and establish targets to determine what success looks like after deployment. The KPIs can be monitored at various levels to show whether there is a meaningful impact.

A governance body assigns more detailed levels of responsibility down to the data steward who is actually responsible for execution and monitoring of the effort (there are many roles in deployment but someone needs to “own” the data and be responsible for data quality and fitness to purpose.

These decision-making structures ensure data integrity and confirm the value of investments through comparison of baselines to measures post deployment. These baselines are referenced using a library of use cases that are clear and unambiguous in order to be testable. That library can be further refined and expanded over time and used in a variety of ways, including user acceptance testing and ongoing fine tuning.

Ethical Issues

Many ethical issues arise in the context of Generative AI, as noted below:

Misinformation: One area of generative AI that has ethical implications is misinformation and deception. Generative AI can write convincing marketing and communications copy. The ethical challenge is ensuring that content is correct, not misleading or outright deceptive. Human review of output is essential and clear policies and procedures need to be in place to ensure compliance.
Ownership of content: Another issue is that of intellectual property ownership and copyright infringement. This is another area where restricting the Generative AI to corporate content will reduce this risk. Generative AI may be using content that may have restrictions on reuse or derivative applications.
Transparency and traceability: It is difficult to identify where answers come from when using Generative AI or to determine the quality of source information. This is another area where a retrieval-augmented approach has value. Since the content is coming from a specific source, an audit trail can be maintained, and people can trust the answer (assuming they trust the information source).
Bias: If data sources are biased (typically based on how the data is generated) then the algorithm will be biased. It is important to understand and mitigate this bias through human review, broad data collection that reflects various demographics and use of bias detection tools. Offerings in this area allow for exploration of model behaviors, detection of discrimination, understanding of data set distributions and other auditing and review approaches[1]. Different models will serve different purposes – it is important to test various models against core use cases.
Privacy and security: Corporate data must remain behind the firewall and not ingested into public LLMs where proprietary information can become public. The same issue occurs with personal data. Strong security protocols need to be in place and tested continuously.

Governance Implementation for Generative AI

Stakeholders at various levels need to understand core principles, what AI can and cannot do and how it will impact them and their role in the organization. Cross-functional teams allow for a range or perspectives for developing governance standards and guiding AI deployment. Employees need to understand how they can leverage it to make them more productive rather than fear its impact. Executives need to understand key success factors and that AI is not a silver bullet that can operate autonomously. Data and content curation is even more critical to success than ever.

Clear roles and responsibilities: Bring the correct stakeholders to meetings to clearly define roles and responsibilities, and address issues that arise during implementation. If people feel that a meeting is not appropriate for them or is a waste of time, they will not continue to participate. Make the punishment fit the crime with RACI charts to determine who is Responsible Accountable, Consulted or Informed for a particular governance task or process.

Finally, governance standards that are not implemented are of little value. There needs to be a process for reviewing execution and metrics for compliance.

Conclusion

Generative AI holds tremendous promise for organizations of all types. Correct deployment involves understanding risks, potential rewards and requirements and preconditions for successful implementation, maintenance and ongoing improvement. Governance is the glue that holds various interests together and enables the organization to realize the value of their investments in AI. Generative AI for the enterprise is an area with huge potential value. Setting the enterprise up for success will ensure that the right resources are correctly deployed and the value from this revolutionary tool is realized.

[1] https://towardsdatascience.com/5-tools-to-detect-and-eliminate-bias-in-your-machine-learning-models-fb6c7b28b4f1