How would you feel if you discovered that your partner, someone you deeply trusted and respected, was sometimes lying to you? Would you throw them out and change the locks? Would you let them stay, but never entirely trust their word again? Or would you realize that your “partner” is actually an innocent but sometimes poorly informed artificial intelligence chatbot?

As interest in AI technology has surged since the emergence of a mind-boggling number of tools and language models, plenty of evidence has come to light suggesting that your AI partner needs some tough love before you can trust it. Although current and future uses of AI technology hold tremendous potential, they also present significant risks. Ensuring accurate results and protecting sensitive and personal information is of vital importance, but presents a significant challenge.

As a result, many public and private organizations are taking a cautious approach in how they implement and monitor AI-powered processes. Large Language Models (LLMs) are a type of artificial intelligence that have been trained on by analyzing enormous amounts of digital text, including books, Wikipedia articles and online chat logs to generate original content. The most prominent LLMs include OpenAI’s GPT (Generative Pre-trained Transformer), Google’s LaMDA (Language Model for Dialogue Applications), AWS’s AI tool suite anchored by Bedrock (its Foundational Model), and Meta’s LLaMA (Language Model for Multilingual Audience). Community-driven Hugging Face supports a wide range of pre-trained models, including those developed by OpenAI and others like BERT, RoBERTa, and T5. By pinpointing patterns in all that data, an LLM learns to guess the next word in a sequence of words. Because the internet is filled with untruthful information, these systems sometimes repeat the same untruths.

In a recent New York Times article, AI expert Cade Metz noted some of the unpredictable results that these LLMs are producing:

When the San Francisco start-up OpenAI unveiled its ChatGPT online chatbot late last year, millions were wowed by the humanlike way it answered questions, wrote poetry and discussed almost any topic. But most people were slow to realize that this new kind of chatbot often makes things up.

When Google introduced a similar chatbot several weeks later, it spewed nonsense about the James Webb telescope. The next day, Microsoft’s new Bing chatbot offered up all sorts of bogus information about the Gap, Mexican nightlife and the singer Billie Eilish. Then, in March, ChatGPT cited a half dozen fake court cases.

So how can you ever trust your AI buddy to live in your house? One solution is to feed it your own food…er, data. Retrieval-Augmented Generation (RAG) is a technique used to enhance the output of an LLM by retrieving additional information from an external knowledge base (or knowledge graph) to augment the prompt input provided to the LLM. There are several knowledge bases that can be used with RAG to enhance the output and improve trust in LLMs, such as:

  • LLama Index: A knowledge base that can be used to retrieve structured, unstructured, and semi-structured data
  • Wikipedia: A free online encyclopedia that can be used to retrieve information on a wide range of topics
  • DBpedia: A knowledge base that extracts structured content from the information created in the Wikipedia project
  • Freebase: A knowledge base that contains structured data on millions of topics
  • Google Knowledge Graph: A database of billions of facts about people, places, and things

Some organizations are taking this a step further by creating their own knowledge bases that are fed by their internal, proprietary data. Early adopters of AI technology include healthcare, legal, and financial service providers, but they are typically challenged by existing data silos across organizational units don’t always talk to one another. The work involved in compiling and refining the vast amounts of data required to create and maintain a knowledge base is substantial, and most often companies cannot do this on their own. Taking on the challenge is certainly a daunting, but the big payoff is drastically improved data governance, fraud detection, knowledge management, search, chatbot, recommendation, as well as intelligent systems across different organizational units.

Want to learn more about how Axyom can help you get started with an AI solution? Contact us today!