{"id":71991,"date":"2024-03-28T04:17:21","date_gmt":"2024-03-28T03:17:21","guid":{"rendered":"https:\/\/intellias.com\/?post_type=blog&p=71991"},"modified":"2024-07-01T17:48:47","modified_gmt":"2024-07-01T15:48:47","slug":"deploy-your-llm-chatbot-with-retrieval-augmented-generation-rag","status":"publish","type":"blog","link":"https:\/\/intellias.com\/deploy-llm-chatbot-with-rag\/","title":{"rendered":"Deploy Your LLM Chatbot with Retrieval Augmented Generation (RAG)"},"content":{"rendered":"
In 1966, professor Joseph Weizenbaum at MIT came up with a curious concept: teaching a machine called Eliza to respond like a psychotherapist to user queries.<\/p>\n
Eliza was powered by a set of rules for generating plausible text responses. Although this first chatbot was far from perfect, it inspired the idea of creating computer assistants for humans.<\/p>\n
Fast-forward to 2010, when IBM unveiled Watson \u2014 a state-of-the-art artificial intelligence (AI) system capable of chatting like Bob Dylan thanks to advanced natural language processing (NLP) capabilities.<\/p>\n
Heavily advertised and with a successful appearance on the quiz show Jeopardy!, IBM Watson soon became the epitome of artificial intelligence in the public mind.<\/p>\n
However, IBM\u2019s technology was also far from perfect. One of the first large-scale field tests of IBM Watson<\/a> at the MD Anderson Cancer Center in Houston went badly wrong. Although the machine could scan billions of scientific journals, it wasn\u2019t capable of effective document classification, making it hard for users to obtain specific information. The project was shelved several years into the making.<\/p>\n Watson\u2019s machine learning algorithms struggled to effectively process unstructured data \u2014 handwritten notes, clinical images, and freehand doctors\u2019 inputs \u2014 making it less effective than advertised for the most hyped use case: healthcare. The system was trained on unrepresentative data sets, was impossible to integrate with popular electronic health record (EHR) systems, and couldn\u2019t adapt to local treatment protocols. In the best case, Watson made blatantly obvious recommendations<\/a>. In the worst case, it proposed downright dangerous treatment options<\/a>.<\/p>\n Despite all of the company\u2019s expertise, IBM didn\u2019t anticipate the problems that emerged when the model had to interact with the real world.<\/p>\n OpenAI came better prepared with ChatGPT, training its GPT-3.5 large language model (LLM) on an estimated 570 GB of publicly available data<\/a>. Because ChatGPT \u201cread\u201d a good part of the internet, it was much better \u201ceducated,\u201d delighting users with moderately accurate information on everything from popular dinosaurs to a recommended go-to-market strategy for a new SaaS application.<\/p>\n Compared to Watson, ChatGPT can better understand the context of users\u2019 queries. But similarly to Watson, its intelligence is limited by its training data. Like Wikipedia, ChatGPT provides broad-stroke information to satisfy general curiosity. However, the OpenAI model lacks awareness of anything outside the scope of its training data. It doesn\u2019t know what Lydia said in the last meeting or what your corporate policy says about performance bonuses.<\/p>\n But programming an LLM to know those things isn\u2019t an impossible task. Most LLMs are open-source, meaning developers can access and customize pretrained models. Moreover, whether an LLM is open-source or proprietary, retrieval augmented generation (RAG) can provide a way for the LLM to interact with business data and become more context-aware.<\/p>\n Leverage RAG for safer, smarter LLM chatbot solutions<\/p>\n Standard chatbots have bounded knowledge, which quickly gets out of date and can result in misleading outputs when the LLM makes guesses (hallucinates) due to missing information.<\/p>\nHow retrieval augmented generation builds on LLMs<\/h2>\n