{"id":71991,"date":"2024-03-28T04:17:21","date_gmt":"2024-03-28T03:17:21","guid":{"rendered":"https:\/\/intellias.com\/?post_type=blog&#038;p=71991"},"modified":"2024-07-01T17:48:47","modified_gmt":"2024-07-01T15:48:47","slug":"deploy-your-llm-chatbot-with-retrieval-augmented-generation-rag","status":"publish","type":"blog","link":"https:\/\/intellias.com\/deploy-llm-chatbot-with-rag\/","title":{"rendered":"Deploy Your LLM Chatbot with Retrieval Augmented Generation (RAG)"},"content":{"rendered":"<p>In 1966, professor Joseph Weizenbaum at MIT came up with a curious concept: teaching a machine called Eliza to respond like a psychotherapist to user queries.<\/p>\n<p>Eliza was powered by a set of rules for generating plausible text responses. Although this first chatbot was far from perfect, it inspired the idea of creating computer assistants for humans.<\/p>\n<p>Fast-forward to 2010, when IBM unveiled Watson \u2014 a state-of-the-art artificial intelligence (AI) system capable of chatting like Bob Dylan thanks to advanced natural language processing (NLP) capabilities.<\/p>\n<p>Heavily advertised and with a successful appearance on the quiz show Jeopardy!, IBM Watson soon became the epitome of artificial intelligence in the public mind.<\/p>\n<p>However, IBM\u2019s technology was also far from perfect. One of the first <a href=\"https:\/\/www.mdanderson.org\/publications\/annual-report\/annual-report-2013\/the-oncology-expert-advisor.html\" target=\"_blank\" rel=\"noopener\">large-scale field tests of IBM Watson<\/a> at the MD Anderson Cancer Center in Houston went badly wrong. Although the machine could scan billions of scientific journals, it wasn\u2019t capable of effective document classification, making it hard for users to obtain specific information. The project was shelved several years into the making.<\/p>\n<p>Watson\u2019s machine learning algorithms struggled to effectively process unstructured data \u2014 handwritten notes, clinical images, and freehand doctors\u2019 inputs \u2014 making it less effective than advertised for the most hyped use case: healthcare. The system was trained on unrepresentative data sets, was impossible to integrate with popular electronic health record (EHR) systems, and couldn\u2019t adapt to local treatment protocols. In the best case, Watson made <a href=\"https:\/\/slate.com\/technology\/2022\/01\/ibm-watson-health-failure-artificial-intelligence.html\" target=\"_blank\" rel=\"noopener\">blatantly obvious recommendations<\/a>. In the worst case, it proposed <a href=\"https:\/\/www.theverge.com\/2018\/7\/26\/17619382\/ibms-watson-cancer-ai-healthcare-science\" target=\"_blank\" rel=\"noopener\">downright dangerous treatment options<\/a>.<\/p>\n<p>Despite all of the company\u2019s expertise, IBM didn\u2019t anticipate the problems that emerged when the model had to interact with the real world.<\/p>\n<p>OpenAI came better prepared with ChatGPT, <a href=\"https:\/\/www.sciencefocus.com\/future-technology\/gpt-3\" target=\"_blank\" rel=\"noopener\">training its GPT-3.5 large language model (LLM) on an estimated 570 GB of publicly available data<\/a>. Because ChatGPT \u201cread\u201d a good part of the internet, it was much better \u201ceducated,\u201d delighting users with moderately accurate information on everything from popular dinosaurs to a recommended go-to-market strategy for a new SaaS application.<\/p>\n<p>Compared to Watson, ChatGPT can better understand the context of users\u2019 queries. But similarly to Watson, its intelligence is limited by its training data. Like Wikipedia, ChatGPT provides broad-stroke information to satisfy general curiosity. However, the OpenAI model lacks awareness of anything outside the scope of its training data. It doesn\u2019t know what Lydia said in the last meeting or what your corporate policy says about performance bonuses.<\/p>\n<p>But programming an LLM to know those things isn\u2019t an impossible task. Most LLMs are open-source, meaning developers can access and customize pretrained models. Moreover, whether an LLM is open-source or proprietary, retrieval augmented generation (RAG) can provide a way for the LLM to interact with business data and become more context-aware.<\/p>\n<div><a href=\"https:\/\/intellias.com\/contact\/\" class=\"cta-wrap bg_1 column\">\n                    <div class=\"left-col\">\n                        <p class=\"cta-title\">Leverage RAG for safer, smarter LLM chatbot solutions<\/p>\n                        <div class=\"cta-block__content\">\n                            <div class=\"description\"><\/div>\n                        <\/div>\n                    <\/div>\n                     <span class=\"btn-filled\">Start now with Intellias<\/span>\n\t\t        <\/a><\/div>\n<h2>How retrieval augmented generation builds on LLMs<\/h2>\n<p>Standard chatbots have bounded knowledge, which quickly gets out of date and can result in misleading outputs when the LLM makes guesses (hallucinates) due to missing information.<\/p>\n<p>Back in 2020, <a href=\"https:\/\/research.facebook.com\/publications\/retrieval-augmented-generation-for-knowledge-intensive-nlp-tasks\/\" target=\"_blank\" rel=\"noopener\">researchers from Meta AI<\/a> came up with a solution called retrieval augmented generation: a recipe for fine-tuning a model by combining its pretrained knowledge with knowledge gleaned by training on additional data from external sources.<\/p>\n<p>When prompted, an LLM chatbot with RAG can retrieve extra information from a connected vector database \u2014 a data store containing high-dimensional vector data, typically derived from text, visuals, or audio and converted into mathematical vectors. When data is packaged in a vector format, the model can access new knowledge in a matter of seconds.<\/p>\n<p>Vector databases also enable better semantic search, which aims to understand a search query\u2019s intent and context. Such databases find the most semantically similar results, even when the exact keywords aren\u2019t present. Graph databases are also well suited for semantic search tasks, especially when you want to map complex data relationships.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-71992\" src=\"https:\/\/d17ocfn2f5o4rl.cloudfront.net\/wp-content\/uploads\/2024\/03\/BP-LLM-Chatbots-with-RAG_infographic_body_1.jpg\" alt=\"semantic search tasks\" width=\"1604\" height=\"1112\" srcset=\"https:\/\/intellias.com\/wp-content\/uploads\/2024\/03\/BP-LLM-Chatbots-with-RAG_infographic_body_1.jpg 1604w, https:\/\/intellias.com\/wp-content\/uploads\/2024\/03\/BP-LLM-Chatbots-with-RAG_infographic_body_1-300x208.jpg 300w, https:\/\/intellias.com\/wp-content\/uploads\/2024\/03\/BP-LLM-Chatbots-with-RAG_infographic_body_1-1024x710.jpg 1024w, https:\/\/intellias.com\/wp-content\/uploads\/2024\/03\/BP-LLM-Chatbots-with-RAG_infographic_body_1-768x532.jpg 768w\" sizes=\"(max-width: 1604px) 100vw, 1604px\" \/><\/p>\n<p><em>Source: <a href=\"https:\/\/www.graft.com\/blog\/the-future-is-semantic-transforming-search-in-the-age-of-ai\" target=\"_blank\" rel=\"noopener\">Graft<\/a><\/em><\/p>\n<p>Once a retrieval augmented generation chatbot fetches the required data, it uses the underlying LLM model to generate a relevant, up-to-date response.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-71995\" src=\"https:\/\/d17ocfn2f5o4rl.cloudfront.net\/wp-content\/uploads\/2024\/03\/BP-LLM-Chatbots-with-RAG_infographic_body_2.jpg\" alt=\"LLM model to generate a relevant, up-to-date response\" width=\"1604\" height=\"1112\" srcset=\"https:\/\/intellias.com\/wp-content\/uploads\/2024\/03\/BP-LLM-Chatbots-with-RAG_infographic_body_2.jpg 1604w, https:\/\/intellias.com\/wp-content\/uploads\/2024\/03\/BP-LLM-Chatbots-with-RAG_infographic_body_2-300x208.jpg 300w, https:\/\/intellias.com\/wp-content\/uploads\/2024\/03\/BP-LLM-Chatbots-with-RAG_infographic_body_2-1024x710.jpg 1024w, https:\/\/intellias.com\/wp-content\/uploads\/2024\/03\/BP-LLM-Chatbots-with-RAG_infographic_body_2-768x532.jpg 768w\" sizes=\"(max-width: 1604px) 100vw, 1604px\" \/><\/p>\n<h2>Benefits of using RAG for customizing LLM chatbots<\/h2>\n<p>With RAG, LLM chatbots get access to a designated treasure trove of information \u2014 be it your corporate SharePoint portal, Confluence, or a database with anonymized customer data.<\/p>\n<p>Apart from giving the large language model access to the most accurate, context-specific data from connected sources, RAG also allows you to include cited sources and footnotes in generated responses so users can fact-check any claims.<\/p>\n<p>Thanks to semantic search, an LLM chatbot with retrieval augmented generation also produces less ambiguous results, even with imperfect queries, plus delivers a few more benefits.<br \/>\n<div><a href=\"https:\/\/intellias.com\/ebooks\/ai-engineering-productivity-cookbook\/\" class=\"whitepaper-banner wow fadeInUp\">\n\t\t\t<div class=\"whitepaper-banner__icon\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/intellias.com\/wp-content\/uploads\/2024\/06\/AI-Engineering-Productivity-Cookbook-image-2.jpg\" alt=\"AI Engineering Productivity Cookbook image 2\" width=\"345\" height=\"442\" ><\/div>\n\t\t\t<div class=\"whitepaper-banner__info\">\n\t\t\t\t<div class=\"whitepaper-banner__info-text\">\n\t\t\t\t\t<div class=\"whitepaper-title\">AI Engineering Productivity Cookbook<\/div>\n\t\t\t\t\t<div class=\"whitepaper-description\">Unleash the potential of AI in software engineering for higher productivity and reduced time-to-market<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<span class=\"whitepaper-btn btn-filled\"> Download now <\/span>\n\t\t\t<\/div>\n\t\t<\/a><\/div><\/p>\n<h3>Hyper-relevant results<\/h3>\n<p>Retrieval augmented generation provides a pretrained LLM with extra knowledge, enabling faster corporate data processing. Instead of querying an analytics database or perusing data storage services, users can quickly retrieve the information they need via a text query. With retrieval augmented generation, users receive hyper-relevant results thanks to technologies that allow for injecting an external knowledge base and document-level context into queries.<\/p>\n<p>RAG is adaptable to virtually any industry as long as the training data can be packaged into a vector format. For example, <strong>Morgan Stanley<\/strong> <a href=\"https:\/\/www.forbes.com\/sites\/tomdavenport\/2023\/03\/20\/how-morgan-stanley-is-training-gpt-to-help-financial-advisors\/\" target=\"_blank\" rel=\"noopener\">prompt-tuned OpenAI\u2019s GPT-4 model<\/a> on a set of 100,000 corporate documents containing investment, business, and process knowledge to supply its workforce with hyper-relevant answers. The company also managed to address some of the hallucination problems GPT-4 models have during long conversations by applying fine-tuning techniques, limiting prompt topics, and embedding accuracy checks.<\/p>\n<p>Generative AI tools enable contextual search and make information retrieval conversational. Employees can use these tools to easily summarize a vast body of data into digestible paragraphs to solve problems faster, uncover new insights, and deliver better business outcomes. In healthcare, the public sector, and education, <a href=\"https:\/\/www.mckinsey.com\/featured-insights\/sustainable-inclusive-growth\/chart-of-the-day\/gen-ais-productivity-possibilities\" target=\"_blank\" rel=\"noopener\">generative AI could have a $480 billion productivity benefit<\/a> in the short term.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-71998\" src=\"https:\/\/d17ocfn2f5o4rl.cloudfront.net\/wp-content\/uploads\/2024\/03\/BP-LLM-Chatbots-with-RAG_infographic_body_3.jpg\" alt=\" generative AI could have a $480 billion productivity benefit\" width=\"1604\" height=\"1112\" srcset=\"https:\/\/intellias.com\/wp-content\/uploads\/2024\/03\/BP-LLM-Chatbots-with-RAG_infographic_body_3.jpg 1604w, https:\/\/intellias.com\/wp-content\/uploads\/2024\/03\/BP-LLM-Chatbots-with-RAG_infographic_body_3-300x208.jpg 300w, https:\/\/intellias.com\/wp-content\/uploads\/2024\/03\/BP-LLM-Chatbots-with-RAG_infographic_body_3-1024x710.jpg 1024w, https:\/\/intellias.com\/wp-content\/uploads\/2024\/03\/BP-LLM-Chatbots-with-RAG_infographic_body_3-768x532.jpg 768w\" sizes=\"(max-width: 1604px) 100vw, 1604px\" \/><\/p>\n<p><em>Source: McKinsey \u2014 <a href=\"https:\/\/www.mckinsey.com\/featured-insights\/sustainable-inclusive-growth\/chart-of-the-day\/gen-ais-productivity-possibilities\" target=\"_blank\" rel=\"noopener\">Gen AI\u2019s productivity possibilities<\/a>.<\/em><\/p>\n<h3>Cost-efficiency<\/h3>\n<p>Training a custom LLM is a resource-intensive task. You need massive computing power, human expertise, and terabytes of training data. Creating an RAG-based chatbot requires a fraction of the above efforts.<\/p>\n<p>Instead of training the model from scratch, you augment its knowledge with extra insights. Investment research company <strong>Morningstar<\/strong> fine-tuned the GPT-3.5 model on vector embeddings from its investment research database to create a <a href=\"https:\/\/newsroom.morningstar.com\/newsroom\/news-archive\/press-release-details\/2023\/Mo-an-AI-Chatbot-Powered-by-Morningstar-Intelligence-Engine-Debuts-in-Morningstar-Platforms\/default.aspx\" target=\"_blank\" rel=\"noopener\">generative AI assistant called Mo for its financial advisors<\/a>. Mo can synthesize investing insights, automate simple tasks, and provide in-software help for company users.<\/p>\n<p><a href=\"https:\/\/www.forbes.com\/sites\/tomdavenport\/2023\/06\/26\/fast-cheap-and-in-control-generative-ai-in-morningstars-mo\/?sh=5061fdea16ae\" target=\"_blank\" rel=\"noopener\">According to Forbes Contributor Tom Davenport<\/a>, Rhodes (Morningstar\u2019s CTO) has said that thus far Mo has answered over 25,000 questions and the average cost per question answered is an astounding $0.002. The total expense devoted to Mo thus far \u2014 not counting the compensation of its creators \u2014 is $3000.<\/p>\n<p>Over one month, Mo handled over 25,000 questions, with the average cost per answer being just $0.002. <a href=\"https:\/\/www.forbes.com\/sites\/tomdavenport\/2023\/06\/26\/fast-cheap-and-in-control-generative-ai-in-morningstars-mo\/?sh=5061fdea16ae\" target=\"_blank\" rel=\"noopener\">The total expense on generative AI assistant<\/a> is about $3,000 per month, not counting the compensation of its creators.<\/p>\n<h3>Easy implementation<\/h3>\n<p>RAG can be used with various LLM architectures and is relatively easy to implement. <a href=\"https:\/\/arxiv.org\/pdf\/2005.11401.pdf\" target=\"_blank\" rel=\"noopener\">Researchers associated with Facebook AI Research, University College London, and New York University<\/a> have proposed a variant of RAG containing just five lines of code that can support simple use cases.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-72001\" src=\"https:\/\/d17ocfn2f5o4rl.cloudfront.net\/wp-content\/uploads\/2024\/03\/BP-LLM-Chatbots-with-RAG_infographic_body_4.jpg\" alt=\"RAG containing\" width=\"1604\" height=\"993\" srcset=\"https:\/\/intellias.com\/wp-content\/uploads\/2024\/03\/BP-LLM-Chatbots-with-RAG_infographic_body_4.jpg 1604w, https:\/\/intellias.com\/wp-content\/uploads\/2024\/03\/BP-LLM-Chatbots-with-RAG_infographic_body_4-300x186.jpg 300w, https:\/\/intellias.com\/wp-content\/uploads\/2024\/03\/BP-LLM-Chatbots-with-RAG_infographic_body_4-1024x634.jpg 1024w, https:\/\/intellias.com\/wp-content\/uploads\/2024\/03\/BP-LLM-Chatbots-with-RAG_infographic_body_4-768x475.jpg 768w\" sizes=\"(max-width: 1604px) 100vw, 1604px\" \/><\/p>\n<p><em>Source: Hugging Face \u2014 <a href=\"https:\/\/huggingface.co\/facebook\/rag-token-nq\" target=\"_blank\" rel=\"noopener\">RAG Token<\/a>.<\/em><\/p>\n<p>On the downside, RAG requires a mature data management culture. Effectively, you\u2019ll have to transform a large body of knowledge (documents, images, etc.) into vector embeddings. Likewise, you\u2019ll have to ensure low system latency and optimized performance to support a high number of concurrent users.<\/p>\n<p>New open-source and commercial frameworks are emerging every day to facilitate the development of production-ready generative AI applications. Open-source frameworks like <a href=\"https:\/\/www.langchain.com\/\" target=\"_blank\" rel=\"noopener\">LangChain<\/a> and <a href=\"https:\/\/www.llamaindex.ai\/\" target=\"_blank\" rel=\"noopener\">LlamaIndex<\/a> help connect external data sources to LLMs. <a href=\"https:\/\/azure.microsoft.com\/en-us\/products\/ai-services\/openai-service\" target=\"_blank\" rel=\"noopener\">Azure OpenAI Service<\/a> provides API-based access to some of the best foundation models, plus optimized infrastructure for model hosting and training. <a href=\"https:\/\/aws.amazon.com\/bedrock\/\" target=\"_blank\" rel=\"noopener\">AWS Bedrock<\/a> also offers API-based access to foundation models from companies like AI21 Labs, Anthropic, Cohere, Meta, Stability AI, and Amazon (of course!). <a href=\"https:\/\/superpowered.ai\/\" target=\"_blank\" rel=\"noopener\">Superpowered.ai<\/a> offers an API for retrieval augmented generation that is optimized for legal, financial, and educational use cases.<\/p>\n<h3>Flexible use cases<\/h3>\n<p>RAG enables LLM chatbot customization for virtually any domain \u2014 from finance to telecom \u2014 and for any business function: sales, marketing, HR, procurement, you name it. Effectively, you can use RAG to create custom virtual assistants for any use case involving text summarization, clustering, or classification.<\/p>\n<p>At Intellias, we\u2019re also supercharging a GPT model to handle a wide range of knowledge-intensive tasks, ranging from providing general information on organizational policies to accessing specific sales, training, or HR knowledge. Our sales team can use AI to gain insights into our portfolio and projects to better close the next deal.<\/p>\n<p>Google used RAG to improve its <a href=\"https:\/\/cloud.google.com\/blog\/topics\/healthcare-life-sciences\/sharing-google-med-palm-2-medical-large-language-model\" target=\"_blank\" rel=\"noopener\">Med-PaLM2<\/a> \u2014 an improved model for medical knowledge processing. The company also recently released <a href=\"https:\/\/github.com\/GoogleCloudPlatform\/database-query-extension\" target=\"_blank\" rel=\"noopener\">GenAI Databases Retrieval App<\/a> \u2014 a collection of techniques for infusing extra information from Google Cloud databases into generative AI apps.<\/p>\n<p>Microsoft also relies on RAG for its growing series of Copilots for Sales, Microsoft 365, and Security. Users can also apply RAG for <a href=\"https:\/\/learn.microsoft.com\/en-us\/azure\/search\/retrieval-augmented-generation-overview\" target=\"_blank\" rel=\"noopener\">Azure AI Search projects<\/a>. IBM\u2019s so-far most successful AI service, <a href=\"https:\/\/www.ibm.com\/watsonx\" target=\"_blank\" rel=\"noopener\">Watsonx<\/a>, relies on various RAG techniques.<\/p>\n<div><a href=\"https:\/\/intellias.com\/learning-chatbot-for-product-training\/\" class=\"cta-wrap bg_3\">\n                    <div class=\"left-col\">\n                        <p class=\"cta-title\">We\u2019ve built a virtual chatbot for efficient product training that helps to boost sales<\/p>\n                        <div class=\"cta-block__content\">\n                            <div class=\"description\"><\/div>\n                        <\/div>\n                    <\/div>\n                     <span class=\"btn-filled\">Read more<\/span>\n\t\t        <\/a><\/div>\n<h2>Use cases of an LLM Chatbot with RAG<\/h2>\n<p>An enterprise uses 991 applications on average. IT budgets increase year over year, but the lack of interoperability and mounting number of data silos diminishes the productivity gains. Most companies need better knowledge management and dissemination practices. A retrieval augmented generation chatbot can deliver just that for an array of text processing tasks.<\/p>\n<h3>Corporate knowledge management<\/h3>\n<p>Looking up information in emails, presentations, reports, and meeting notes is a solid use case for an AI chatbot with RAG. Retrieval augmented generation enables efficient document querying through a conversational interface. Instead of opening dozens of SharePoint pages or multiple tabs in a virtual cloud storage app, users can get summarized answers from a virtual assistant.<\/p>\n<p>Almost half (47%) of digital workers struggle to find the information they need to effectively perform their jobs, and 66% wish that their IT departments provided \u201cuniversally accepted and supported applications or devices\u201d to get work done according to a <a href=\"https:\/\/www.gartner.com\/en\/newsroom\/press-releases\/2023-05-10-gartner-survey-reveals-47-percent-of-digital-workers-struggle-to-find-the-information-needed-to-effectively-perform-their-jobs\" target=\"_blank\" rel=\"noopener\">Gartner survey<\/a>.<\/p>\n<p>LLMs powered by RAG effectively deliver just that. For example, <a href=\"https:\/\/techcommunity.microsoft.com\/t5\/microsoft-mechanics-blog\/how-sales-copilot-works-amp-how-to-set-it-up\/ba-p\/3970601#:~:text=Sales%20Copilot%20starts%20with%20the,storing%20customer%20and%20opportunity%20information.https:\/\/techcommunity.microsoft.com\/t5\/microsoft-mechanics-blog\/how-sales-copilot-works-amp-how-to-set-it-up\/ba-p\/3970601\" target=\"_blank\" rel=\"noopener\">Microsoft Sales Copilot<\/a> can quickly provide information about customers or opportunities from a connected CRM system. <a href=\"https:\/\/dust.tt\/#sectionProduct\" target=\"_blank\" rel=\"noopener\">Dust<\/a> lets you fetch and process data from Notion, Slack, GitHub, and Google Drive.<\/p>\n<h3>Employee onboarding<\/h3>\n<p>Getting up to speed in a new job can be tough. Between signing a heap of payroll-related documents and getting acquainted with organizational policies, new hires struggle to cope. <a href=\"https:\/\/www.glean.com\/resources\/onboarding-survey-ebook\" target=\"_blank\" rel=\"noopener\">Over 80% of employees say they\u2019re overwhelmed with information<\/a> throughout the onboarding process and lose about one full workday per week searching for the right information.<\/p>\n<p>A corporate chatbot can facilitate knowledge discovery, giving instant replies to questions a new hire may find too silly to ask a new colleague or the ever-busy HR manager. Likewise, employees can use a corporate chatbot to save time in searching for the right policy document when questioning the correctness of their actions (when no immediate human help is in sight). This can eliminate otherwise inevitable first-day mistakes for new hires, shorten their time to productivity, and save senior staff from answering routine questions instead of discussing more important problems.<\/p>\n<p>Intellias has recently helped a multi-national company launch a <a href=\"https:\/\/intellias.com\/chatgpt-enabled-platform-for-evaluating-individual-employees-skills\/\">GPT-powered assistant for employee skill assessments<\/a>. The model analyzes internal and external data sources, such as SharePoint and LinkedIn, to get a 360-degree view of employees\u2019 competency levels. The tool automatically creates detailed skill profiles and suggests areas for extra professional development.<\/p>\n<p>CloudApper has used the GPT model to <a href=\"https:\/\/www.cloudapper.ai\/hrgpt-hcm-automation-solution\/\" target=\"_blank\" rel=\"noopener\">create an HR chatbot<\/a> that provides contextually aware self-service to employees. The app can provide up-to-date information on company policies and various HR procedures, plus answer specific questions about vacation days and sick leave or other benefits.<\/p>\n<h3>Customer support<\/h3>\n<p>A stellar customer experience (CX) lowers customer service costs and increases the future purchase intent. Yet, offering impeccable support across multiple channels is expensive. Current advances in NLP and generative AI can <a href=\"https:\/\/www.ibm.com\/downloads\/cas\/GQDGPZJE\" target=\"_blank\" rel=\"noopener\">reduce customer service costs by 30%<\/a> by answering 80% of customer queries.<\/p>\n<p>Apart from providing contextual replies to common questions like <em>Where\u2019s my package?<\/em> and <em>What does the warranty cover?<\/em> RAG also makes it easier to identify flaws in conversations and re-route more complex problems to human agents.<\/p>\n<p>LLMs with RAG can also empower employees with better knowledge, aiding them with issue resolution. <a href=\"https:\/\/www.withchima.com\/chi-core#Use-Cases\" target=\"_blank\" rel=\"noopener\">Chima<\/a>, for example, has created a customizable generative AI platform that can connect to different customer databases and support platforms to analyze historical interactions and provide accurate suggestions. <strong>Gusto<\/strong>, an online payroll and HR platform, has <a href=\"https:\/\/www.withchima.com\/case-study-transforming-customer-experience-with-chi-nexus-at-gusto\" target=\"_blank\" rel=\"noopener\">used Chima<\/a> to completely overhaul its support function. The chatbot uses current interaction context and historical data to provide users with responses in text, audio, and video formats. Agents also receive rich insights for problem-solving based on corporate data. To ensure compliance, the system applies masking to sensitive customer data and uses industry-standard security practices.<\/p>\n<h3>Clinical decision support<\/h3>\n<p>While IBM Watson was a flop for healthcare, newer generative AI systems have shown better results. They can handle a wider range of data formats, adapt to different clinical protocols, and produce more reliable results.<\/p>\n<p><a href=\"https:\/\/www.harman.com\/\" target=\"_blank\" rel=\"noopener\">Harman,<\/a> a Samsung subsidiary, has been <a href=\"https:\/\/www.healthcareitnews.com\/news\/harman-latest-launch-llm-healthcare\" target=\"_blank\" rel=\"noopener\">testing a HealthGPT model<\/a> for analyzing clinical trial data. Using an automated framework for fine-tuning and output validation, the team has produced a system capable of providing rich, context-aware clinical insights from cohort studies, screening trials, and other types of observational studies.<\/p>\n<p>The <a href=\"https:\/\/www.nuhs.edu.sg\/sites\/nuhs\/NUHS%20Assets\/News%20Documents\/NUHS%20Corp\/Media%20Releases\/2023\/Media-release-NUHS-leverages-supercomputer-to-drive-AI-in-healthcare.pdf\" target=\"_blank\" rel=\"noopener\">National University Health System (NUHS) in Singapore<\/a> has also trained an advanced LLM model. It can summarize patient case notes, write referral letters, and provide healthcare professionals with information related to medical conditions and clinical practice guidelines.<\/p>\n<p>Apart from providing clinical decision support, LLM models can handle a wide range of back-office healthcare processes. Claims processing, health insurance authorizations, and resolution of claims denials are time-consuming, data-intensive processes that can be streamlined with a chatbot using retrieval augmented generation. According to McKinsey, \u201cGen AI represents a meaningful new tool that can help unlock a piece of the <a href=\"https:\/\/www.mckinsey.com\/industries\/healthcare\/our-insights\/tackling-healthcares-biggest-burdens-with-generative-ai\" target=\"_blank\" rel=\"noopener\">unrealized $1 trillion of improvement potential present in the [healthcare] industry<\/a>.\u201d<\/p>\n<h3>Regulatory compliance<\/h3>\n<p>New data privacy laws, ESG disclosures, and marketing policy updates \u2014 every organization has a mounting regulatory burden. According to the Thomson Reuters <a href=\"https:\/\/legal.thomsonreuters.com\/en\/insights\/reports\/cost-of-compliance-2023\/form\" target=\"_blank\" rel=\"noopener\">2023 Cost of Compliance Report<\/a>, 73% of compliance professionals expect the volume of regulatory information to increase in the coming year. This, in turn, drives up the cost of compliance.<\/p>\n<p>Generative AI proves to be an effective assistant for tasks such as monitoring the regulatory landscape, assessing the its impact, and making prescriptive implementation recommendations. Thanks to RAG, LLM chatbots can cite facts and sources, reducing the effort required of compliance managers. The models can also summarize complex regulatory language, compare different regulations across countries, and check them against current internal policies and procedures.<\/p>\n<p>The Austrian legal publisher <a href=\"https:\/\/www.manz.at\/\" target=\"_blank\" rel=\"noopener\">Manz<\/a>, for example, fine-tuned an open-source language model (BERT) to better process legal documents. The system can now query the publisher\u2019s database of 3 million documents with high accuracy. It can re-surface relevant case law documents and find up to 30 different facets to a legal problem with high accuracy based on one query.<\/p>\n<p><a href=\"https:\/\/www.corlytics.com\/\" target=\"_blank\" rel=\"noopener\">Corlytics<\/a>, a risk intelligence platform, has been running internal tests to evaluate pretrained generative AI compliance models. <a href=\"https:\/\/www.thomsonreuters.com\/en-us\/posts\/technology\/expert-ai-automating-compliance-tasks\/\" target=\"_blank\" rel=\"noopener\">According to the company\u2019s CEO<\/a> John Byrne, \u201cwhen it comes to summarizing complex regulatory documents and rules, the off-the-shelf accuracy is below 20%.\u201d In contrast, Corlytics\u2019 model produced up to 85% accuracy during the first runs. With further fact-checking and feedback from lawyers, the model accuracy improved to 99%.<\/p>\n<h3>Deal management<\/h3>\n<p>The <a href=\"https:\/\/www.geckoboard.com\/best-practice\/kpi-examples\/average-sales-cycle-length\/\" target=\"_blank\" rel=\"noopener\">average sales cycle length across industriesa<\/a> is 102 days \u2014 and a lot of knowledge work happens during that time. Sales teams need to identify requirements, request and fill in proposal templates, create contracts, check in with the legal team, and do myriad other text-related tasks. GenAI can add efficiencies at every step of the sales cycle, helping teams close new opportunities faster.<\/p>\n<p><a href=\"https:\/\/www.robinai.com\/\" target=\"_blank\" rel=\"noopener\">Robin AI<\/a> has developed a copilot for legal contracts, offering 85% faster contract reviews through contextual recommendations and faster information lookup. <a href=\"https:\/\/www.usedialect.com\/\" target=\"_blank\" rel=\"noopener\">Dialect<\/a> has built a GenAI assistant for auto-filling RFPs, RFIs, DDQs, and other important questionnaires using data from connected corporate systems.<\/p>\n<p><a href=\"https:\/\/eilla.ai\/\" target=\"_blank\" rel=\"noopener\">Eilla<\/a>, in turn, promises to bring extra speed to M&amp;A and VC workflows by researching, aggregating, and analyzing information from key industry sources and internal data to perform better due diligence. <a href=\"https:\/\/govdash.com\/\" target=\"_blank\" rel=\"noopener\">Govdash<\/a> wants to help companies liaising with federal customers to create more relevant solicitation documentation by automatically identifying requirements and evaluation factors and then returning relevant customer information.<\/p>\n<h2>RAG brings a new degree of intelligence to companies with LLMs<\/h2>\n<p>The above use cases just scratch the surface of the possibilities for a corporate retrieval augmented generation chatbot. Because of the relatively low implementation cost and complexity, RAG-based models can be deployed for a wide range of text processing tasks across all business functions. Full data security and privacy are guaranteed, as the model is deployed on private cloud infrastructure, limiting exposure of data to third parties. High auditability and a greater degree of control over output generation also make it easier to spot inconsistencies or biases early. If you\u2019re looking for a way to convert the massive potential of generative AI into measurable business profits, RAG may be the answer.<\/p>\n<p>Intellias helps global businesses <a href=\"https:\/\/intellias.com\/machine-learning-artificial-intelligence-development-services\/\">implement and productize AI applications<\/a>. From automotive to telecom, we help clients deploy high-performance, responsible, and secure generative AI solutions. <a href=\"https:\/\/intellias.com\/contact\/\">Contact us<\/a> for a personalized consultation.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Elevate your chatbot experience by learning how RAG empowers LLMs for hyper-relevant and risk-free interactions <\/p>\n","protected":false},"author":18,"featured_media":72010,"template":"","class_list":["post-71991","blog","type-blog","status-publish","has-post-thumbnail","hentry","blog-category-machine-learning-ai","blog-category-technology"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v22.4 (Yoast SEO v23.2) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Why Deploy LLM Chatbots with RAG? - Intellias<\/title>\n<meta name=\"description\" content=\"Turn generative AI\u2019s massive potential into profits with a corporate chatbot that uses RAG to enhance the LLM and provide hyper-relevant results without security risks.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/intellias.com\/deploy-llm-chatbot-with-rag\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Deploy Your LLM Chatbot with Retrieval Augmented Generation (RAG)\" \/>\n<meta property=\"og:description\" content=\"Turn generative AI\u2019s massive potential into profits with a corporate chatbot that uses RAG to enhance the LLM and provide hyper-relevant results without security risks.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/intellias.com\/deploy-llm-chatbot-with-rag\/\" \/>\n<meta property=\"og:site_name\" content=\"Intellias\" \/>\n<meta property=\"article:modified_time\" content=\"2024-07-01T15:48:47+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/d17ocfn2f5o4rl.cloudfront.net\/wp-content\/uploads\/2024\/03\/BP-LLM-Chatbots-with-RAG_fb_lin_tw-2.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1200\" \/>\n\t<meta property=\"og:image:height\" content=\"628\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"14 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/intellias.com\/deploy-llm-chatbot-with-rag\/\",\"url\":\"https:\/\/intellias.com\/deploy-llm-chatbot-with-rag\/\",\"name\":\"Why Deploy LLM Chatbots with RAG? - Intellias\",\"isPartOf\":{\"@id\":\"https:\/\/intellias.com\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/intellias.com\/deploy-llm-chatbot-with-rag\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/intellias.com\/deploy-llm-chatbot-with-rag\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/intellias.com\/wp-content\/uploads\/2024\/03\/BP-LLM-Chatbots-with-RAG_wesite-header.jpg\",\"datePublished\":\"2024-03-28T03:17:21+00:00\",\"dateModified\":\"2024-07-01T15:48:47+00:00\",\"description\":\"Turn generative AI\u2019s massive potential into profits with a corporate chatbot that uses RAG to enhance the LLM and provide hyper-relevant results without security risks.\",\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/intellias.com\/deploy-llm-chatbot-with-rag\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/intellias.com\/deploy-llm-chatbot-with-rag\/#primaryimage\",\"url\":\"https:\/\/intellias.com\/wp-content\/uploads\/2024\/03\/BP-LLM-Chatbots-with-RAG_wesite-header.jpg\",\"contentUrl\":\"https:\/\/intellias.com\/wp-content\/uploads\/2024\/03\/BP-LLM-Chatbots-with-RAG_wesite-header.jpg\",\"width\":1920,\"height\":600},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/intellias.com\/#website\",\"url\":\"https:\/\/intellias.com\/\",\"name\":\"Intellias\",\"description\":\"\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/intellias.com\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\/\/intellias.com\/#\/schema\/person\/49393ae6c76450b944348af909348651\",\"name\":\"Pavlyna-Mariana Ruzhytska\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Why Deploy LLM Chatbots with RAG? - Intellias","description":"Turn generative AI\u2019s massive potential into profits with a corporate chatbot that uses RAG to enhance the LLM and provide hyper-relevant results without security risks.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/intellias.com\/deploy-llm-chatbot-with-rag\/","og_locale":"en_US","og_type":"article","og_title":"Deploy Your LLM Chatbot with Retrieval Augmented Generation (RAG)","og_description":"Turn generative AI\u2019s massive potential into profits with a corporate chatbot that uses RAG to enhance the LLM and provide hyper-relevant results without security risks.","og_url":"https:\/\/intellias.com\/deploy-llm-chatbot-with-rag\/","og_site_name":"Intellias","article_modified_time":"2024-07-01T15:48:47+00:00","og_image":[{"width":1200,"height":628,"url":"https:\/\/d17ocfn2f5o4rl.cloudfront.net\/wp-content\/uploads\/2024\/03\/BP-LLM-Chatbots-with-RAG_fb_lin_tw-2.jpg","type":"image\/jpeg"}],"twitter_card":"summary_large_image","twitter_misc":{"Est. reading time":"14 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/intellias.com\/deploy-llm-chatbot-with-rag\/","url":"https:\/\/intellias.com\/deploy-llm-chatbot-with-rag\/","name":"Why Deploy LLM Chatbots with RAG? - Intellias","isPartOf":{"@id":"https:\/\/intellias.com\/#website"},"primaryImageOfPage":{"@id":"https:\/\/intellias.com\/deploy-llm-chatbot-with-rag\/#primaryimage"},"image":{"@id":"https:\/\/intellias.com\/deploy-llm-chatbot-with-rag\/#primaryimage"},"thumbnailUrl":"https:\/\/intellias.com\/wp-content\/uploads\/2024\/03\/BP-LLM-Chatbots-with-RAG_wesite-header.jpg","datePublished":"2024-03-28T03:17:21+00:00","dateModified":"2024-07-01T15:48:47+00:00","description":"Turn generative AI\u2019s massive potential into profits with a corporate chatbot that uses RAG to enhance the LLM and provide hyper-relevant results without security risks.","inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/intellias.com\/deploy-llm-chatbot-with-rag\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/intellias.com\/deploy-llm-chatbot-with-rag\/#primaryimage","url":"https:\/\/intellias.com\/wp-content\/uploads\/2024\/03\/BP-LLM-Chatbots-with-RAG_wesite-header.jpg","contentUrl":"https:\/\/intellias.com\/wp-content\/uploads\/2024\/03\/BP-LLM-Chatbots-with-RAG_wesite-header.jpg","width":1920,"height":600},{"@type":"WebSite","@id":"https:\/\/intellias.com\/#website","url":"https:\/\/intellias.com\/","name":"Intellias","description":"","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/intellias.com\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/intellias.com\/#\/schema\/person\/49393ae6c76450b944348af909348651","name":"Pavlyna-Mariana Ruzhytska"}]}},"_links":{"self":[{"href":"https:\/\/intellias.com\/wp-json\/wp\/v2\/blog\/71991"}],"collection":[{"href":"https:\/\/intellias.com\/wp-json\/wp\/v2\/blog"}],"about":[{"href":"https:\/\/intellias.com\/wp-json\/wp\/v2\/types\/blog"}],"author":[{"embeddable":true,"href":"https:\/\/intellias.com\/wp-json\/wp\/v2\/users\/18"}],"version-history":[{"count":10,"href":"https:\/\/intellias.com\/wp-json\/wp\/v2\/blog\/71991\/revisions"}],"predecessor-version":[{"id":76230,"href":"https:\/\/intellias.com\/wp-json\/wp\/v2\/blog\/71991\/revisions\/76230"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/intellias.com\/wp-json\/wp\/v2\/media\/72010"}],"wp:attachment":[{"href":"https:\/\/intellias.com\/wp-json\/wp\/v2\/media?parent=71991"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}