Agentic RAG - CrewAI & LangChain

Introduction    

In the rapidly evolving field of artificial intelligence, the ability to build systems that can understand and respond to human queries is of paramount importance. In this blog post, we'll explore how to create an intelligent question-answering system using  LangChain, CrewAI and various tools and agents to process and respond to user questions effectively.  

We'll walk through a Python script that demonstrates how to set up a multi-agent system capable of routing queries, retrieving information from different sources, grading the relevance and accuracy of responses, and ultimately providing concise and accurate answers to users.The goal of our system is to answer user questions by either retrieving information from a local PDF document or performing a web search, depending on the nature of the question. To achieve this, we'll use:  

  • LangChain : A framework for building applications with large language models (LLMs).
  • CrewAI : A toolkit for orchestrating multi-agent workflows.
  • PDFSearchTool : A tool for searching and retrieving information from PDF documents.
  • TavilySearchResults : A tool for performing web searches and retrieving results  

We'll also define several agents and tasks to process the question through various stages, ensuring the final answer is relevant and accurate.  

Table of Contents  

  1. Setting Up the Environment
  2. Initializing the Language Model
  3. Downloading and Preparing the PDF Document
  4. Configuring the PDF Search Tool
  5. Setting Up the Web Search Tool
  6. Creating the Router Function
  7. Defining the Agents
  8. Constructing the Tasks
  9. Assembling the Crew
  10. Running the System   

1. Setting Up the Environment      

First, we need to import the necessary libraries and set up our environment variables for API keys. 

import os 
import requests  
from langchain_openai import ChatOpenAI 
from crewai_tools import PDFSearchTool, tool  
from langchain_community.tools.tavily_search import TavilySearchResults 
from crewai import Crew, Task, Agent  
# Set your API keys  
os.environ['GROQ_API_KEY'] = 'your_groq_api_key'  
os.environ['TAVILY_API_KEY'] = 'your_tavily_api_key'  

Replace  ‘your_groq_api_key’  and  ‘your_tavily_api_key’  with your actual API keys.  

2. Initializing the Language Model      

We'll initialize our language model using  ChatOpenAI, pointing to the Groq API endpoint and specifying the model parameters. 

llm = ChatOpenAI(  
   openai_api_base="https://api.groq.com/openai/v1",  
   openai_api_key=os.environ['GROQ_API_KEY'], 
   model_name="llama3-8b-8192",  
   temperature=0.1,  
   max_tokens=1000,  
)  


3. Downloading and Preparing the PDF Document      

Next, we'll download the "Attention Is All You Need" paper, which introduces the Transformer model—a foundational concept in modern NLP.  

pdf_url = 'https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf'  
response = requests.get(pdf_url)  
with open('attention_is_all_you_need.pdf', 'wb') as file:  
   file.write(response.content)  


4. Configuring the PDF Search Tool

 We'll configure the  PDFSearchTool to enable searching within the downloaded PDF using embeddings.  

rag_tool = PDFSearchTool(  
   pdf='self_attention.pdf', 
   config=dict(  
       llm=dict(  
           provider="groq",  
           config=dict(  
               model="llama3-8b-8192",  
           ),  
       ),  
       embedder=dict(  
           provider="huggingface",  
           config=dict(  
               model="BAAI/bge-small-en-v1.5",  
           ),  
       ),  
   )  
)  


5. Setting Up the Web Search Tool       

We'll also set up the  TavilySearchResults tool to perform web searches when necessary. 

web_search_tool = TavilySearchResults(k=3)  


6. Creating the Router Function        

The router function determines whether to use the vector store (PDF) or perform a web search based on the user's question.  

@tool
def router_tool(question):  
   """Router Function"""  
   if 'self-attention' in question.lower():  
       return 'vectorstore'  
   else:  
       return 'web_search'  


7. Defining the Agents 

We define several agents, each responsible for a specific task in the pipeline.  

Router Agent 

Routes the question to the appropriate tool.  

Router_Agent = Agent(  
   role='Router',  
   goal='Route user question to a vectorstore or web search',  
   backstory=(  
       "You are an expert at routing a user question to a vectorstore or web search. "  
       "Use the vectorstore for questions on concepts related to Retrieval-Augmented Generation. "  
       "Otherwise, use web search."  
   ),  
   verbose=True,  
   allow_delegation=False,  
   llm=llm,  
)  

Retriever Agent     

Retrieves information based on the router's decision. 

Retriever_Agent = Agent(  
   role="Retriever",  
   goal="Use the information retrieved to answer the question",  
   backstory=(  
       "You are an assistant for question-answering tasks. "  
       "Use the information present in the retrieved context to answer the question. "  
       "Provide a clear, concise answer."  
   ),  
   verbose=True,  
   allow_delegation=False,  
   llm=llm,
)  

Grader Agents   

Assess the relevance and accuracy of the retrieved information.  

Grader_agent = Agent(  
   role='Answer Grader',  
   goal='Filter out erroneous retrievals',  
   backstory=(  
       "You are a grader assessing the relevance of a retrieved document to a user question. "  
       "Grade it as relevant if it contains keywords related to the question."  
   ),  
   verbose=True,  
   allow_delegation=False,  
   llm=llm,  
)  
hallucination_grader = Agent(  
   role="Hallucination Grader",  
   goal="Filter out hallucinations",  
   backstory=(  
       "You assess whether an answer is grounded in facts. "  
       "Meticulously review the answer and check if it aligns with the question."  
   ),  
   verbose=True,  
   allow_delegation=False,  
   llm=llm,  
)  
answer_grader = Agent(  
   role="Answer Grader",  
   goal="Provide the final answer or perform a web search if necessary",  
   backstory=(  
       "You assess whether an answer is useful to resolve a question. "  
       "If relevant, generate a clear and concise response. "  
       "If not, perform a web search using 'web_search_tool'."
   ),  
   verbose=True,  
   allow_delegation=False,  
   llm=llm,  
)  


8. Constructing the tasks       

We define tasks that the agents will perform, each corresponding to a step in processing the user's question.  

Router Task       

Determines which tool to use.  

router_task = Task(  
   description=(  
       "Analyze the keywords in the question {question}. "  
       "Decide whether it's eligible for a vector store search or a web search."  
   ),  
   expected_output=(  
       "Return 'vectorstore' or 'websearch' based on the question. "  
       "Do not provide any other preamble or explanation." 
   ),  
   agent=Router_Agent,  
   tools=[router_tool],  
)  

Retriever Task     

Retrieves information using the chosen tool.

retriever_task = Task(
   description=(  
       "Based on the router task's response, extract information for the question {question} using the appropriate tool." 
   ),  
   expected_output=(  
       "Use 'web_search_tool' if 'websearch' was returned. "  
       "Use 'rag_tool' if 'vectorstore' was returned. "  
       "Return a clear and concise text as a response." 
   ),  
   agent=Retriever_Agent,  
   context=[router_task],  
)  

Grader Task    

Evaluates the relevance of the retrieved information.   

grader_task = Task( 
   description=(  
       "Evaluate whether the retrieved content for the question {question} is relevant."  
   ),  
   expected_output=(  
       "Provide 'yes' if relevant, 'no' if not. " 
       "Do not provide any preamble or explanations."  
   ),  
   agent=Grader_agent,  
   context=[retriever_task],  
)  

Hallucination Task    

This task checks for hallucinations in the answer. 

hallucination_task = Task(  
   description=(  
       "Assess whether the answer is grounded in facts for the question {question}."  
   ),  
   expected_output=(  
       "Provide 'yes' if the answer is factual, 'no' if not. " 
       "Do not provide any preamble or explanations."
   ),  
   agent=hallucination_grader,  
   context=[grader_task],  
)  

Answer Task    

Provides the final answer or performs a web search if necessary.  

answer_task = Task(  
   description=(  
       "Based on the hallucination task's response, provide the final answer for the question {question}."
   ),  
   expected_output=(  
       "Return a clear and concise response if 'yes'. "  
       "Perform a web search and return a response if 'no'. "  
       "Otherwise, respond 'Sorry! Unable to find a valid response'."  
   ),  
   context=[hallucination_task],
   agent=answer_grader,
)  


9. Assembling the Crew          

We assemble all the agents and tasks into a  Crew that will process the user's question.

rag_crew = Crew(  
   agents=[Router_Agent, Retriever_Agent, Grader_agent, hallucination_grader, answer_grader], 
   tasks=[router_task, retriever_task, grader_task, hallucination_task, answer_task],  
   verbose=True,  
)  


10. Running the System        

We define the user's question and run the system.  

inputs = {"question": "How does self-attention mechanism help large language models?"}  
result = rag_crew.kickoff(inputs=inputs)
print(result)  


Expected Output      

The system will process the question through the following steps: 

  1. Router Task - Determines that the question is about “self-attention” and routes it to the vector store
  2. Retriever Task - Retrieves relevant information from the PDF using  rag_tool
  3. Grader Task - Assesses the relevance of the retrieved information.
  4. Hallucination Task - Checks if the answer is grounded in facts.
  5. Answer Task - Provides the final answer if everything checks out or performs a web search if not.  

The final output should be a clear and concise explanation of how the self-attention mechanism helps large language models, retrieved from the PDF document.  

Conclusion   

By leveraging LangChain, CrewAI, and various tools and agents, we've built an intelligent question-answering system capable of dynamically selecting the best information source based on the user's query. This approach demonstrates the power of combining language models with retrieval tools and multi-agent orchestration to create robust AI applications.  

This system can be extended and customized further to handle a wide range of queries and integrate additional data sources or processing steps, opening up possibilities for more advanced AI assistants and knowledge retrieval systems.  

Arun Gopalakrishnan
Senior Module Lead