Path: blob/master/cloud/notebooks/python_sdk/deployments/foundation_models/Use watsonx, and LangGraph to build Adaptive RAG Graph.ipynb
6405 views
Use watsonx.ai and LangGraph to build Adaptive RAG Graph
Disclaimers
Use only Projects and Spaces that are available in watsonx context.
Notebook content
This notebook contains the steps and code to demonstrate how to build Adaptive RAG application using LangGraph and watsonx.ai models.
Some familiarity with Python is helpful. This notebook uses Python 3.11.
Learning goal
The purpose of this notebook is to demonstrate how to use language models, e.g. meta-llama/llama-3-70b-instruct
and create Adaptive RAG applications using the tools available in LangGraph. LangGraph is an Agent Orchestrator with which you can build graph applications that automatically execute sequences of actions and in which the LLM is the key decision maker that determines the next step.
Contents
This notebook contains the following parts:
Set up the environment
Before you use the sample code in this notebook, you must perform the following setup tasks:
Create a watsonx.ai Runtime Service instance (a free plan is offered and information about how to create the instance can be found here).
Install and import dependecies
Defining the watsonx.ai credentials
This cell defines the watsonx.ai credentials required to work with watsonx Foundation Model inferencing.
Action: Provide the IBM Cloud user API key. For details, see documentation.
Defining the project id
The Foundation Model requires project id that provides the context for the call. We will obtain the id from the project in which this notebook runs. Otherwise, please provide the project id.
Adaptive RAG Graph
Adaptive RAG is a RAG strategy that combines query analysis with self-correcting RAG. The basic idea behind the LangGraph orchestrator is to capture the entire flow of a selected RAG strategy in graph form. In general, each node in our graph where we call the LLM may have a different model underneath. Therefore, before each step definition, we initialize a separate llm with a specific set of generation parameters.
Below example is prepared based on the LangGraph tutorial.
Initialise WatsonxEmbeddings
To embed documents needed for RAG we use IBM watsonx.ai embeddings.
Create VectorStore and Retriever
As additional data, we will use information about IBM watsonx.ai, watsonx Orchestrate and watsonx Assistant, which can be found on the official pages of the aforementioned products.
Next, we create a collection in Chromadb and add documents.
Router stage
Entry stage in our graph. At this step the LLM decides what path to take to generate response for user question: either using llm's own knowledge or to use the most relevant documents from vector store to improve the response.
Let's check how question_router
is working.
Retrieval Grader
One node in our adaptive RAG graph will be concerned with evaluating downloaded documents. The purpose of this stage is to let the LLM decide whether all retrieved documents are relevant for the user question. Since for the purposes of this example we do not need separate llm in this stage we are going to reuse llm from route stage.
Let see whether second document on the list docs
is proper for our test question.
Generate
The pivotal node of our graph. The purpose of this node is to generate a response based on the retrieved relevant documents or the model's own knowledge. Since the prompt for the scenario where the model needs to use "own" knowledge to generate the response may be slightly different from the RAG prompt, we create a separate one, rag_simple_chain
. On the other hand, for the RAG route we download a specially adopted prompt from LangChain hub.
Moreover, the llm used at this stage should be able to generate a relatively long response, so we initialize a new llm where we set an appropriately large value for the max_new_tokens
parameter.
Hallucination Grader
Hallucination is one of the phenomena that may occur when LLM generate a new content. It appears when a large language model generates a response that is either factually incorrect or ungrounded in the input prompt.
Since a separate llm is not needed for our purposes, so we will reuse the one created in the route stage.
Let us check if our documents are relevant to the generated above text.
Answer Grader
After LLM generates an answer, we check whether it answers the user's question. We are going to use the same llm as in route stage.
Generated LLM's response should be directly related to the user's question.
Question Re-writer
Finally, when the selected documents are not appropriate for the question or the generated answer is not good enough, we can use LLM to rewrite the question in such a way that the newly retrieved documents will better match it.
Graph state
Graph state is one of the central concepts of LangGraph. Each nodes action creates a state that is passed between nodes in the graph as they execute.
Nodes action
The execution of an action by a node updates the internal state of the graph. Note that actions are defined by a custom function.
Set Conditional Edges
The next node to call is selected based on the current node's decision. Therefore, we need to set conditional functions on the edges to make the executor follow the right path.
Build graph
Since all needed components are ready, we can proceed to build an Adaptive RAG Graph.
Compile the graph.
Moreover, to better understand the flow in our graph one can visualize it.
Now let us test our adaptive RAG graph and see which path Agent will choose if we pass a question about an object for which a description can be found in the vector store.
Next, let us check how the Agent will behave if we pass a question unrelated to the data stored in the vector store.
If we ask a question whose content is related to data stored in the vector store, then the application will choose the RAG path. However, when the question is not related to the data stored in the vector store, the LLM will choose to use its "own" knowledge and generate an answer.
Summary and next steps
You successfully completed this notebook!
You learned how to build Adaptive RAG graph using LangGraph and WastonxLLM
.
Check out our Online Documentation for more samples, tutorials, documentation, how-tos, and blog posts.
Authors:
Mateusz Świtała, Software Engineer at watsonx.ai.
Copyright © 2024-2025 IBM. This notebook and its source code are released under the terms of the MIT License.