Path: blob/master/cloud/notebooks/python_sdk/experiments/autoai_rag/Use AutoAI RAG with predefined Milvus index to create a pattern about IBM.ipynb
5168 views

Use AutoAI RAG with predefined Milvus index to create a pattern about IBM
Disclaimers
Use only Projects and Spaces that are available in watsonx context.
Notebook content
This notebook contains the steps and code to demonstrate the usage of IBM AutoAI RAG with predefined vector store collection. Although this example uses Milvus, Elasticsearch and Chroma databases can be used similarly. Note that the AutoAI RAG experiment conducted in this notebook uses data scraped from the ibm-watsonx-ai SDK documentation.
Some familiarity with Python is helpful. This notebook uses Python 3.11.
Learning goal
The learning goals of this notebook are:
Create an AutoAI RAG job that will find the best RAG pattern based on collection created from
ibm-watsonx-aiSDK documentation.
Contents
This notebook contains the following parts:
Set up the environment
Before you use the sample code in this notebook, you must perform the following setup tasks:
Create a watsonx.ai Runtime Service instance (a free plan is offered and information about how to create the instance can be found here).
Install and import the required modules and dependencies
Successfully installed Pillow-12.0.0 SQLAlchemy-2.0.44 XlsxWriter-3.2.9 aiohappyeyeballs-2.6.1 aiohttp-3.13.2 aiosignal-1.4.0 annotated-types-0.7.0 anyio-4.11.0 attrs-25.4.0 backoff-2.2.1 bcrypt-5.0.0 beautifulsoup4-4.13.5 build-1.3.0 cachetools-6.2.2 certifi-2025.11.12 charset_normalizer-3.4.4 chromadb-1.3.5 click-8.3.1 coloredlogs-15.0.1 dataclasses-json-0.6.7 distro-1.9.0 durationpy-0.10 elastic-transport-8.17.1 elasticsearch-8.19.2 et-xmlfile-2.0.0 filelock-3.20.0 flatbuffers-25.9.23 frozenlist-1.8.0 fsspec-2025.10.0 google-auth-2.43.0 googleapis-common-protos-1.72.0 grpcio-1.76.0 h11-0.16.0 hf-xet-1.2.0 httpcore-1.0.9 httptools-0.7.1 httpx-0.28.1 httpx-sse-0.4.3 huggingface-hub-1.1.5 humanfriendly-10.0 ibm-cos-sdk-2.14.3 ibm-cos-sdk-core-2.14.3 ibm-cos-sdk-s3transfer-2.14.3 ibm-db-3.2.7 ibm-watsonx-ai-1.4.7 idna-3.11 importlib-resources-6.5.2 jmespath-1.0.1 joblib-1.5.2 jsonpatch-1.33 jsonpointer-3.0.0 jsonschema-4.25.1 jsonschema-specifications-2025.9.1 kubernetes-33.1.0 langchain-0.3.27 langchain-chroma-0.2.5 langchain-community-0.3.31 langchain-core-0.3.80 langchain-db2-0.1.7 langchain-elasticsearch-0.3.2 langchain-ibm-0.3.20 langchain-milvus-0.2.1 langchain-text-splitters-0.3.11 langgraph-0.6.11 langgraph-checkpoint-3.0.1 langgraph-prebuilt-0.6.5 langgraph-sdk-0.2.10 langsmith-0.4.48 lomond-0.3.3 lxml-6.0.2 markdown-3.8.2 markdown-it-py-4.0.0 marshmallow-3.26.1 mdurl-0.1.2 mmh3-5.2.0 mpmath-1.3.0 multidict-6.7.0 mypy-extensions-1.1.0 numpy-2.3.5 oauthlib-3.3.1 onnxruntime-1.23.2 openpyxl-3.1.5 opentelemetry-api-1.38.0 opentelemetry-exporter-otlp-proto-common-1.38.0 opentelemetry-exporter-otlp-proto-grpc-1.38.0 opentelemetry-proto-1.38.0 opentelemetry-sdk-1.38.0 opentelemetry-semantic-conventions-0.59b0 orjson-3.11.4 ormsgpack-1.12.0 overrides-7.7.0 pandas-2.2.3 posthog-5.4.0 propcache-0.4.1 protobuf-6.33.1 pyYAML-6.0.3 pyasn1-0.6.1 pyasn1-modules-0.4.2 pybase64-1.4.2 pydantic-2.12.5 pydantic-core-2.41.5 pydantic-settings-2.12.0 pymilvus-2.6.4 pypdf-6.4.0 pypika-0.48.9 pyproject_hooks-1.2.0 python-docx-1.2.0 python-dotenv-1.2.1 python-pptx-1.0.2 pytz-2025.2 referencing-0.37.0 requests-2.32.5 requests-oauthlib-2.0.0 requests-toolbelt-1.0.0 rich-14.2.0 rpds-py-0.29.0 rsa-4.9.1 scikit-learn-1.7.2 scipy-1.16.3 shellingham-1.5.4 simsimd-6.5.3 sniffio-1.3.1 soupsieve-2.8 sympy-1.14.0 tabulate-0.9.0 tenacity-9.1.2 threadpoolctl-3.6.0 tokenizers-0.22.1 tqdm-4.67.1 typer-0.20.0 typer-slim-0.20.0 typing-inspect-0.9.0 typing-inspection-0.4.2 tzdata-2025.2 urllib3-2.5.0 uvicorn-0.38.0 uvloop-0.22.1 watchfiles-1.1.1 websocket-client-1.9.0 websockets-15.0.1 xxhash-3.6.0 yarl-1.22.0 zstandard-0.25.0
Note: you may need to restart the kernel to use updated packages.
Defining the watsonx.ai credentials
This cell defines the credentials required to work with the watsonx.ai Runtime service.
Action: Provide the IBM Cloud user API key. For details, see documentation.
Working with spaces
You need to create a space that will be used for your work. If you do not have a space, you can use Deployment Spaces Dashboard to create one.
Click New Deployment Space
Create an empty space
Select Cloud Object Storage
Select watsonx.ai Runtime instance and press Create
Go to Manage tab
Copy
Space GUIDinto your env file or else enter it in the window which will show up after running below cell
Tip: You can also use SDK to prepare the space for your work. More information can be found here.
Action: assign space ID below
Create an instance of APIClient with authentication details.
Download example data. You can also assign your own text to document content.
Chunk and upload your document to the vector store.
Define a reference to knowledge base.
Defining a connection to test data
Upload a json file that will be used for benchmarking to COS and then define a connection to this file. Define benchmarking question about your knowledge base. Replace the questions below.
Upload testing data to the bucket as a json file.
Define connection information to testing data.
RAG Optimizer configuration
Provide the input information for AutoAI RAG optimizer:
name- experiment namedescription- experiment descriptionmax_number_of_rag_patterns- maximum number of RAG patterns to createoptimization_metrics- target optimization metrics
Configuration parameters can be retrieved via get_params().
You can use the get_run_status() method to monitor AutoAI RAG jobs in background mode.
Additionally, you can pass the scoring parameter to the summary method, to filter RAG patterns starting with the best.
Get selected pattern
Get the RAGPattern object from the RAG Optimizer experiment. By default, the RAGPattern of the best pattern is returned.
The pattern details can be retrieved by calling the get_pattern_details method:
Query the RAGPattern locally, to test it.
Deploy RAGPattern
Deployment is done by storing the defined RAG function and then by creating a deployed asset.
Test the deployed function
RAG service is now deployed in our space. To test our solution we can run the cell below. Questions have to be provided in the payload. Their format is provided below.
Get executed optimizer's configuration parameters
Get historical rag_optimizer instance and training details
List trained patterns for selected optimizer
To delete the deployment, use the delete method.
Warning: Keeping the deployment active may lead to unnecessary consumption of Compute Unit Hours (CUHs).
To delete obsolete collections, us the clear method of MilvusVectorStore
If you want to clean up all created assets:
experiments
trainings
pipelines
model definitions
models
functions
deployments
please follow up this sample notebook.
Summary and next steps
You successfully completed this notebook!
You learned how to use ibm-watsonx-ai to run AutoAI RAG experiments.
Check out our Online Documentation for more samples, tutorials, documentation, how-tos, and blog posts.
Authors
Paweł Kocur, Software Engineer watsonx.ai
Copyright © 2025-2026 IBM. This notebook and its source code are released under the terms of the MIT License.