Path: blob/master/cpd5.3/notebooks/python_sdk/experiments/autoai_rag/Use AutoAI RAG with watsonx Text Extraction service.ipynb
5121 views

Use AutoAI RAG with watsonx Text Extraction service
Disclaimers
Use only Projects and Spaces that are available in the watsonx context.
Notebook content
This notebook demonstrates how to process data using the IBM watsonx.ai Text Extraction service and use the result in an AutoAI RAG experiment. The data used in this notebook is from the Granite Code Models paper.
Some familiarity with Python is helpful. This notebook uses Python 3.12.
Learning goal
The learning goals of this notebook are:
Process data using the IBM watsonx.ai Text Extraction service
Create an AutoAI RAG job that will find the best RAG pattern based on processed data
Contents
This notebook contains the following parts:
Install and import the required modules and dependencies
Successfully installed Pillow-12.1.0 SQLAlchemy-2.0.45 XlsxWriter-3.2.9 aiohappyeyeballs-2.6.1 aiohttp-3.13.3 aiosignal-1.4.0 annotated-types-0.7.0 anyio-4.12.1 attrs-25.4.0 backoff-2.2.1 bcrypt-5.0.0 beautifulsoup4-4.13.5 build-1.4.0 cachetools-6.2.4 certifi-2026.1.4 charset_normalizer-3.4.4 chromadb-1.4.0 click-8.3.1 coloredlogs-15.0.1 dataclasses-json-0.6.7 distro-1.9.0 durationpy-0.10 elastic-transport-8.17.1 elasticsearch-8.19.3 et-xmlfile-2.0.0 filelock-3.20.2 flatbuffers-25.12.19 frozenlist-1.8.0 fsspec-2025.12.0 google-auth-2.47.0 googleapis-common-protos-1.72.0 grpcio-1.76.0 h11-0.16.0 hf-xet-1.2.0 httpcore-1.0.9 httptools-0.7.1 httpx-0.28.1 httpx-sse-0.4.3 huggingface-hub-1.2.4 humanfriendly-10.0 ibm-cos-sdk-2.14.3 ibm-cos-sdk-core-2.14.3 ibm-cos-sdk-s3transfer-2.14.3 ibm-db-3.2.8 ibm-watsonx-ai-1.4.11 idna-3.11 importlib-metadata-8.7.1 importlib-resources-6.5.2 jmespath-1.0.1 joblib-1.5.3 jsonpatch-1.33 jsonpointer-3.0.0 jsonschema-4.26.0 jsonschema-specifications-2025.9.1 kubernetes-33.1.0 langchain-0.3.27 langchain-chroma-0.2.5 langchain-community-0.3.31 langchain-core-0.3.81 langchain-db2-0.1.7 langchain-elasticsearch-0.3.2 langchain-ibm-0.3.20 langchain-milvus-0.2.1 langchain-text-splitters-0.3.11 langgraph-0.6.11 langgraph-checkpoint-3.0.1 langgraph-prebuilt-0.6.5 langgraph-sdk-0.2.15 langsmith-0.6.1 lomond-0.3.3 lxml-6.0.2 markdown-3.8.2 markdown-it-py-4.0.0 marshmallow-3.26.2 mdurl-0.1.2 mmh3-5.2.0 mpmath-1.3.0 multidict-6.7.0 mypy-extensions-1.1.0 numpy-2.4.0 oauthlib-3.3.1 onnxruntime-1.23.2 openpyxl-3.1.5 opentelemetry-api-1.39.1 opentelemetry-exporter-otlp-proto-common-1.39.1 opentelemetry-exporter-otlp-proto-grpc-1.39.1 opentelemetry-proto-1.39.1 opentelemetry-sdk-1.39.1 opentelemetry-semantic-conventions-0.60b1 orjson-3.11.5 ormsgpack-1.12.1 overrides-7.7.0 pandas-2.2.3 posthog-5.4.0 propcache-0.4.1 protobuf-6.33.2 pyYAML-6.0.3 pyasn1-0.6.1 pyasn1-modules-0.4.2 pybase64-1.4.3 pydantic-2.12.5 pydantic-core-2.41.5 pydantic-settings-2.12.0 pymilvus-2.6.6 pypdf-6.5.0 pypika-0.48.9 pyproject_hooks-1.2.0 python-docx-1.2.0 python-dotenv-1.2.1 python-pptx-1.0.2 pytz-2025.2 referencing-0.37.0 requests-2.32.5 requests-oauthlib-2.0.0 requests-toolbelt-1.0.0 rich-14.2.0 rpds-py-0.30.0 rsa-4.9.1 scikit-learn-1.8.0 scipy-1.16.3 shellingham-1.5.4 simsimd-6.5.12 soupsieve-2.8.1 sympy-1.14.0 tabulate-0.9.0 tenacity-9.1.2 threadpoolctl-3.6.0 tokenizers-0.22.2 tqdm-4.67.1 typer-0.21.1 typer-slim-0.21.1 typing-inspect-0.9.0 typing-inspection-0.4.2 tzdata-2025.3 urllib3-2.6.3 uuid-utils-0.13.0 uvicorn-0.40.0 uvloop-0.22.1 watchfiles-1.1.1 websocket-client-1.9.0 websockets-15.0.1 xxhash-3.6.0 yarl-1.22.0 zipp-3.23.0 zstandard-0.25.0
Note: you may need to restart the kernel to use updated packages.
Collecting wget
Using cached wget-3.2-py3-none-any.whl
Installing collected packages: wget
Successfully installed wget-3.2
Note: you may need to restart the kernel to use updated packages.
Connect to WML
Authenticate the Watson Machine Learning service on IBM Cloud Pak® for Data. You need to provide the platform url, your username, and your api_key.
url- url which points to your CPD instance.username- username to your CPD instance.
Alternatively, you can use your username and password to authenticate WML services.
Working with spaces
First, you need to create a space for your work. If you do not have a space already created, you can use {PLATFORM_URL}/ml-runtime/spaces?context=icp4data to create one.
Click New Deployment Space
Create an empty space
Go to the space
SettingstabCopy
Space GUIDinto your env file or else enter it in the window which will show up after running below cell
Tip: You can also use SDK to prepare the space for your work. Find more information in the Space Management sample notebook.
Action: Assign the space ID below
Create an instance of APIClient with authentication details
Create an instance of COS client
Connect to the default COS instance for the provided space by using the ibm_boto3 package.
Create a new bucket.
Initialize the client connection to the created bucket and get the connection ID.
Prepare data and connections for the Text Extraction service
The document, from which we are going to extract text, is located in the IBM Cloud Object Storage (COS). In this notebook, we will use the Granite Code Models paper as a source text document. The final results file, which will contain extracted text and necessary metadata, will be placed in the COS. So we will use the ibm_watsonx_ai.helpers.DataConnection and the ibm_watsonx_ai.helpers.S3Location class to create Python objects that will represent the references to the processed files. Reference to the final results will be used as an input for the AutoAI RAG experiment.
Download and upload training data to the COS bucket. Then define a connection to the uploaded file.
Input file connection.
Output file connection.
Initialize the Text Extraction service endpoint.
Run a text extraction job for connections created in the previous step.
Get the text extraction result.
Granite Code Models: A Family of Open Foundation Models for Code Intelligence
Mayank Mishra⋆ Matt Stallone⋆ Gaoyuan Zhang⋆ Yikang Shen Aditya Prasad Adriana Meza Soria Michele Merler Parameswaran Selvam Saptha Surendran Shivdeep Singh Manish Sethi Xuan-Hong Dang Pengyuan Li Kun-Lung Wu Syed Zawad Andrew Coleman Matthew White Mark Lewis Raju Pavuluri Yan Koyfman Boris Lublinsky Maximilien de Bayser Ibrahim Abdelaziz Kinjal Basu Mayank Agarwal Yi Zhou Chris Johnson Aanchal Goyal Hima Patel Yousaf Shah Petros Zerfos Heiko Ludwig Asim Munawar Maxwell Crouse Pavan Kapanipathi Shweta Salaria Bob Calio Sophia Wen Seetharami Seelam Brian Belgodere Carlos Fonseca Amith Singhee Nirmit Desai David D. Cox Ruchir Puri† Rameswar Panda†
IBM Research ⋆Equal Contribution
†Corresponding Authors [email protected], [email protected]
Abstract
Large Language Models (LLMs) trained on code are revolutionizing the software development process. Increasingly, code LLMs are being inte grated into software development environments to improve the produc tivity of human programmers, and LLM-based agents are beginning to show promise for handling complex tasks autonomously. Realizing the full potential of code LLMs requires a wide range of capabilities, including code generation, fixing bugs, explaining and documenting code, maintaining repositories, and more. In this work, we introduce the Granite series of decoder-only code models for code generative tasks, trained with code written in 116 programming languages. The Granite Code models family consists of models ranging in size from 3 to 34 billion parameters, suitable for applications ranging from complex application modernization tasks to on-device memory-constrained use cases. Evaluation on a comprehensive set of tasks demonstrates that Granite Code models consistently reaches state-of-the-art performance among available open-source code LLMs. The Granite Code model family was optimized for enterprise software devel opment workflows and performs well across a range of coding tasks (e.g. code generation, fixing and explanation), making it a versatile “all around” code model. We release all our Granite Code models under an Apache 2.0 license for both research and commercial use.
https://github.com/ibm-granite/granite-code-models
1 Introduction
Over the last several decades, software has been woven into the fabric of every aspect of our society. As demand for software development surges, it is more critical than ever to increase software development productivity, and LLMs provide promising path for augmenting human programmers. Prominent enterprise use cases for LLMs in software development productivity include code generation, code explanation, code fixing, unit test and documentation generation, application modernization, vulnerability detection, code translation, and more.
Recent years have seen rapid progress in LLM’s ability to generate and manipulate code, and a range of models with impressive coding abi
Upload a json file to use for benchmarking to COS and define a connection to this file.
Note: correct_answer_document_ids must refer to the document processed by text extraction service, not the initial document.
Test the data connection.
Use the reference to the Text Extraction job result as input for the AutoAI RAG experiment.
Call the run() method to trigger the AutoAI RAG experiment. Choose one of two modes:
To use the interactive mode (synchronous job), specify
background_mode=FalseTo use the background mode (asynchronous job), specify
background_mode=True
Get the selected pattern
Get the RAGPattern object from the RAG Optimizer experiment. By default, the RAGPattern of the best pattern is returned.
Test the RAGPattern by querying it locally.
Deploy the RAGPattern
Store the defined RAG function and create a deployed asset to deploy the RAGPattern.
Test the deployed function
The RAG service is now deployed in the space. To test the solution, run the cell below. Questions have to be provided in the payload. Their format is provided below.
Summary
You successfully completed this notebook!
You learned how to use AutoAI RAG with documents processed by the TextExtraction service.
Check out our Online Documentation for more samples, tutorials, documentation, how-tos, and blog posts.
Author:
Paweł Kocur, Software Engineer at watsonx.ai.
Copyright © 2025-2026 IBM. This notebook and its source code are released under the terms of the MIT License.