GitHub Repository: ibm/watson-machine-learning-samples
Path: blob/master/cpd5.2/notebooks/python_sdk/deployments/ai_services/Use watsonx to run generate_batch job using AI service.ipynb
⁶⁴¹² views

Kernel: test_env

Use watsonx to run `generate_batch` job using AI service

Disclaimers

Use only Projects and Spaces that are available in watsonx context.

Notebook content

This notebook provides a detailed demonstration of the steps and code required to showcase support for watsonx.ai AI service.

Some familiarity with Python is helpful. This notebook uses Python 3.11.

Learning goal

The primary objective of this notebook is to illustrate how to utilize watsonx.ai AI services to execute a generate_batch job, facilitating the ingestion of documents into a Milvus vector database.

This notebook contains the following parts:

Set up the environment

Before you use the sample code in this notebook, you must perform the following setup tasks:

Contact with your Cloud Pak for Data administrator and ask them for your account credentials

Install and import the `ibm-watsonx-ai` and dependecies

Note: ibm-watsonx-ai documentation can be found here.

In [ ]:

%pip install -U langchain_community | tail -n 1
%pip install -U "ibm_watsonx_ai>=1.3.20" | tail -n 1

Connect to WML

Authenticate the Watson Machine Learning service on IBM Cloud Pak for Data. You need to provide the platform url, your username, and your api_key.

url - url which points to your CPD instance.
username - username to your CPD instance.
api_key - api_key to your CPD instance.

In [ ]:

url = "PASTE YOUR CPD INSTANCE URL HERE"
api_key = "PASTE YOUR CPD INSTANCE API KEY HERE"
username = "PASTE YOUR CPD INSTANCE USERNAME HERE"

In [ ]:

from ibm_watsonx_ai import Credentials

credentials = Credentials(
    username=username,
    api_key=api_key,
    url=url,
    instance_id="openshift",
    version="5.2"
)

Alternatively you can use username and password to authenticate WML services.

credentials = Credentials(
    username=***,
    password=***,
    url=***,
    instance_id="openshift",
    version="5.2"
)

Working with spaces

First, you need to create a space for your work. If you do not have a space already created, you can use {PLATFORM_URL}/ml-runtime/spaces?context=icp4data to create one.

Click New Deployment Space
Create an empty space
Go to the space Settings tab
Copy the space_id and paste it below

PLATFORM_URL is the url which points to your CPD instance.

Tip: You can also use SDK to prepare the space for your work. Find more information in the Space Management sample notebook.

Action: Assign the space ID below

In [3]:

import os

try:
    space_id = os.environ["SPACE_ID"]
except KeyError:
    space_id = input("Please enter your space_id (hit enter): ")

Create an instance of APIClient with authentication details.

In [4]:

from ibm_watsonx_ai import APIClient

api_client = APIClient(
    credentials=credentials, 
    space_id=space_id
)

Create an embedding function for VectorStore

Note that you can feed a custom embedding function to be used by Milvus. The performance of Milvus may differ depending on the embedding model used.

Note: To list available embedding models use:

api_client.foundation_models.EmbeddingModels.show()

In [6]:

from ibm_watsonx_ai.foundation_models import Embeddings

embedding_model_id="ibm/granite-embedding-107m-multilingual"

embeddings = Embeddings(
    model_id=embedding_model_id,
    api_client=api_client
)

Set up connectivity information to Milvus

This notebook focuses on a self-managed Milvus cluster using IBM watsonx.data.

The following cell retrieves the Milvus username, password, host, and port from the environment (if available) and prompts you to provide them manually in case of failure.

You can provide a connection asset ID to read all required connection data from it. Before doing so, make sure that a connection asset was created in your space.

In [8]:

import os
import getpass

milvus_connection_id = input("Provide connection asset ID in your space. Skip this, if you wish to type credentials by hand and hit enter: ") or None

if milvus_connection_id is None:
    try:
        username = os.environ["USERNAME"]
    except KeyError:
        username = input("Please enter your Milvus user name and hit enter: ")
    try:
        password = os.environ["PASSWORD"]
    except KeyError:
        password = getpass.getpass("Please enter your Milvus password and hit enter: ")
    try:
        host = os.environ["HOST"]
    except KeyError:
        host = input("Please enter your Milvus hostname and hit enter: ")
    try:
        port = os.environ["PORT"]
    except KeyError:
        port = input("Please enter your Milvus port number and hit enter: ")
    try:
        ssl = os.environ["SSL"]
    except:
        ssl = bool(input("Please enter ('y'/anything) if your Milvus instance has SSL enabled. Skip if it is not: "))

    # Create connection
    milvus_data_source_type_id = api_client.connections.get_datasource_type_uid_by_name(
        "milvus"
    )
    details = api_client.connections.create(
        {
            api_client.connections.ConfigurationMetaNames.NAME: "Milvus Connection",
            api_client.connections.ConfigurationMetaNames.DESCRIPTION: "Connection created by the sample notebook",
            api_client.connections.ConfigurationMetaNames.DATASOURCE_TYPE: milvus_data_source_type_id,
            api_client.connections.ConfigurationMetaNames.PROPERTIES: {
                "host": host,
                "port": port,
                "username": username,
                "password": password,
                "ssl": ssl,
            },
        }
    )

    milvus_connection_id = api_client.connections.get_id(details)

Set up VectorStore with Milvus credentials

Create a VectorStore class that automatically detects the database type (in our case it will be Milvus) and allows us to add, search and delete documents.

It works as a wrapper for LangChain VectorStore classes. You can customize the settings as long as it is supported. Consult the LangChain documentation for more information about Milvus connector.

Provide the name of your Milvus index for subsequent operations:

In [20]:

index_name = input("Please enter Milvus index name and hit enter: ")

print(f"{index_name=}")

Out[20]:

index_name='example_milvus_index'

In [21]:

from ibm_watsonx_ai.foundation_models.extensions.rag.vector_stores import VectorStore

vector_store = VectorStore(
    api_client=api_client, 
    connection_id=milvus_connection_id, 
    index_name=index_name, 
    embeddings=embeddings
)

Verify if index in Milvus instance is empty.

Note: If collection is not empty you can use clear method on VectorStore object:

vector_store.clear()

In [22]:

vector_store.count()

Out[22]:

0

References to input data

This example uses the ModelInference description from the ibm_watsonx_ai documentation.

In [23]:

from langchain_community.document_loaders import WebBaseLoader

url = "https://ibm.github.io/watsonx-ai-python-sdk/fm_model_inference.html"

docs = WebBaseLoader(url).load()
model_inference_content = docs[0].page_content

In [24]:

import os

document_filename = "ModelInference.txt"

if not os.path.isfile(document_filename):
    with open(document_filename, "w") as file:
        file.write(model_inference_content)

Upload the input data to the space as a data asset.

In [25]:

document_asset_details = api_client.data_assets.create(name=document_filename, file_path=document_filename)

document_asset_id = api_client.data_assets.get_id(document_asset_details)
document_asset_id

Out[25]:

Creating data asset...
SUCCESS

'ddf93f89-095a-4146-bcb3-3e03f4a9d178'

Define a connection to the input data.

In [26]:

from ibm_watsonx_ai.helpers import DataConnection

data_connection = DataConnection(data_asset_id=document_asset_id)
data_connection.set_client(api_client=api_client)

Create input_data_references as a dict representation.

In [27]:

import json

input_data_references = [
  data_connection.to_dict()
]

print(json.dumps(input_data_references, indent=2))

Out[27]:

[
  {
    "type": "data_asset",
    "location": {
      "href": "/v2/assets/ddf93f89-095a-4146-bcb3-3e03f4a9d178?space_id=d29cf5c7-428e-46a6-97d2-6ceec613fbc9",
      "id": "ddf93f89-095a-4146-bcb3-3e03f4a9d178"
    }
  }
]

Create AI service

Prepare function which will be deployed using AI service.

Please specify the default parameters that will be passed to the function.

In [28]:

def deployable_ai_service(context, url=None, embedding_model_id=None, milvus_connection_id=None, index_name=None):

    from ibm_watsonx_ai import APIClient
    from ibm_watsonx_ai import Credentials
    from ibm_watsonx_ai.helpers import DataConnection
    from ibm_watsonx_ai.foundation_models.embeddings import Embeddings
    from ibm_watsonx_ai.foundation_models.extensions.rag.vector_stores import VectorStore
    from ibm_watsonx_ai.data_loaders.datasets.documents import DocumentsIterableDataset
    from ibm_watsonx_ai.foundation_models.extensions.rag.chunker.langchain_chunker import LangChainChunker
    
    api_client = APIClient(
        credentials=Credentials(url=url, token=context.generate_token(), instance_id="openshift"),
        space_id=context.get_space_id(),
    )
    print("Successfully initialized APIClient")
    
    embeddings = Embeddings(
        model_id=embedding_model_id, 
        api_client=api_client
    )
    print("Successfully initialized Embeddings")

    def generate_batch(input_data_references: list[dict], output_data_reference: dict | None = None) -> None:
        
        vector_store = VectorStore(
            api_client=api_client, 
            connection_id=milvus_connection_id, 
            index_name=index_name, 
            embeddings=embeddings
        )
        print("Successfully initialized VectorStore")
        
        connections = []
        
        for input_data_reference in input_data_references:
            connections.append(DataConnection.from_dict(input_data_reference))
    
        dataset = DocumentsIterableDataset(
            connections=connections, 
            enable_sampling=False, 
            api_client=api_client
        )
        print("Successfully initialized DocumentsIterableDataset")

        chunker = LangChainChunker(
            chunk_size=512,
        )
        print("Successfully initialized LangChainChunker")

        documents = chunker.split_documents(dataset)
        print("Successfully splitted documents")
        
        vector_store.add_documents(documents)
        print("Documents added")
        
        vector_store_count = vector_store.count()
        print(f"Vector Store count: {vector_store_count}")

    return generate_batch

Testing AI service's function locally

You can test AI service's function locally. Initialise RuntimeContext firstly.

In [29]:

from ibm_watsonx_ai.deployments import RuntimeContext

context = RuntimeContext(api_client=api_client)

generate_batch = deployable_ai_service(
    context, 
    url=credentials["url"], 
    embedding_model_id=embedding_model_id, 
    milvus_connection_id=milvus_connection_id, 
    index_name=index_name
)

Out[29]:

Successfully initialized APIClient
Successfully initialized Embeddings

In [ ]:

generate_batch(input_data_references)

Successfully initialized VectorStore
Successfully initialized DocumentsIterableDataset
Successfully initialized LangChainChunker
Successfully splitted documents
Documents added
Vector Store count: 82

Verify the total number of documents currently stored within the Milvus collection.

Note: Due to the implementation specifics of Milvus, it is necessary to initialize a new VectorStore instance in order to accurately retrieve the count of indexed elements.

In [32]:

vector_store = VectorStore(
    api_client=api_client, 
    connection_id=milvus_connection_id, 
    index_name=index_name, 
    embeddings=embeddings
)

vector_store.count()

Out[32]:

82

Once the collection accurately reflects the expected number of items, proceed to clear its contents to prepare the environment for subsequent testing activities.

In [33]:

vector_store.clear()
vector_store.count()

Out[33]:

0

Deploy AI service

Store AI service with previous created custom software specifications.

In [34]:

sw_spec_id = api_client.software_specifications.get_id_by_name("runtime-24.1-py3.11")
print(f"{sw_spec_id=}")

Out[34]:

sw_spec_id='45f12dfe-aa78-5b8d-9f38-0ee223c47309'

In [35]:

meta_props = {
    api_client.repository.AIServiceMetaNames.NAME: "AI service with generate_batch",
    api_client.repository.AIServiceMetaNames.DESCRIPTION: "Sample AI service with implemented generate_batch",
    api_client.repository.AIServiceMetaNames.SOFTWARE_SPEC_ID: sw_spec_id
}
stored_ai_service_details = api_client.repository.store_ai_service(deployable_ai_service, meta_props)

print(json.dumps(stored_ai_service_details, indent=2))

Out[35]:

{
  "metadata": {
    "name": "AI service with generate_batch",
    "description": "Sample AI service with implemented generate_batch",
    "space_id": "d29cf5c7-428e-46a6-97d2-6ceec613fbc9",
    "id": "f4b23cc2-3b36-4201-adc5-72318dfc39a9",
    "created_at": "2025-05-28T08:02:01Z",
    "rov": {
      "member_roles": {
        "1000330999": {
          "user_iam_id": "1000330999",
          "roles": [
            "OWNER"
          ]
        }
      }
    },
    "owner": "1000330999"
  },
  "entity": {
    "software_spec": {
      "id": "45f12dfe-aa78-5b8d-9f38-0ee223c47309"
    },
    "code_type": "python"
  }
}

In [36]:

ai_service_id = api_client.repository.get_ai_service_id(stored_ai_service_details)
print(f"{ai_service_id=}")

Out[36]:

ai_service_id='f4b23cc2-3b36-4201-adc5-72318dfc39a9'

Create batch deployment of AI service.

In [37]:

deployment_details = api_client.deployments.create(
    artifact_id=ai_service_id,
    meta_props={
        api_client.deployments.ConfigurationMetaNames.NAME: "Vector Store Batch Deployment",
        api_client.deployments.ConfigurationMetaNames.BATCH: {
            "parameters": {
                "url": credentials["url"],
                "embedding_model_id": embedding_model_id,
                "milvus_connection_id": milvus_connection_id,
                "index_name": index_name
            }
        }, 
        api_client.deployments.ConfigurationMetaNames.HARDWARE_SPEC: {
            "id": api_client.hardware_specifications.get_id_by_name("L")
        }
    }
)

deployment_id = api_client.deployments.get_id(deployment_details)
print(f"{deployment_id=}")

Out[37]:

######################################################################################

Synchronous deployment creation for id: 'f4b23cc2-3b36-4201-adc5-72318dfc39a9' started

######################################################################################

ready.

-----------------------------------------------------------------------------------------------
Successfully finished deployment creation, deployment_id='f3bd2825-7eca-45ed-b598-7aa5e87cf217'
-----------------------------------------------------------------------------------------------

deployment_id='f3bd2825-7eca-45ed-b598-7aa5e87cf217'

Example of executing an AI service

Run the following cells to create and run a job with the deployed AI service.

In [38]:

def poll_async_job(wml_client, job_id):
    import time

    while True:
        job_status = wml_client.deployments.get_job_status(job_id)
        print(job_status)
        state = job_status["state"]
        if state == "completed" or "fail" in state:
            return wml_client.deployments.get_job_details(job_id)
        time.sleep(5)

In [39]:

batch_reference_payload = {
    "input_data_references": input_data_references,
}

job_details = api_client.deployments.create_job(deployment_id, batch_reference_payload)
job_id = api_client.deployments.get_job_id(job_details)
print(f"{job_id=}")

Out[39]:

job_id='b7e944de-5172-44ea-9680-3e2e8580bbfc'

In [ ]:

job_details = poll_async_job(api_client, job_id)

{'completed_at': '', 'running_at': '', 'state': 'queued'}
{'completed_at': '', 'running_at': '', 'state': 'queued'}
{'completed_at': '', 'running_at': '', 'state': 'queued'}
{'completed_at': '', 'running_at': '', 'state': 'queued'}
{'completed_at': '', 'running_at': '2025-05-28T08:06:14.912Z', 'state': 'running'}
{'completed_at': '', 'running_at': '2025-05-28T08:06:14.912Z', 'state': 'running'}
{'completed_at': '', 'running_at': '2025-05-28T08:06:14.912Z', 'state': 'running'}
{'completed_at': '', 'running_at': '2025-05-28T08:06:14.912Z', 'state': 'running'}
{'completed_at': '2025-05-28T08:06:36.948059Z', 'running_at': '2025-05-28T08:06:14.582494Z', 'state': 'completed'}

Verify the total number of documents currently stored within the Milvus collection.

Note: Due to the implementation specifics of Milvus, it is necessary to initialize a new VectorStore instance in order to accurately retrieve the count of indexed elements.

In [43]:

vector_store = VectorStore(
    api_client=api_client, 
    connection_id=milvus_connection_id, 
    index_name=index_name, 
    embeddings=embeddings
)

vector_store.count()

Out[43]:

82

Cleanup

Please execute the following commands to clean up and decommission all resources provisioned during the execution of this notebook.

In [44]:

# Delete deployment job

api_client.deployments.delete_job(job_id)

Out[44]:

'SUCCESS'

In [45]:

# Delete batch deployment

api_client.deployments.delete(deployment_id)

Out[45]:

'SUCCESS'

In [46]:

# Delete AI service asset

api_client.repository.delete(ai_service_id)

Out[46]:

'SUCCESS'

In [47]:

# Delete `ModelInference.txt` asset

api_client.data_assets.delete(document_asset_id)

Out[47]:

'SUCCESS'

In [48]:

# Delete `ModelInference.txt` file locally

import os

if os.path.exists(document_filename):
    os.remove(document_filename)
    print(f"{document_filename} has been deleted.")
else:
    print(f"{document_filename} does not exist.")

Out[48]:

ModelInference.txt has been deleted.

Summary and next steps

You successfully completed this notebook!

You have successfully learned how to design and deploy an AI service utilizing the generate_batch functionality, leveraging the capabilities of the ibm_watsonx_ai SDK.

Check out our Online Documentation for more samples, tutorials, documentation, how-tos, and blog posts.

Author

Mateusz Szewczyk, Software Engineer at watsonx.ai.

Use watsonx to run `generate_batch` job using AI service

Disclaimers

Notebook content

Learning goal

Table of Contents

Set up the environment

Install and import the `ibm-watsonx-ai` and dependecies

Connect to WML

Working with spaces

Create an embedding function for VectorStore

Set up connectivity information to Milvus

Set up VectorStore with Milvus credentials

References to input data

Create AI service

Testing AI service's function locally

Deploy AI service

Example of executing an AI service

Cleanup

Summary and next steps

Author

Product

Resources

Company

Use watsonx to run generate_batch job using AI service

Disclaimers

Notebook content

Learning goal

Table of Contents

Set up the environment

Install and import the ibm-watsonx-ai and dependecies

Connect to WML

Working with spaces

Create an embedding function for VectorStore

Set up connectivity information to Milvus

Set up VectorStore with Milvus credentials

References to input data

Create AI service

Testing AI service's function locally

Deploy AI service

Example of executing an AI service

Cleanup

Summary and next steps

Author

Use watsonx to run `generate_batch` job using AI service

Install and import the `ibm-watsonx-ai` and dependecies