Path: blob/master/cpd5.2/notebooks/python_sdk/deployments/ai_services/Use watsonx to run AI service and switch deployments using serving name.ipynb
5176 views

Use watsonx to run AI service and switch deployments using serving name
Disclaimers
Use only Projects and Spaces that are available in watsonx context.
Notebook content
This notebook provides a detailed demonstration of the steps and code required to showcase support for watsonx.ai AI service.
Some familiarity with Python is helpful. This notebook uses Python 3.11.
Learning goal
The goal is to demonstrate how an AI service deployment, identified by a serving name, can switch between different deployments that use different LLMs with minimal downtime.
Table of Contents
This notebook contains the following parts:
Install dependencies
Successfully installed anyio-4.11.0 cachetools-6.2.0 certifi-2025.8.3 charset_normalizer-3.4.3 h11-0.16.0 httpcore-1.0.9 httpx-0.28.1 ibm-cos-sdk-2.14.3 ibm-cos-sdk-core-2.14.3 ibm-cos-sdk-s3transfer-2.14.3 idna-3.10 jmespath-1.0.1 lomond-0.3.3 numpy-2.3.3 pandas-2.2.3 pytz-2025.2 requests-2.32.5 sniffio-1.3.1 tabulate-0.9.0 tzdata-2025.2 urllib3-2.5.0
Successfully installed PyYAML-6.0.3 annotated-types-0.7.0 jsonpatch-1.33 jsonpointer-3.0.0 langchain-core-0.3.77 langchain-ibm-0.3.18 langsmith-0.4.32 orjson-3.11.3 pydantic-2.11.9 pydantic-core-2.33.2 requests-toolbelt-1.0.0 tenacity-9.1.2 typing-inspection-0.4.2 zstandard-0.25.0
Define credentials
Authenticate the watsonx.ai Runtime service on IBM Cloud Pak® for Data. You need to provide the admin's username and the platform url.
Use the admin's api_key to authenticate watsonx.ai Runtime services:
Alternatively you can use the admin's password:
Working with spaces
First of all, you need to create a space that will be used for your work. If you do not have a space, you can use {PLATFORM_URL}/ml-runtime/spaces?context=icp4data to create one.
Click New Deployment Space
Create an empty space
Go to space
SettingstabCopy
space_idand paste it below
Tip: You can also use SDK to prepare the space for your work. More information can be found here.
Action: Assign space ID below
Create APIClient instance
Specify model
This notebook uses text models ibm/granite-3-2b-instruct and meta-llama/llama-3-1-8b-instruct, which have to be available on your IBM Cloud Pak® for Data environment for this notebook to run successfully. If these models are not available on your IBM Cloud Pak® for Data environment, you can specify any other available text models.
You can list available text models by running the cell below.
Execute the generate function locally.
Execute the generate_stream function locally.
Store AI service which uses ibm/granite-3-2b-instruct
Create online deployment of AI service with serving name
Execute generate_stream method, use serving name
Store the second AI service which uses meta-llama/llama-3-1-8b-instruct
Create online deployment of AI service without serving name, otherwise it will complain of non unique serving name.
Wait few seconds, and patch the second deployment with the serving name
The new deployment is now accessible with the serving name
Execute generate_batch method with serving name
Summary and next steps
You successfully completed this notebook!
You learned how to create and deploy an AI service using the ibm_watsonx_ai SDK.
You also learned how to use a serving name to switch between deployments, enabling the service to use different LLMs. The deployment downtime occurs between deleting the prior deployment and patching the new one with the serving name. This results in only minimal downtime.
Check out our Online Documentation for more samples, tutorials, documentation, how-tos, and blog posts.
Author
Ginbiaksang Naulak, Senior Software Engineer at IBM watsonx.ai
Copyright © 2025-2026 IBM. This notebook and its source code are released under the terms of the MIT License.