Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
ibm
GitHub Repository: ibm/watson-machine-learning-samples
Path: blob/master/cpd5.2/notebooks/python_sdk/deployments/ai_services/Use watsonx, and `granite-3-8b-instruct` to run as an AI service.ipynb
6412 views
Kernel: watsonx-ai-samples-py-312

image

Use watsonx, and ibm/granite-3-8b-instruct to run as an AI service

Disclaimers

  • Use only Projects and Spaces that are available in watsonx context.

Notebook content

This notebook provides a detailed demonstration of the steps and code required to showcase support for watsonx.ai AI service.

Some familiarity with Python is helpful. This notebook uses Python 3.12.

Learning goal

The learning goal for your notebook is to leverage AI services to generate accurate and contextually relevant responses based on a question.

Table of Contents

This notebook contains the following parts:

Set up the environment

Before you use the sample code in this notebook, you must perform the following setup tasks:

  • Contact with your Cloud Pak for Data administrator and ask them for your account credentials

Install dependencies

%pip install -U "ibm_watsonx_ai>=1.2.4" | tail -n 1
Successfully installed anyio-4.9.0 certifi-2025.4.26 charset-normalizer-3.4.2 h11-0.16.0 httpcore-1.0.9 httpx-0.28.1 ibm-cos-sdk-2.14.1 ibm-cos-sdk-core-2.14.1 ibm-cos-sdk-s3transfer-2.14.1 ibm_watsonx_ai-1.3.20 idna-3.10 jmespath-1.0.1 lomond-0.3.3 numpy-2.2.6 pandas-2.2.3 pytz-2025.2 requests-2.32.2 sniffio-1.3.1 tabulate-0.9.0 typing_extensions-4.13.2 tzdata-2025.2 urllib3-2.4.0

Define credentials

Authenticate the watsonx.ai Runtime service on IBM Cloud Pak for Data. You need to provide the admin's username and the platform url.

username = "PASTE YOUR USERNAME HERE" url = "PASTE THE PLATFORM URL HERE"

Use the admin's api_key to authenticate watsonx.ai Runtime services:

import getpass from ibm_watsonx_ai import Credentials credentials = Credentials( username=username, api_key=getpass.getpass("Enter your watsonx.ai API key and hit enter: "), url=url, instance_id="openshift", version="5.2", )

Alternatively you can use the admin's password:

import getpass from ibm_watsonx_ai import Credentials if "credentials" not in locals() or not credentials.api_key: credentials = Credentials( username=username, password=getpass.getpass("Enter your watsonx.ai password and hit enter: "), url=url, instance_id="openshift", version="5.2", )

Working with spaces

First of all, you need to create a space that will be used for your work. If you do not have a space, you can use {PLATFORM_URL}/ml-runtime/spaces?context=icp4data to create one.

  • Click New Deployment Space

  • Create an empty space

  • Go to space Settings tab

  • Copy space_id and paste it below

Tip: You can also use SDK to prepare the space for your work. More information can be found here.

Action: Assign space ID below

space_id = "PASTE YOUR SPACE ID HERE"

Create APIClient instance

from ibm_watsonx_ai import APIClient api_client = APIClient(credentials, space_id=space_id)

Specify model

This notebook uses chat model ibm/granite-3-8b-instruct, which has to be available on your Cloud Pak for Data environment for this notebook to run successfully. If this model is not available on your Cloud Pack for Data environment, you can specify any other available chat model. You can list available chat models by running the cell below.

if len(api_client.foundation_models.ChatModels): print(*api_client.foundation_models.ChatModels, sep="\n") else: print( "Chat models are missing in this environment. Install chat models to proceed." )
ibm/granite-3-3-8b-instruct ibm/granite-3-8b-instruct

Specify the model_id of the model you will use for the chat.

model_id = "ibm/granite-3-8b-instruct"

Create AI service

Prepare function which will be deployed using AI service.

Please specify the default parameters that will be passed to the function.

def deployable_ai_service( context, space_id=space_id, url=credentials["url"], model_id=model_id, params={"temperature": 1}, **kwargs ): from ibm_watsonx_ai import APIClient, Credentials from ibm_watsonx_ai.foundation_models import ModelInference api_client = APIClient( credentials=Credentials( url=url, token=context.generate_token(), instance_id="openshift", ), space_id=space_id, ) model = ModelInference( model_id=model_id, api_client=api_client, params=params, ) def generate(context) -> dict: api_client.set_token(context.get_token()) payload = context.get_json() question = payload["question"] messages = [ { "role": "system", "content": "You are a helpful assistant.", }, {"role": "user", "content": question}, ] response = model.chat(messages=messages) return {"body": response} def generate_stream(context): api_client.set_token(context.get_token()) payload = context.get_json() question = payload["question"] messages = [ { "role": "system", "content": "You are a helpful assistant.", }, {"role": "user", "content": question}, ] yield from model.chat_stream(messages) return generate, generate_stream

Testing AI service's function locally

You can test AI service's function locally. Initialize RuntimeContext firstly.

from ibm_watsonx_ai.deployments import RuntimeContext context = RuntimeContext(api_client=api_client)
local_function = deployable_ai_service(context=context)

Prepare request json payload for local invoke.

context.request_payload_json = {"question": "When was IBM founded?"}

Execute the generate function locally.

resp = local_function[0](context) resp
{'body': {'id': 'chatcmpl-bdc8958b-30aa-4153-b59e-2a274b206a50', 'object': 'chat.completion', 'model_id': 'ibm/granite-3-8b-instruct', 'model': 'ibm/granite-3-8b-instruct', 'choices': [{'index': 0, 'message': {'role': 'assistant', 'content': 'IBM was founded on June 16, 1911, in Endicott, New York. It was initially called the Computing-Tabulating-Recording Company (CTR), but it was later renamed International Business Machines in 1924.'}, 'finish_reason': 'stop'}], 'created': 1746532682, 'model_version': '1.0.0', 'created_at': '2025-05-06T11:58:02.540Z', 'usage': {'completion_tokens': 56, 'prompt_tokens': 25, 'total_tokens': 81}, 'system': {'warnings': [{'message': "The value of 'max_tokens' for this model was set to value 1024", 'id': 'unspecified_max_token', 'additional_properties': {'limit': 0, 'new_value': 1024, 'parameter': 'max_tokens', 'value': 0}}]}}}

Execute the generate_stream function locally.

response = local_function[1](context)
for chunk in response: if chunk["choices"]: print(chunk["choices"][0]["delta"].get("content", ""), end="", flush=True)
IBM was formally established on June 16, 1911. The initiating process, however, began when a group of investors, including Charles Ranlett Flint, purchased the Computing-Tabulating-Recording Company (CTR) from its founder, Herman Hollerith. CTR was responsible for the development of mechanical tabulating systems, which were later used during the 1890 U.S. Census. This acquisition laid the foundation for what is now recognized as IBM. The rebranding to International Business Machines occurred four years later, in 1915.

Deploy AI service

Store AI service with previous created custom software specifications.

sw_spec_id = api_client.software_specifications.get_id_by_name("runtime-25.1-py3.12") sw_spec_id
'f47ae1c3-198e-5718-b59d-2ea471561e9e'
meta_props = { api_client.repository.AIServiceMetaNames.NAME: "AI service with SDK", api_client.repository.AIServiceMetaNames.SOFTWARE_SPEC_ID: sw_spec_id, } stored_ai_service_details = api_client.repository.store_ai_service( deployable_ai_service, meta_props )
ai_service_id = api_client.repository.get_ai_service_id(stored_ai_service_details) ai_service_id
'f024dde7-fcc3-4c39-adf7-6b750f77d5a3'

Create online deployment of AI service.

meta_props = { api_client.deployments.ConfigurationMetaNames.NAME: "AI service with SDK", api_client.deployments.ConfigurationMetaNames.ONLINE: {}, } deployment_details = api_client.deployments.create(ai_service_id, meta_props)
###################################################################################### Synchronous deployment creation for id: 'f024dde7-fcc3-4c39-adf7-6b750f77d5a3' started ###################################################################################### initializing Note: online_url is deprecated and will be removed in a future release. Use serving_urls instead. ....... ready ----------------------------------------------------------------------------------------------- Successfully finished deployment creation, deployment_id='a924c998-c6d4-41a7-a93a-86b1358d8de1' -----------------------------------------------------------------------------------------------

Obtain the deployment_id of the previously created deployment.

deployment_id = api_client.deployments.get_id(deployment_details)

Example of Executing an AI service.

Execute generate method.

question = "When was IBM founded?" deployments_results = api_client.deployments.run_ai_service( deployment_id, {"question": question} )
import json print(json.dumps(deployments_results, indent=2))
{ "choices": [ { "finish_reason": "stop", "index": 0, "message": { "content": "IBM was founded on June 16, 1911, in Endicott, New York, originally under the name of Computing-Tabulating-Recording Company (CTR). After several mergers and name changes, it was officially renamed International Business Machines Corporation (IBM) on March 16, 1924.", "role": "assistant" } } ], "created": 1746532766, "created_at": "2025-05-06T11:59:27.395Z", "id": "chatcmpl-ac084a1f-5131-4cd4-a00d-9d469fb650bb", "model": "ibm/granite-3-8b-instruct", "model_id": "ibm/granite-3-8b-instruct", "model_version": "1.0.0", "object": "chat.completion", "system": { "warnings": [ { "additional_properties": { "limit": 0, "new_value": 1024, "parameter": "max_tokens", "value": 0 }, "id": "unspecified_max_token", "message": "The value of 'max_tokens' for this model was set to value 1024" } ] }, "usage": { "completion_tokens": 73, "prompt_tokens": 25, "total_tokens": 98 } }

Execute generate_stream method.

question = "When was IBM founded?" deployments_results = api_client.deployments.run_ai_service_stream( deployment_id, {"question": question} )
import json for chunk in deployments_results: chunk_json = json.loads(chunk) if chunk_json["choices"]: print(chunk_json["choices"][0]["delta"].get("content", ""), end="", flush=True)
IBM (International Business Machines Corporation) was established on June 16, 1911. It was formed from the consolidation of several companies, most notably Computing-Tabulating-Recording Company (CTR). The new corporation, originally named International Business Machines, was created with the goal of entering the tabulating machine business, which was transforming industries through improved methods of data processing and recording.

Summary and next steps

You successfully completed this notebook!

You learned how to create and deploy AI service using ibm_watsonx_ai SDK.

Check out our Online Documentation for more samples, tutorials, documentation, how-tos, and blog posts.

Author

Rafał Chrzanowski, Software Engineer Intern at watsonx.ai.

Copyright © 2025 IBM. This notebook and its source code are released under the terms of the MIT License.