Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
ibm
GitHub Repository: ibm/watson-machine-learning-samples
Path: blob/master/cloud/notebooks/python_sdk/deployments/foundation_models/Use watsonx, and LangChain to make a series of calls to a language model.ipynb
9484 views
Kernel: .venv_watsonx_ai_samples_py_312

image

Use watsonx, and LangChain to make a series of calls to a language model

Disclaimers

  • Use only Projects and Spaces that are available in watsonx context.

Notebook content

This notebook contains the steps and code to demonstrate Sequential Chain using langchain integration with watsonx models.

Some familiarity with Python is helpful. This notebook uses Python 3.12.

Learning goal

The goal of this notebook is to demonstrate how to chain mistral-small-3-1-24b-instruct-2503 and mistral-medium-2505 models to generate a sequence of creating a random question on a given topic and an answer to that question and also to make the user friends with LangChain framework, using simple chain (LLMChain) and the extended chain (SequentialChain) with the WatsonxLLM.

Contents

This notebook contains the following parts:

Set up the environment

Before you use the sample code in this notebook, you must perform the following setup tasks:

Install and import dependencies

Note: ibm-watsonx-ai documentation can be found here.

%pip install -U "langchain>=0.3.25,<0.4" | tail -n 1 %pip install -U "langchain_ibm>=0.3.10,<0.4" | tail -n 1
Successfully installed PyYAML-6.0.3 SQLAlchemy-2.0.45 annotated-types-0.7.0 anyio-4.12.1 certifi-2026.1.4 charset_normalizer-3.4.4 h11-0.16.0 httpcore-1.0.9 httpx-0.28.1 idna-3.11 jsonpatch-1.33 jsonpointer-3.0.0 langchain-0.3.27 langchain-core-0.3.83 langchain-text-splitters-0.3.11 langsmith-0.6.3 orjson-3.11.5 pydantic-2.12.5 pydantic-core-2.41.5 requests-2.32.5 requests-toolbelt-1.0.0 tenacity-9.1.2 typing-extensions-4.15.0 typing-inspection-0.4.2 urllib3-2.6.3 uuid-utils-0.13.0 zstandard-0.25.0 Successfully installed cachetools-6.2.4 ibm-cos-sdk-2.14.3 ibm-cos-sdk-core-2.14.3 ibm-cos-sdk-s3transfer-2.14.3 ibm-watsonx-ai-1.5.0 jmespath-1.0.1 langchain_ibm-0.3.20 lomond-0.3.3 numpy-2.4.1 pandas-2.2.3 pytz-2025.2 tabulate-0.9.0 tzdata-2025.3

Defining the watsonx.ai credentials

This cell defines the watsonx.ai credentials required to work with watsonx Foundation Model inferencing.

Action: Provide the IBM Cloud user API key. For details, see documentation.

import getpass from ibm_watsonx_ai import Credentials credentials = Credentials( url="https://us-south.ml.cloud.ibm.com", api_key=getpass.getpass("Please enter your watsonx.ai api key (hit enter): "), )

Working with spaces

You need to create a space that will be used for your work. If you do not have a space, you can use Deployment Spaces Dashboard to create one.

  • Click New Deployment Space

  • Create an empty space

  • Select Cloud Object Storage

  • Select watsonx.ai Runtime instance and press Create

  • Go to Manage tab

  • Copy Space GUID and paste it below

Tip: You can also use SDK to prepare the space for your work. More information can be found here.

Action: assign space ID below

import os try: space_id = os.environ["SPACE_ID"] except KeyError: space_id = input("Please enter your space_id (hit enter): ")

Create an instance of APIClient with authentication details.

from ibm_watsonx_ai import APIClient client = APIClient(credentials=credentials, space_id=space_id)

Foundation Models on watsonx.ai

List available models

All avaliable models are presented under TextModels class.

client.foundation_models.TextModels.show()
{'GRANITE_3_2_8B_INSTRUCT': 'ibm/granite-3-2-8b-instruct', 'GRANITE_3_3_8B_INSTRUCT': 'ibm/granite-3-3-8b-instruct', 'GRANITE_3_8B_INSTRUCT': 'ibm/granite-3-8b-instruct', 'GRANITE_4_H_SMALL': 'ibm/granite-4-h-small', 'GRANITE_8B_CODE_INSTRUCT': 'ibm/granite-8b-code-instruct', 'GRANITE_GUARDIAN_3_8B': 'ibm/granite-guardian-3-8b', 'LLAMA_3_2_11B_VISION_INSTRUCT': 'meta-llama/llama-3-2-11b-vision-instruct', 'LLAMA_3_2_90B_VISION_INSTRUCT': 'meta-llama/llama-3-2-90b-vision-instruct', 'LLAMA_3_3_70B_INSTRUCT': 'meta-llama/llama-3-3-70b-instruct', 'LLAMA_3_405B_INSTRUCT': 'meta-llama/llama-3-405b-instruct', 'LLAMA_4_MAVERICK_17B_128E_INSTRUCT_FP8': 'meta-llama/llama-4-maverick-17b-128e-instruct-fp8', 'LLAMA_GUARD_3_11B_VISION': 'meta-llama/llama-guard-3-11b-vision', 'MISTRAL_MEDIUM_2505': 'mistralai/mistral-medium-2505', 'MISTRAL_SMALL_3_1_24B_INSTRUCT_2503': 'mistralai/mistral-small-3-1-24b-instruct-2503', 'GPT_OSS_120B': 'openai/gpt-oss-120b'}

You need to specify model_id's that will be used for inferencing:

model_id_1 = "mistralai/mistral-small-3-1-24b-instruct-2503" model_id_2 = "mistralai/mistral-medium-2505"

Defining the model parameters

You might need to adjust model parameters for different models or tasks, to do so please refer to documentation under GenTextParamsMetaNames class.

Action: If any complications please refer to the documentation.

from ibm_watsonx_ai.foundation_models.utils.enums import DecodingMethods from ibm_watsonx_ai.metanames import GenTextParamsMetaNames as GenParams parameters = { GenParams.DECODING_METHOD: DecodingMethods.SAMPLE.value, GenParams.MAX_NEW_TOKENS: 200, GenParams.MIN_NEW_TOKENS: 1, GenParams.TEMPERATURE: 0.1, GenParams.TOP_K: 50, GenParams.TOP_P: 1, }

LangChain integration

WatsonxLLM is a wrapper around watsonx.ai models that provide chain integration around the models.

Action: For more details about CustomLLM check the LangChain documentation

Initialize the WatsonxLLM class.

from langchain_ibm import WatsonxLLM mistral_small_llm = WatsonxLLM( model_id=model_id_1, url=credentials.url, apikey=credentials.api_key, space_id=space_id, params=parameters, ) mistral_medium_llm = WatsonxLLM( model_id=model_id_2, url=credentials.url, apikey=credentials.api_key, space_id=space_id, )

You can print all set data about the WatsonxLLM object using the dict() method.

mistral_small_llm.dict()
{'model_id': 'mistralai/mistral-small-3-1-24b-instruct-2503', 'deployment_id': None, 'params': {'decoding_method': 'sample', 'max_new_tokens': 200, 'min_new_tokens': 1, 'temperature': 0.1, 'top_k': 50, 'top_p': 1}, 'project_id': None, 'space_id': 'fb3d528a-bf16-460e-bcf6-06f05d8ba57c', '_type': 'IBM watsonx.ai'}

Sequential Chain experiment

The simplest type of sequential chain is called a SequentialChain, in which each step has a single input and output and the output of one step serves as the input for the following step.

The experiment will consist in generating a random question about any topic and answer the following question.

An object called PromptTemplate assists in generating prompts using a combination of user input, additional non-static data, and a fixed template string.

In our case we would like to create two PromptTemplate objects which will be responsible for creating a random question and answering it.

from langchain_core.prompts import PromptTemplate prompt_1 = PromptTemplate( input_variables=["topic"], template="Describe a single specific person related to {topic}. The description must fit only this exact person. Do not use their name anywhere in your response. ", ) prompt_2 = PromptTemplate( input_variables=["description"], template="{description} The previous text describes a person. Their name is: ", )

In the chain below, prompt_1 is passed to mistral_small_llm, then its response is added to prompt_2, which is then passed to mistral_medium_llm, after which we receive the response:

chain = prompt_1 | mistral_small_llm | prompt_2 | mistral_medium_llm

Generate random question and answer to topic.

from langchain.callbacks.tracers import ConsoleCallbackHandler chain.invoke({"topic": "Formula 1"}, config={"callbacks": [ConsoleCallbackHandler()]})
[chain/start] [chain:RunnableSequence] Entering Chain run with input: { "topic": "Formula 1" } [chain/start] [chain:RunnableSequence > prompt:PromptTemplate] Entering Prompt run with input: { "topic": "Formula 1" } [chain/end] [chain:RunnableSequence > prompt:PromptTemplate] [0ms] Exiting Prompt run with output: [outputs] [llm/start] [chain:RunnableSequence > llm:WatsonxLLM] Entering LLM run with input: { "prompts": [ "Describe a single specific person related to Formula 1. The description must fit only this exact person. Do not use their name anywhere in your response." ] } [llm/end] [chain:RunnableSequence > llm:WatsonxLLM] [2.96s] Exiting LLM run with output: { "generations": [ [ { "text": "1960s Formula 1 driver, born in the United Kingdom, who was the first to win a Grand Prix in a car with a rear-mounted engine. The driver was also the first to win a Grand Prix in a car with a monocoque chassis. The driver was the first to win a Grand Prix in a car with a semi-automatic gearbox. The driver was the first to win a Grand Prix in a car with a carbon-fiber chassis. The driver was the first to win a Grand Prix in a car with a turbocharged engine. The driver was the first to win a Grand Prix in a car with a ground-effect aerodynamics. The driver was the first to win a Grand Prix in a car with a carbon-fiber brake discs. The driver was the first to win a Grand Prix in a car with a traction control system. The driver was the first to win a Grand Prix in a car with a carbon-fiber wishbones. The driver", "generation_info": { "finish_reason": "max_tokens" }, "type": "Generation" } ] ], "llm_output": { "token_usage": { "completion_tokens": 200, "prompt_tokens": 33, "total_tokens": 233 }, "model_id": "mistralai/mistral-small-3-1-24b-instruct-2503", "deployment_id": null }, "run": null, "type": "LLMResult" } [chain/start] [chain:RunnableSequence > prompt:PromptTemplate] Entering Prompt run with input: { "input": "1960s Formula 1 driver, born in the United Kingdom, who was the first to win a Grand Prix in a car with a rear-mounted engine. The driver was also the first to win a Grand Prix in a car with a monocoque chassis. The driver was the first to win a Grand Prix in a car with a semi-automatic gearbox. The driver was the first to win a Grand Prix in a car with a carbon-fiber chassis. The driver was the first to win a Grand Prix in a car with a turbocharged engine. The driver was the first to win a Grand Prix in a car with a ground-effect aerodynamics. The driver was the first to win a Grand Prix in a car with a carbon-fiber brake discs. The driver was the first to win a Grand Prix in a car with a traction control system. The driver was the first to win a Grand Prix in a car with a carbon-fiber wishbones. The driver" } [chain/end] [chain:RunnableSequence > prompt:PromptTemplate] [0ms] Exiting Prompt run with output: [outputs] [llm/start] [chain:RunnableSequence > llm:WatsonxLLM] Entering LLM run with input: { "prompts": [ "1960s Formula 1 driver, born in the United Kingdom, who was the first to win a Grand Prix in a car with a rear-mounted engine. The driver was also the first to win a Grand Prix in a car with a monocoque chassis. The driver was the first to win a Grand Prix in a car with a semi-automatic gearbox. The driver was the first to win a Grand Prix in a car with a carbon-fiber chassis. The driver was the first to win a Grand Prix in a car with a turbocharged engine. The driver was the first to win a Grand Prix in a car with a ground-effect aerodynamics. The driver was the first to win a Grand Prix in a car with a carbon-fiber brake discs. The driver was the first to win a Grand Prix in a car with a traction control system. The driver was the first to win a Grand Prix in a car with a carbon-fiber wishbones. The driver The previous text describes a person. Their name is:" ] } [llm/end] [chain:RunnableSequence > llm:WatsonxLLM] [793ms] Exiting LLM run with output: { "generations": [ [ { "text": "1960s Formula 1 driver, born in the United Kingdom, who was the first", "generation_info": { "finish_reason": "max_tokens" }, "type": "Generation" } ] ], "llm_output": { "token_usage": { "completion_tokens": 20, "prompt_tokens": 213, "total_tokens": 233 }, "model_id": "mistralai/mistral-medium-2505", "deployment_id": null }, "run": null, "type": "LLMResult" } [chain/end] [chain:RunnableSequence] [3.75s] Exiting Chain run with output: { "output": "1960s Formula 1 driver, born in the United Kingdom, who was the first" }
'1960s Formula 1 driver, born in the United Kingdom, who was the first'

AI Service

Let's wrap the chain code within Python function that can be used to create an AI service.

Function implementation.

Let's wrap the above chain code into function.

def chain_text_generator(context, url=credentials.url, parameters=parameters): from ibm_watsonx_ai import APIClient, Credentials from langchain_core.prompts import PromptTemplate from langchain_ibm import WatsonxLLM api_client = APIClient( credentials=Credentials(url=url, token=context.generate_token()), space_id=context.get_space_id(), ) mistral_small_llm = WatsonxLLM( model_id="mistralai/mistral-small-3-1-24b-instruct-2503", params=parameters, watsonx_client=api_client, ) mistral_medium_llm = WatsonxLLM( model_id="mistralai/mistral-medium-2505", watsonx_client=api_client, ) prompt_1 = PromptTemplate( input_variables=["topic"], template="Describe a single specific person related to {topic}. The description must fit only this exact person. Do not use their name anywhere in your response. ", ) prompt_2 = PromptTemplate( input_variables=["description"], template="{description} The previous text describes a person. Their name is: ", ) chain = prompt_1 | mistral_small_llm | prompt_2 | mistral_medium_llm def generate(context) -> dict: """Generates a description of a person based on provided topic and returns guesses on who that might be.""" api_client.set_token(context.get_token()) topic = context.get_json()["topic"] answer = chain.invoke({"topic": topic}) return {"body": {"topic": topic, "answer": answer}} return generate

Test the function

It is good practice to validate the code locally first.

from ibm_watsonx_ai.deployments import RuntimeContext context = RuntimeContext(api_client=client)
context.request_payload_json = {"topic": "Football"} inference = chain_text_generator(context) inference(context)
{'body': {'topic': 'Football', 'answer': '1. Eric Cantona 2. Paul Gascoigne 3. David Beckham 4'}}

Custom inference endpoint

Create the online deployment of python function.

Store the function

sw_spec_id = client.software_specifications.get_id_by_name("genai-A25-py3.12") meta_props = { client.repository.FunctionMetaNames.NAME: "SequenceChain LLM AI service", client.repository.FunctionMetaNames.SOFTWARE_SPEC_ID: sw_spec_id, } ai_service_details = client.repository.store_ai_service( chain_text_generator, meta_props ) ai_service_id = client.repository.get_ai_service_id(ai_service_details)

Create online deployment

metadata = { client.deployments.ConfigurationMetaNames.NAME: "Deployment of LLMs chain AI service", client.deployments.ConfigurationMetaNames.ONLINE: {}, } ai_service_deployment = client.deployments.create(ai_service_id, meta_props=metadata)
###################################################################################### Synchronous deployment creation for id: '727d35d4-08ea-41b7-b44b-de6b50f51ed7' started ###################################################################################### initializing Note: online_url and serving_urls are deprecated and will be removed in a future release. Use inference instead. ..... ready ----------------------------------------------------------------------------------------------- Successfully finished deployment creation, deployment_id='e0a20388-d51e-4c3d-a54a-da4b0aa5fdb7' -----------------------------------------------------------------------------------------------

Scoring

Generate text using custom inference endpoint.

deployment_id = client.deployments.get_id(ai_service_deployment) client.deployments.run_ai_service(deployment_id, {"topic": "Football"})
{'answer': '1. Paul Gascoigne 2. Glenn Hoddle 3. Chris Waddle ', 'topic': 'Football'}

Summary and next steps

You successfully completed this notebook!

You learned how to use Sequential Chain using custom llm WastonxLLM.

Check out our Online Documentation for more samples, tutorials, documentation, how-tos, and blog posts.

Authors:

Lukasz Cmielowski, PhD, Senior Technical Staff Member at watsonx.ai.

Mateusz Szewczyk, Software Engineer at watsonx.ai.

Copyright © 2024-2026 IBM. This notebook and its source code are released under the terms of the MIT License.