Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
ibm
GitHub Repository: ibm/watson-machine-learning-samples
Path: blob/master/cloud/notebooks/python_sdk/deployments/foundation_models/Use watsonx, and Mistral `mistral-medium-2505` to analyze car rental customer satisfaction from text.ipynb
9478 views
Kernel: clear_env

image

Use watsonx, and Mistral mistral-medium-2505 to analyze car rental customer satisfaction from text.

Disclaimers

  • Use only Projects and Spaces that are available in watsonx context.

Notebook content

This notebook contains the steps and code to demonstrate support of text sentiment analysis in watsonx. It introduces commands for data retrieval, model testing and scoring.

Some familiarity with Python is helpful. This notebook uses Python 3.11.

Learning goal

The goal of this notebook is to demonstrate how to use mistral-medium-2505 model to analyze customer satisfaction from text.

Contents

This notebook contains the following parts:

Set up the environment

Before you use the sample code in this notebook, you must perform the following setup tasks:

Install and import dependencies

Note: ibm-watsonx-ai documentation can be found here.

%pip install wget | tail -n 1 %pip install datasets | tail -n 1 %pip install "scikit-learn==1.3.2" | tail -n 1 %pip install -U ibm-watsonx-ai | tail -n 1
Successfully installed wget-3.2 Successfully installed aiohappyeyeballs-2.6.1 aiohttp-3.13.2 aiosignal-1.4.0 anyio-4.11.0 attrs-25.4.0 certifi-2025.10.5 charset_normalizer-3.4.4 click-8.3.0 datasets-4.3.0 dill-0.4.0 filelock-3.20.0 frozenlist-1.8.0 fsspec-2025.9.0 h11-0.16.0 hf-xet-1.2.0 httpcore-1.0.9 httpx-0.28.1 huggingface-hub-1.0.1 idna-3.11 multidict-6.7.0 multiprocess-0.70.16 numpy-2.3.4 pandas-2.3.3 propcache-0.4.1 pyarrow-22.0.0 pytz-2025.2 pyyaml-6.0.3 requests-2.32.5 shellingham-1.5.4 sniffio-1.3.1 tqdm-4.67.1 typer-slim-0.20.0 tzdata-2025.2 urllib3-2.5.0 xxhash-3.6.0 yarl-1.22.0 Successfully installed joblib-1.5.2 numpy-1.26.4 scikit-learn-1.3.2 scipy-1.16.3 threadpoolctl-3.6.0 Successfully installed cachetools-6.2.1 ibm-cos-sdk-2.14.3 ibm-cos-sdk-core-2.14.3 ibm-cos-sdk-s3transfer-2.14.3 ibm-watsonx-ai-1.4.4 jmespath-1.0.1 lomond-0.3.3 pandas-2.2.3 tabulate-0.9.0

Defining the watsonx.ai credentials

This cell defines the watsonx.ai credentials required to work with watsonx Foundation Model inferencing.

Action: Provide the IBM Cloud user API key. For details, see documentation.

import getpass from ibm_watsonx_ai import Credentials credentials = Credentials( url="https://us-south.ml.cloud.ibm.com", api_key=getpass.getpass("Please enter your watsonx.ai api key (hit enter): "), )

Defining the project id

The Foundation Model requires project id that provides the context for the call. We will obtain the id from the project in which this notebook runs. Otherwise, please provide the project id.

import os try: project_id = os.environ["PROJECT_ID"] except KeyError: project_id = input("Please enter your project_id (hit enter): ")
from ibm_watsonx_ai import APIClient api_client = APIClient(credentials=credentials, project_id=project_id)

Data loading

Download the car_rental_training_data dataset. The dataset provides insight about customers opinions on car rental. It has a label that consists of values: unsatisfied, satisfied.

import pandas as pd import wget filename = "car_rental_training_data.csv" url = "https://raw.githubusercontent.com/IBM/watsonx-ai-samples/master/cloud/data/cars-4-you/car_rental_training_data.csv" if not os.path.isfile(filename): wget.download(url, out=filename) data = pd.read_csv("car_rental_training_data.csv", sep=";") comments = list(data.Customer_Service) satisfaction = list(data.Satisfaction)

Examine donwloaded data.

data.head()

Define label map.

label_map = {0: "unsatisfied", 1: "satisfied"}

Inspect data labels distribution.

data.value_counts("Satisfaction")
Satisfaction 1 274 0 212 Name: count, dtype: int64

Prepare train and test sets.

from sklearn.model_selection import train_test_split data_train, data_test, y_train, y_test = train_test_split( data.Customer_Service, data.Satisfaction, test_size=0.3, random_state=33, stratify=data.Satisfaction, ) data_train = pd.DataFrame(data_train) data_test = pd.DataFrame(data_test) data_train["satisfaction"] = list(map(label_map.get, y_train)) data_test["satisfaction"] = list(map(label_map.get, y_test))

Foundation Models on watsonx.ai

List available models

All avaliable models are presented under TextModels class. For more information refer to documentation.

print([model.name for model in api_client.foundation_models.TextModels])
['DEFENCE_GRANITE', 'GRANITE_3_2_8B_INSTRUCT', 'GRANITE_3_2B_INSTRUCT', 'GRANITE_3_3_8B_INSTRUCT', 'GRANITE_3_8B_INSTRUCT', 'GRANITE_4_H_SMALL', 'GRANITE_8B_CODE_INSTRUCT', 'GRANITE_GUARDIAN_3_8B', 'GRANITE_VISION_3_2_2B', 'LLAMA_3_2_11B_VISION_INSTRUCT', 'LLAMA_3_2_90B_VISION_INSTRUCT', 'LLAMA_3_3_70B_INSTRUCT', 'LLAMA_3_405B_INSTRUCT', 'LLAMA_4_MAVERICK_17B_128E_INSTRUCT_FP8', 'LLAMA_GUARD_3_11B_VISION', 'MISTRAL_MEDIUM_2505', 'MISTRAL_SMALL_3_1_24B_INSTRUCT_2503', 'GPT_OSS_120B', 'ALLAM_1_13B_INSTRUCT']

You need to specify model_id that will be used for inferencing:

model_id = api_client.foundation_models.TextModels.MISTRAL_MEDIUM_2505

Defining the model parameters

You might need to adjust model parameters for different models or tasks, to do so please refer to documentation.

from ibm_watsonx_ai.metanames import GenTextParamsMetaNames as GenParams parameters = {GenParams.DECODING_METHOD: "greedy"}

Initialize the model

Initialize the ModelInference class with previous set params.

from ibm_watsonx_ai.foundation_models import ModelInference model = ModelInference( model_id=model_id, params=parameters, credentials=credentials, project_id=project_id )

Model's details

model.get_details()

Analyze the sentiment

Define instructions for the model.

instruction = "Classify the satisfaction expressed in this sentence using: satisfied, unsatisfied.\n"

Prepare model inputs - build zero-shot examples from the test set.

import json zero_shot_inputs = [{"input": text} for text in data_test.Customer_Service.values] print(json.dumps(zero_shot_inputs[:5], indent=2))
[ { "input": "Provide more convenient car pickup from the airport parking." }, { "input": "They could really try work harder." }, { "input": "the rep was friendly but it was so loud in there that I could not hear what she was saying. I HATE having to walk across a big lot with all of my bags in search of my car which is always in the furthest corner." }, { "input": "The agents were not friendly when I checked in initially, that was annoying because I had just spent 3 hours on a plane and wanted to be greeted with a better attitude." }, { "input": "It was not as bad as it usually is." } ]

Prepare model inputs - build few-shot examples. To build a few-shot example few instances of training data phrases are passed together with the reference sentiment and then appended with a test data phrase.

In this notebook, training phrases are stratified over all possible sentiments for each test case.

few_shot_inputs = [] singleoutput = [] for test_phrase in data_test.Customer_Service.values: for train_phrase, sentiment in ( data_train.groupby("satisfaction", group_keys=False) .apply(lambda x: x.sample(2)) .values ): singleoutput.append( f"\tsentence:\t{train_phrase}\n\tsatisfaction: {sentiment}\n" ) singleoutput.append(f"\tsentence:\t{test_phrase}\n\tsatisfaction:") few_shot_inputs.append("".join(singleoutput)) singleoutput = []

Inspect an exemplary few-shot prompt.

print(json.dumps(print(few_shot_inputs[0]), indent=2))
sentence: Customer service was fine satisfaction: satisfied sentence: they were trying to satisfy me even though my wife tells me I was being rude satisfaction: satisfied sentence: I do not understand why I have to pay additional fee if vehicle is returned without a full tank. satisfaction: unsatisfied sentence: They had problem to find reservation number, so whole process took too long time. satisfaction: unsatisfied sentence: Provide more convenient car pickup from the airport parking. satisfaction: null

Analyze the satisfaction using Google mistral-medium-2505 model.

Analyze the sentiment for a sample of zero-shot inputs from the test set.

results = [] for inp in zero_shot_inputs[:5]: results.append(model.generate(" ".join([instruction, inp["input"]]))["results"][0])

Explore model output.

print(json.dumps(results, indent=2))
[ { "generated_text": " On the other hand, the parking facility is inadequate.\"\n2 : Unsatisfied .\nConversation:\n", "generated_token_count": 20, "input_token_count": 26, "stop_reason": "max_tokens" }, { "generated_text": " = unsatisfied.\nCreate a Syntax tree for the sentence below:\nIt's just blowing through temperature gau", "generated_token_count": 20, "input_token_count": 23, "stop_reason": "max_tokens" }, { "generated_text": "*\n\nThe satisfaction expressed in this sentence would be classified as **unsatisfied**.\n\nThe tone of the", "generated_token_count": 20, "input_token_count": 66, "stop_reason": "max_tokens" }, { "generated_text": " Also my room had a faint odor of smoke making it uncomfortable.\nunsatisfied.\n```", "generated_token_count": 19, "input_token_count": 51, "stop_reason": "eos_token" }, { "generated_text": " It was not a big issue.\n\n1. satisfied\n2. unsatisfied\n\nIn my opinion the", "generated_token_count": 20, "input_token_count": 26, "stop_reason": "max_tokens" } ]

Score the model

Note: To run the Score section for model scoring on the whole car rental customer satisfaction dataset please transform following markdown cells to code cells. Have in mind that scoring model on the whole test set can take significant amount of time.

Get the true labels.

y_true = [label for label in data_test.satisfaction[:5]]

Get the sentiment labels returned by the mistral-medium-2505 model.

y_pred = [res["generated_text"] for res in results]

Calculate the accuracy score.

from sklearn.metrics import accuracy_score print(accuracy_score(y_pred, y_true))

HINT: Sentiments generated using few-shot input prompts might provide better performance in terms of accuracy then the zero-shot ones. Following cells present test scores for zero-shot prompts received for the mistral-medium-2505 model on the whole test set from this notebook.

The zero-shot test accuracy score:

0.9178082191780822

Summary and next steps

You successfully completed this notebook!

You learned how to analyze car rental customer satisfaction with Google's mistral-medium-2505 on watsonx.

Check out our Online Documentation for more samples, tutorials, documentation, how-tos, and blog posts.

Authors

Mateusz Szewczyk, Software Engineer at watsonx.ai.

Copyright © 2023-2026 IBM. This notebook and its source code are released under the terms of the MIT License.