Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
ibm
GitHub Repository: ibm/watson-machine-learning-samples
Path: blob/master/cpd5.1/notebooks/python_sdk/deployments/foundation_models/Use watsonx, and Google `flan-t5-xxl` to analyze car rental customer satisfaction from text.ipynb
6408 views
Kernel: Python 3 (ipykernel)

image

Use watsonx, and Google flan-t5-xxl to analyze car rental customer satisfaction from text

Disclaimers

  • Use only Projects and Spaces that are available in watsonx context.

Notebook content

This notebook contains the steps and code to demonstrate support of text sentiment analysis in watsonx. It introduces commands for data retrieval, model testing and scoring.

Some familiarity with Python is helpful. This notebook uses Python 3.11.

Learning goal

The goal of this notebook is to demonstrate how to use flan-t5-xxl model to analyze customer satisfaction from text.

Contents

This notebook contains the following parts:

Set up the environment

Before you use the sample code in this notebook, you must perform the following setup tasks:

  • Contact with your Cloud Pak for Data administrator and ask them for your account credentials

Install and import the ibm-watsonx-ai and dependecies

Note: ibm-watsonx-ai documentation can be found here.

!pip install wget | tail -n 1 !pip install datasets | tail -n 1 !pip install "scikit-learn==1.3.2" | tail -n 1 !pip install -U ibm-watsonx-ai | tail -n 1

Connection to WML

Authenticate the Watson Machine Learning service on IBM Cloud Pak for Data. You need to provide platform url, your username and api_key.

username = 'PASTE YOUR USERNAME HERE' api_key = 'PASTE YOUR API_KEY HERE' url = 'PASTE THE PLATFORM URL HERE'
from ibm_watsonx_ai import Credentials credentials = Credentials( username=username, api_key=api_key, url=url, instance_id="openshift", version="5.1" )

Alternatively you can use username and password to authenticate WML services.

credentials = Credentials( username=***, password=***, url=***, instance_id="openshift", version="5.1" )

Defining the project id

The Foundation Model requires project id that provides the context for the call. We will obtain the id from the project in which this notebook runs. Otherwise, please provide the project id.

import os try: project_id = os.environ["PROJECT_ID"] except KeyError: project_id = input("Please enter your project_id (hit enter): ")

Data loading

Download the car_rental_training_data dataset. The dataset provides insight about customers opinions on car rental. It has a label that consists of values: unsatisfied, satisfied.

import wget import pandas as pd filename = 'car_rental_training_data.csv' url = 'https://raw.githubusercontent.com/IBM/watson-machine-learning-samples/master/cloud/data/cars-4-you/car_rental_training_data.csv' if not os.path.isfile(filename): wget.download(url, out=filename) data = pd.read_csv("car_rental_training_data.csv", sep=';') comments = list(data.Customer_Service) satisfaction = list(data.Satisfaction)

Examine donwloaded data.

data.head()

Define label map.

label_map= {0: "unsatisfied", 1: "satisfied"}

Inspect data labels distribution.

pd.Series(data['Satisfaction']).value_counts()
Satisfaction 1 274 0 212 Name: count, dtype: int64

Prepare train and test sets.

from sklearn.model_selection import train_test_split data_train, data_test, y_train, y_test = train_test_split(data.Customer_Service, data.Satisfaction, test_size=0.3, random_state=33, stratify=data.Satisfaction) data_train = pd.DataFrame(data_train) data_test = pd.DataFrame(data_test) data_train["satisfaction"] = list(map(label_map.get, y_train)) data_test["satisfaction"] = list(map(label_map.get, y_test))

Foundation Models on watsonx.ai

List available models

All avaliable models are presented under ModelTypes class. For more information refer to documentation.

from ibm_watsonx_ai.foundation_models.utils.enums import ModelTypes print([model.name for model in ModelTypes])
['FLAN_T5_XXL', 'FLAN_UL2', 'MT0_XXL', 'GPT_NEOX', 'MPT_7B_INSTRUCT2', 'STARCODER', 'LLAMA_2_70B_CHAT', 'LLAMA_2_13B_CHAT', 'GRANITE_13B_INSTRUCT', 'GRANITE_13B_CHAT', 'FLAN_T5_XL', 'GRANITE_13B_CHAT_V2', 'GRANITE_13B_INSTRUCT_V2', 'ELYZA_JAPANESE_LLAMA_2_7B_INSTRUCT', 'MIXTRAL_8X7B_INSTRUCT_V01_Q', 'CODELLAMA_34B_INSTRUCT_HF', 'GRANITE_20B_MULTILINGUAL']

You need to specify model_id that will be used for inferencing:

model_id = ModelTypes.FLAN_T5_XXL

Defining the model parameters

You might need to adjust model parameters for different models or tasks, to do so please refer to documentation.

from ibm_watsonx_ai.metanames import GenTextParamsMetaNames as GenParams parameters = { GenParams.DECODING_METHOD: "greedy" }

Initialize the model

Initialize the ModelInference class with previous set params.

from ibm_watsonx_ai.foundation_models import ModelInference model = ModelInference( model_id=model_id, params=parameters, credentials=credentials, project_id=project_id)

Model's details

model.get_details()
{'model_id': 'google/flan-t5-xxl', 'label': 'flan-t5-xxl-11b', 'provider': 'Google', 'source': 'Hugging Face', 'functions': [{'id': 'text_generation'}], 'short_description': 'flan-t5-xxl is an 11 billion parameter model based on the Flan-T5 family.', 'long_description': 'flan-t5-xxl (11B) is an 11 billion parameter model based on the Flan-T5 family. It is a pretrained T5 - an encoder-decoder model pre-trained on a mixture of supervised / unsupervised tasks converted into a text-to-text format, and fine-tuned on the Fine-tuned Language Net (FLAN) with instructions for better zero-shot and few-shot performance.', 'tier': 'class_2', 'number_params': '11b', 'min_shot_size': 0, 'task_ids': ['question_answering', 'summarization', 'retrieval_augmented_generation', 'classification', 'generation', 'extraction'], 'tasks': [{'id': 'question_answering', 'ratings': {'quality': 4}}, {'id': 'summarization', 'ratings': {'quality': 4}}, {'id': 'retrieval_augmented_generation', 'ratings': {'quality': 3}}, {'id': 'classification', 'ratings': {'quality': 4}}, {'id': 'generation'}, {'id': 'extraction', 'ratings': {'quality': 4}}], 'lifecycle': [{'id': 'available', 'since_version': '8.0.0', 'current_state': True}]}

Analyze the sentiment

Define instructions for the model.

instruction = "Classify the satisfaction expressed in this sentence using: satisfied, unsatisfied.\n"

Prepare model inputs - build zero-shot examples from the test set.

import json zero_shot_inputs = [{"input": text} for text in data_test.Customer_Service.values] print(json.dumps(zero_shot_inputs[:5], indent=2))
[ { "input": "Provide more convenient car pickup from the airport parking." }, { "input": "They could really try work harder." }, { "input": "the rep was friendly but it was so loud in there that I could not hear what she was saying. I HATE having to walk across a big lot with all of my bags in search of my car which is always in the furthest corner." }, { "input": "The agents were not friendly when I checked in initially, that was annoying because I had just spent 3 hours on a plane and wanted to be greeted with a better attitude." }, { "input": "It was not as bad as it usually is." } ]

Prepare model inputs - build few-shot examples. To build a few-shot example few instances of training data phrases are passed together with the reference sentiment and then appended with a test data phrase.

In this notebook, training phrases are stratified over all possible sentiments for each test case.

few_shot_inputs = [] singleoutput= [] for test_phrase in data_test.Customer_Service.values: for train_phrase, sentiment in data_train.groupby('satisfaction', group_keys=False).apply(lambda x: x.sample(2)).values: singleoutput.append(f"\tsentence:\t{train_phrase}\n\tsatisfaction: {sentiment}\n") singleoutput.append(f"\tsentence:\t{test_phrase}\n\tsatisfaction:") few_shot_inputs.append("".join(singleoutput)) singleoutput = []

Inspect an exemplary few-shot prompt.

print(json.dumps(print(few_shot_inputs[0]), indent=2))
sentence: Last time I rented a car was when I went skiing with my whole family. We got a Chevy Blazer. We didn't think it was as large as a Ford Explorer, so we asked to switch. The agent was very nice and gave us the Ford Explorer. satisfaction: satisfied sentence: The service was polite and professional. I was attended to quickly and courteously. satisfaction: satisfied sentence: Do not try sell what I do not need. satisfaction: unsatisfied sentence: Most windows were closed. satisfaction: unsatisfied sentence: Provide more convenient car pickup from the airport parking. satisfaction: null

Analyze the satisfaction using Google flan-t5-xxl model.

Analyze the sentiment for a sample of zero-shot inputs from the test set.

results = [] for inp in zero_shot_inputs[:5]: results.append(model.generate(" ".join([instruction, inp['input']]))["results"][0])

Explore model output.

print(json.dumps(results, indent=2))
[ { "generated_text": "unsatisfied", "generated_token_count": 6, "input_token_count": 29, "stop_reason": "eos_token" }, { "generated_text": "unsatisfied", "generated_token_count": 6, "input_token_count": 26, "stop_reason": "eos_token" }, { "generated_text": "unsatisfied", "generated_token_count": 6, "input_token_count": 71, "stop_reason": "eos_token" }, { "generated_text": "unsatisfied", "generated_token_count": 6, "input_token_count": 57, "stop_reason": "eos_token" }, { "generated_text": "satisfied", "generated_token_count": 2, "input_token_count": 29, "stop_reason": "eos_token" } ]

Score the model

Note: To run the Score section for model scoring on the whole car rental customer satisfaction dataset please transform following markdown cells to code cells. Have in mind that scoring model on the whole test set can take significant amount of time.

Get the true labels.

y_true = [label for label in data_test.satisfaction[:5]]

Get the sentiment labels returned by the flan-t5-xxl model.

y_pred = [res["generated_text"] for res in results]

Calculate the accuracy score.

from sklearn.metrics import accuracy_score print(accuracy_score(y_pred, y_true))

HINT: Sentiments generated using few-shot input prompts might provide better performance in terms of accuracy then the zero-shot ones. Following cells present test scores for zero-shot prompts received for the flan-t5-xxl model on the whole test set from this notebook.

The zero-shot test accuracy score:

0.9178082191780822

Summary and next steps

You successfully completed this notebook!

You learned how to analyze car rental customer satisfaction with Google's flan-t5-xxl on watsonx.

Check out our Online Documentation for more samples, tutorials, documentation, how-tos, and blog posts.

Authors

Mateusz Szewczyk, Software Engineer at Watson Machine Learning.

Copyright © 2023-2025 IBM. This notebook and its source code are released under the terms of the MIT License.