GitHub Repository: ibm/watson-machine-learning-samples
Path: blob/master/cpd4.8/notebooks/python_sdk/deployments/foundation_models/Use watsonx, and `granite-13b-instruct` to analyze car rental customer satisfaction from text.ipynb
⁶⁴⁰⁵ views

Kernel: Python 3 (ipykernel)

Use watsonx, and `ibm/granite-13b-instruct-v2` to analyze car rental customer satisfaction from text

Disclaimers

Use only Projects and Spaces that are available in watsonx context.

Notebook content

This notebook contains the steps and code to demonstrate support of text sentiment analysis in watsonx. It introduces commands for data retrieval, model testing and scoring.

Some familiarity with Python is helpful. This notebook uses Python 3.10.

Learning goal

The goal of this notebook is to demonstrate how to use ibm/granite-13b-instruct-v2 model to analyze customer satisfaction from text.

This notebook contains the following parts:

Set up the environment

Before you use the sample code in this notebook, you must perform the following setup tasks:

Contact with your Cloud Pack for Data administrator and ask him for your account credentials

Install and import the `datasets` and dependecies

In [ ]:

!pip install wget | tail -n 1
!pip install datasets | tail -n 1
!pip install scikit-learn | tail -n 1
!pip install "ibm-watson-machine-learning>=1.0.348" | tail -n 1

Connection to WML

Authenticate the Watson Machine Learning service on IBM Cloud Pack for Data. You need to provide platform url, your username and api_key.

In [2]:

username = 'PASTE YOUR USERNAME HERE'
api_key = 'PASTE YOUR API_KEY HERE'
url = 'PASTE THE PLATFORM URL HERE'

In [2]:

wml_credentials = {
    "username": username,
    "apikey": api_key,
    "url": url,
    "instance_id": 'openshift',
    "version": '4.8'
}

Alternatively you can use username and password to authenticate WML services.

wml_credentials = {
    "username": ***,
    "password": ***,
    "url": ***,
    "instance_id": 'openshift',
    "version": '4.8'
}

Defining the project id

The Foundation Model requires project id that provides the context for the call. We will obtain the id from the project in which this notebook runs. Otherwise, please provide the project id.

In [3]:

import os

try:
    project_id = os.environ["PROJECT_ID"]
except KeyError:
    project_id = input("Please enter your project_id (hit enter): ")

Data loading

Download the car_rental_training_data dataset. The dataset provides insight about customers opinions on car rental. It has a label that consists of values: unsatisfied, satisfied.

In [4]:

import wget
import pandas as pd

filename = 'car_rental_training_data.csv'
url = 'https://raw.githubusercontent.com/IBM/watson-machine-learning-samples/master/cloud/data/cars-4-you/car_rental_training_data.csv'

if not os.path.isfile(filename): 
    wget.download(url, out=filename)

df = pd.read_csv("car_rental_training_data.csv", sep=';')
data = df[['Customer_Service', 'Satisfaction']]

Examine downloaded data.

In [5]:

data.head()

Out[5]:

Prepare train and test sets.

In [46]:

from sklearn.model_selection import train_test_split

train, test = train_test_split(data, test_size=0.2)
comments = list(test.Customer_Service)
satisfaction = list(test.Satisfaction)

Foundation Models on `watsonx.ai`

List available models

All avaliable models are presented under ModelTypes class. For more information refer to documentation.

In [7]:

from ibm_watson_machine_learning.foundation_models.utils.enums import ModelTypes

print([model.name for model in ModelTypes])

Out[7]:

['FLAN_T5_XXL', 'FLAN_UL2', 'MT0_XXL', 'GPT_NEOX', 'MPT_7B_INSTRUCT2', 'STARCODER', 'LLAMA_2_70B_CHAT', 'LLAMA_2_13B_CHAT', 'GRANITE_13B_INSTRUCT', 'GRANITE_13B_CHAT', 'FLAN_T5_XL', 'GRANITE_13B_CHAT_V2', 'GRANITE_13B_INSTRUCT_V2', 'ELYZA_JAPANESE_LLAMA_2_7B_INSTRUCT']

You need to specify model_id that will be used for inferencing:

In [11]:

model_id = ModelTypes.GRANITE_13B_INSTRUCT_V2

Defining the model parameters

You might need to adjust model parameters for different models or tasks, to do so please refer to documentation.

In [9]:

from ibm_watson_machine_learning.metanames import GenTextParamsMetaNames as GenParams
from ibm_watson_machine_learning.foundation_models.utils.enums import DecodingMethods

parameters = {
    GenParams.MIN_NEW_TOKENS: 0,
    GenParams.MAX_NEW_TOKENS: 1,
    GenParams.DECODING_METHOD: DecodingMethods.GREEDY,
    GenParams.REPETITION_PENALTY: 1
}

Initialize the model

Initialize the ModelInference class with previous set params.

In [ ]:

from ibm_watson_machine_learning.foundation_models import ModelInference

model = ModelInference(
    model_id=model_id, 
    params=parameters, 
    credentials=wml_credentials,
    project_id=project_id)

Model's details

In [ ]:

model.get_details()

{'model_id': 'ibm/granite-13b-instruct-v2',
 'label': 'granite-13b-instruct-v2',
 'provider': 'IBM',
 'source': 'IBM',
 'short_description': 'The Granite model series is a family of IBM-trained, dense decoder-only models, which are particularly well-suited for generative tasks.',
 'long_description': 'Granite models are designed to be used for a wide range of generative and non-generative tasks with appropriate prompt engineering. They employ a GPT-style decoder-only architecture, with additional innovations from IBM Research and the open community.',
 'tier': 'class_2',
 'number_params': '13b',
 'min_shot_size': 0,
 'task_ids': ['question_answering',
  'summarization',
  'classification',
  'generation',
  'extraction'],
 'tasks': [{'id': 'question_answering',
   'ratings': {'quality': 3},
   'tags': ['function_prompt_tune_trainable']},
  {'id': 'summarization',
   'ratings': {'quality': 2},
   'tags': ['function_prompt_tune_trainable']},
  {'id': 'retrieval_augmented_generation',
   'ratings': {'quality': 2},
   'tags': ['function_prompt_tune_trainable']},
  {'id': 'classification',
   'ratings': {'quality': 3},
   'tags': ['function_prompt_tune_trainable']},
  {'id': 'generation', 'tags': ['function_prompt_tune_trainable']},
  {'id': 'extraction',
   'ratings': {'quality': 2},
   'tags': ['function_prompt_tune_trainable']}],
 'model_limits': {'max_sequence_length': 8192},
 'limits': {'lite': {'max_output_tokens': 8192},
  'v2-professional': {'max_output_tokens': 8192},
  'v2-standard': {'max_output_tokens': 8192}}}

Analyze the satisfaction

Prepare prompt and generate text

In [20]:

instruction = """Determine if the customer was satisfied with the experience based on the comment. Return simple yes or no.
Comment:The car was broken. They couldn't find a replacement. I've waster over 2 hours.
Satisfied:no"""

In [47]:

prompt1 = "\n".join([instruction, "Comment:" + comments[2], "Satisfied:"])
print(prompt1)

Out[47]:

Determine if the customer was satisfied with the experience based on the comment. Return simple yes or no.
Comment:The car was broken. They couldn't find a replacement. I've waster over 2 hours.
Satisfied:no
Comment:I thought that they were very short and not very friendly. I felt like they hated their job and could care less about the customer.
Satisfied:

Analyze the sentiment for a sample of zero-shot input from the test set.

In [48]:

print(model.generate_text(prompt=prompt1))

Out[48]:

no

Calculate the accuracy

In [49]:

sample_size = 10
prompts_batch = ["\n".join([instruction, "Comment:" + comment, "Satisfied:"]) for comment in comments[:10]]
results = model.generate_text(prompt=prompts_batch)

In [50]:

print(prompts_batch[0])

Out[50]:

Determine if the customer was satisfied with the experience based on the comment. Return simple yes or no.
Comment:The car was broken. They couldn't find a replacement. I've waster over 2 hours.
Satisfied:no
Comment:Excellent response dealing with child seat.
Satisfied:

Score the model

In [51]:

from sklearn.metrics import accuracy_score

label_map = {0: "no", 1: "yes"}
y_true = [label_map[sat] for sat in satisfaction][:sample_size]

print('accuracy_score', accuracy_score(y_true, results))

Out[51]:

accuracy_score 0.8

In [52]:

print('true', y_true, '\npred', results)

Out[52]:

true ['yes', 'no', 'no', 'yes', 'yes', 'no', 'no', 'yes', 'yes', 'no'] 
pred ['no', 'no', 'no', 'yes', 'yes', 'no', 'no', 'no', 'yes', 'no']

Summary and next steps

You successfully completed this notebook!.

You learned how to analyze car rental customer satisfaction with watsonx.ai foundation model.

Check out our Online Documentation for more samples, tutorials, documentation, how-tos, and blog posts.

Authors

Mateusz Szewczyk, Software Engineer at Watson Machine Learning.

Lukasz Cmielowski, PhD, is an Automation Architect and Data Scientist at IBM with a track record of developing enterprise-level applications that substantially increases clients' ability to turn data into actionable knowledge.

Use watsonx, and `ibm/granite-13b-instruct-v2` to analyze car rental customer satisfaction from text

Disclaimers

Notebook content

Learning goal

Contents

Set up the environment

Install and import the `datasets` and dependecies

Connection to WML

Defining the project id

Data loading

Foundation Models on `watsonx.ai`

List available models

Defining the model parameters

Initialize the model

Model's details

Analyze the satisfaction

Prepare prompt and generate text

Calculate the accuracy

Score the model

Summary and next steps

Authors

Product

Resources

Company

Use watsonx, and ibm/granite-13b-instruct-v2 to analyze car rental customer satisfaction from text

Disclaimers

Notebook content

Learning goal

Contents

Set up the environment

Install and import the datasets and dependecies

Connection to WML

Defining the project id

Data loading

Foundation Models on watsonx.ai

List available models

Defining the model parameters

Initialize the model

Model's details

Analyze the satisfaction

Prepare prompt and generate text

Calculate the accuracy

Score the model

Summary and next steps

Authors

Use watsonx, and `ibm/granite-13b-instruct-v2` to analyze car rental customer satisfaction from text

Install and import the `datasets` and dependecies

Foundation Models on `watsonx.ai`