GitHub Repository: CloudPak-Outcomes/Outcomes-Projects
Path: blob/main/L4assets/DSandMLOpsAssets/HandsOn/Notebooks/DS Model evaluation.ipynb
¹⁹²⁸ views

Kernel: Python 3.10

Model evaluation

In this notebook we test the models on data that was not used in the model creation.

CPDaaS: Make sure to first insert a "project token"

Click on the three vertical dots icon in the uper right of the screen, then click on Insert project token

Once inserted, execute the cell.

A project token is only available if you followed the prerequesite instructions to create on in your project.

WML python library

The ibm-watson-machine-learning library is already part of the runtime environment, so it does not need to be installed with the command:

!pip install ibm-watson-machine-learning

At the time of this writing, the library supports Cloud Pak for Data as a Service as well as Cloudpak for Data versions 3.5, 4.0, 4.5, and 4.6.

For more information on the Python libraries see:

Get a WML connection

Replace the value of cpd_url with the proper cloud region or the proper cluster URL
Set the value of API_key to your API key

In [ ]:

# cluster URL, make sure it ends with "/", and no "zen" ending
#cpd_url = "https://cpd-cpd.ai-governance-12345a678e90addd123c4567c8f9a012-3456.us-east.containers.appdomain.cloud/"
cpd_url = "https://us-south.ml.cloud.ibm.com"
API_key = "<YOUR_API_KEY>" # either CPD or CPDaaS

Create a Watson Machine Learning (WML) client connection

In [ ]:

import os
import json
import pandas as pd
from sklearn.metrics import mean_squared_error
from ibm_watson_studio_lib import access_project_or_space

from ibm_watson_machine_learning import APIClient

if "USER_ID" in os.environ :
    wslib = access_project_or_space()
    wml_credentials = {
                   "url": cpd_url,
                   "username": "<USERNAME>",
                   "apikey" : API_key,
                   "instance_id": "openshift",
                   "version" : "4.0"
                  }
else :
    wml_credentials = {
                   "url": cpd_url,
                   "apikey": API_key
                  }

client = APIClient(wml_credentials)

Listing projects models

Also get the model details

In [ ]:

# Need to set either the space id (deployment) or the project id
# for the next few cells, we need to set it to the project ID
## space_details = client.spaces.get_details()
## space_uid = client.spaces.get_id(space_details["resources"][0]) # deployment space
## client.set.default_space(space_uid)
client.set.default_project(os.environ["PROJECT_ID"])

In [ ]:

models_details = client.repository.get_model_details()
print("\n".join([item['metadata']['name'] for item in models_details['resources']]))

In [ ]:

models_pd = client.repository.list_models()
AutoAI_id = models_pd.loc[models_pd['NAME'].str.startswith("AutoAI")][["ID"]].reset_index().loc[0]['ID']
SPSS_id = models_pd.loc[models_pd['NAME'].str.startswith("SPSS")][["ID"]].reset_index().loc[0]['ID']
print("AutoAI id: {}\nSPSS id: {}".format(AutoAI_id,SPSS_id))

In [ ]:

# Another way to get the model IDs
wml_models = wslib.assets.list_assets("wml_model")
print("\n".join(["{:26} : {}".format(item['name'], item['asset_id']) for item in wml_models]))

AutoAI model details

There are a lot of details attached to a model. Here is a small sample:

Number of rcords involved in the model training
Label column
Feature importance
Input fields schema
Training data reference
Model name
and more

List the feature importance

List the important features in order of importance. This is something that can be seen through the Watson Studio UI in the AutoAI experiment.

The two most important features are HOTSPOT2 and PRIM_DRIVER_GENDER

In [ ]:

AutoAI_details = [item for item in models_details['resources'] if "AutoAI" in item['metadata']['name']][0]
importance = AutoAI_details['entity']['metrics'][0]['context']['features_importance'][0]['features']
# Use only the entries greater than zero
importance = {k:v for (k,v) in importance.items() if v > 0.0 }
importance = dict(sorted(importance.items(), key=lambda item: item[1], reverse=True))
importance

List the input column names

Note that the RISK column is not part of the input.

The output columns definition is an empty array.

In [ ]:

print("Input columns:")
print(", ".join([item['name'] for item in AutoAI_details['entity']['schemas']['input'][0]['fields']]))

SPSS model details

The SPSS model includes different information from AutoAI but stilll includes information such as:

Schemas: input, output
Software specifications
Name
and more

List input and output column names

The output columns include the input columns with the addition of:

Partition
$XR-RISK
$XRE-RISK

In [ ]:

SPSS_details = [item for item in models_details['resources'] if "SPSS" in item['metadata']['name']][0]
print("Input columns:")
print(", ".join([item['name'] for item in SPSS_details['entity']['schemas']['input'][0]['fields']]))
print("\nOutput columns:")
print(", ".join([item['name'] for item in SPSS_details['entity']['schemas']['output'][0]['fields']]))

Load data to score

This data was not seen during model creation. It contains over 20 thousand records.

This section uses the first 20 records of the dataset.

In [ ]:

body = wslib.load_data("ValidationRecords.csv")
records_df = pd.read_csv(body)
print("Number of available records: {}".format(records_df.shape[0]))
# scoring_records = records_df.sample(frac = 0.001) would be 20-21 records
scoring_records_df = records_df[:20]
records_df.head()

Some statistics on the records

This should really be done on the training data but since we have 20000 records, that should be good enough.

In [ ]:

print("Min risk: {}, max risk: {}, range: {}".format(records_df['RISK'].min(), 
                                                     records_df['RISK'].max(),
                                                    records_df['RISK'].max() - records_df['RISK'].min())
     )

Load the AutoAI model and score some records

The AutoAI model can run in the notebook runtime. There is no need to first deploy it. This makes it quite simple to use in a notebook.

In [ ]:

AutoAI_model = client.repository.load(AutoAI_id)
AutoAI_results = AutoAI_model.predict(scoring_records_df.drop(columns=['RISK']).to_numpy())

Compare results

Values, differences and percentage of differences

In [ ]:

percent = 100 * (scoring_records_df["RISK"] - AutoAI_results) / scoring_records_df["RISK"]
diff = scoring_records_df["RISK"] - AutoAI_results
d = {'risk': scoring_records_df["RISK"], 'AutoAI': AutoAI_results,
     "diff": diff, "percent": percent}
df = pd.DataFrame(data=d)
df.head(20)

Root mean squared error (RMSE)

Applied only on the small set of records scored. This can varie greatly from the value returned in the AutoAI experiment on the training data.

The RISK value in the validation records range from 4 to 59. A RMSE of 5 would represent roughly a 9% error.

In [ ]:

mean_squared_error(scoring_records_df["RISK"], AutoAI_results)

Deploy the SPSS model and score some records

In this case, the load command would create a local file with the model in it and returns the path to the model.

The SPSS model depends on a runtime that is not part of the notebook. It must use the Watson Machine Learning(WML) for execution.

We need a deployment space!

In [ ]:

spaces_list = client.spaces.list()
spaces_list.head()

Promote the model to the deployment space

From the project to the deployment space. This deployment space must be associated with a Watson Machine Learning (WML) service

In [ ]:

# This assumes the desired deployment space is at index 0. If not, please change the index.
space_id = spaces_list.iloc[0]['ID']
SPSS_details = [item for item in models_details['resources'] if "SPSS" in item['metadata']['name']][0]
client.set.default_space(space_id)
promoted_asset_id = client.spaces.promote(SPSS_id, source_project_id=os.environ['PROJECT_ID'], 
                                          target_space_id=space_id)

Deploy the newly promoted model

Note: online_url is deprecated and will be removed in a future release. Use serving_urls instead.

In [ ]:

deployment = client.deployments.create(
    artifact_uid=promoted_asset_id,
    meta_props={
        client.deployments.ConfigurationMetaNames.NAME: "SPSS risk factor deployment",
        client.deployments.ConfigurationMetaNames.ONLINE:{}}
)
deployment_id = client.deployments.get_id(deployment)

Score the records

Notice the fields names are extracted from the model details and the scoring records are directly from the dataframe. There must not be invalid values so they are replaces with zeros (fillna(0))

In [ ]:

scoring_data = {
    client.deployments.ScoringMetaNames.INPUT_DATA: [{
        "fields": [item['name'] for item in SPSS_details['entity']['schemas']['input'][0]['fields']],
        "values": scoring_records_df.fillna(0).to_numpy()
    }]
}
predictions = client.deployments.score(deployment_id, scoring_data)
print(predictions)

Compare results

Values, differences and percentage of differences

In [ ]:

# Extract the values for column "$XR-RISK" at position: -2
SPSS_results = [item[-2] for item in predictions['predictions'][0]['values']]

In [ ]:

percent = 100 * (scoring_records_df["RISK"] - SPSS_results) / scoring_records_df["RISK"]
diff = scoring_records_df["RISK"] - SPSS_results
d = {'risk': scoring_records_df["RISK"], 'SPSS': SPSS_results,
     "diff": diff, "percent": percent}
df = pd.DataFrame(data=d)
df.head(20)

Root mean squared error (RMSE)

Applied only on the small set of records scored. This can varie greatly based on the results used, only 20 values here.

In [ ]:

mean_squared_error(scoring_records_df["RISK"], SPSS_results)

Cleanup

Remove the deployment
Remove the promoted asset

In [ ]:

client.deployments.delete(deployment_id)

In [ ]:

client.data_assets.delete(promoted_asset_id)

Author

Jacques Roy is a member of the IBM Enablement for Data and AI

Model evaluation

CPDaaS: Make sure to first insert a "project token"

WML python library

For more information on the Python libraries see:

Get a WML connection

Create a Watson Machine Learning (WML) client connection

Listing projects models

AutoAI model details

List the feature importance

List the input column names

SPSS model details

List input and output column names

Load data to score

Some statistics on the records

Load the AutoAI model and score some records

Compare results

Root mean squared error (RMSE)

Deploy the SPSS model and score some records

Promote the model to the deployment space

Deploy the newly promoted model

Score the records

Compare results

Root mean squared error (RMSE)

Cleanup

Author

Product

Resources

Company