Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
CloudPak-Outcomes
GitHub Repository: CloudPak-Outcomes/Outcomes-Projects
Path: blob/main/L4assets/DSandMLOpsAssets/HandsOn/Notebooks/DS Model evaluation.ipynb
1928 views
Kernel: Python 3.10

Model evaluation

In this notebook we test the models on data that was not used in the model creation.

CPDaaS: Make sure to first insert a "project token"

Click on the three vertical dots icon in the uper right of the screen, then click on Insert project token

Once inserted, execute the cell.

A project token is only available if you followed the prerequesite instructions to create on in your project.

WML python library

The ibm-watson-machine-learning library is already part of the runtime environment, so it does not need to be installed with the command:

!pip install ibm-watson-machine-learning

At the time of this writing, the library supports Cloud Pak for Data as a Service as well as Cloudpak for Data versions 3.5, 4.0, 4.5, and 4.6.

For more information on the Python libraries see:

Get a WML connection

  • Replace the value of cpd_url with the proper cloud region or the proper cluster URL

  • Set the value of API_key to your API key

# cluster URL, make sure it ends with "/", and no "zen" ending #cpd_url = "https://cpd-cpd.ai-governance-12345a678e90addd123c4567c8f9a012-3456.us-east.containers.appdomain.cloud/" cpd_url = "https://us-south.ml.cloud.ibm.com" API_key = "<YOUR_API_KEY>" # either CPD or CPDaaS

Create a Watson Machine Learning (WML) client connection

import os import json import pandas as pd from sklearn.metrics import mean_squared_error from ibm_watson_studio_lib import access_project_or_space from ibm_watson_machine_learning import APIClient if "USER_ID" in os.environ : wslib = access_project_or_space() wml_credentials = { "url": cpd_url, "username": "<USERNAME>", "apikey" : API_key, "instance_id": "openshift", "version" : "4.0" } else : wml_credentials = { "url": cpd_url, "apikey": API_key } client = APIClient(wml_credentials)

Listing projects models

Also get the model details

# Need to set either the space id (deployment) or the project id # for the next few cells, we need to set it to the project ID ## space_details = client.spaces.get_details() ## space_uid = client.spaces.get_id(space_details["resources"][0]) # deployment space ## client.set.default_space(space_uid) client.set.default_project(os.environ["PROJECT_ID"])
models_details = client.repository.get_model_details() print("\n".join([item['metadata']['name'] for item in models_details['resources']]))
models_pd = client.repository.list_models() AutoAI_id = models_pd.loc[models_pd['NAME'].str.startswith("AutoAI")][["ID"]].reset_index().loc[0]['ID'] SPSS_id = models_pd.loc[models_pd['NAME'].str.startswith("SPSS")][["ID"]].reset_index().loc[0]['ID'] print("AutoAI id: {}\nSPSS id: {}".format(AutoAI_id,SPSS_id))
# Another way to get the model IDs wml_models = wslib.assets.list_assets("wml_model") print("\n".join(["{:26} : {}".format(item['name'], item['asset_id']) for item in wml_models]))

AutoAI model details

There are a lot of details attached to a model. Here is a small sample:

  • Number of rcords involved in the model training

  • Label column

  • Feature importance

  • Input fields schema

  • Training data reference

  • Model name

  • and more

List the feature importance

List the important features in order of importance. This is something that can be seen through the Watson Studio UI in the AutoAI experiment.

The two most important features are HOTSPOT2 and PRIM_DRIVER_GENDER

AutoAI_details = [item for item in models_details['resources'] if "AutoAI" in item['metadata']['name']][0] importance = AutoAI_details['entity']['metrics'][0]['context']['features_importance'][0]['features'] # Use only the entries greater than zero importance = {k:v for (k,v) in importance.items() if v > 0.0 } importance = dict(sorted(importance.items(), key=lambda item: item[1], reverse=True)) importance

List the input column names

Note that the RISK column is not part of the input.

The output columns definition is an empty array.

print("Input columns:") print(", ".join([item['name'] for item in AutoAI_details['entity']['schemas']['input'][0]['fields']]))

SPSS model details

The SPSS model includes different information from AutoAI but stilll includes information such as:

  • Schemas: input, output

  • Software specifications

  • Name

  • and more

List input and output column names

The output columns include the input columns with the addition of:

  • Partition

  • $XR-RISK

  • $XRE-RISK

SPSS_details = [item for item in models_details['resources'] if "SPSS" in item['metadata']['name']][0] print("Input columns:") print(", ".join([item['name'] for item in SPSS_details['entity']['schemas']['input'][0]['fields']])) print("\nOutput columns:") print(", ".join([item['name'] for item in SPSS_details['entity']['schemas']['output'][0]['fields']]))

Load data to score

This data was not seen during model creation. It contains over 20 thousand records.

This section uses the first 20 records of the dataset.

body = wslib.load_data("ValidationRecords.csv") records_df = pd.read_csv(body) print("Number of available records: {}".format(records_df.shape[0])) # scoring_records = records_df.sample(frac = 0.001) would be 20-21 records scoring_records_df = records_df[:20] records_df.head()

Some statistics on the records

This should really be done on the training data but since we have 20000 records, that should be good enough.

print("Min risk: {}, max risk: {}, range: {}".format(records_df['RISK'].min(), records_df['RISK'].max(), records_df['RISK'].max() - records_df['RISK'].min()) )

Load the AutoAI model and score some records

The AutoAI model can run in the notebook runtime. There is no need to first deploy it. This makes it quite simple to use in a notebook.

AutoAI_model = client.repository.load(AutoAI_id) AutoAI_results = AutoAI_model.predict(scoring_records_df.drop(columns=['RISK']).to_numpy())

Compare results

Values, differences and percentage of differences

percent = 100 * (scoring_records_df["RISK"] - AutoAI_results) / scoring_records_df["RISK"] diff = scoring_records_df["RISK"] - AutoAI_results d = {'risk': scoring_records_df["RISK"], 'AutoAI': AutoAI_results, "diff": diff, "percent": percent} df = pd.DataFrame(data=d) df.head(20)

Root mean squared error (RMSE)

Applied only on the small set of records scored. This can varie greatly from the value returned in the AutoAI experiment on the training data.

The RISK value in the validation records range from 4 to 59. A RMSE of 5 would represent roughly a 9% error.

mean_squared_error(scoring_records_df["RISK"], AutoAI_results)

Deploy the SPSS model and score some records

In this case, the load command would create a local file with the model in it and returns the path to the model.

The SPSS model depends on a runtime that is not part of the notebook. It must use the Watson Machine Learning(WML) for execution.

We need a deployment space!

spaces_list = client.spaces.list() spaces_list.head()

Promote the model to the deployment space

From the project to the deployment space. This deployment space must be associated with a Watson Machine Learning (WML) service

# This assumes the desired deployment space is at index 0. If not, please change the index. space_id = spaces_list.iloc[0]['ID'] SPSS_details = [item for item in models_details['resources'] if "SPSS" in item['metadata']['name']][0] client.set.default_space(space_id) promoted_asset_id = client.spaces.promote(SPSS_id, source_project_id=os.environ['PROJECT_ID'], target_space_id=space_id)

Deploy the newly promoted model

Note: online_url is deprecated and will be removed in a future release. Use serving_urls instead.

deployment = client.deployments.create( artifact_uid=promoted_asset_id, meta_props={ client.deployments.ConfigurationMetaNames.NAME: "SPSS risk factor deployment", client.deployments.ConfigurationMetaNames.ONLINE:{}} ) deployment_id = client.deployments.get_id(deployment)

Score the records

Notice the fields names are extracted from the model details and the scoring records are directly from the dataframe. There must not be invalid values so they are replaces with zeros (fillna(0))

scoring_data = { client.deployments.ScoringMetaNames.INPUT_DATA: [{ "fields": [item['name'] for item in SPSS_details['entity']['schemas']['input'][0]['fields']], "values": scoring_records_df.fillna(0).to_numpy() }] } predictions = client.deployments.score(deployment_id, scoring_data) print(predictions)

Compare results

Values, differences and percentage of differences

# Extract the values for column "$XR-RISK" at position: -2 SPSS_results = [item[-2] for item in predictions['predictions'][0]['values']]
percent = 100 * (scoring_records_df["RISK"] - SPSS_results) / scoring_records_df["RISK"] diff = scoring_records_df["RISK"] - SPSS_results d = {'risk': scoring_records_df["RISK"], 'SPSS': SPSS_results, "diff": diff, "percent": percent} df = pd.DataFrame(data=d) df.head(20)

Root mean squared error (RMSE)

Applied only on the small set of records scored. This can varie greatly based on the results used, only 20 values here.

mean_squared_error(scoring_records_df["RISK"], SPSS_results)

Cleanup

  • Remove the deployment

  • Remove the promoted asset

client.deployments.delete(deployment_id)
client.data_assets.delete(promoted_asset_id)

Author

Jacques Roy is a member of the IBM Enablement for Data and AI

Copyright © 2023. This notebook and its source code are released under the terms of the MIT License.