GitHub Repository: ibm/watson-machine-learning-samples
Path: blob/master/cloud/notebooks/python_sdk/lifecycle-management/Use ibm-watsonx-ai to extract data about historical AutoAI experiments and deployments stored in spaces.ipynb
⁶⁴⁰⁵ views

Kernel: Python 3.11

Use `ibm-watsonx-ai` to extract data about historical AutoAI experiments and deployments stored in spaces

This notebook contains the steps and code to extract the info about any available historical AutoAI experiment runs as well as deployments in watsonx.ai Runtime service. It contains steps and code to work with ibm-watsonx-ai library available in PyPI repository.

Some familiarity with Python is helpful. This notebook uses Python 3.11.

Learning goals

The learning goals of this notebook are:

Work with watsonx.ai Runtime to extract info about past AutoAI experiments
Work with watsonx.ai Runtime to extract info about AutoAI models that are deployed into spaces
Store the desired data into CSV's for further analysis

This notebook contains the following parts:

1. Installing and importing the `ibm-watsonx-ai` and dependecies

Note: ibm-watsonx-ai documentation can be found here.

In [1]:

!pip install ibm-watsonx-ai | tail -n 1

Out[1]:

Requirement already satisfied: six>=1.10.0 in /opt/conda/envs/Python-RT24.1/lib/python3.11/site-packages (from lomond->ibm-watsonx-ai) (1.16.0)

In [1]:

import json
import os
import getpass
import requests

from IPython.display import display
import pandas as pd

from dateutil import parser

from ibm_watsonx_ai import Credentials, APIClient
from ibm_watsonx_ai.wml_client_error import ApiRequestFailure
from ibm_watsonx_ai.utils import create_download_link

2. Connecting to watsonx.ai Runtime

Authenticate the watsonx.ai Runtime service on IBM Cloud. You need to provide Cloud API key and location.

Tip: Your Cloud API key can be generated by going to the Users section of the Cloud console. From that page, click your name, scroll down to the API Keys section, and click Create an IBM Cloud API key. Give your key a name and click Create, then copy the created key and paste it below. You can also get a service specific url by going to the Endpoint URLs section of the watsonx.ai Runtime docs. You can check your instance location in your watsonx.ai Runtime Service instance details.

You can use IBM Cloud CLI to retrieve the instance location.

ibmcloud login --apikey API_KEY -a https://cloud.ibm.com
ibmcloud resource service-instance INSTANCE_NAME

NOTE: You can also get a service specific apikey by going to the Service IDs section of the Cloud Console. From that page, click Create, and then copy the created key and paste it in the following cell.

Action: Enter your api_key and location in the following cells.

In [2]:

api_key = getpass.getpass("Insert your API key (hit enter): ...")

Out[2]:

Insert your API key (hit enter): ... ········

In [ ]:

location = "INSERT YOUR LOCATION HERE"
url = f"https://{location}.ml.cloud.ibm.com"

In [ ]:

credentials = Credentials(
    api_key=api_key,
    url=url
)

In [5]:

api_client = APIClient(credentials=credentials)

3. Getting available projects and spaces

Getting projects that can be accessed direactly via API requests, once it's supporten in the SDK, it will be updated.

In [6]:

projects_url = f"{api_client.PLATFORM_URL}/v2/projects"
params = api_client._params(skip_for_create=True)
params["limit"] = 100
response = requests.get(url=projects_url, params=params, headers=api_client._get_headers())
available_projects = [resource.get("metadata", {}).get("guid") for resource in response.json().get("resources", [])]

In [7]:

print(f"Available projects: [{', '.join(available_projects)}]")

Out[7]:

Available projects: [31610097-87bb-45aa-ab76-2aee29b5857f, dbc99479-c4ac-46c1-90b1-1d7a7749587d, eac8bfe2-a00b-43ca-846b-305af5cc6395]

Getting spaces via SDK.

In [8]:

available_spaces = api_client.spaces.list()["ID"].to_list()

In [9]:

print(f"Available spaces: [{', '.join(available_spaces)}]")

Out[9]:

Available spaces: [9f44cc2b-b3d0-4472-824e-4941afb1617b, 7ba02c5f-a50a-4105-b9c3-2fdb54fe1829, d68da17a-ab98-44fa-b2e1-a21ab1b76058]

Introducing a method for setting the client's scope - either a space, or a project.

In [10]:

def set_scope(client, scope_name, scope_id):
    if scope_name == "project": 
        client.set.default_project(scope_id)
    else:
        client.set.default_space(scope_id)
    print(f"Working on {scope_name}_id: {scope_id}")

4. Getting all available historical trainings

Training can be executed in a project or in space. Below methods introduce a mechanism to extract the desired data from the training service instance.

In [11]:

training_results = pd.DataFrame()

In [15]:

def is_autoai(metrics): 
    return any("ml_metrics" in i or "ts_metrics" in i for i in metrics)

def get_training_info(api_client):
    training_service = api_client.training
    if (scope_id := api_client.default_project_id) is not None:
        scope = "project"
    else:
        scope = "space"
        scope_id = api_client.default_space_id
    trainings = training_service.list(get_all=True)
    trainings_ids_list = trainings["ID (training)"].to_list()
    info = []
    
    for training in trainings_ids_list: 
        details = training_service.get_details(training)
        metadata = details.get("metadata", {})
        created_at = parser.parse(metadata.get("created_at"))
        status = details.get("entity", {}).get("status", {})
        metrics = status.get("metrics", [])
        if (state:=status.get("state")) == "completed" and is_autoai(metrics): 
            completed_at = parser.parse(status.get("completed_at"))
            # collecting only finished trainings 
            info.append({
                    "ID (training)": training,
                    "Created at": created_at,
                    "Finished at": completed_at,
                    "Status": state,
                    "Took": completed_at - created_at,
                    "Scope": scope,
                    "Scope ID": scope_id
            })
            print(".", end="")
    print()
    return pd.DataFrame(info)

5. Getting all available deployments info

Deployments can be stored only in space. Below methods introduce a mechanism to extract the desired data from the deployment.

In [16]:

def get_model_deployments_ids(client):
    deployments = client.deployments.list()
    return deployments[deployments["ARTIFACT_TYPE"] == "model"]["ID"]

def get_model_details(client, deployment_id):
    asset_id = client.deployments.get_details(deployment_id).get("entity", {}).get("asset", {}).get("id")
    return client.data_assets.get_details(asset_id)

def is_autoai_pipeline(deployment_details):
    return deployment_details.get("metadata", {}).get("asset_type") == "wml_model"

def extract_details(deployment_details):
    space_id = deployment_details.get("metadata", {}).get("space_id")
    wml_model = deployment_details.get("entity", {}).get("wml_model", {})
    training_id = wml_model.get("training_id")
    pipeline_id = wml_model.get("pipeline", {}).get("id")
    metrics = wml_model.get("metrics", [])[0]
    model = metrics.get("context", {}).get("intermediate_model", {})
    pipeline_steps = f'[{", ".join(model.get("composition_steps", []))}]'
    pipeline_nodes = f'[{", ".join(model.get("pipeline_nodes", []))}]'
    print(".", end="")
    return {
            "Scope": "space",
            "Scope ID": space_id,
            "ID (pipeline)": pipeline_id,
            "ID (training)": training_id,
            "Pipeline steps": pipeline_steps,
            "Pipeline nodes": pipeline_nodes
            
        }

6. Extracting trainings info from all available projects

In [17]:

for project_id in available_projects:
    set_scope(api_client, "project", project_id)
    training_results = pd.concat([training_results, get_training_info(api_client)])

Out[17]:

Working on project_id: 31610097-87bb-45aa-ab76-2aee29b5857f
........................................
Working on project_id: dbc99479-c4ac-46c1-90b1-1d7a7749587d
...
Working on project_id: eac8bfe2-a00b-43ca-846b-305af5cc6395
..............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

7. Extracting trainings and deployments info from all available spaces

In [ ]:

models_list = []
for space_id in available_spaces:
    try:
        set_scope(api_client, "space", space_id)
        training_results = pd.concat([training_results, get_training_info(api_client)])
        deployments_list = get_model_deployments_ids(api_client)
        for deployment_id in deployments_list:
            model_details = get_model_details(api_client, deployment_id)
            if is_autoai_pipeline(model_details):
                # collecting only AutoAI models
                models_list.append(extract_details(model_details))
    except Exception as e:
        print(f"Error for space {space_id}: {e}")
    print()
deployments_df = pd.DataFrame(models_list)

8. Displaying the results

In [19]:

training_results

Out[19]:

In [20]:

deployments_df

Out[20]:

9. Exporting results to file

In [21]:

trainings_csv = "trainings.csv"
deployments_csv = "deployments.csv"

In [22]:

training_results.to_csv(trainings_csv, index=False)
deployments_df.to_csv(deployments_csv, index=False)

Export data from cloud (optional)

If you are running this notebook on cloud, execute this cell in order to download the saved results

In [23]:

display(create_download_link(trainings_csv, title=f"Download {trainings_csv}"))
display(create_download_link(deployments_csv, title=f"Download {deployments_csv}"))

Out[23]:

10. Cleanup

If you want to clean up all created assets:

experiments
trainings
pipelines
model definitions
models
functions
deployments

please follow up this sample notebook.

11. Summary and next steps

You successfully completed this notebook! You learned how to use scikit-learn machine learning as well as watsonx.ai Runtime for model creation and deployment. Check out our Online Documentation for more samples, tutorials, documentation, how-tos, and blog posts.

Authors

Marta Tomzik, Software Engineer at watsonx.ai.

Use `ibm-watsonx-ai` to extract data about historical AutoAI experiments and deployments stored in spaces

Learning goals

Contents

1. Installing and importing the `ibm-watsonx-ai` and dependecies

2. Connecting to watsonx.ai Runtime

3. Getting available projects and spaces

4. Getting all available historical trainings

5. Getting all available deployments info

6. Extracting trainings info from all available projects

7. Extracting trainings and deployments info from all available spaces

8. Displaying the results

9. Exporting results to file

Export data from cloud (optional)

10. Cleanup

11. Summary and next steps

Authors

Product

Resources

Company

Use ibm-watsonx-ai to extract data about historical AutoAI experiments and deployments stored in spaces

Learning goals

Contents

1. Installing and importing the ibm-watsonx-ai and dependecies

2. Connecting to watsonx.ai Runtime

3. Getting available projects and spaces

4. Getting all available historical trainings

5. Getting all available deployments info

6. Extracting trainings info from all available projects

7. Extracting trainings and deployments info from all available spaces

8. Displaying the results

9. Exporting results to file

Export data from cloud (optional)

10. Cleanup

11. Summary and next steps

Authors

Use `ibm-watsonx-ai` to extract data about historical AutoAI experiments and deployments stored in spaces

1. Installing and importing the `ibm-watsonx-ai` and dependecies