GitHub Repository: ibm/watson-machine-learning-samples
Path: blob/master/cloud/notebooks/python_sdk/experiments/federated_learning/Federated Learning Demo Part I - for Admin.ipynb
⁶⁴⁰⁵ views

Kernel: Python 3

WML Federated Learning with MNIST for Admin using `ibm-watsonx-ai`.

With IBM Federated Learning, you can combine data from multiple sources to train a model from the collective data without having to actually share them. This allows enterprises to train data with other companies without delegating resources for security. Another advantage is the remote data does not have to be centralized in one location, eliminates the needs to move potentially large datasets. This notebook demonstrates how to start Federated Learning with the Python client.

Learning Goals

After completing this notebook, you should know how to:

Load an untrained model
Create a Remote Training System
Start a training job

This notebook is intended to be run by the administrator of the Federated Learning experiment.

1. Set up the environment

Before you use the sample code in this notebook, you must perform the following setup tasks:

Create a watsonx.ai Runtime Service instance (a free plan is offered and information about how to create the instance can be found here).

Install and import the `ibm-watsonx-ai` and dependecies

Note: ibm-watsonx-ai documentation can be found here.

In [ ]:

!pip install -U ibm-watsonx-ai | tail -n 1

Connection to watsonx.ai Runtime

Authenticate the watsonx.ai Runtime service on IBM Cloud. You need to provide platform api_key and instance location.

You can use IBM Cloud CLI to retrieve platform API Key and instance location.

API Key can be generated in the following way:

ibmcloud login
ibmcloud iam api-key-create API_KEY_NAME

In result, get the value of api_key from the output.

Location of your watsonx.ai Runtime instance can be retrieved in the following way:

ibmcloud login --apikey API_KEY -a https://cloud.ibm.com
ibmcloud resource service-instance INSTANCE_NAME

In result, get the value of location from the output.

Tip: Your Cloud API key can be generated by going to the Users section of the Cloud console. From that page, click your name, scroll down to the API Keys section, and click Create an IBM Cloud API key. Give your key a name and click Create, then copy the created key and paste it below. You can also get a service specific url by going to the Endpoint URLs section of the watsonx.ai Runtime docs. You can check your instance location in your watsonx.ai Runtime Service instance details.

You can also get service specific apikey by going to the Service IDs section of the Cloud Console. From that page, click Create, then copy the created key and paste it below.

Action: Enter your api_key and location in the following cell.

In [ ]:

api_key = 'PASTE YOUR PLATFORM API KEY HERE'
location = 'PASTE YOUR INSTANCE LOCATION HERE'
cloud_user_id = 'PASTE YOUR USER ID HERE [IBMid-xxx"]'

In [1]:

from ibm_watsonx_ai import Credentials

credentials = Credentials(
    api_key=api_key,
    url='https://' + location + '.ml.cloud.ibm.com'
)

In [2]:

from ibm_watsonx_ai import APIClient

client = APIClient(credentials)

Action: Assign project ID below

In [3]:

project_id = 'PASTE YOUR PROJECT ID HERE'

In [4]:

client.set.default_project(project_id)

Out[4]:

'SUCCESS'

2. Load the model

You need an untrained model asset for Federated Learning to work with. In this tutorial, an untrained Tensorflow 2 Keras model is provided for you. Federated Learning supports Scikit-learn and Tensorflow 2, which are free machine learning packages with tutorials. Additionally IBM docs provide some details on how to configure an untrained model for Federated Learning. See:

2.1 Create Untrained Model Asset

Creates an untrained model asset in your project.

In [5]:

import urllib3
import requests
import json
from string import Template

urllib3.disable_warnings()

In [ ]:

software_specification_name = "tensorflow_rt24.1-py3.11"
sw_spec_id = client.software_specifications.get_id_by_name(software_specification_name)
model_resp = requests.get("https://github.com/IBMDataScience/sample-notebooks/raw/master/Files/tf_mnist_model.zip")
f = open('tf_mnist_model.zip', 'wb')
f.write(model_resp.content)
f.close()
MY_MODEL_ZIP = "./tf_mnist_model.zip"

model_metadata = {
        client.repository.ModelMetaNames.NAME: 'untrained TF model',
        client.repository.ModelMetaNames.SOFTWARE_SPEC_ID: sw_spec_id,
        client.repository.ModelMetaNames.TYPE: 'tensorflow_2.14'
    }

untrained_model_details = client.repository.store_model(MY_MODEL_ZIP,model_metadata)
base_model_id = client.repository.get_model_id(untrained_model_details)
print("Saved model id: %s" % base_model_id)
base_model_content_uri = "/ml/v4/models/"+ base_model_id + "/content"
print("Host URL = " + credentials["url"] + base_model_content_uri)

3. Create Remote Training System Asset

Now you will learn to create a Remote Training System (RTS). An RTS handles receiving your multiple parties' call to the aggregator to run the training.

allowed_identities are users permitted to connect to the Federated Learning experiment. In this tutorial, only your user ID is permitted to connect but you can update the template and add additional users as required.
An Admin in remote_admin. The template for the admin is the same as the user. In this tutorial, a template Admin is created. It is also the same as the user ID, however generally in application, the admin does not have to be one of the users.

In [ ]:

metadata = {
 client.remote_training_systems.ConfigurationMetaNames.NAME: "Remote Party 1",
 client.remote_training_systems.ConfigurationMetaNames.TAGS: ["Federated Learning"],
 client.remote_training_systems.ConfigurationMetaNames.ORGANIZATION: {"name": "IBM", "region": "US"},
 client.remote_training_systems.ConfigurationMetaNames.ALLOWED_IDENTITIES: [{"id": cloud_user_id, "type": "user"}],
 client.remote_training_systems.ConfigurationMetaNames.REMOTE_ADMIN: {"id": cloud_user_id, "type": "user"}
}

details = client.remote_training_systems.store(meta_props=metadata)
print("Create wml_remote_training_system_one asset response: %s"  % json.dumps(details, indent=4))
wml_remote_training_system_one_asset_id = details["metadata"]["id"]
print("WML wml_remote_training_system_one asset id: %s" % wml_remote_training_system_one_asset_id)

4. Create FL Training Job

In this section, you will launch the Federated Learning experiment.

In [ ]:

training_payload_str = Template(""" 
{
    "model": {
      "spec": {
        "id": "$modelID"
      },
      "type": "tensorflow"
    },
    "fusion_type": "iter_avg",
    "rounds": 5,
    "remote_training" : {
      "quorum": 1.0,
      "remote_training_systems": [ { "id" : "$rts_one", "required" : true  } ]
    },
    "hardware_spec": {
      "name": "XS"
    }
  }
""").substitute(modelID = base_model_id,
                rts_one = wml_remote_training_system_one_asset_id
               )

print("Training payload: %s" % training_payload_str)
training_payload = json.loads(training_payload_str)

aggregator_metadata = {
 client.training.ConfigurationMetaNames.NAME: 'FL Aggregator',
 client.training.ConfigurationMetaNames.DESCRIPTION: 'Sample FL Aggregator',
 client.training.ConfigurationMetaNames.TRAINING_DATA_REFERENCES: [],
 client.training.ConfigurationMetaNames.TRAINING_RESULTS_REFERENCE: {
    "type": "container",
    "name": "outputData",
    "connection": {},
    "location": {
      "path": "."
      }
    },
 client.training.ConfigurationMetaNames.FEDERATED_LEARNING: training_payload
}

aggregator = client.training.run(aggregator_metadata, asynchronous=True)
training_id = client.training.get_id(aggregator)
print("Training ID: %s" % training_id)

4.1 Get Training Job Status

In [11]:

training_run_details = client.training.get_details(training_id)
print("Full training job status: "+ json.dumps(training_run_details, indent=4))

5. Clean up

If you want to clean up all created assets:

experiments
trainings
pipelines
model definitions
models
functions
deployments

please follow up this sample notebook.

6. Summary and next steps

You successfully completed this notebook! Check out our Online Documentation for more samples, tutorials, documentation, how-tos, and blog posts.

Get Variables And Paste Into Party Notebook

Run the following cell and copy the output.

In [ ]:

print(f'WML_SERVICES_HOST = https://{location}.ml.cloud.ibm.com')
print(f'IAM_APIKEY = {api_key}')
print(f'RTS_ID = {wml_remote_training_system_one_asset_id}')
print(f'TRAINING_ID = {training_id}')
print(f'PROJECT_ID = {project_id}')

As the Admin, you have now launched a Federated Learning experiment. Copy the output from the previous cell. Open Part II - WML Federated Learning with MNIST for Party and paste the output into the first code cell.

Author

Rinay Shah, Software Developer at IBM.

WML Federated Learning with MNIST for Admin using `ibm-watsonx-ai`.

Learning Goals

Table of Contents

1. Set up the environment

Install and import the `ibm-watsonx-ai` and dependecies

Connection to watsonx.ai Runtime

2. Load the model

2.1 Create Untrained Model Asset

3. Create Remote Training System Asset

4. Create FL Training Job

4.1 Get Training Job Status

5. Clean up

6. Summary and next steps

Get Variables And Paste Into Party Notebook

Author

Product

Resources

Company

WML Federated Learning with MNIST for Admin using ibm-watsonx-ai.

Learning Goals

Table of Contents

1. Set up the environment

Install and import the ibm-watsonx-ai and dependecies

Connection to watsonx.ai Runtime

2. Load the model

2.1 Create Untrained Model Asset

3. Create Remote Training System Asset

4. Create FL Training Job

4.1 Get Training Job Status

5. Clean up

6. Summary and next steps

Get Variables And Paste Into Party Notebook

Author

WML Federated Learning with MNIST for Admin using `ibm-watsonx-ai`.

Install and import the `ibm-watsonx-ai` and dependecies