Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
CloudPak-Outcomes
GitHub Repository: CloudPak-Outcomes/Outcomes-Projects
Path: blob/main/L4assets/DSandMLOpsAssets/CLIandSDK/Notebooks/CPDall-03b Project Management.ipynb
1928 views
Kernel: Python 3.10

Project management

This notebook can run on the platform of your choice.

The functionality demonstrated are:

  • Create a project

  • Import a file into the project

  • Removing the new project

  • Importing a project

  • Exporting a project

import json import sys, os import platform as compute import requests import urllib from datetime import datetime #import inspect import pandas as pd from ibm_watson_studio_lib import access_project_or_space from random import randint import zipfile from io import BytesIO import tarfile platform = "cpdaas" if "USER_ID" in os.environ : platform = "cpd"

Make sure to set the variables in the next cell

For Cloud Pak for data (CPD):

  • Set the cpd_url value to the endpoint for your CPD cluster

  • Set the API_key to the API key for your user

For Cloud Pak for Data as a Service (CPDaaS):

  • set cpd_url to https://api.dataplatform.cloud.ibm.com/

  • Set the API_key to the API key for your user

# cluster URL, make sure it ends with "/", and no "zen" ending #cpd_url = "https://cpd-cpd.ai-governance-12345a678e90addd123c4567c8f9a012-3456.us-east.containers.appdomain.cloud/" cpd_url = "https://cloud-pak-for-data/" API_key = "<YOUR_API_KEY>" # either CPD or CPDaaS

Get an access token

We have a chicken and egg problem here: we need the support functions to get the token but we need the token to use the support function. To solve the problem, we define the support function we need before we loadd all the support functions.

An access token is used to identify a user in API requests. Note that the token becomes invalid after an hour and must be re-created.

# Support functions if platform == "cpdaas" : def getToken(key) : """Get the access token required to interface with CPDaaS""" headers = { 'Accept': 'application/json', 'Content-type': 'application/x-www-form-urlencoded' } data = "grant_type=urn:ibm:params:oauth:grant-type:apikey&apikey={}" resp = requests.post('https://iam.cloud.ibm.com/identity/token', headers=headers, data=data.format(key)) return(resp) else : def getToken(admin, passwd, url) : """Get the access token required to interface with CPD""" headers = { 'Accept': 'application/json', 'Content-type': 'application/json' } data = { "username" : admin, "password" : passwd } resp = requests.post(url + IDENTAUTH, data=json.dumps(data), headers=headers, verify=True) return(resp)

Create a bearer (access) token

token = "invalid" if platform == "cpdaas" : resp = getToken(API_key) token = resp.json()['access_token'] else : resp = getToken(username, password, cpd_url) # from cell-2 token = resp.json()['token'] # Header to use in subsequent queries headersAPI = { 'accept': 'application/json', 'Content-type': 'application/json', 'Authorization': 'Bearer ' + token, 'cache-control': 'no-cache' } print("Got a token at {} GMT".format(datetime.now().time().isoformat("seconds"))) # Needed later to look at project assets params = { 'project_id': os.environ['PROJECT_ID'], 'url': cpd_url, 'token': token } wslib = access_project_or_space(params)

Support functions

raw_data_1 = wslib.load_data('cpdalllibs.zip') !rm -rf cpdalllibs myzip = zipfile.ZipFile(BytesIO(raw_data_1.read())) myzip.extractall('.') sys.path.append(".") if platform == "cpdaas" : from cpdalllibs.cpdaaslibfns import * importcpdaas() else : from cpdalllibs.cpdlibfns import * importcpd() # Test if we have access help(getProjects)

On CPSaaS, get the details of the API key

Execute the next cell even if you are on CPD

account_id = None iam_id = None if platform == "cpdaas" : resp = apikeyDetails(API_key, token) key_details_json = resp.json() account_id = key_details_json['account_id'] iam_id = key_details_json['iam_id']

List available projects

# Get the project info in the Techzone account # It needs the 'cpdaas-include-permissions' header projects_json = getProjects(headersAPI, cpd_url, account_id=account_id) print("Number of projects: {}\n".format(len(projects_json))) format_str = "{:40} | {:26} | {}" print(format_str.format("Project name", "Creator", "Creation date")) #print(format_str.format("=" * 40, "=" * 26, "=" * 13)) print("-" * 85) print("\n".join([format_str.format(item['entity']['name'],item['entity']['creator'], item['metadata']['created_at'][:10]) for item in projects_json]))

Get the current project

This makes it easier to get all the arguments needed for the creation of a new project.

projectid = os.environ['PROJECT_ID'] resp = getProject(headersAPI, projectid, cpd_url, cpdaas=False) if resp.status_code > 204 : print("Status code: {}, reason: {}".format(resp.status_code,resp.reason)) current_project = resp.json() print(json.dumps(current_project, indent=2, sort_keys=True))

Create a project

This notebook used API calls to create a project. It is also possible to create projects using the cpdctl command.

This notebook used cpdctl later to import and export projects.

We generate a random project name

from random import randint project_name = "Temp project {}".format(randint(999, 10000)) print("New project name: {}".format(project_name))
crn = None if platform == "cpdaas" : crn = current_project['entity']['storage']['properties']['credentials']['editor']['resource_key_crn'] new_project_info = { "name" : project_name, "description": "Temporary project demonstrating creation", "generator": "L4 CPDall project", "storage" : { "type": current_project['entity']['storage']['type'], # "assetfiles" or "bmcos_object_storage" "guid": current_project['entity']['storage']['guid'], "resource_key_crn": crn } } new_project_id = cre8Project(headersAPI, new_project_info, cpd_url) print("New project id: {}".format(new_project_id))

Load a file into the new project

Use the ibm_watson_studio_lib python library to load a data file into the project

# Get a connection to the new project params = { 'project_id': new_project_id, 'url': cpd_url, 'token': token } wslib2 = access_project_or_space(params) print("Target project: {}".format(wslib2.here.get_name()))

Write a CSV file from github to the project

At the completion of the file loading, notice there were three steps taken:

  • created file

  • created data asset

  • created attachment

url="https://raw.githubusercontent.com/CloudPak-Outcomes/Outcomes-Projects/main/TrustedAI-L3-Tech-Lab/evaluation_records.csv" r = requests.get(url, allow_redirects=True) newfile = wslib2.save_data("datafile.csv", r.content) print(json.dumps(newfile, indent=2))

List the project assets

There should only be one data asset called datafile.csv

You can use a separate tab to open the project and look at the content of the file if you want to be sure the content is all there.

all_assets = wslib2.assets.list_assets("asset") print("\n".join(["{}: {}".format(item['asset_type'], item['name']) for item in all_assets]))

Delete the new project

resp = deleteProject(headersAPI, cpd_url, new_project_id) if resp.status_code > 204 : print("Status code: {}, reason: {}".format(resp.status_code,resp.reason)) else : print("Project '{}' deleted".format(project_name))

Import a project

Installing cpdctl

The latest release is currently 1.4.0. See: cpdctl releases

# set environment variables for the use of cpdctl os.environ['USER_ACCESS_TOKEN'] = token os.environ['RUNTIME_ENV_APSX_URL'] = cpd_url running_os = compute.system().lower() running_cpu = compute.machine().lower() if running_cpu == "x86_64": running_cpu = "amd" cpd_tar_name = "cpdctl_{}_{}64.tar.gz".format(running_os,running_cpu) url = "https://github.com/IBM/cpdctl/releases/download/v1.4.0/{}".format(cpd_tar_name) urllib.request.urlretrieve(url, cpd_tar_name) with tarfile.open(cpd_tar_name, 'r:gz') as tar: img_file = tar.extract('cpdctl') !ls -l cpdctl

Get the project zipfile

We can get the information on the project name

if platform == "cpdaas" : url = "https://github.com/CloudPak-Outcomes/Outcomes-Projects/raw/main/Data-Fabric-Outcomes.zip" else : url = "https://github.com/CloudPak-Outcomes/Outcomes-Projects/raw/main/L4assets/L4CPDProjectForImport.zip" zipname = url.split('/')[-1] urllib.request.urlretrieve(url, zipname) compressed_file = zipfile.ZipFile(zipname) compressed_file.extractall() f = open('project.json') project_desc_json = json.load(f) f.close() # print(json.dumps(project_desc_json, indent=2)) print("Project name: {}".format(project_desc_json['entity']['name']))

Create an empty project

A project is imported into an empty project

crn = None if platform == "cpdaas" : crn = current_project['entity']['storage']['properties']['credentials']['editor']['resource_key_crn'] new_project_info = { "name" : project_desc_json['entity']['name'], "description": "Temporary project demonstrating creation", "generator": "L4 CPDall project", "storage" : { "type": current_project['entity']['storage']['type'], # "assetfiles" or "bmcos_object_storage" "guid": current_project['entity']['storage']['guid'], "resource_key_crn": crn } } new_project_id = cre8Project(headersAPI, new_project_info, cpd_url) print("New project id: {}".format(new_project_id))

Import the project

The zip file is available in the working directory since ti was loader earlier.

!./cpdctl asset import start --import-file {zipname} --project-id {new_project_id}

List assets from the new project

params = { 'project_id': new_project_id, 'url': cpd_url, 'token': token } wslib2 = access_project_or_space(params) print("Target project: {}\n".format(wslib2.here.get_name())) all_assets = wslib2.assets.list_assets("asset") print("\n".join(["{}: {}".format(item['asset_type'], item['name']) for item in all_assets]))

Export a project

export = { 'all_assets': True } export_json = json.dumps(export) result = !./cpdctl asset export start --assets '{export_json}' --name project-export --project-id {new_project_id} --output json --jmes-query "metadata.id" --raw-output export_id = result.s print('Export ID: {}'.format(export_id))
# Get the export file intot he working directory !./cpdctl asset export download --project-id {new_project_id} --export-id {export_id} --output-file project-assets.zip --progress
!ls -l
!unzip -l project-assets.zip

Delete the imported project

resp = deleteProject(headersAPI, cpd_url, new_project_id) if resp.status_code > 204 : print("Status code: {}, reason: {}".format(resp.status_code,resp.reason)) else : print("Project '{}' deleted".format(wslib2.here.get_name()))

Author

Jacques Roy is a member of the IBM Enablement for Data and AI

Copyright © 2023. This notebook and its source code are released under the terms of the MIT License.