Path: blob/master/site/en-snapshot/tfx/guide/tuner.md
56931 views
The Tuner TFX Pipeline Component
The Tuner component tunes the hyperparameters for the model.
Tuner Component and KerasTuner Library
The Tuner component makes extensive use of the Python KerasTuner API for tuning hyperparameters.
Note: The KerasTuner library can be used for hyperparameter tuning regardless of the modeling API, not just for Keras models only.
Component
Tuner takes:
tf.Examples used for training and eval.
A user provided module file (or module fn) that defines the tuning logic, including model definition, hyperparameter search space, objective etc.
Protobuf definition of train args and eval args.
(Optional) Protobuf definition of tuning args.
(Optional) transform graph produced by an upstream Transform component.
(Optional) A data schema created by a SchemaGen pipeline component and optionally altered by the developer.
With the given data, model, and objective, Tuner tunes the hyperparameters and emits the best result.
Instructions
A user module function tuner_fn with the following signature is required for Tuner:
In this function, you define both the model and hyperparameter search spaces, and choose the objective and algorithm for tuning. The Tuner component takes this module code as input, tunes the hyperparameters, and emits the best result.
Trainer can take Tuner's output hyperparameters as input and utilize them in its user module code. The pipeline definition looks like this:
You might not want to tune the hyperparameters every time you retrain your model. Once you have used Tuner to determine a good set of hyperparameters, you can remove Tuner from your pipeline and use ImporterNode to import the Tuner artifact from a previous training run to feed to Trainer.
Tuning on Google Cloud Platform (GCP)
When running on the Google Cloud Platform (GCP), the Tuner component can take advantage of two services:
AI Platform Vizier (via CloudTuner implementation)
AI Platform Training (as a flock manager for distributed tuning)
AI Platform Vizier as the backend of hyperparameter tuning
AI Platform Vizier is a managed service that performs black box optimization, based on the Google Vizier technology.
CloudTuner is an implementation of KerasTuner which talks to the AI Platform Vizier service as the study backend. Since CloudTuner is a subclass of keras_tuner.Tuner, it can be used as a drop-in replacement in the tuner_fn module, and execute as a part of the TFX Tuner component.
Below is a code snippet which shows how to use CloudTuner. Notice that configuration to CloudTuner requires items which are specific to GCP, such as the project_id and region.
Parallel tuning on Cloud AI Platform Training distributed worker flock
The KerasTuner framework as the underlying implementation of the Tuner component has ability to conduct hyperparameter search in parallel. While the stock Tuner component does not have ability to execute more than one search worker in parallel, by using the Google Cloud AI Platform extension Tuner component, it provides the ability to run parallel tuning, using an AI Platform Training Job as a distributed worker flock manager. TuneArgs is the configuration given to this component. This is a drop-in replacement of the stock Tuner component.
The behavior and the output of the extension Tuner component is the same as the stock Tuner component, except that multiple hyperparameter searches are executed in parallel on different worker machines, and as a result, the num_trials will be completed faster. This is particularly effective when the search algorithm is embarrassingly parallelizable, such as RandomSearch. However, if the search algorithm uses information from results of prior trials, such as Google Vizier algorithm implemented in the AI Platform Vizier does, an excessively parallel search would negatively affect the efficacy of the search.
Note: Each trial in each parallel search is conducted on a single machine in the worker flock, i.e., each trial does not take advantage of multi-worker distributed training. If multi-worker distribution is desired for each trial, refer to DistributingCloudTuner, instead of CloudTuner.
Note: Both CloudTuner and the Google Cloud AI Platform extensions Tuner component can be used together, in which case it allows distributed parallel tuning backed by the AI Platform Vizier's hyperparameter search algorithm. However, in order to do so, the Cloud AI Platform Job must be given access to the AI Platform Vizier service. See this guide to set up a custom service account. After that, you should specify the custom service account for your training job in the pipeline code. More details see E2E CloudTuner on GCP example.
Links
More details are available in the Tuner API reference.