Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
tensorflow
GitHub Repository: tensorflow/docs-l10n
Path: blob/master/site/en-snapshot/datasets/cli.ipynb
25115 views
Kernel: Python 3

TFDS CLI

TFDS CLI is a command-line tool that provides various commands to easily work with TensorFlow Datasets.

Copyright 2020 The TensorFlow Datasets Authors, Licensed under the Apache License, Version 2.0

Disable TF logs on import
%%capture %env TF_CPP_MIN_LOG_LEVEL=1 # Disable logs on TF import

Installation

The CLI tool is installed with tensorflow-datasets (or tfds-nightly).

!pip install -q tfds-nightly !tfds --version

For the list of all CLI commands:

!tfds --help

tfds new: Implementing a new Dataset

This command will help you kickstart writing your new Python dataset by creating a <dataset_name>/ directory containing default implementation files.

Usage:

!tfds new my_dataset

tfds new my_dataset will create:

ls -1 my_dataset/

An optional flag --data_format can be used to generate format-specific dataset builders (e.g., conll). If no data format is given, it will generate a template for a standard tfds.core.GeneratorBasedBuilder. Refer to the documentation for details on the available format-specific dataset builders.

See our writing dataset guide for more info.

Available options:

!tfds new --help

tfds build: Download and prepare a dataset

Use tfds build <my_dataset> to generate a new dataset. <my_dataset> can be:

  • A path to dataset/ folder or dataset.py file (empty for current directory):

    • tfds build datasets/my_dataset/

    • cd datasets/my_dataset/ && tfds build

    • cd datasets/my_dataset/ && tfds build my_dataset

    • cd datasets/my_dataset/ && tfds build my_dataset.py

  • A registered dataset:

    • tfds build mnist

    • tfds build my_dataset --imports my_project.datasets

Note: tfds build has useful flags to help prototyping and debuging. See the Debug & tests: section bellow.

Available options:

!tfds build --help