GitHub Repository: tensorflow/docs-l10n
Path: blob/master/site/en-snapshot/tutorials/load_data/numpy.ipynb
²⁵¹¹⁸ views

Kernel: Python 3

Copyright 2019 The TensorFlow Authors.

In [ ]:

#@title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

Load NumPy data

View on TensorFlow.org

Run in Google Colab

View source on GitHub

Download notebook

This tutorial provides an example of loading data from NumPy arrays into a tf.data.Dataset.

This example loads the MNIST dataset from a .npz file. However, the source of the NumPy arrays is not important.

Setup

In [ ]:

 
import numpy as np
import tensorflow as tf

Load from `.npz` file

In [ ]:

DATA_URL = 'https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz'

path = tf.keras.utils.get_file('mnist.npz', DATA_URL)
with np.load(path) as data:
  train_examples = data['x_train']
  train_labels = data['y_train']
  test_examples = data['x_test']
  test_labels = data['y_test']

Load NumPy arrays with `tf.data.Dataset`

Assuming you have an array of examples and a corresponding array of labels, pass the two arrays as a tuple into tf.data.Dataset.from_tensor_slices to create a tf.data.Dataset.

In [ ]:

train_dataset = tf.data.Dataset.from_tensor_slices((train_examples, train_labels))
test_dataset = tf.data.Dataset.from_tensor_slices((test_examples, test_labels))

Use the datasets

Shuffle and batch the datasets

In [ ]:

BATCH_SIZE = 64
SHUFFLE_BUFFER_SIZE = 100

train_dataset = train_dataset.shuffle(SHUFFLE_BUFFER_SIZE).batch(BATCH_SIZE)
test_dataset = test_dataset.batch(BATCH_SIZE)

Build and train a model

In [ ]:

model = tf.keras.Sequential([
    tf.keras.layers.Flatten(input_shape=(28, 28)),
    tf.keras.layers.Dense(128, activation='relu'),
    tf.keras.layers.Dense(10)
])

model.compile(optimizer=tf.keras.optimizers.RMSprop(),
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['sparse_categorical_accuracy'])

In [ ]:

model.fit(train_dataset, epochs=10)

In [ ]:

model.evaluate(test_dataset)

Copyright 2019 The TensorFlow Authors.

Load NumPy data

Setup

Load from `.npz` file

Load NumPy arrays with `tf.data.Dataset`

Use the datasets

Shuffle and batch the datasets

Build and train a model

Product

Resources

Company

Copyright 2019 The TensorFlow Authors.

Load NumPy data

Setup

Load from .npz file

Load NumPy arrays with tf.data.Dataset

Use the datasets

Shuffle and batch the datasets

Build and train a model

Load from `.npz` file

Load NumPy arrays with `tf.data.Dataset`