GitHub Repository: tensorflow/docs-l10n
Path: blob/master/site/es-419/addons/tutorials/optimizers_lazyadam.ipynb
²⁵¹¹⁸ views

Kernel: Python 3

Copyright 2020 The TensorFlow Authors.

In [ ]:

#@title Licensed under the Apache License, Version 2.0
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

Optimizadores de complementos de TensorFlow: LazyAdam

Ver en TensorFlow.org

Ejecutar en Google Colab

Ver fuente en GitHub

Descargar bloc de notas

Descripción general

En este bloc de notas se explica cómo usar el optimizador LazyAdam del paquete de complementos.

LazyAdam

LazyAdam es una variante del optimizador Adam que gestiona las actualizaciones dispersas de forma más eficiente. El algoritmo Adam original dispone de dos acumuladores de medias móviles para cada variable entrenable; los acumuladores se actualizan en cada paso. Esta clase proporciona un manejo más flexible de las actualizaciones de gradiente para variables dispersas. Solo actualiza los acumuladores de medias móviles para los índices de variables dispersas que aparecen en el lote actual, en lugar de actualizar los acumuladores para todos los índices. Si se compara con el optimizador Adam original, ofrece grandes mejoras en el rendimiento del entrenamiento de modelos para algunas aplicaciones. No obstante, su semántica es ligeramente distinta a la del algoritmo Adam original y puede dar lugar a resultados empíricos diferentes.

Preparación

In [ ]:

!pip install -U tensorflow-addons

In [ ]:

import tensorflow as tf
import tensorflow_addons as tfa

In [ ]:

# Hyperparameters
batch_size=64
epochs=10

Generación del modelo

In [ ]:

model = tf.keras.Sequential([
    tf.keras.layers.Dense(64, input_shape=(784,), activation='relu', name='dense_1'),
    tf.keras.layers.Dense(64, activation='relu', name='dense_2'),
    tf.keras.layers.Dense(10, activation='softmax', name='predictions'),
])

Preparación de los datos

In [ ]:

# Load MNIST dataset as NumPy arrays
dataset = {}
num_validation = 10000
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()

# Preprocess the data
x_train = x_train.reshape(-1, 784).astype('float32') / 255
x_test = x_test.reshape(-1, 784).astype('float32') / 255

Entrenamiento y evaluación

Sencillamente reemplace los optimizadores típicos de keras por el nuevo optimizador de tfa

In [ ]:

# Compile the model
model.compile(
    optimizer=tfa.optimizers.LazyAdam(0.001),  # Utilize TFA optimizer
    loss=tf.keras.losses.SparseCategoricalCrossentropy(),
    metrics=['accuracy'])

# Train the network
history = model.fit(
    x_train,
    y_train,
    batch_size=batch_size,
    epochs=epochs)

In [ ]:

# Evaluate the network
print('Evaluate on test data:')
results = model.evaluate(x_test, y_test, batch_size=128, verbose = 2)
print('Test loss = {0}, Test acc: {1}'.format(results[0], results[1]))