Path: blob/master/site/en-snapshot/tutorials/generative/deepdream.ipynb
25118 views
Copyright 2019 The TensorFlow Authors.
DeepDream
This tutorial contains a minimal implementation of DeepDream, as described in this blog post by Alexander Mordvintsev.
DeepDream is an experiment that visualizes the patterns learned by a neural network. Similar to when a child watches clouds and tries to interpret random shapes, DeepDream over-interprets and enhances the patterns it sees in an image.
It does so by forwarding an image through the network, then calculating the gradient of the image with respect to the activations of a particular layer. The image is then modified to increase these activations, enhancing the patterns seen by the network, and resulting in a dream-like image. This process was dubbed "Inceptionism" (a reference to InceptionNet, and the movie Inception).
Let's demonstrate how you can make a neural network "dream" and enhance the surreal patterns it sees in an image.
Choose an image to dream-ify
For this tutorial, let's use an image of a labrador.
Prepare the feature extraction model
Download and prepare a pre-trained image classification model. You will use InceptionV3 which is similar to the model originally used in DeepDream. Note that any pre-trained model will work, although you will have to adjust the layer names below if you change this.
The idea in DeepDream is to choose a layer (or layers) and maximize the "loss" in a way that the image increasingly "excites" the layers. The complexity of the features incorporated depends on layers chosen by you, i.e, lower layers produce strokes or simple patterns, while deeper layers give sophisticated features in images, or even whole objects.
The InceptionV3 architecture is quite large (for a graph of the model architecture see TensorFlow's research repo). For DeepDream, the layers of interest are those where the convolutions are concatenated. There are 11 of these layers in InceptionV3, named 'mixed0' though 'mixed10'. Using different layers will result in different dream-like images. Deeper layers respond to higher-level features (such as eyes and faces), while earlier layers respond to simpler features (such as edges, shapes, and textures). Feel free to experiment with the layers selected below, but keep in mind that deeper layers (those with a higher index) will take longer to train on since the gradient computation is deeper.
Calculate loss
The loss is the sum of the activations in the chosen layers. The loss is normalized at each layer so the contribution from larger layers does not outweigh smaller layers. Normally, loss is a quantity you wish to minimize via gradient descent. In DeepDream, you will maximize this loss via gradient ascent.
Gradient ascent
Once you have calculated the loss for the chosen layers, all that is left is to calculate the gradients with respect to the image, and add them to the original image.
Adding the gradients to the image enhances the patterns seen by the network. At each step, you will have created an image that increasingly excites the activations of certain layers in the network.
The method that does this, below, is wrapped in a tf.function
for performance. It uses an input_signature
to ensure that the function is not retraced for different image sizes or steps
/step_size
values. See the Concrete functions guide for details.
Main Loop
Taking it up an octave
Pretty good, but there are a few issues with this first attempt:
The output is noisy (this could be addressed with a
tf.image.total_variation
loss).The image is low resolution.
The patterns appear like they're all happening at the same granularity.
One approach that addresses all these problems is applying gradient ascent at different scales. This will allow patterns generated at smaller scales to be incorporated into patterns at higher scales and filled in with additional detail.
To do this you can perform the previous gradient ascent approach, then increase the size of the image (which is referred to as an octave), and repeat this process for multiple octaves.
Optional: Scaling up with tiles
One thing to consider is that as the image increases in size, so will the time and memory necessary to perform the gradient calculation. The above octave implementation will not work on very large images, or many octaves.
To avoid this issue you can split the image into tiles and compute the gradient for each tile.
Applying random shifts to the image before each tiled computation prevents tile seams from appearing.
Start by implementing the random shift:
Here is a tiled equivalent of the deepdream
function defined earlier:
Putting this together gives a scalable, octave-aware deepdream implementation:
Much better! Play around with the number of octaves, octave scale, and activated layers to change how your DeepDream-ed image looks.
Readers might also be interested in TensorFlow Lucid which expands on ideas introduced in this tutorial to visualize and interpret neural networks.