GitHub Repository: tensorflow/docs-l10n
Path: blob/master/site/ko/tutorials/generative/dcgan.ipynb
³⁸³¹⁷ views

Kernel: Python 3

Copyright 2019 The TensorFlow Authors.

In [ ]:

#@title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

딥 컨볼루셔널 생성적 적대 신경망

이 튜토리얼은 딥 컨볼루셔널 생성적 적대 신경망(Deep Convolutional Generative Adversarial Networks, DCGAN)을 이용하여 손으로 쓴 숫자의 이미지를 생성하는 방법을 보여줍니다. 이 코드는 Keras Sequential API 및 tf.GradientTape 훈련 루프를 사용하여 작성되었습니다.

생성적 적대 신경망(GAN)이란?

생성적 적대 신경망(Generative Adversarial Networks, GAN)은 최근 컴퓨터 과학에서 가장 흥미로운 아이디어 중 하나입니다. 두 개의 모델이 적대적인 과정을 통해 동시에 훈련됩니다. 생성자("예술가")는 진짜처럼 보이는 이미지 생성을 학습하고, 감별자("예술비평가")는 가짜의 이미지를 진짜 이미지와 구별하는 방식을 학습합니다.

생성자와 감별자를 그린 도표

훈련 과정 동안 생성자는 점차 진짜 같은 이미지를 더욱 잘 생성하게 되며, 감별자는 점차 진짜와 가짜를 더 잘 구별하게 됩니다. 감별자가 가짜 이미지와 진짜 이미지를 더 이상 구별할 수 없을 때 이 과정은 평형 상태에 도달하게 됩니다.

생성자와 감별자를 그린 두번째 도표

이 노트북은 이 과정을 MNIST 데이터세트를 이용하여 보여줍니다. 아래의 애니메이션에는 50 epoch 동안 훈련한 생성자를 통해 생성된 연속적 이미지가 표시됩니다. 해당 이미지는 임의의 노이즈에서 시작되어, 점차 손으로 쓴 숫자를 닮아갑니다.

출력 예시

GAN에 대해 자세히 알아보려면 MIT의 Intro to Deep Learning 과정을 참조하십시오.

설정

In [ ]:

import tensorflow as tf

In [ ]:

tf.__version__

In [ ]:

# To generate GIFs
!pip install imageio
!pip install git+https://github.com/tensorflow/docs

In [ ]:

import glob
import imageio
import matplotlib.pyplot as plt
import numpy as np
import os
import PIL
from tensorflow.keras import layers
import time

from IPython import display

데이터세트 로드 및 준비하기

생성자와 감별자 훈련을 위해 MNIST 데이터세트가 사용됩니다. 생성자는 손글씨 숫자 데이터를 닮은 숫자를 생성할 것입니다.

In [ ]:

(train_images, train_labels), (_, _) = tf.keras.datasets.mnist.load_data()

In [ ]:

train_images = train_images.reshape(train_images.shape[0], 28, 28, 1).astype('float32')
train_images = (train_images - 127.5) / 127.5  # Normalize the images to [-1, 1]

In [ ]:

BUFFER_SIZE = 60000
BATCH_SIZE = 256

In [ ]:

# Batch and shuffle the data
train_dataset = tf.data.Dataset.from_tensor_slices(train_images).shuffle(BUFFER_SIZE).batch(BATCH_SIZE)

모델 만들기

Keras Sequential API를 사용해 생성자와 감별자가 정의됩니다.

생성자

생성자는 시드값(임의의 노이즈)에서 이미지를 생성하기 위해, tf.keras.layers.Conv2DTranspose(업샘플링) 레이어를 이용합니다. 처음 Dense 레이어는 해당 시드값을 입력값으로 받습니다. 다음으로 원하는 크기인 28x28x1의 이미지에 도달할 때까지 업샘플링을 여러 번 수행합니다. tanh 함수를 사용하는 출력 레이어를 제외한 각 레이어에서 tf.keras.layers.LeakyReLU 활성화가 사용됩니다.

In [ ]:

def make_generator_model():
    model = tf.keras.Sequential()
    model.add(layers.Dense(7*7*256, use_bias=False, input_shape=(100,)))
    model.add(layers.BatchNormalization())
    model.add(layers.LeakyReLU())

    model.add(layers.Reshape((7, 7, 256)))
    assert model.output_shape == (None, 7, 7, 256)  # Note: None is the batch size

    model.add(layers.Conv2DTranspose(128, (5, 5), strides=(1, 1), padding='same', use_bias=False))
    assert model.output_shape == (None, 7, 7, 128)
    model.add(layers.BatchNormalization())
    model.add(layers.LeakyReLU())

    model.add(layers.Conv2DTranspose(64, (5, 5), strides=(2, 2), padding='same', use_bias=False))
    assert model.output_shape == (None, 14, 14, 64)
    model.add(layers.BatchNormalization())
    model.add(layers.LeakyReLU())

    model.add(layers.Conv2DTranspose(1, (5, 5), strides=(2, 2), padding='same', use_bias=False, activation='tanh'))
    assert model.output_shape == (None, 28, 28, 1)

    return model

(아직 훈련되지 않은) 생성자를 이용해 이미지를 생성해 보겠습니다.

In [ ]:

generator = make_generator_model()

noise = tf.random.normal([1, 100])
generated_image = generator(noise, training=False)

plt.imshow(generated_image[0, :, :, 0], cmap='gray')

감별자

감별자는 컨볼루셔널 신경망(Convolutional Neural Network, CNN) 기반의 이미지 분류자입니다.

In [ ]:

def make_discriminator_model():
    model = tf.keras.Sequential()
    model.add(layers.Conv2D(64, (5, 5), strides=(2, 2), padding='same',
                                     input_shape=[28, 28, 1]))
    model.add(layers.LeakyReLU())
    model.add(layers.Dropout(0.3))

    model.add(layers.Conv2D(128, (5, 5), strides=(2, 2), padding='same'))
    model.add(layers.LeakyReLU())
    model.add(layers.Dropout(0.3))

    model.add(layers.Flatten())
    model.add(layers.Dense(1))

    return model

(아직까지 훈련되지 않은) 감별자를 사용하여 생성된 이미지가 진짜인지 가짜인지 판별합니다. 진짜 이미지에는 양수의 값을, 가짜 이미지에는 음수의 값을 출력하도록 모델이 훈련됩니다.

In [ ]:

discriminator = make_discriminator_model()
decision = discriminator(generated_image)
print (decision)

손실과 옵티마이저 정의하기

두 모델의 손실 함수와 옵티마이저를 정의합니다.

In [ ]:

# This method returns a helper function to compute cross entropy loss
cross_entropy = tf.keras.losses.BinaryCrossentropy(from_logits=True)

감별자 손실

이 메서드는 감별자가 가짜 이미지와 진짜 이미지를 얼마나 잘 판별하는지 수치화합니다. 진짜 이미지에 대한 감별자의 예측과 1로 이루어진 배열을 비교하고, 가짜로 (생성된) 이미지에 대한 감별자의 예측과 0으로 이루어진 배열을 비교합니다.

In [ ]:

def discriminator_loss(real_output, fake_output):
    real_loss = cross_entropy(tf.ones_like(real_output), real_output)
    fake_loss = cross_entropy(tf.zeros_like(fake_output), fake_output)
    total_loss = real_loss + fake_loss
    return total_loss

생성자 손실

생성자의 손실은 판별자를 얼마나 잘 속일 수 있었는지를 수치화합니다. 직관적으로, 생성자가 잘 작동하면 판별자는 가짜 이미지를 실제(또는 1)로 분류합니다. 여기에서 생성된 이미지에 대한 판별자의 판단을 1의 배열과 비교합니다.

In [ ]:

def generator_loss(fake_output):
    return cross_entropy(tf.ones_like(fake_output), fake_output)

판별자와 생성자 옵티마이저는 두 네트워크를 별도로 훈련하기 때문에 서로 다릅니다.

In [ ]:

generator_optimizer = tf.keras.optimizers.Adam(1e-4)
discriminator_optimizer = tf.keras.optimizers.Adam(1e-4)

체크포인트 저장하기

이 노트북은 오랫동안 진행되는 훈련이 중단된 경우 유용한 모델의 저장 및 복구 방법을 보여줍니다.

In [ ]:

checkpoint_dir = './training_checkpoints'
checkpoint_prefix = os.path.join(checkpoint_dir, "ckpt")
checkpoint = tf.train.Checkpoint(generator_optimizer=generator_optimizer,
                                 discriminator_optimizer=discriminator_optimizer,
                                 generator=generator,
                                 discriminator=discriminator)

훈련 루프 정의하기

In [ ]:

EPOCHS = 50
noise_dim = 100
num_examples_to_generate = 16

# You will reuse this seed overtime (so it's easier)
# to visualize progress in the animated GIF)
seed = tf.random.normal([num_examples_to_generate, noise_dim])

생성자가 입력으로 임의의 시드를 받으면 훈련 루프가 시작됩니다. 해당 시드는 이미지를 생성하는 데 사용됩니다. 감별자를 사용하여 (훈련 세트에서 가져온) 진짜 이미지와 (생성자를 통해 생성된) 가짜 이미지를 분류합니다. 각 모델의 손실을 계산하고, 그래디언트를 사용해 생성자와 감별자를 업데이트합니다.

In [ ]:

# Notice the use of `tf.function`
# This annotation causes the function to be "compiled".
@tf.function
def train_step(images):
    noise = tf.random.normal([BATCH_SIZE, noise_dim])

    with tf.GradientTape() as gen_tape, tf.GradientTape() as disc_tape:
      generated_images = generator(noise, training=True)

      real_output = discriminator(images, training=True)
      fake_output = discriminator(generated_images, training=True)

      gen_loss = generator_loss(fake_output)
      disc_loss = discriminator_loss(real_output, fake_output)

    gradients_of_generator = gen_tape.gradient(gen_loss, generator.trainable_variables)
    gradients_of_discriminator = disc_tape.gradient(disc_loss, discriminator.trainable_variables)

    generator_optimizer.apply_gradients(zip(gradients_of_generator, generator.trainable_variables))
    discriminator_optimizer.apply_gradients(zip(gradients_of_discriminator, discriminator.trainable_variables))

In [ ]:

def train(dataset, epochs):
  for epoch in range(epochs):
    start = time.time()

    for image_batch in dataset:
      train_step(image_batch)

    # Produce images for the GIF as you go
    display.clear_output(wait=True)
    generate_and_save_images(generator,
                             epoch + 1,
                             seed)

    # Save the model every 15 epochs
    if (epoch + 1) % 15 == 0:
      checkpoint.save(file_prefix = checkpoint_prefix)

    print ('Time for epoch {} is {} sec'.format(epoch + 1, time.time()-start))

  # Generate after the final epoch
  display.clear_output(wait=True)
  generate_and_save_images(generator,
                           epochs,
                           seed)

이미지 생성 및 저장하기

In [ ]:

def generate_and_save_images(model, epoch, test_input):
  # Notice `training` is set to False.
  # This is so all layers run in inference mode (batchnorm).
  predictions = model(test_input, training=False)

  fig = plt.figure(figsize=(4, 4))

  for i in range(predictions.shape[0]):
      plt.subplot(4, 4, i+1)
      plt.imshow(predictions[i, :, :, 0] * 127.5 + 127.5, cmap='gray')
      plt.axis('off')

  plt.savefig('image_at_epoch_{:04d}.png'.format(epoch))
  plt.show()

모델 훈련하기

위에서 정의한 train() 메서드를 호출하여 생성자와 판별자를 동시에 훈련합니다. GAN 훈련은 까다로울 수 있습니다. 생성자와 판별자가 서로를 압도하지 않는 것이 중요합니다(예: 비슷한 속도로 훈련됨).

훈련 초기에 생성된 이미지는 무작위 노이즈처럼 보입니다. 훈련이 진행됨에 따라 생성된 숫자는 점점 더 실제처럼 보일 것입니다. 약 50 epoch 후에는 MNIST 숫자와 유사합니다. Colab의 기본 설정에서 이 작업은 epoch당 약 1분 정도 소요될 수 있습니다.

In [ ]:

train(train_dataset, EPOCHS)

마지막 체크포인트를 복구합니다.

In [ ]:

checkpoint.restore(tf.train.latest_checkpoint(checkpoint_dir))

GIF 생성하기

In [ ]:

# Display a single image using the epoch number
def display_image(epoch_no):
  return PIL.Image.open('image_at_epoch_{:04d}.png'.format(epoch_no))

In [ ]:

display_image(EPOCHS)

imageio를 통해 훈련 중 저장된 이미지를 사용한 GIF 애니메이션을 만듭니다.

In [ ]:

anim_file = 'dcgan.gif'

with imageio.get_writer(anim_file, mode='I') as writer:
  filenames = glob.glob('image*.png')
  filenames = sorted(filenames)
  for filename in filenames:
    image = imageio.imread(filename)
    writer.append_data(image)
  image = imageio.imread(filename)
  writer.append_data(image)

In [ ]:

import tensorflow_docs.vis.embed as embed
embed.embed_file(anim_file)

다음 단계

이 튜토리얼은 GAN을 작성하고 훈련하는 데 필요한 전체 코드를 보여주었습니다. 다음 단계로 Kaggle에서 사용할 수 있는 대규모 유명인 얼굴 특성(CelebA) 데이터세트와 같은 다른 데이터세트를 실험해볼 수 있습니다. GAN에 대해 자세히 알아보려면 NIPS 2016 튜토리얼: 생성적 적대 네트워크를 참조하세요.