Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
tensorflow
GitHub Repository: tensorflow/docs-l10n
Path: blob/master/site/ko/tutorials/generative/adversarial_fgsm.ipynb
25118 views
Kernel: Python 3

Licensed under the Apache License, Version 2.0 (the "License");

#@title Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # https://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License.

이 νŠœν† λ¦¬μ–Όμ—μ„œλŠ” Ian Goodfellow et al의 Explaining and Harnessing Adversarial Examples에 기술된 FGSM(Fast Gradient Signed Method)을 μ΄μš©ν•΄ μ λŒ€μ  μƒ˜ν”Œ(adversarial example)을 μƒμ„±ν•˜λŠ” 방법에 λŒ€ν•΄ μ†Œκ°œν•©λ‹ˆλ‹€. FGSM은 신경망 곡격 κΈ°μˆ λ“€ 쀑 μ΄ˆκΈ°μ— 발견된 λ°©λ²•μ΄μž κ°€μž₯ 유λͺ…ν•œ 방식 쀑 ν•˜λ‚˜μž…λ‹ˆλ‹€.

μ λŒ€μ  μƒ˜ν”Œμ΄λž€?

μ λŒ€μ  μƒ˜ν”Œμ΄λž€ 신경망을 ν˜Όλž€μ‹œν‚¬ λͺ©μ μœΌλ‘œ λ§Œλ“€μ–΄μ§„ νŠΉμˆ˜ν•œ μž…λ ₯으둜, μ‹ κ²½λ§μœΌλ‘œ ν•˜μ—¬κΈˆ μƒ˜ν”Œμ„ 잘λͺ» λΆ„λ₯˜ν•˜λ„둝 ν•©λ‹ˆλ‹€. 비둝 μΈκ°„μ—κ²Œ μ λŒ€μ  μƒ˜ν”Œμ€ 일반 μƒ˜ν”Œκ³Ό 큰 차이가 μ—†μ–΄λ³΄μ΄μ§€λ§Œ, 신경망은 μ λŒ€μ  μƒ˜ν”Œμ„ μ˜¬λ°”λ₯΄κ²Œ μ‹λ³„ν•˜μ§€ λͺ»ν•©λ‹ˆλ‹€. 이와 같은 신경망 κ³΅κ²©μ—λŠ” μ—¬λŸ¬ μ’…λ₯˜κ°€ μžˆλŠ”λ°, λ³Έ νŠœν† λ¦¬μ–Όμ—μ„œλŠ” ν™”μ΄νŠΈ λ°•μŠ€(white box) 곡격 κΈ°μˆ μ— μ†ν•˜λŠ” FGSM을 μ†Œκ°œν•©λ‹ˆλ‹€. ν™”μ΄νŠΈ λ°•μŠ€ κ³΅κ²©μ΄λž€ κ³΅κ²©μžκ°€ λŒ€μƒ λͺ¨λΈμ˜ λͺ¨λ“  νŒŒλΌλ―Έν„°κ°’μ— μ ‘κ·Όν•  수 μžˆλ‹€λŠ” κ°€μ • ν•˜μ— μ΄λ£¨μ–΄μ§€λŠ” 곡격을 μΌμ»«μŠ΅λ‹ˆλ‹€. μ•„λž˜ μ΄λ―Έμ§€λŠ” Goodfellow et al에 μ†Œκ°œλœ κ°€μž₯ 유λͺ…ν•œ μ λŒ€μ  μƒ˜ν”ŒμΈ νŒλ‹€μ˜ μ‚¬μ§„μž…λ‹ˆλ‹€.

Adversarial Example

원본 이미지에 νŠΉμ •ν•œ μž‘μ€ μ™œκ³‘μ„ μΆ”κ°€ν•˜λ©΄ μ‹ κ²½λ§μœΌλ‘œ ν•˜μ—¬κΈˆ νŒλ‹€λ₯Ό 높은 μ‹ λ’°λ„λ‘œ κΈ΄νŒ” μ›μˆ­μ΄λ‘œ 잘λͺ» μΈμ‹ν•˜λ„λ‘ λ§Œλ“€ 수 μžˆμŠ΅λ‹ˆλ‹€. μ΄ν•˜ μ„Ήμ…˜μ—μ„œλŠ” 이 μ™œκ³‘ μΆ”κ°€ 과정에 λŒ€ν•΄ μ‚΄νŽ΄λ³΄λ„λ‘ ν•˜κ² μŠ΅λ‹ˆλ‹€.

FGSM

FGSM은 μ‹ κ²½λ§μ˜ κ·Έλž˜λ””μ–ΈνŠΈ(gradient)λ₯Ό μ΄μš©ν•΄ μ λŒ€μ  μƒ˜ν”Œμ„ μƒμ„±ν•˜λŠ” κΈ°λ²•μž…λ‹ˆλ‹€. λ§Œμ•½ λͺ¨λΈμ˜ μž…λ ₯이 이미지라면, μž…λ ₯ 이미지에 λŒ€ν•œ 손싀 ν•¨μˆ˜μ˜ κ·Έλž˜λ””μ–ΈνŠΈλ₯Ό κ³„μ‚°ν•˜μ—¬ κ·Έ 손싀을 μ΅œλŒ€ν™”ν•˜λŠ” 이미지λ₯Ό μƒμ„±ν•©λ‹ˆλ‹€. 이처럼 μƒˆλ‘­κ²Œ μƒμ„±λœ 이미지λ₯Ό μ λŒ€μ  이미지(adversarial image)라고 ν•©λ‹ˆλ‹€. 이 과정은 λ‹€μŒκ³Ό 같은 μˆ˜μ‹μœΌλ‘œ 정리할 수 μžˆμŠ΅λ‹ˆλ‹€:

advx=x+Ο΅βˆ—sign(βˆ‡xJ(ΞΈ,x,y))adv_x = x + \epsilon*\text{sign}(\nabla_xJ(\theta, x, y))
  • adv_x : μ λŒ€μ  이미지.

  • x : 원본 μž…λ ₯ 이미지.

  • y : 원본 μž…λ ₯ λ ˆμ΄λΈ”(label).

  • Ο΅\epsilon : μ™œκ³‘μ˜ 양을 적게 λ§Œλ“€κΈ° μœ„ν•΄ κ³±ν•˜λŠ” 수.

  • ΞΈ\theta : λͺ¨λΈμ˜ νŒŒλΌλ―Έν„°.

  • JJ : 손싀 ν•¨μˆ˜.

각 κΈ°ν˜Έμ— λŒ€ν•œ μ„€λͺ…은 λ‹€μŒκ³Ό κ°™μŠ΅λ‹ˆλ‹€.

μ—¬κΈ°μ„œ ν₯미둜운 사싀은 μž…λ ₯ 이미지에 λŒ€ν•œ κ·Έλž˜λ””μ–ΈνŠΈκ°€ μ‚¬μš©λœλ‹€λŠ” μ μž…λ‹ˆλ‹€. μ΄λŠ” 손싀을 μ΅œλŒ€ν™”ν•˜λŠ” 이미지λ₯Ό μƒμ„±ν•˜λŠ” 것이 FGSM의 λͺ©μ μ΄κΈ° λ•Œλ¬Έμž…λ‹ˆλ‹€. μš”μ•½ν•˜μžλ©΄, μ λŒ€μ  μƒ˜ν”Œμ€ 각 ν”½μ…€μ˜ 손싀에 λŒ€ν•œ 기여도λ₯Ό κ·Έλž˜λ””μ–ΈνŠΈλ₯Ό 톡해 κ³„μ‚°ν•œ ν›„, κ·Έ 기여도에 따라 픽셀값에 μ™œκ³‘μ„ μΆ”κ°€ν•¨μœΌλ‘œμ¨ 생성할 수 μžˆμŠ΅λ‹ˆλ‹€. 각 ν”½μ…€μ˜ κΈ°μ—¬λ„λŠ” 연쇄 법칙(chain rule)을 μ΄μš©ν•΄ κ·Έλž˜λ””μ–ΈνŠΈλ₯Ό κ³„μ‚°ν•˜λŠ” κ²ƒμœΌλ‘œ λΉ λ₯΄κ²Œ νŒŒμ•…ν•  수 μžˆμŠ΅λ‹ˆλ‹€. 이것이 μž…λ ₯ 이미지에 λŒ€ν•œ κ·Έλž˜λ””μ–ΈνŠΈκ°€ μ“°μ΄λŠ” μ΄μœ μž…λ‹ˆλ‹€. λ˜ν•œ, λŒ€μƒ λͺ¨λΈμ€ 더 이상 ν•™μŠ΅ν•˜κ³  μžˆμ§€ μ•ŠκΈ° λ•Œλ¬Έμ— (λ”°λΌμ„œ μ‹ κ²½λ§μ˜ κ°€μ€‘μΉ˜μ— λŒ€ν•œ κ·Έλž˜λ””μ–ΈνŠΈλŠ” ν•„μš”ν•˜μ§€ μ•ŠμŠ΅λ‹ˆλ‹€) λͺ¨λΈμ˜ κ°€μ€‘μΉ˜κ°’μ€ λ³€ν•˜μ§€ μ•ŠμŠ΅λ‹ˆλ‹€. FGSM의 ꢁ극적인 λͺ©ν‘œλŠ” 이미 ν•™μŠ΅μ„ 마친 μƒνƒœμ˜ λͺ¨λΈμ„ ν˜Όλž€μ‹œν‚€λŠ” κ²ƒμž…λ‹ˆλ‹€.

import tensorflow as tf import matplotlib as mpl import matplotlib.pyplot as plt mpl.rcParams['figure.figsize'] = (8, 8) mpl.rcParams['axes.grid'] = False

사전 ν›ˆλ ¨λœ MobileNetV2 λͺ¨λΈκ³Ό ImageNet의 클래슀(class) 이름듀을 λΆˆλŸ¬μ˜΅λ‹ˆλ‹€.

pretrained_model = tf.keras.applications.MobileNetV2(include_top=True, weights='imagenet') pretrained_model.trainable = False # ImageNet labels decode_predictions = tf.keras.applications.mobilenet_v2.decode_predictions
# Helper function to preprocess the image so that it can be inputted in MobileNetV2 def preprocess(image): image = tf.cast(image, tf.float32) image = tf.image.resize(image, (224, 224)) image = tf.keras.applications.mobilenet_v2.preprocess_input(image) image = image[None, ...] return image # Helper function to extract labels from probability vector def get_imagenet_label(probs): return decode_predictions(probs, top=1)[0][0]

원본 이미지

Mirko CC-BY-SA 3.0의 λž˜λΈŒλΌλ„ λ¦¬νŠΈλ¦¬λ²„ μƒ˜ν”Œ 이미지λ₯Ό μ΄μš©ν•΄ μ λŒ€μ  μƒ˜ν”Œμ„ μƒμ„±ν•©λ‹ˆλ‹€. 첫 λ‹¨κ³„λ‘œ, 원본 이미지λ₯Ό μ „μ²˜λ¦¬ν•˜μ—¬ MobileNetV2 λͺ¨λΈμ— μž…λ ₯으둜 μ œκ³΅ν•©λ‹ˆλ‹€.

image_path = tf.keras.utils.get_file('YellowLabradorLooking_new.jpg', 'https://storage.googleapis.com/download.tensorflow.org/example_images/YellowLabradorLooking_new.jpg') image_raw = tf.io.read_file(image_path) image = tf.image.decode_image(image_raw) image = preprocess(image) image_probs = pretrained_model.predict(image)

이미지λ₯Ό μ‚΄νŽ΄λ΄…μ‹œλ‹€.

plt.figure() plt.imshow(image[0] * 0.5 + 0.5) # To change [-1, 1] to [0,1] _, image_class, class_confidence = get_imagenet_label(image_probs) plt.title('{} : {:.2f}% Confidence'.format(image_class, class_confidence*100)) plt.show()

μ λŒ€μ  이미지 μƒμ„±ν•˜κΈ°

FGSM μ‹€ν–‰ν•˜κΈ°

첫번째 λ‹¨κ³„λŠ” μƒ˜ν”Œ 생성을 μœ„ν•΄ 원본 이미지에 κ°€ν•˜κ²Œ 될 μ™œκ³‘μ„ μƒμ„±ν•˜λŠ” κ²ƒμž…λ‹ˆλ‹€. μ•žμ„œ μ‚΄νŽ΄λ³΄μ•˜λ“―μ΄, μ™œκ³‘μ„ 생성할 λ•Œμ—λŠ” μž…λ ₯ 이미지에 λŒ€ν•œ κ·Έλž˜λ””μ–ΈνŠΈλ₯Ό μ‚¬μš©ν•©λ‹ˆλ‹€.

loss_object = tf.keras.losses.CategoricalCrossentropy() def create_adversarial_pattern(input_image, input_label): with tf.GradientTape() as tape: tape.watch(input_image) prediction = pretrained_model(input_image) loss = loss_object(input_label, prediction) # Get the gradients of the loss w.r.t to the input image. gradient = tape.gradient(loss, input_image) # Get the sign of the gradients to create the perturbation signed_grad = tf.sign(gradient) return signed_grad

μƒμ„±ν•œ μ™œκ³‘μ„ μ‹œκ°ν™”ν•΄ λ³Ό 수 μžˆμŠ΅λ‹ˆλ‹€.

# Get the input label of the image. labrador_retriever_index = 208 label = tf.one_hot(labrador_retriever_index, image_probs.shape[-1]) label = tf.reshape(label, (1, image_probs.shape[-1])) perturbations = create_adversarial_pattern(image, label) plt.imshow(perturbations[0] * 0.5 + 0.5); # To change [-1, 1] to [0,1]

μ™œκ³‘ 승수 μ—‘μ‹€λ‘ (epsilon)을 λ°”κΏ”κ°€λ©° λ‹€μ–‘ν•œ 값듀을 μ‹œλ„ν•΄λ΄…μ‹œλ‹€. μœ„μ˜ κ°„λ‹¨ν•œ μ‹€ν—˜μ„ 톡해 μ—‘μ‹€λ‘ μ˜ 값이 컀질수둝 λ„€νŠΈμ›Œν¬λ₯Ό ν˜Όλž€μ‹œν‚€λŠ” 것이 μ‰¬μ›Œμ§μ„ μ•Œ 수 μžˆμŠ΅λ‹ˆλ‹€. ν•˜μ§€λ§Œ μ΄λŠ” μ΄λ―Έμ§€μ˜ μ™œκ³‘μ΄ 점점 더 λšœλ ·ν•΄μ§„λ‹€λŠ” 단점을 λ™λ°˜ν•©λ‹ˆλ‹€.

def display_images(image, description): _, label, confidence = get_imagenet_label(pretrained_model.predict(image)) plt.figure() plt.imshow(image[0]*0.5+0.5) plt.title('{} \n {} : {:.2f}% Confidence'.format(description, label, confidence*100)) plt.show()
epsilons = [0, 0.01, 0.1, 0.15] descriptions = [('Epsilon = {:0.3f}'.format(eps) if eps else 'Input') for eps in epsilons] for i, eps in enumerate(epsilons): adv_x = image + eps*perturbations adv_x = tf.clip_by_value(adv_x, -1, 1) display_images(adv_x, descriptions[i])

λ‹€μŒ 단계

이 νŠœν† λ¦¬μ–Όμ—μ„œ μ λŒ€μ  곡격에 λŒ€ν•΄μ„œ μ•Œμ•„λ³΄μ•˜μœΌλ‹ˆ, μ΄μ œλŠ” 이 기법을 λ‹€μ–‘ν•œ 데이터넷과 신경망 ꡬ쑰에 μ‹œν—˜ν•΄λ³Ό μ°¨λ‘€μž…λ‹ˆλ‹€. μƒˆλ‘œ λ§Œλ“  λͺ¨λΈμ— FGSM을 μ‹œλ„ν•΄λ³΄λŠ” 것도 κ°€λŠ₯ν•  κ²ƒμž…λ‹ˆλ‹€. μ—‘μ‹€λ‘  값을 λ°”κΏ”κ°€λ©° μ‹ κ²½λ§μ˜ μƒ˜ν”Œ 신뒰도가 μ–΄λ–»κ²Œ λ³€ν•˜λŠ”μ§€ μ‚΄νŽ΄λ³Ό μˆ˜λ„ μžˆμŠ΅λ‹ˆλ‹€.

FGSM은 κ·Έ μžμ²΄λ‘œλ„ κ°•λ ₯ν•œ κΈ°λ²•μ΄μ§€λ§Œ 이후 λ‹€λ₯Έ μ—°κ΅¬λ“€μ—μ„œ 발견된 보닀 더 효과적인 μ λŒ€μ  곡격 κΈ°μˆ λ“€μ˜ μ‹œμž‘μ μ— λΆˆκ³Όν•©λ‹ˆλ‹€. λ˜ν•œ, FGSM의 λ°œκ²¬μ€ μ λŒ€μ  곡격 뿐만 μ•„λ‹ˆλΌ 더 κ²¬κ³ ν•œ 기계 ν•™μŠ΅ λͺ¨λΈμ„ λ§Œλ“€κΈ° μœ„ν•œ λ°©μ–΄ κΈ°μˆ μ— λŒ€ν•œ 연ꡬ도 μ΄‰μ§„μ‹œμΌ°μŠ΅λ‹ˆλ‹€. μ λŒ€μ  곡격과 λ°©μ–΄ κΈ°μˆ μ— λŒ€ν•œ μ „λ°˜μ μΈ 쑰망은 이 λ¬Έν—Œμ—μ„œ λ³Ό 수 μžˆμŠ΅λ‹ˆλ‹€.

λ‹€μ–‘ν•œ μ λŒ€μ  곡격과 λ°©μ–΄ 기술의 κ΅¬ν˜„ 방법이 κΆκΈˆν•˜λ‹€λ©΄, μ λŒ€μ  μƒ˜ν”Œ 라이브러리 CleverHansλ₯Ό μ°Έκ³ ν•©λ‹ˆλ‹€.