Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
tensorflow
GitHub Repository: tensorflow/docs-l10n
Path: blob/master/site/ko/tutorials/customization/custom_training_walkthrough.ipynb
25118 views
Kernel: Python 3
#@title Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # https://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License.

์‚ฌ์šฉ์ž ์ •์˜ ํ•™์Šต: ์ž์„ธํžˆ ๋‘˜๋Ÿฌ๋ณด๊ธฐ

์ด ํŠœํ† ๋ฆฌ์–ผ์€ ํŽญ๊ท„์„ ์ข…๋ณ„๋กœ ๋ถ„๋ฅ˜ํ•˜๊ธฐ ์œ„ํ•œ ์‚ฌ์šฉ์ž ์ •์˜ ํ›ˆ๋ จ ๋ฃจํ”„๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋จธ์‹ ๋Ÿฌ๋‹ ๋ชจ๋ธ์„ ํ•™์Šต์‹œํ‚ค๋Š” ๋ฐฉ๋ฒ•์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค. ์ด ๋…ธํŠธ๋ถ์—์„œ๋Š” TensorFlow๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋‹ค์Œ์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.

  1. ๋ฐ์ดํ„ฐ์„ธํŠธ ๊ฐ€์ ธ์˜ค๊ธฐ

  2. ๊ฐ„๋‹จํ•œ ์„ ํ˜• ๋ชจ๋ธ ๊ตฌ์ถ•ํ•˜๊ธฐ

  3. ๋ชจ๋ธ ํ›ˆ๋ จํ•˜๊ธฐ

  4. ๋ชจ๋ธ์˜ ํšจ๊ณผ ํ‰๊ฐ€ํ•˜๊ธฐ

  5. ํ›ˆ๋ จ๋œ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜์—ฌ ์˜ˆ์ธกํ•˜๊ธฐ

TensorFlow ํ”„๋กœ๊ทธ๋ž˜๋ฐ

์ด ํŠœํ† ๋ฆฌ์–ผ์€ ๋‹ค์Œ TensorFlow ํ”„๋กœ๊ทธ๋ž˜๋ฐ ์ž‘์—…์„ ๋ณด์—ฌ ์ค๋‹ˆ๋‹ค.

ํŽญ๊ท„ ๋ถ„๋ฅ˜ ๋ฌธ์ œ

์กฐ๋ฅ˜ํ•™์ž๊ฐ€ ํŽญ๊ท„์„ ์ž๋™์œผ๋กœ ๋ถ„๋ฅ˜ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์ฐพ๊ณ  ์žˆ๋‹ค๊ณ  ๊ฐ€์ •ํ•ด ๋ด…์‹œ๋‹ค. ๋จธ์‹ ๋Ÿฌ๋‹์€ ํ†ต๊ณ„์ ์œผ๋กœ ํŽญ๊ท„์„ ๋ถ„๋ฅ˜ํ•  ์ˆ˜ ์žˆ๋Š” ๋‹ค์–‘ํ•œ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ์ •๊ตํ•œ ๋จธ์‹ ๋Ÿฌ๋‹ ํ”„๋กœ๊ทธ๋žจ์ด๋ผ๋ฉด ์‚ฌ์ง„์„ ํ†ตํ•ด ํŽญ๊ท„์„ ๋ถ„๋ฅ˜ํ•  ์ˆ˜ ์žˆ์„ ๊ฒ๋‹ˆ๋‹ค. ์ด ํŠœํ† ๋ฆฌ์–ผ์—์„œ ๋นŒ๋“œํ•˜๋Š” ๋ชจ๋ธ์€ ์กฐ๊ธˆ ๋” ๊ฐ„๋‹จํ•ฉ๋‹ˆ๋‹ค. ํŠœํ† ๋ฆฌ์–ผ์—์„œ๋Š” ์ฒด์ค‘, ํŽญ๊ท„๋‚ ๊ฐœ ๊ธธ์ด, ๋ถ€๋ฆฌ, ํŠนํžˆ ํŽญ๊ท„ ๋ถ€๋ฆฌ์˜ ๊ธธ์ด์™€ ๋„ˆ๋น„ ์ธก์ •์„ ๊ธฐ์ค€์œผ๋กœ ํŽญ๊ท„์„ ๋ถ„๋ฅ˜ํ•ฉ๋‹ˆ๋‹ค.

18์ข…์˜ ํŽญ๊ท„์ด ์žˆ์ง€๋งŒ ์ด ํŠœํ† ๋ฆฌ์–ผ์—์„œ๋Š” ๋‹ค์Œ ์„ธ ์ข…์˜ ํŽญ๊ท„๋งŒ ๋ถ„๋ฅ˜ํ•˜๋ ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค.

  • ํ„ฑ๋ˆ ํŽญ๊ท„

  • ์  ํˆฌ ํŽญ๊ท„

  • ์•„๋ธ๋ฆฌ ํŽญ๊ท„

Petal geometry compared for three iris species: Iris setosa, Iris virginica, and Iris versicolor
๊ทธ๋ฆผ 1. ํ„ฑ๋ˆ ํŽญ๊ท„, ์  ํˆฌ ํŽญ๊ท„, ์•„๋ธ๋ฆฌ ํŽญ๊ท„(@allison_horst ์˜ ์ž‘ํ’ˆ, CC BY-SA 2.0).

๋‹คํ–‰ํžˆ ์—ฐ๊ตฌํŒ€์—์„œ ์ด๋ฏธ ์ฒด์ค‘, ์˜ค๋ฆฌ๋ฐœ ๊ธธ์ด, ๋ถ€๋ฆฌ ์น˜์ˆ˜ ๋ฐ ๊ธฐํƒ€ ๋ฐ์ดํ„ฐ๊ฐ€ ํฌํ•จ๋œ 334๋งˆ๋ฆฌ์˜ ํŽญ๊ท„ ๋ฐ์ดํ„ฐ์„ธํŠธ๋ฅผ ๋งŒ๋“ค๊ณ  ๊ณต์œ ํ–ˆ์Šต๋‹ˆ๋‹ค. ์ด ๋ฐ์ดํ„ฐ์„ธํŠธ๋Š” ํŽญ๊ท„ TensorFlow ๋ฐ์ดํ„ฐ์„ธํŠธ๋กœ๋„ ํŽธ๋ฆฌํ•˜๊ฒŒ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์„ค์ •ํ•˜๊ธฐ

ํŽญ๊ท„ ๋ฐ์ดํ„ฐ์„ธํŠธ์šฉ tfds-nightly ํŒจํ‚ค์ง€๋ฅผ ์„ค์น˜ํ•ฉ๋‹ˆ๋‹ค. tfds-nightly ํŒจํ‚ค์ง€๋Š” TFDS(TensorFlow ๋ฐ์ดํ„ฐ์„ธํŠธ)์˜ Nightly ์ถœ์‹œ ๋ฒ„์ „์ž…๋‹ˆ๋‹ค. TFDS์— ๋Œ€ํ•œ ์ž์„ธํ•œ ๋‚ด์šฉ์€ TensorFlow ๋ฐ์ดํ„ฐ์„ธํŠธ ๊ฐœ์š”๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

!pip install -q tfds-nightly

๊ทธ๋Ÿฐ ๋‹ค์Œ Colab ๋ฉ”๋‰ด์—์„œ ๋Ÿฐํƒ€์ž„ > ๋Ÿฐํƒ€์ž„ ๋‹ค์‹œ ์‹œ์ž‘์„ ์„ ํƒํ•˜์—ฌ Colab ๋Ÿฐํƒ€์ž„์„ ๋‹ค์‹œ ์‹œ์ž‘ํ•ฉ๋‹ˆ๋‹ค.

๋Ÿฐํƒ€์ž„์„ ์šฐ์„ ์ ์œผ๋กœ ๋‹ค์‹œ ์‹œ์ž‘ํ•˜์ง€ ์•Š์€ ๊ฒฝ์šฐ ์ด ํŠœํ† ๋ฆฌ์–ผ์˜ ๋‚˜๋จธ์ง€ ๋ถ€๋ถ„์„ ์ง„ํ–‰ํ•˜์ง€ ๋งˆ์„ธ์š”.

TensorFlow ๋ฐ ๊ธฐํƒ€ ํ•„์ˆ˜ Python ๋ชจ๋“ˆ์„ ๊ฐ€์ ธ์˜ต๋‹ˆ๋‹ค.

import os import tensorflow as tf import tensorflow_datasets as tfds import matplotlib.pyplot as plt print("TensorFlow version: {}".format(tf.__version__)) print("TensorFlow Datasets version: ",tfds.__version__)

๋ฐ์ดํ„ฐ์„ธํŠธ ๊ฐ€์ ธ์˜ค๊ธฐ

๊ธฐ๋ณธ ํŽญ๊ท„/์ฒ˜๋ฆฌ๋จ TensorFlow ๋ฐ์ดํ„ฐ์„ธํŠธ๋Š” ์ด๋ฏธ ์ •๋ฆฌ๋˜๊ณ  ์ •๊ทœํ™”๋˜์—ˆ์œผ๋ฉฐ ๋ชจ๋ธ์„ ๋นŒ๋“œํ•  ์ค€๋น„๊ฐ€ ์™„๋ฃŒ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ์ฒ˜๋ฆฌ๋œ ๋ฐ์ดํ„ฐ๋ฅผ ๋‹ค์šด๋กœ๋“œํ•˜๊ธฐ ์ „์— ๋‹จ์ˆœํ™”๋œ ๋ฒ„์ „์„ ๋ฏธ๋ฆฌ ๋ณด๊ธฐํ•˜์—ฌ ์›๋ž˜์˜ ํŽญ๊ท„ ์กฐ์‚ฌ ๋ฐ์ดํ„ฐ์— ์ต์ˆ™ํ•ด์ง€์„ธ์š”.

๋ฐ์ดํ„ฐ ๋ฏธ๋ฆฌ ๋ณด๊ธฐ

TensorFlow ๋ฐ์ดํ„ฐ์„ธํŠธ tdfs.load ๋ฉ”์„œ๋“œ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํŽญ๊ท„ ๋ฐ์ดํ„ฐ์„ธํŠธ์˜ ๊ฐ„์†Œํ™”๋œ ๋ฒ„์ „(penguins/simple)์„ ๋‹ค์šด๋กœ๋“œํ•ฉ๋‹ˆ๋‹ค. ์ด ๋ฐ์ดํ„ฐ์„ธํŠธ์—๋Š” 344๊ฐœ์˜ ๋ฐ์ดํ„ฐ ๋ ˆ์ฝ”๋“œ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ์ฒ˜์Œ 5๊ฐœ์˜ ๋ ˆ์ฝ”๋“œ๋ฅผ DataFrame ๊ฐ์ฒด๋กœ ์ถ”์ถœํ•˜์—ฌ ์ด ๋ฐ์ดํ„ฐ์„ธํŠธ์— ์žˆ๋Š” ๊ฐ’์˜ ์ƒ˜ํ”Œ์„ ๊ฒ€์‚ฌํ•ฉ๋‹ˆ๋‹ค.

ds_preview, info = tfds.load('penguins/simple', split='train', with_info=True) df = tfds.as_dataframe(ds_preview.take(5), info) print(df) print(info.features)

๋ฒˆํ˜ธ๊ฐ€ ๋งค๊ฒจ์ง„ ์ค„์€ ๋ฐ์ดํ„ฐ ๋ ˆ์ฝ”๋“œ๋กœ, ํ•œ ์ค„๋‹น ํ•œ ๊ฐœ์˜ *์˜ˆ*๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

  • ์ฒ˜์Œ ์—ฌ์„ฏ ๊ฐœ์˜ ํ•„๋“œ๋Š” ์˜ˆ์ œ์˜ *ํŠน์„ฑ*์„ ํ‘œ์‹œํ•˜๋Š” ํŠน์„ฑ์ž…๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์—๋Š” ํŽญ๊ท„ ์ธก์ •์น˜๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” ์ˆซ์ž๊ฐ€ ํ‘œ์‹œ๋ฉ๋‹ˆ๋‹ค.

  • ๋งˆ์ง€๋ง‰ ์—ด์€ *๋ ˆ์ด๋ธ”*์ด๋ฉฐ ์ด๋Š” ์˜ˆ์ธกํ•˜๋ ค๋Š” ๊ฐ’์„ ๋‚˜ํƒ€๋ƒ…๋‹ˆ๋‹ค. ์ด ๋ฐ์ดํ„ฐ์„ธํŠธ์—์„œ ์ •์ˆ˜๊ฐ’ 0, 1, 2๋Š” ํŽญ๊ท„ ์ข… ์ด๋ฆ„์— ์ƒ์‘ํ•ฉ๋‹ˆ๋‹ค.

๋ฐ์ดํ„ฐ์„ธํŠธ์—์„œ ํŽญ๊ท„ ์ข…์˜ ๋ ˆ์ด๋ธ”์€ ๊ตฌ์ถ• ์ค‘์ธ ๋ชจ๋ธ์—์„œ ๋” ์‰ฝ๊ฒŒ ์ž‘์—…ํ•  ์ˆ˜ ์žˆ๋„๋ก ์ˆซ์ž๋กœ ํ‘œ์‹œ๋ฉ๋‹ˆ๋‹ค. ์ด ์ˆซ์ž๋Š” ๋‹ค์Œ ํŽญ๊ท„ ์ข…์— ์ƒ์‘ํ•ฉ๋‹ˆ๋‹ค.

  • 0: ์•„๋ธ๋ฆฌ ํŽญ๊ท„

  • 1: ํ„ฑ๋ˆ ํŽญ๊ท„

  • 2: ์  ํˆฌ ํŽญ๊ท„

ํŽญ๊ท„ ์ข…์˜ ์ด๋ฆ„์„ ์ด ์ˆœ์„œ๋กœ ํฌํ•จํ•˜๋Š” ๋ชฉ๋ก์„ ๋งŒ๋“ญ๋‹ˆ๋‹ค. ์ด ๋ชฉ๋ก์„ ์‚ฌ์šฉํ•˜์—ฌ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์˜ ์ถœ๋ ฅ์„ ํ•ด์„ํ•ฉ๋‹ˆ๋‹ค.

class_names = ['Adรฉlie', 'Chinstrap', 'Gentoo']

ํŠน์„ฑ๊ณผ ๋ ˆ์ด๋ธ”์— ๊ด€ํ•œ ๋” ์ž์„ธํ•œ ๋‚ด์šฉ์€ ๋จธ์‹ ๋Ÿฌ๋‹ ๋‹จ๊ธฐ ์ง‘์ค‘ ๊ณผ์ •์˜ ML ์šฉ์–ด ์„น์…˜์„ ์ฐธ์กฐํ•˜์„ธ์š”.

์ „์ฒ˜๋ฆฌ๋œ ๋ฐ์ดํ„ฐ์„ธํŠธ ๋‹ค์šด๋กœ๋“œํ•˜๊ธฐ

์ด์ œ tf.data.Dataset ๊ฐ์ฒด ๋ชฉ๋ก์„ ๋ฐ˜ํ™˜ํ•˜๋Š” tfds.load ๋ฉ”์„œ๋“œ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ „์ฒ˜๋ฆฌ๋œ ํŽญ๊ท„ ๋ฐ์ดํ„ฐ์„ธํŠธ(penguins/processed)๋ฅผ ๋‹ค์šด๋กœ๋“œํ•ฉ๋‹ˆ๋‹ค. penguins/processed ๋ฐ์ดํ„ฐ์„ธํŠธ์˜ ๊ฒฝ์šฐ ์ž์ฒด ํ…Œ์ŠคํŠธ ์„ธํŠธ๊ฐ€ ํ•จ๊ป˜ ์ œ๊ณต๋˜์ง€ ์•Š์œผ๋ฏ€๋กœ 80:20์˜ ๋ถ„ํ•  ๋น„์œจ์˜ ํ›ˆ๋ จ ์„ธํŠธ์™€ ํ…Œ์ŠคํŠธ ์„ธํŠธ๋กœ ์ „์ฒด ๋ฐ์ดํ„ฐ์„ธํŠธ๋ฅผ ๋ถ„ํ• ํ•ฉ๋‹ˆ๋‹ค. ๋‚˜์ค‘์— ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ์„ธํŠธ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ชจ๋ธ์„ ํ™•์ธํ•ฉ๋‹ˆ๋‹ค.

ds_split, info = tfds.load("penguins/processed", split=['train[:20%]', 'train[20%:]'], as_supervised=True, with_info=True) ds_test = ds_split[0] ds_train = ds_split[1] assert isinstance(ds_test, tf.data.Dataset) print(info.features) df_test = tfds.as_dataframe(ds_test.take(5), info) print("Test dataset sample: ") print(df_test) df_train = tfds.as_dataframe(ds_train.take(5), info) print("Train dataset sample: ") print(df_train) ds_train_batch = ds_train.batch(32)

์ด ๋ฒ„์ „์˜ ๋ฐ์ดํ„ฐ์„ธํŠธ๋Š” 4๊ฐœ์˜ ์ •๊ทœํ™”๋œ ํŠน์„ฑ๊ณผ ์ข… ๋ ˆ์ด๋ธ”๋กœ ๋ฐ์ดํ„ฐ๋ฅผ ์ถ•์†Œํ•˜์—ฌ ์ฒ˜๋ฆฌ๋˜์—ˆ์Œ์„ ์•Œ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด ํ˜•์‹์—์„œ ๋ฐ์ดํ„ฐ๋Š” ์ถ”๊ฐ€ ์ฒ˜๋ฆฌ ์—†์ด ๋ชจ๋ธ์„ ํ›ˆ๋ จํ•˜๋Š” ๋ฐ ๋น ๋ฅด๊ฒŒ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

features, labels = next(iter(ds_train_batch)) print(features) print(labels)

๋ฐฐ์น˜์—์„œ ์ผ๋ถ€ ํŠน์„ฑ์„ ํ”Œ๋กฏํ•˜์—ฌ ํด๋Ÿฌ์Šคํ„ฐ๋ฅผ ์‹œ๊ฐํ™”ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

plt.scatter(features[:,0], features[:,2], c=labels, cmap='viridis') plt.xlabel("Body Mass") plt.ylabel("Culmen Length") plt.show()

๊ฐ„๋‹จํ•œ ์„ ํ˜• ๋ชจ๋ธ ๊ตฌ์ถ•ํ•˜๊ธฐ

์™œ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•ด์•ผํ•˜๋Š”๊ฐ€?

*๋ชจ๋ธ*์€ ํŠน์„ฑ๊ณผ ๋ ˆ์ด๋ธ” ๊ฐ„์˜ ๊ด€๊ณ„์ž…๋‹ˆ๋‹ค. ํŽญ๊ท„ ๋ถ„๋ฅ˜ ๋ฌธ์ œ์˜ ๊ฒฝ์šฐ, ๋ชจ๋ธ์€ ์ฒด์งˆ๋Ÿ‰๊ณผ ํŽญ๊ท„๋‚ ๊ฐœ ๋ฐ ํŽญ๊ท„ ๋ถ€๋ฆฌ ์ธก์ •์น˜์™€ ์˜ˆ์ธก๋œ ํŽญ๊ท„ ์ข… ๊ฐ„์˜ ๊ด€๊ณ„๋ฅผ ์ •์˜ํ•ฉ๋‹ˆ๋‹ค. ์ผ๋ถ€ ๊ฐ„๋‹จํ•œ ๋ชจ๋ธ์€ ๋ช‡ ์ค„์˜ ๋Œ€์ˆ˜๋กœ ์„ค๋ช…ํ•  ์ˆ˜ ์žˆ์ง€๋งŒ, ๋ณต์žกํ•œ ๋จธ์‹ ๋Ÿฌ๋‹ ๋ชจ๋ธ์—๋Š” ์š”์•ฝํ•˜๊ธฐ ์–ด๋ ค์šด ๋งค๊ฐœ๋ณ€์ˆ˜๊ฐ€ ๋งŽ์Šต๋‹ˆ๋‹ค.

๋จธ์‹ ๋Ÿฌ๋‹์„ ์‚ฌ์šฉํ•˜์ง€ ์•Š๊ณ  4๊ฐ€์ง€ ํŠน์„ฑ๊ณผ ํŽญ๊ท„ ์ข… ๊ฐ„์˜ ๊ด€๊ณ„๋ฅผ ํ™•์ธํ•  ์ˆ˜ ์žˆ์„๊นŒ์š”? ์ฆ‰, ๊ธฐ์กด ํ”„๋กœ๊ทธ๋ž˜๋ฐ ๊ธฐ์ˆ (์˜ˆ: ์—ฌ๋Ÿฌ ๊ฐœ์˜ ์กฐ๊ฑด๋ฌธ)์„ ์‚ฌ์šฉํ•˜์—ฌ ๋ชจ๋ธ์„ ๋งŒ๋“ค ์ˆ˜ ์žˆ์„๊นŒ์š”? ํŠน์ • ์ข…์— ๋Œ€ํ•œ ์ฒด์งˆ๋Ÿ‰๊ณผ ํŽญ๊ท„ ๋ถ€๋ฆฌ ์ธก์ •์น˜ ๊ฐ„์˜ ๊ด€๊ณ„๋ฅผ ํ™•์ธํ•  ์ˆ˜ ์žˆ์„ ๋งŒํผ ์ถฉ๋ถ„ํžˆ ์˜ค๋žซ๋™์•ˆ ๋ฐ์ดํ„ฐ์„ธํŠธ๋ฅผ ๋ถ„์„ํ•œ ๊ฒฝ์šฐ ๊ฐ€๋Šฅํ•  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์ด๊ฒƒ์€ ๋” ๋ณต์žกํ•œ ๋ฐ์ดํ„ฐ์„ธํŠธ์—์„œ๋Š” ์–ด๋ ต๊ฑฐ๋‚˜ ๋ถˆ๊ฐ€๋Šฅํ•  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค. ์ข‹์€ ๋จธ์‹ ๋Ÿฌ๋‹ ์ ‘๊ทผ ๋ฐฉ์‹์ด๋ผ๋ฉด ์ ์ ˆํ•œ ๋ชจ๋ธ์„ ์ œ์‹œํ•ด ์ค๋‹ˆ๋‹ค. ์ ์ ˆํ•œ ๋จธ์‹ ๋Ÿฌ๋‹ ๋ชจ๋ธ ํ˜•์‹์— ์ถฉ๋ถ„ํ•œ ๋Œ€ํ‘œ ์˜ˆ์ œ๋ฅผ ์ œ๊ณตํ•˜๋ฉด ํ”„๋กœ๊ทธ๋žจ์ด ๊ด€๊ณ„๋ฅผ ํŒŒ์•…ํ•ด ์ค๋‹ˆ๋‹ค.

๋ชจ๋ธ ์„ ์ •

ํ›ˆ๋ จํ•  ๋ชจ๋ธ์˜ ์ข…๋ฅ˜๋ฅผ ์„ ํƒํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ๋งŽ์€ ํ˜•์‹์˜ ๋ชจ๋ธ์ด ์žˆ์œผ๋ฉฐ ์ข‹์€ ๋ชจ๋ธ์„ ์„ ํƒํ•˜๋ ค๋ฉด ๊ฒฝํ—˜์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค. ์ด ํŠœํ† ๋ฆฌ์–ผ์—์„œ๋Š” ์‹ ๊ฒฝ๋ง์„ ์‚ฌ์šฉํ•˜์—ฌ ํŽญ๊ท„ ๋ถ„๋ฅ˜ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•ฉ๋‹ˆ๋‹ค. ์‹ ๊ฒฝ๋ง์€ ํŠน์„ฑ๊ณผ ๋ ˆ์ด๋ธ” ๊ฐ„์˜ ๋ณต์žกํ•œ ๊ด€๊ณ„๋ฅผ ์ฐพ์„ ์ˆ˜ ์žˆ์œผ๋ฉฐ, ํ•˜๋‚˜ ์ด์ƒ์˜ ์ˆจ๊ฒจ์ง„ ๋ ˆ์ด์–ด๋กœ ๊ตฌ์„ฑ๋œ ๊ณ ๋„๋กœ ๊ตฌ์กฐํ™”๋œ ๊ทธ๋ž˜ํ”„์ž…๋‹ˆ๋‹ค. ๊ฐ ์ˆจ๊ฒจ์ง„ ๋ ˆ์ด์–ด๋Š” ํ•˜๋‚˜ ์ด์ƒ์˜ ์‹ ๊ฒฝ์œผ๋กœ ๊ตฌ์„ฑ๋ฉ๋‹ˆ๋‹ค. ์‹ ๊ฒฝ๋ง์—๋Š” ์—ฌ๋Ÿฌ ๋ฒ”์ฃผ๊ฐ€ ์žˆ์œผ๋ฉฐ, ์ด ํ”„๋กœ๊ทธ๋žจ์€ ์กฐ๋ฐ€ํ•˜๊ฑฐ๋‚˜ ์™„์ „ํžˆ ์—ฐ๊ฒฐ๋œ ์‹ ๊ฒฝ๋ง์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ์ฆ‰, ํ•œ ๋ ˆ์ด์–ด์˜ ์‹ ๊ฒฝ์€ ์ด์ „ ๋ ˆ์ด์–ด์˜ ๋ชจ๋“  ์‹ ๊ฒฝ์—์„œ ์ž…๋ ฅ ์—ฐ๊ฒฐ์„ ๋ฐ›์Šต๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ๊ทธ๋ฆผ 2๋Š” ์ž…๋ ฅ ๋ ˆ์ด์–ด, 2๊ฐœ์˜ ์ˆจ๊ฒจ์ง„ ๋ ˆ์ด์–ด ๋ฐ ์ถœ๋ ฅ ๋ ˆ์ด์–ด๋กœ ๊ตฌ์„ฑ๋œ ์กฐ๋ฐ€ํ•œ ์‹ ๊ฒฝ๋ง์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.

A diagram of the network architecture: Inputs, 2 hidden layers, and outputs
๊ทธ๋ฆผ 2. ํŠน์„ฑ, ์ˆจ๊ฒจ์ง„ ๋ ˆ์ด์–ด, ์˜ˆ์ธก์œผ๋กœ ๊ตฌ์„ฑ๋œ ์‹ ๊ฒฝ๋ง

๊ทธ๋ฆผ 2์˜ ๋ชจ๋ธ์„ ํ›ˆ๋ จํ•˜๊ณ  ๋ ˆ์ด๋ธ”์ด ์ง€์ •๋˜์ง€ ์•Š์€ ์˜ˆ์ œ๋ฅผ ์ œ๊ณตํ•˜๋ฉด, ์ด ํŽญ๊ท„์ด ์ฃผ์–ด์ง„ ํŽญ๊ท„ ์ข…์ผ ๊ฐ€๋Šฅ์„ฑ์— ๋Œ€ํ•œ 3๊ฐ€์ง€ ์˜ˆ์ธก๊ฐ’์ด ์ƒ์„ฑ๋ฉ๋‹ˆ๋‹ค. ์ด ์˜ˆ์ธก์„ ์ถ”๋ก ์ด๋ผ๊ณ  ํ•ฉ๋‹ˆ๋‹ค. ์ด ์˜ˆ์ œ์—์„œ ์ถœ๋ ฅ ์˜ˆ์ธก๊ฐ’์˜ ํ•ฉ๊ณ„๋Š” 1.0์ž…๋‹ˆ๋‹ค. ๊ทธ๋ฆผ 2์—์„œ ์ด ์˜ˆ์ธก์€ ์•„๋ธ๋ฆฌ ํŽญ๊ท„์˜ ๊ฒฝ์šฐ 0.02, ํ„ฑ๋ˆ ํŽญ๊ท„์˜ ๊ฒฝ์šฐ 0.95, ์  ํˆฌ ํŽญ๊ท„์˜ ๊ฒฝ์šฐ 0.03์ž…๋‹ˆ๋‹ค. ์ฆ‰, ๋ชจ๋ธ์€ 95% ํ™•๋ฅ ๋กœ ๋ ˆ์ด๋ธ”์ด ์ง€์ •๋˜์ง€ ์•Š์€ ์˜ˆ์ œ ํŽญ๊ท„์ด ํ„ฑ๋ˆ ํŽญ๊ท„์ด๋ผ๊ณ  ์˜ˆ์ธกํ•ฉ๋‹ˆ๋‹ค.

์ผ€๋ผ์Šค๋ฅผ ์‚ฌ์šฉํ•œ ๋ชจ๋ธ ์ƒ์„ฑ

TensorFlow์˜ tf.keras API๋Š” ๋ชจ๋ธ๊ณผ ๋ ˆ์ด์–ด๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๋ฐ ์ฃผ๋กœ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค. Keras๊ฐ€ ๋ชจ๋“  ๊ตฌ์„ฑ ์š”์†Œ ์—ฐ๊ฒฐ์— ๋Œ€ํ•œ ๋ณต์žก์„ฑ์„ ์ฒ˜๋ฆฌํ•ด ์ฃผ๊ธฐ ๋•Œ๋ฌธ์— ๋ชจ๋ธ์„ ๊ตฌ์ถ•ํ•˜๊ณ  ์‹คํ—˜ํ•˜๋Š” ๋ฐ ์šฉ์ดํ•ฉ๋‹ˆ๋‹ค.

tf.keras.Sequential ๋ชจ๋ธ์€ ๋ ˆ์ด์–ด์˜ ์„ ํ˜• ์Šคํƒ์ž…๋‹ˆ๋‹ค. ์ด ์ƒ์„ฑ์ž๋Š” ๋ ˆ์ด์–ด ์ธ์Šคํ„ด์Šค ๋ชฉ๋ก์„ ์ทจํ•˜๋Š”๋ฐ, ์•„๋ž˜์˜ ๊ฒฝ์šฐ, ๊ฐ 10๊ฐœ์˜ ๋…ธ๋“œ๋ฅผ ๊ฐ–๋Š” ๋‘ ๊ฐœ์˜ tf.keras.layers.Dense ๋ ˆ์ด์–ด ๋ฐ 3๊ฐœ์˜ ๋…ธ๋“œ๋ฅผ ๊ฐ–๋Š” ์ถœ๋ ฅ ๋ ˆ์ด์–ด๋กœ ๊ตฌ์„ฑ๋˜์–ด ๋ ˆ์ด๋ธ” ์˜ˆ์ธก์„ ๋ณด์—ฌ์ฃผ๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์ฒซ ๋ฒˆ์งธ ๋ ˆ์ด์–ด์˜ input_shape ๋งค๊ฐœ๋ณ€์ˆ˜๋Š” ๋ฐ์ดํ„ฐ์„ธํŠธ์˜ ํŠน์„ฑ ์ˆ˜์— ํ•ด๋‹นํ•˜๋ฉฐ ํ•„์ˆ˜์ ์ž…๋‹ˆ๋‹ค.

model = tf.keras.Sequential([ tf.keras.layers.Dense(10, activation=tf.nn.relu, input_shape=(4,)), # input shape required tf.keras.layers.Dense(10, activation=tf.nn.relu), tf.keras.layers.Dense(3) ])

*ํ™œ์„ฑํ™” ํ•จ์ˆ˜*๋Š” ๊ฐ ๋ ˆ์ด์–ด์˜ ๋…ธ๋“œ์—์„œ ์ถœ๋ ฅ ํ˜•์ƒ์„ ๊ฒฐ์ •ํ•ฉ๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ๋น„์„ ํ˜•์„ฑ์ด ์ค‘์š”ํ•œ๋ฐ, ํ™œ์„ฑํ™” ํ•จ์ˆ˜๊ฐ€ ์—†๋Š” ๋ชจ๋ธ์€ ๋‹จ์ผ ๋ ˆ์ด์–ด์™€ ๋งˆ์ฐฌ๊ฐ€์ง€์ด๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค. tf.keras.activations๊ฐ€ ๋งŽ์ด ์žˆ์ง€๋งŒ, ์ˆจ๊ฒจ์ง„ ๋ ˆ์ด์–ด์—์„œ๋Š” ์ฃผ๋กœ ReLU ํ•จ์ˆ˜๊ฐ€ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.

์ˆจ๊ฒจ์ง„ ๋ ˆ์ด์–ด์™€ ์‹ ๊ฒฝ์˜ ์ด์ƒ์ ์ธ ์ˆ˜๋Š” ๋ฌธ์ œ์™€ ๋ฐ์ดํ„ฐ์„ธํŠธ์— ๋”ฐ๋ผ ๋‹ค๋ฆ…๋‹ˆ๋‹ค. ๋จธ์‹ ๋Ÿฌ๋‹์˜ ์—ฌ๋Ÿฌ ์ธก๋ฉด๊ณผ ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ ์‹ ๊ฒฝ๋ง์˜ ์ตœ์ƒ์˜ ํ˜•ํƒœ๋ฅผ ๊ณ ๋ฅด๊ธฐ ์œ„ํ•ด์„œ๋Š” ์ง€์‹๊ณผ ์‹คํ—˜์ด ๋ชจ๋‘ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค. ๊ฒฝํ—˜์ƒ ์ˆจ๊ฒจ์ง„ ๋ ˆ์ด์–ด์™€ ์‹ ๊ฒฝ์˜ ์ˆ˜๋ฅผ ๋Š˜๋ฆฌ๋ฉด ์ผ๋ฐ˜์ ์œผ๋กœ ๋” ๊ฐ•๋ ฅํ•œ ๋ชจ๋ธ์ด ์ƒ์„ฑ๋˜๋ฉฐ ์ด๋ฅผ ํšจ๊ณผ์ ์œผ๋กœ ํ›ˆ๋ จํ•˜๋ ค๋ฉด ๋” ๋งŽ์€ ๋ฐ์ดํ„ฐ๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.

๋ชจ๋ธ ์‚ฌ์šฉํ•˜๊ธฐ

์ด ๋ชจ๋ธ์ด ํŠน์„ฑ์˜ ๋ฐฐ์น˜์— ๋Œ€ํ•ด ์ˆ˜ํ–‰ํ•˜๋Š” ์ž‘์—…์„ ๊ฐ„๋‹จํžˆ ์‚ดํŽด๋ด…์‹œ๋‹ค.

predictions = model(features) predictions[:5]

์—ฌ๊ธฐ์—์„œ ๊ฐ ์˜ˆ์ œ๋Š” ๊ฐ ํด๋ž˜์Šค์— ๋Œ€ํ•œ ๋กœ์ง“์„ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.

์ด๋Ÿฌํ•œ ๋กœ์ง“์„ ๊ฐ ํด๋ž˜์Šค์˜ ํ™•๋ฅ ๋กœ ๋ณ€ํ™˜ํ•˜๋ ค๋ฉด softmax ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์„ธ์š”.

tf.nn.softmax(predictions[:5])

ํด๋ž˜์Šค์—์„œ tf.math.argmax๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ์˜ˆ์ธก๋œ ํด๋ž˜์Šค ์ธ๋ฑ์Šค๊ฐ€ ์ œ๊ณต๋ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ๋ชจ๋ธ์€ ์•„์ง ํ›ˆ๋ จ๋˜์ง€ ์•Š์•˜์œผ๋ฏ€๋กœ ์ข‹์€ ์˜ˆ์ธก์„ ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

print("Prediction: {}".format(tf.math.argmax(predictions, axis=1))) print(" Labels: {}".format(labels))

๋ชจ๋ธ ํ›ˆ๋ จํ•˜๊ธฐ

*ํ›ˆ๋ จํ•˜๊ธฐ*๋Š” ๋ชจ๋ธ์ด ์ ์ฐจ ์ตœ์ ํ™”๋  ๋•Œ ๋˜๋Š” ๋ชจ๋ธ์ด ๋ฐ์ดํ„ฐ์„ธํŠธ๋ฅผ ํ•™์Šตํ•˜๋Š” ๋จธ์‹ ๋Ÿฌ๋‹ ๋‹จ๊ณ„์ž…๋‹ˆ๋‹ค. ์ด ๋‹จ๊ณ„์˜ ๋ชฉํ‘œ๋Š” ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ์„ธํŠธ์˜ ๊ตฌ์กฐ์— ๋Œ€ํ•ด ์ถฉ๋ถ„ํžˆ ํ•™์Šตํ•˜์—ฌ ๋ณด์ด์ง€ ์•Š๋Š” ๋ฐ์ดํ„ฐ๋ฅผ ์˜ˆ์ธกํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ์„ธํŠธ์— ๋Œ€ํ•ด ๋„ˆ๋ฌด ๋งŽ์ด ๋ฐฐ์šฐ๋ฉด ์˜ˆ์ธก์ด ๊ด€์ธกํ•œ ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•ด์„œ๋งŒ ์ž‘๋™ํ•˜๊ณ  ์ผ๋ฐ˜ํ™”ํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. ์ด๋Ÿฐ ๋ฌธ์ œ๋ฅผ ๊ณผ๋Œ€์ ํ•ฉ์ด๋ผ๊ณ  ํ•˜๋ฉฐ, ์ด๋Š” ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์ดํ•ดํ•˜๋Š” ๋Œ€์‹  ๋‹ต์„ ์•”๊ธฐํ•˜๋Š” ๊ฒƒ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

ํŽญ๊ท„ ๋ถ„๋ฅ˜ ๋ฌธ์ œ๋Š” ๊ฐ๋… ๋จธ์‹ ๋Ÿฌ๋‹์˜ ์˜ˆ์ž…๋‹ˆ๋‹ค. ๋ชจ๋ธ์€ ๋ ˆ์ด๋ธ”์ด ํฌํ•จ๋œ ์˜ˆ์ œ๋กœ ํ›ˆ๋ จ๋ฉ๋‹ˆ๋‹ค. ๋น„๊ฐ๋… ๋จธ์‹ ๋Ÿฌ๋‹์—์„œ๋Š” ์˜ˆ์ œ์— ๋ ˆ์ด๋ธ”์ด ํฌํ•จ๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๋Œ€์‹  ๋ชจ๋ธ์€ ์ผ๋ฐ˜์ ์œผ๋กœ ํŠน์„ฑ ์‚ฌ์ด์—์„œ ํŒจํ„ด์„ ์ฐพ์Šต๋‹ˆ๋‹ค.

์†์‹ค ํ•จ์ˆ˜์™€ ๊ฒฝ์‚ฌ ํ•จ์ˆ˜ ์ •์˜ํ•˜๊ธฐ

ํ›ˆ๋ จ ๋ฐ ํ‰๊ฐ€ ๋‹จ๊ณ„ ๋ชจ๋‘์—์„œ ๋ชจ๋ธ ์†์‹ค์„ ๊ณ„์‚ฐํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์ด๊ฒƒ์€ ๋ชจ๋ธ์˜ ์˜ˆ์ธก์ด ์›ํ•˜๋Š” ๋ ˆ์ด๋ธ”์—์„œ ์–ผ๋งˆ๋‚˜ ๋–จ์–ด์ ธ ์žˆ๋Š”์ง€, ์ฆ‰ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์ด ์–ผ๋งˆ๋‚˜ ์•ˆ ์ข‹์€์ง€๋ฅผ ์ธก์ •ํ•˜๋Š” ๊ฒƒ์œผ๋กœ, ๊ทธ ๊ฐ’์„ ์ตœ์†Œํ™”ํ•˜๊ฑฐ๋‚˜ ์ตœ์ ํ™”ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

๋ชจ๋ธ์˜ ์†์‹ค์€ tf.keras.losses.categorical_crossentropy ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•ด ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค. ์ด ํ•จ์ˆ˜๋Š” ๋ชจ๋ธ์˜ ํด๋ž˜์Šค ํ™•๋ฅ  ์˜ˆ์ธก๊ณผ ์›ํ•˜๋Š” ๋ ˆ์ด๋ธ”์„ ์ž…๋ ฅ์œผ๋กœ ๋ฐ›์•„ ์˜ˆ์˜ ํ‰๊ท  ์†์‹ค์„ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.

loss_object = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
def loss(model, x, y, training): # training=training is needed only if there are layers with different # behavior during training versus inference (e.g. Dropout). y_ = model(x, training=training) return loss_object(y_true=y, y_pred=y_) l = loss(model, features, labels, training=False) print("Loss test: {}".format(l))

๋ชจ๋ธ์„ ์ตœ์ ํ™”ํ•˜๊ธฐ ์œ„ํ•ด ์‚ฌ์šฉ๋˜๋Š” *๊ทธ๋ž˜๋””์–ธํŠธ*๋ฅผ ๊ณ„์‚ฐํ•˜๊ธฐ ์œ„ํ•ด tf.GradientTape ์ปจํ…์ŠคํŠธ๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

def grad(model, inputs, targets): with tf.GradientTape() as tape: loss_value = loss(model, inputs, targets, training=True) return loss_value, tape.gradient(loss_value, model.trainable_variables)

์˜ตํ‹ฐ๋งˆ์ด์ € ์ƒ์„ฑ

์˜ตํ‹ฐ๋งˆ์ด์ €๋Š” ๊ณ„์‚ฐ๋œ ๊ฒฝ์‚ฌ๋ฅผ ๋ชจ๋ธ์˜ ๋งค๊ฐœ๋ณ€์ˆ˜์— ์ ์šฉํ•˜์—ฌ loss ํ•จ์ˆ˜๋ฅผ ์ตœ์†Œํ™”ํ•ฉ๋‹ˆ๋‹ค. ์†์‹ค ํ•จ์ˆ˜๋ฅผ ๊ณก๋ฉด์— ๋น„์œ ํ•œ๋‹ค๋ฉด(๊ทธ๋ฆผ 3 ์ฐธ์กฐ) ๊ณก๋ฉด์—์„œ ๊ฐ€์žฅ ๋‚ฎ์€ ์ง€์ ์„ ์ฐพ๋Š” ๊ฒƒ๊ณผ ๊ฐ™๋‹ค๊ณ  ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ฒฝ์‚ฌ๋Š” ๊ฐ€์žฅ ๊ฐ€ํŒŒ๋ฅธ ์ƒ์Šน ๋ฐฉํ–ฅ์„ ๊ฐ€๋ฆฌํ‚ค๋ฏ€๋กœ ๋ฐ˜๋Œ€ ๋ฐฉํ–ฅ์œผ๋กœ ์ด๋™ํ•˜์—ฌ ๋‚ด๋ ค๊ฐ€์•ผ ํ•ฉ๋‹ˆ๋‹ค. ๊ฐ ๋ฐฐ์น˜์˜ ์†์‹ค๊ณผ ๊ฒฝ์‚ฌ๋ฅผ ๋ฐ˜๋ณต์ ์œผ๋กœ ๊ณ„์‚ฐํ•˜์—ฌ ํ›ˆ๋ จ์„ ํ†ตํ•ด ๋ชจ๋ธ์„ ์กฐ์ •ํ•ฉ๋‹ˆ๋‹ค. ๋ชจ๋ธ์€ ์ ์ฐจ ์†์‹ค์„ ์ตœ์†Œํ™”ํ•˜๊ธฐ ์œ„ํ•ด ๊ฐ€์ค‘์น˜์™€ ๋ฐ”์ด์–ด์Šค์˜ ์ตœ์ƒ์˜ ์กฐํ•ฉ์„ ์ฐพ์Šต๋‹ˆ๋‹ค. ์†์‹ค์ด ๋‚ฎ์„์ˆ˜๋ก ๋ชจ๋ธ์˜ ์˜ˆ์ธก๊ฐ’์€ ๋” ์ข‹์•„์ง‘๋‹ˆ๋‹ค.

Optimization algorithms visualized over time in 3D space.
๊ทธ๋ฆผ 3. 3D ๊ณต๊ฐ„์—์„œ ์‹œ๊ฐ„์— ๊ฑธ์ณ ์‹œ๊ฐํ™”ํ•œ ์ตœ์ ํ™” ์•Œ๊ณ ๋ฆฌ์ฆ˜.
(์ถœ์ฒ˜: Stanford class CS231n, MIT License, ์ด๋ฏธ์ง€ ์ œ๊ณต: Alec Radford)

TensorFlow์—๋Š” ํ›ˆ๋ จ์— ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ๋งŽ์€ ์ตœ์ ํ™” ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด ์žˆ์Šต๋‹ˆ๋‹ค. ์ด ํŠœํ† ๋ฆฌ์–ผ์—์„œ๋Š” ํ™•๋ฅ ์  ๊ฒฝ์‚ฌํ•˜๊ฐ•๋ฒ•(SGD) ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ๊ตฌํ˜„ํ•˜๋Š” tf.keras.optimizers.SGD๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. learning_rate ๋งค๊ฐœ๋ณ€์ˆ˜๋Š” ๊ฒฝ์‚ฌ ์•„๋ž˜๋กœ ๋ฐ˜๋ณตํ•  ๋•Œ๋งˆ๋‹ค ์ˆ˜ํ–‰ํ•  ๋‹จ๊ณ„ ํฌ๊ธฐ๋ฅผ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค. ์ด ๋น„์œจ์€ ๋” ๋‚˜์€ ๊ฒฐ๊ณผ๋ฅผ ์–ป๊ธฐ ์œ„ํ•ด ์ผ๋ฐ˜์ ์œผ๋กœ ์กฐ์ •ํ•˜๋Š” ํ•˜์ดํผ ๋งค๊ฐœ๋ณ€์ˆ˜์ž…๋‹ˆ๋‹ค.

ํ•™์Šต๋ฅ ์ด 0.01์ธ ์ตœ์ ํ™” ๋„๊ตฌ๋ฅผ ์ธ์Šคํ„ด์Šคํ™”ํ•ฉ๋‹ˆ๋‹ค. ์Šค์นผ๋ผ ๊ฐ’์€ ๊ฐ ํ›ˆ๋ จ ๋ฐ˜๋ณต์—์„œ ๊ธฐ์šธ๊ธฐ๋ฅผ ๊ณฑํ•œ ๊ฐ’์ž…๋‹ˆ๋‹ค.

optimizer = tf.keras.optimizers.SGD(learning_rate=0.01)

๊ทธ๋Ÿฐ ๋‹ค์Œ ์ด ๊ฐœ์ฒด๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋‹จ์ผ ์ตœ์ ํ™” ๋‹จ๊ณ„๋ฅผ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.

loss_value, grads = grad(model, features, labels) print("Step: {}, Initial Loss: {}".format(optimizer.iterations.numpy(), loss_value.numpy())) optimizer.apply_gradients(zip(grads, model.trainable_variables)) print("Step: {}, Loss: {}".format(optimizer.iterations.numpy(), loss(model, features, labels, training=True).numpy()))

ํ›ˆ๋ จ ๋ฃจํ”„

์—ฌ๊ธฐ๊นŒ์ง€ ๋ชจ๋‘ ๋งˆ์ณค๋‹ค๋ฉด ๋ชจ๋ธ์„ ํ›ˆ๋ จํ•  ์ค€๋น„๊ฐ€ ๋˜์—ˆ์Šต๋‹ˆ๋‹ค! ํ›ˆ๋ จ ๋ฃจํ”„๋Š” ๋” ๋‚˜์€ ์˜ˆ์ธก์„ ํ•  ์ˆ˜ ์žˆ๋„๋ก ๋ฐ์ดํ„ฐ์„ธํŠธ ์˜ˆ์ œ๋ฅผ ๋ชจ๋ธ์— ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ๋‹ค์Œ ์ฝ”๋“œ ๋ธ”๋ก์€ ์ด๋Ÿฌํ•œ ํ›ˆ๋ จ ๋‹จ๊ณ„๋ฅผ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค.

  1. ๊ฐ epoch ๋ฐ˜๋ณต. Epoch๋Š” ๋ฐ์ดํ„ฐ์„ธํŠธ๋ฅผ ํ†ต๊ณผ์‹œํ‚ค๋Š” ํšŸ์ˆ˜์ž…๋‹ˆ๋‹ค.

  2. ํ•˜๋‚˜์˜ Epoch ๋‚ด์—์„œ ํŠน์„ฑ(x)๊ณผ ๋ ˆ์ด๋ธ”(y)์ด ํฌํ•จ๋œ ํ›ˆ๋ จ Dataset์˜ ๊ฐ ์˜ˆ๋ฅผ ๋ฐ˜๋ณตํ•ฉ๋‹ˆ๋‹ค.

  3. ์˜ˆ์˜ ํŠน์„ฑ์„ ์‚ฌ์šฉํ•˜์—ฌ ์˜ˆ์ธก์„ ์ˆ˜ํ–‰ํ•˜๊ณ  ๋ ˆ์ด๋ธ”๊ณผ ๋น„๊ตํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ์ธก์˜ ๋ถ€์ •ํ™•์„ฑ์„ ์ธก์ •ํ•˜๊ณ  ์ด๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ชจ๋ธ์˜ ์†์‹ค ๋ฐ ๊ทธ๋ž˜๋””์–ธํŠธ๋ฅผ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.

  4. optimizer๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ชจ๋ธ์˜ ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ์—…๋ฐ์ดํŠธํ•ฉ๋‹ˆ๋‹ค.

  5. ์‹œ๊ฐํ™”๋ฅผ ์œ„ํ•ด ๋ช‡๊ฐ€์ง€ ๊ฐ’๋“ค์„ ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.

  6. ๊ฐ epoch์— ๋Œ€ํ•ด ๋ฐ˜๋ณตํ•ฉ๋‹ˆ๋‹ค.

num_epochs ๋ณ€์ˆ˜๋Š” ๋ฐ์ดํ„ฐ์„ธํŠธ ๋ชจ์Œ์„ ๋ฐ˜๋ณตํ•˜๋Š” ํšŸ์ˆ˜์ž…๋‹ˆ๋‹ค. ์•„๋ž˜ ์ฝ”๋“œ์—์„œ num_epochs๊ฐ€ 201๋กœ ์„ค์ •๋˜์–ด ์žˆ๊ธฐ์— ์ด ํ›ˆ๋ จ ๋ฃจํ”„๋Š” 201๋ฒˆ ์‹คํ–‰๋ฉ๋‹ˆ๋‹ค. ๋‹จ์ˆœํžˆ ์ƒ๊ฐํ•ด๋„ ๋ชจ๋ธ์„ ๋” ์˜ค๋ž˜ ํ›ˆ๋ จํ•œ๋‹ค๊ณ  ํ•ด์„œ ๋” ๋‚˜์€ ๋ชจ๋ธ์ด ๋ณด์žฅ๋˜์ง€๋Š” ์•Š์Šต๋‹ˆ๋‹ค. num_epochs๋Š” ์กฐ์ •ํ•  ์ˆ˜ ์žˆ๋Š” ํ•˜์ดํผ ๋งค๊ฐœ๋ณ€์ˆ˜์ž…๋‹ˆ๋‹ค. ์ ์ ˆํ•œ ํšŸ์ˆ˜์˜ ์„ ํƒ์—๋Š” ๋งŽ์€ ๊ฒฝํ—˜๊ณผ ์ง๊ด€์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.

## Note: Rerunning this cell uses the same model parameters # Keep results for plotting train_loss_results = [] train_accuracy_results = [] num_epochs = 201 for epoch in range(num_epochs): epoch_loss_avg = tf.keras.metrics.Mean() epoch_accuracy = tf.keras.metrics.SparseCategoricalAccuracy() # Training loop - using batches of 32 for x, y in ds_train_batch: # Optimize the model loss_value, grads = grad(model, x, y) optimizer.apply_gradients(zip(grads, model.trainable_variables)) # Track progress epoch_loss_avg.update_state(loss_value) # Add current batch loss # Compare predicted label to actual label # training=True is needed only if there are layers with different # behavior during training versus inference (e.g. Dropout). epoch_accuracy.update_state(y, model(x, training=True)) # End epoch train_loss_results.append(epoch_loss_avg.result()) train_accuracy_results.append(epoch_accuracy.result()) if epoch % 50 == 0: print("Epoch {:03d}: Loss: {:.3f}, Accuracy: {:.3%}".format(epoch, epoch_loss_avg.result(), epoch_accuracy.result()))

๋˜๋Š” ๋‚ด์žฅ๋œ Keras Model.fit(ds_train_batch) ๋ฉ”์„œ๋“œ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ชจ๋ธ์„ ํ›ˆ๋ จํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์‹œ๊ฐ„์— ๋”ฐ๋ฅธ ์†์‹คํ•จ์ˆ˜ ์‹œ๊ฐํ™”

๋ชจ๋ธ์˜ ํ›ˆ๋ จ ๊ณผ์ •์„ ์ถœ๋ ฅํ•˜๋Š” ๊ฒƒ๋„ ์œ ์šฉํ•˜์ง€๋งŒ TensorFlow์™€ ํ•จ๊ป˜ ์ œ๊ณต๋˜๋Š” ์‹œ๊ฐํ™” ๋ฐ ๋ฉ”ํŠธ๋ฆญ ๋„๊ตฌ์ธ ํ…์„œ๋ณด๋“œ(TensorBoard)๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ›ˆ๋ จ ๊ณผ์ •์„ ์‹œ๊ฐํ™”ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด ๊ฐ„๋‹จํ•œ ์˜ˆ์ œ์—์„œ๋Š” matplotlib ๋ชจ๋“ˆ์„ ์‚ฌ์šฉํ•˜์—ฌ ๊ธฐ๋ณธ ์ฐจํŠธ๋ฅผ ๋งŒ๋“ญ๋‹ˆ๋‹ค.

์ด๋Ÿฌํ•œ ์ฐจํŠธ๋ฅผ ํ•ด์„ํ•˜๋ ค๋ฉด ์–ด๋А ์ •๋„์˜ ๊ฒฝํ—˜์ด ํ•„์š”ํ•˜์ง€๋งŒ, ์ผ๋ฐ˜์ ์œผ๋กœ ์†์‹ค์€ ๊ฐ์†Œํ•˜๊ณ  ์ •ํ™•์„ฑ์ด ์ฆ๊ฐ€ํ•˜๋Š” ๊ฒƒ์„ ํ™•์ธํ•˜๊ณ ์ž ํ•ฉ๋‹ˆ๋‹ค.

fig, axes = plt.subplots(2, sharex=True, figsize=(12, 8)) fig.suptitle('Training Metrics') axes[0].set_ylabel("Loss", fontsize=14) axes[0].plot(train_loss_results) axes[1].set_ylabel("Accuracy", fontsize=14) axes[1].set_xlabel("Epoch", fontsize=14) axes[1].plot(train_accuracy_results) plt.show()

๋ชจ๋ธ ์œ ํšจ์„ฑ ํ‰๊ฐ€

์ด์ œ ๋ชจ๋ธ์ด ํ›ˆ๋ จ๋˜์—ˆ์œผ๋ฏ€๋กœ ์„ฑ๋Šฅ์— ๋Œ€ํ•œ ํ†ต๊ณ„๋ฅผ ์–ป์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

ํ‰๊ฐ€๋Š” ๋ชจ๋ธ์ด ์–ผ๋งˆ๋‚˜ ํšจ๊ณผ์ ์œผ๋กœ ์˜ˆ์ธก์„ ์ˆ˜ํ–‰ํ•˜๋Š”์ง€ ์•Œ์•„๋ณด๋Š” ๊ฒƒ์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค. ํŽญ๊ท„ ๋ถ„๋ฅ˜์—์„œ ๋ชจ๋ธ์˜ ํšจ๊ณผ๋ฅผ ํ™•์ธํ•˜๋ ค๋ฉด ์ธก์ • ์ •๋ณด๋ฅผ ๋ชจ๋ธ์— ์ „๋‹ฌํ•˜๊ณ  ํ•ด๋‹น ๋ชจ๋ธ์ด ํŽญ๊ท„ ์ข…์„ ์˜ˆ์ธกํ•˜๋„๋ก ์š”์ฒญํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿฐ ๋‹ค์Œ ๋ชจ๋ธ์˜ ์˜ˆ์ธก์„ ์‹ค์ œ ๋ ˆ์ด๋ธ”๊ณผ ๋น„๊ตํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ์ž…๋ ฅ ์˜ˆ์ œ์˜ ์ ˆ๋ฐ˜์—์„œ ์˜ฌ๋ฐ”๋ฅธ ์ข…์„ ์„ ํƒํ•œ ๋ชจ๋ธ์˜ ์ •ํ™•์„ฑ์€ 0.5์ž…๋‹ˆ๋‹ค. ๊ทธ๋ฆผ 4๋Š” ์•ฝ๊ฐ„ ๋” ํšจ๊ณผ์ ์ธ ๋ชจ๋ธ์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค. 5๊ฐœ ์˜ˆ์ธก ์ค‘ 4๊ฐœ๋Š” 80% ์ •ํ™•์„ฑ์œผ๋กœ ์ •ํ™•ํ•ฉ๋‹ˆ๋‹ค.

์ƒ˜ํ”Œ ํŠน์„ฑ ๋ ˆ์ด๋ธ” ๋ชจ๋ธ ์˜ˆ์ธก
5.9 3.0 4.3 1.5 1 1
6.9 3.1 5.4 2.1 2 2
5.1 3.3 1.7 0.5 0 0
6.0 3.4 4.5 1.6 1 2
5.5 2.5 4.0 1.3 1 1
๊ทธ๋ฆผ 4. ์ •ํ™•์„ฑ 80%์˜ ํŽญ๊ท„ ๋ถ„๋ฅ˜๊ธฐ

ํ…Œ์ŠคํŠธ ์„ธํŠธ ์„ค์ •ํ•˜๊ธฐ

๋ชจ๋ธ ํ‰๊ฐ€๋Š” ๋ชจ๋ธ ํ›ˆ๋ จ๊ณผ ์œ ์‚ฌํ•ฉ๋‹ˆ๋‹ค. ๊ฐ€์žฅ ํฐ ์ฐจ์ด์ ์€ ์˜ˆ์ œ๊ฐ€ ํ›ˆ๋ จ ์„ธํŠธ๊ฐ€ ์•„๋‹Œ ๋ณ„๋„์˜ *ํ…Œ์ŠคํŠธ ์„ธํŠธ*์—์„œ ๋‚˜์˜จ๋‹ค๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๋ชจ๋ธ์˜ ํšจ๊ณผ๋ฅผ ๊ณต์ •ํ•˜๊ฒŒ ํ‰๊ฐ€ํ•˜๋ ค๋ฉด ๋ชจ๋ธ์„ ํ‰๊ฐ€ํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ๋˜๋Š” ์˜ˆ๊ฐ€ ๋ชจ๋ธ ํ›ˆ๋ จ์— ์‚ฌ์šฉ๋œ ์˜ˆ์™€ ๋‹ฌ๋ผ์•ผ ํ•ฉ๋‹ˆ๋‹ค.

ํŽญ๊ท„ ๋ฐ์ดํ„ฐ์„ธํŠธ์—๋Š” ๋ณ„๋„์˜ ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ์„ธํŠธ๊ฐ€ ์—†์œผ๋ฏ€๋กœ ์ด์ „ ๋ฐ์ดํ„ฐ์„ธํŠธ ๋‹ค์šด๋กœ๋“œ ์„น์…˜์—์„œ ์›๋ณธ ๋ฐ์ดํ„ฐ์„ธํŠธ๋ฅผ ํ…Œ์ŠคํŠธ ๋ฐ ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ์„ธํŠธ๋กœ ๋ถ„ํ• ํ–ˆ์Šต๋‹ˆ๋‹ค. ํ‰๊ฐ€๋ฅผ ์œ„ํ•ด ds_test_batch ๋ฐ์ดํ„ฐ์„ธํŠธ๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ ์„ธํŠธ๋ฅผ ์‚ฌ์šฉํ•œ ๋ชจ๋ธ ํ‰๊ฐ€

ํ›ˆ๋ จ ๋‹จ๊ณ„์™€ ๋‹ฌ๋ฆฌ ๋ชจ๋ธ์€ ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ์˜ ๋‹จ์ผ epoch๋งŒ ํ‰๊ฐ€ํ•ฉ๋‹ˆ๋‹ค. ๋‹ค์Œ ์ฝ”๋“œ๋Š” ํ…Œ์ŠคํŠธ ์„ธํŠธ์˜ ๊ฐ ์˜ˆ์ œ๋ฅผ ๋ฐ˜๋ณตํ•˜๊ณ  ๋ชจ๋ธ์˜ ์˜ˆ์ธก๊ฐ’์„ ์‹ค์ œ ๋ ˆ์ด๋ธ”๊ณผ ๋น„๊ตํ•ฉ๋‹ˆ๋‹ค. ์ด ๋น„๊ต๋Š” ์ „์ฒด ํ…Œ์ŠคํŠธ์„ธํŠธ์—์„œ ๋ชจ๋ธ์˜ ์ •ํ™•์„ฑ์„ ์ธก์ •ํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.

test_accuracy = tf.keras.metrics.Accuracy() ds_test_batch = ds_test.batch(10) for (x, y) in ds_test_batch: # training=False is needed only if there are layers with different # behavior during training versus inference (e.g. Dropout). logits = model(x, training=False) prediction = tf.math.argmax(logits, axis=1, output_type=tf.int64) test_accuracy(prediction, y) print("Test set accuracy: {:.3%}".format(test_accuracy.result()))

model.evaluate(ds_test, return_dict=True) keras ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ์„ธํŠธ์˜ ์ •ํ™•์„ฑ ์ •๋ณด๋ฅผ ๊ฐ€์ ธ์˜ฌ ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค.

์˜ˆ๋ฅผ ๋“ค์–ด ๋งˆ์ง€๋ง‰ ๋ฐฐ์น˜๋ฅผ ๊ฒ€์‚ฌํ•˜์—ฌ ๋ชจ๋ธ ์˜ˆ์ธก์ด ์ผ๋ฐ˜์ ์ธ ์ƒํ™ฉ์—์„œ ์ •ํ™•ํ•œ์ง€ ๊ด€์ฐฐํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

tf.stack([y,prediction],axis=1)

ํ›ˆ๋ จ๋œ ๋ชจ๋ธ๋กœ ์˜ˆ์ธกํ•˜๊ธฐ

๋ชจ๋ธ์„ ํ›ˆ๋ จํ•˜๊ณ  ์ด ๋ชจ๋ธ์ด ํŽญ๊ท„ ์ข…์„ ๋ถ„๋ฅ˜ํ•˜๋Š” ๋ฐ ํ›Œ๋ฅญํ•จ์„ "์ฆ๋ช…"ํ–ˆ์ง€๋งŒ ์™„๋ฒฝํ•˜์ง€๋Š” ์•Š์Šต๋‹ˆ๋‹ค. ์ด์ œ ํ›ˆ๋ จ๋œ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜์—ฌ ๋ ˆ์ด๋ธ”์ด ์—†๋Š” ์˜ˆ์ œ์— ๋Œ€ํ•œ ์˜ˆ์ธก์„ ์ˆ˜ํ–‰ํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. ์ฆ‰, ํŠน์„ฑ์€ ํฌํ•จํ•˜์ง€๋งŒ ๋ ˆ์ด๋ธ”์€ ํฌํ•จํ•˜์ง€ ์•Š๋Š” ์˜ˆ์ œ์ž…๋‹ˆ๋‹ค.

์‹ค์ œ๋กœ ๋ ˆ์ด๋ธ”์ด ์—†๋Š” ์˜ˆ์ œ๋Š” ์•ฑ, CSV ํŒŒ์ผ, ๋ฐ์ดํ„ฐ ํ”ผ๋“œ ๋“ฑ ๋‹ค์–‘ํ•œ ์†Œ์Šค๋กœ๋ถ€ํ„ฐ ์ œ๊ณต๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด ํŠœํ† ๋ฆฌ์–ผ์—์„œ๋Š” ๋ ˆ์ด๋ธ”์„ ์˜ˆ์ธกํ•˜๊ธฐ ์œ„ํ•ด ๋ ˆ์ด๋ธ”์ด ์—†๋Š” 3๊ฐ€์ง€ ์˜ˆ๋ฅผ ์ˆ˜๋™์œผ๋กœ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ๋ ˆ์ด๋ธ” ๋ฒˆํ˜ธ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์ด ํ‘œ์‹œ๋ฉ๋‹ˆ๋‹ค.

  • 0: ์•„๋ธ๋ฆฌ ํŽญ๊ท„

  • 1: ํ„ฑ๋ˆ ํŽญ๊ท„

  • 2: ์  ํˆฌ ํŽญ๊ท„

predict_dataset = tf.convert_to_tensor([ [0.3, 0.8, 0.4, 0.5,], [0.4, 0.1, 0.8, 0.5,], [0.7, 0.9, 0.8, 0.4] ]) # training=False is needed only if there are layers with different # behavior during training versus inference (e.g. Dropout). predictions = model(predict_dataset, training=False) for i, logits in enumerate(predictions): class_idx = tf.math.argmax(logits).numpy() p = tf.nn.softmax(logits)[class_idx] name = class_names[class_idx] print("Example {} prediction: {} ({:4.1f}%)".format(i, name, 100*p))