CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutSign UpSign In
huggingface

Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place. Commercial Alternative to JupyterHub.

GitHub Repository: huggingface/notebooks
Path: blob/main/transformers_doc/ko/quicktour.ipynb
Views: 2542
Kernel: Unknown Kernel
# Transformers ์„ค์น˜ ๋ฐฉ๋ฒ• ! pip install transformers datasets # ๋งˆ์ง€๋ง‰ ๋ฆด๋ฆฌ์Šค ๋Œ€์‹  ์†Œ์Šค์—์„œ ์„ค์น˜ํ•˜๋ ค๋ฉด, ์œ„ ๋ช…๋ น์„ ์ฃผ์„์œผ๋กœ ๋ฐ”๊พธ๊ณ  ์•„๋ž˜ ๋ช…๋ น์„ ํ•ด์ œํ•˜์„ธ์š”. # ! pip install git+https://github.com/huggingface/transformers.git

๋‘˜๋Ÿฌ๋ณด๊ธฐ[[quick-tour]]

๐Ÿค— Transformer๋ฅผ ์‹œ์ž‘ํ•ด๋ด์š”! ๋‘˜๋Ÿฌ๋ณด๊ธฐ๋Š” ๊ฐœ๋ฐœ์ž์™€ ์ผ๋ฐ˜ ์‚ฌ์šฉ์ž ๋ชจ๋‘๋ฅผ ์œ„ํ•ด ์“ฐ์—ฌ์กŒ์Šต๋‹ˆ๋‹ค. pipeline()์œผ๋กœ ์ถ”๋ก ํ•˜๋Š” ๋ฐฉ๋ฒ•, AutoClass๋กœ ์‚ฌ์ „ํ•™์Šต๋œ ๋ชจ๋ธ๊ณผ ์ „์ฒ˜๋ฆฌ๊ธฐ๋ฅผ ์ ์žฌํ•˜๋Š” ๋ฐฉ๋ฒ•๊ณผ PyTorch ๋˜๋Š” TensorFlow๋กœ ์‹ ์†ํ•˜๊ฒŒ ๋ชจ๋ธ์„ ํ›ˆ๋ จ์‹œํ‚ค๋Š” ๋ฐฉ๋ฒ•์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค. ๊ธฐ๋ณธ์„ ๋ฐฐ์šฐ๊ณ  ์‹ถ๋‹ค๋ฉด ํŠœํ† ๋ฆฌ์–ผ์ด๋‚˜ course์—์„œ ์—ฌ๊ธฐ ์†Œ๊ฐœ๋œ ๊ฐœ๋…์— ๋Œ€ํ•œ ์ž์„ธํ•œ ์„ค๋ช…์„ ํ™•์ธํ•˜์‹œ๊ธธ ๊ถŒ์žฅํ•ฉ๋‹ˆ๋‹ค.

์‹œ์ž‘ํ•˜๊ธฐ ์ „์— ํ•„์š”ํ•œ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๊ฐ€ ๋ชจ๋‘ ์„ค์น˜๋˜์–ด ์žˆ๋Š”์ง€ ํ™•์ธํ•˜๊ณ ,

!pip install transformers datasets

์ข‹์•„ํ•˜๋Š” ๋จธ์‹ ๋Ÿฌ๋‹ ํ”„๋ ˆ์ž„์›Œํฌ๋„ ์„ค์น˜ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

pip install torch
pip install tensorflow

Pipeline (ํŒŒ์ดํ”„๋ผ์ธ)

#@title from IPython.display import HTML HTML('<iframe width="560" height="315" src="https://www.youtube.com/embed/tiZFewofSLM?rel=0&amp;controls=0&amp;showinfo=0" frameborder="0" allowfullscreen></iframe>')

pipeline()์€ ์‚ฌ์ „ํ•™์Šต๋œ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•ด ์ถ”๋ก ํ•  ๋•Œ ์ œ์ผ ์‰ฌ์šด ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค. ์—ฌ๋Ÿฌ ๋ชจ๋‹ฌ๋ฆฌํ‹ฐ์˜ ์ˆ˜๋งŽ์€ ํƒœ์Šคํฌ์— pipeline()์„ ์ฆ‰์‹œ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ง€์›ํ•˜๋Š” ํƒœ์Šคํฌ์˜ ์˜ˆ์‹œ๋Š” ์•„๋ž˜ ํ‘œ๋ฅผ ์ฐธ๊ณ ํ•˜์„ธ์š”.

ํƒœ์Šคํฌ์„ค๋ช…๋ชจ๋‹ฌ๋ฆฌํ‹ฐํŒŒ์ดํ”„๋ผ์ธ ID
ํ…์ŠคํŠธ ๋ถ„๋ฅ˜ํ…์ŠคํŠธ์— ์•Œ๋งž์€ ๋ผ๋ฒจ ๋ถ™์ด๊ธฐ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ(NLP)pipeline(task="sentiment-analysis")
ํ…์ŠคํŠธ ์ƒ์„ฑ์ฃผ์–ด์ง„ ๋ฌธ์ž์—ด ์ž…๋ ฅ๊ณผ ์ด์–ด์ง€๋Š” ํ…์ŠคํŠธ ์ƒ์„ฑํ•˜๊ธฐ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ(NLP)pipeline(task="text-generation")
๊ฐœ์ฒด๋ช… ์ธ์‹๋ฌธ์ž์—ด์˜ ๊ฐ ํ† ํฐ๋งˆ๋‹ค ์•Œ๋งž์€ ๋ผ๋ฒจ ๋ถ™์ด๊ธฐ (์ธ๋ฌผ, ์กฐ์ง, ์žฅ์†Œ ๋“ฑ๋“ฑ)์ž์—ฐ์–ด ์ฒ˜๋ฆฌ(NLP)pipeline(task="ner")
์งˆ์˜์‘๋‹ต์ฃผ์–ด์ง„ ๋ฌธ๋งฅ๊ณผ ์งˆ๋ฌธ์— ๋”ฐ๋ผ ์˜ฌ๋ฐ”๋ฅธ ๋Œ€๋‹ตํ•˜๊ธฐ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ(NLP)pipeline(task="question-answering")
๋นˆ์นธ ์ฑ„์šฐ๊ธฐ๋ฌธ์ž์—ด์˜ ๋นˆ์นธ์— ์•Œ๋งž์€ ํ† ํฐ ๋งž์ถ”๊ธฐ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ(NLP)pipeline(task="fill-mask")
์š”์•ฝํ…์ŠคํŠธ๋‚˜ ๋ฌธ์„œ๋ฅผ ์š”์•ฝํ•˜๊ธฐ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ(NLP)pipeline(task="summarization")
๋ฒˆ์—ญํ…์ŠคํŠธ๋ฅผ ํ•œ ์–ธ์–ด์—์„œ ๋‹ค๋ฅธ ์–ธ์–ด๋กœ ๋ฒˆ์—ญํ•˜๊ธฐ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ(NLP)pipeline(task="translation")
์ด๋ฏธ์ง€ ๋ถ„๋ฅ˜์ด๋ฏธ์ง€์— ์•Œ๋งž์€ ๋ผ๋ฒจ ๋ถ™์ด๊ธฐ์ปดํ“จํ„ฐ ๋น„์ „(CV)pipeline(task="image-classification")
์ด๋ฏธ์ง€ ๋ถ„ํ• ์ด๋ฏธ์ง€์˜ ํ”ฝ์…€๋งˆ๋‹ค ๋ผ๋ฒจ ๋ถ™์ด๊ธฐ(์‹œ๋งจํ‹ฑ, ํŒŒ๋†‰ํ‹ฑ ๋ฐ ์ธ์Šคํ„ด์Šค ๋ถ„ํ•  ํฌํ•จ)์ปดํ“จํ„ฐ ๋น„์ „(CV)pipeline(task="image-segmentation")
๊ฐ์ฒด ํƒ์ง€์ด๋ฏธ์ง€ ์† ๊ฐ์ฒด์˜ ๊ฒฝ๊ณ„ ์ƒ์ž๋ฅผ ๊ทธ๋ฆฌ๊ณ  ํด๋ž˜์Šค๋ฅผ ์˜ˆ์ธกํ•˜๊ธฐ์ปดํ“จํ„ฐ ๋น„์ „(CV)pipeline(task="object-detection")
์˜ค๋””์˜ค ๋ถ„๋ฅ˜์˜ค๋””์˜ค ํŒŒ์ผ์— ์•Œ๋งž์€ ๋ผ๋ฒจ ๋ถ™์ด๊ธฐ์˜ค๋””์˜คpipeline(task="audio-classification")
์ž๋™ ์Œ์„ฑ ์ธ์‹์˜ค๋””์˜ค ํŒŒ์ผ ์† ์Œ์„ฑ์„ ํ…์ŠคํŠธ๋กœ ๋ฐ”๊พธ๊ธฐ์˜ค๋””์˜คpipeline(task="automatic-speech-recognition")
์‹œ๊ฐ ์งˆ์˜์‘๋‹ต์ฃผ์–ด์ง„ ์ด๋ฏธ์ง€์™€ ์ด๋ฏธ์ง€์— ๋Œ€ํ•œ ์งˆ๋ฌธ์— ๋”ฐ๋ผ ์˜ฌ๋ฐ”๋ฅด๊ฒŒ ๋Œ€๋‹ตํ•˜๊ธฐ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌpipeline(task="vqa")

๋จผ์ € pipeline()์˜ ์ธ์Šคํ„ด์Šค๋ฅผ ๋งŒ๋“ค์–ด ์ ์šฉํ•  ํƒœ์Šคํฌ๋ฅผ ๊ณ ๋ฅด์„ธ์š”. ์œ„ ํƒœ์Šคํฌ๋“ค์€ ๋ชจ๋‘ pipeline()์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๊ณ , ์ง€์›ํ•˜๋Š” ํƒœ์Šคํฌ์˜ ์ „์ฒด ๋ชฉ๋ก์„ ๋ณด๋ ค๋ฉด pipeline API ๋ ˆํผ๋Ÿฐ์Šค๋ฅผ ํ™•์ธํ•ด์ฃผ์„ธ์š”. ๊ฐ„๋‹จํ•œ ์˜ˆ์‹œ๋กœ ๊ฐ์ • ๋ถ„์„ ํƒœ์Šคํฌ์— pipeline()๋ฅผ ์ ์šฉํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

from transformers import pipeline classifier = pipeline("sentiment-analysis")

pipeline()์€ ๊ธฐ๋ณธ ์‚ฌ์ „ํ•™์Šต๋œ ๋ชจ๋ธ(์˜์–ด)์™€ ๊ฐ์ • ๋ถ„์„์„ ํ•˜๊ธฐ ์œ„ํ•œ tokenizer๋ฅผ ๋‹ค์šด๋กœ๋“œํ•˜๊ณ  ์บ์‹œํ•ด๋†“์Šต๋‹ˆ๋‹ค. ์ด์ œ ์›ํ•˜๋Š” ํ…์ŠคํŠธ์— classifier๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

classifier("We are very happy to show you the ๐Ÿค— Transformers library.")
[{'label': 'POSITIVE', 'score': 0.9998}]

์ž…๋ ฅ์ด ์—ฌ๋Ÿฌ ๊ฐœ๋ผ๋ฉด, ์ž…๋ ฅ์„ pipeline()์— ๋ฆฌ์ŠคํŠธ๋กœ ์ „๋‹ฌํ•ด์„œ ๋”•์…”๋„ˆ๋ฆฌ๋กœ ๋œ ๋ฆฌ์ŠคํŠธ๋ฅผ ๋ฐ›์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

results = classifier(["We are very happy to show you the ๐Ÿค— Transformers library.", "We hope you don't hate it."]) for result in results: print(f"label: {result['label']}, with score: {round(result['score'], 4)}")
label: POSITIVE, with score: 0.9998 label: NEGATIVE, with score: 0.5309

pipeline()์€ ํŠน์ • ํƒœ์Šคํฌ์šฉ ๋ฐ์ดํ„ฐ์…‹๋ฅผ ์ „๋ถ€ ์ˆœํšŒํ•  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค. ์ž๋™ ์Œ์„ฑ ์ธ์‹ ํƒœ์Šคํฌ์— ์ ์šฉํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

import torch from transformers import pipeline speech_recognizer = pipeline("automatic-speech-recognition", model="facebook/wav2vec2-base-960h")

์ด์ œ ์ˆœํšŒํ•  ์˜ค๋””์˜ค ๋ฐ์ดํ„ฐ์…‹๋ฅผ ์ ์žฌํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค. (์ž์„ธํ•œ ๋‚ด์šฉ์€ ๐Ÿค— Datasets ์‹œ์ž‘ํ•˜๊ธฐ๋ฅผ ์ฐธ๊ณ ํ•ด์ฃผ์„ธ์š”) MInDS-14 ๋ฐ์ดํ„ฐ์…‹๋กœ ํ•ด๋ณผ๊นŒ์š”?

from datasets import load_dataset, Audio dataset = load_dataset("PolyAI/minds14", name="en-US", split="train")

๋ฐ์ดํ„ฐ์…‹์˜ ์ƒ˜ํ”Œ๋ง ๋ ˆ์ดํŠธ๊ฐ€ facebook/wav2vec2-base-960h์˜ ํ›ˆ๋ จ ๋‹น์‹œ ์ƒ˜ํ”Œ๋ง ๋ ˆ์ดํŠธ์™€ ์ผ์น˜ํ•ด์•ผ๋งŒ ํ•ฉ๋‹ˆ๋‹ค.

dataset = dataset.cast_column("audio", Audio(sampling_rate=speech_recognizer.feature_extractor.sampling_rate))

์˜ค๋””์˜ค ํŒŒ์ผ์€ "audio" ์—ด์„ ํ˜ธ์ถœํ•  ๋•Œ ์ž๋™์œผ๋กœ ์ ์žฌ๋˜๊ณ  ๋‹ค์‹œ ์ƒ˜ํ”Œ๋ง๋ฉ๋‹ˆ๋‹ค. ์ฒ˜์Œ 4๊ฐœ ์ƒ˜ํ”Œ์—์„œ ์Œ์„ฑ์„ ์ถ”์ถœํ•˜์—ฌ ํŒŒ์ดํ”„๋ผ์ธ์— ๋ฆฌ์ŠคํŠธ ํ˜•ํƒœ๋กœ ์ „๋‹ฌํ•ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

result = speech_recognizer(dataset[:4]["audio"]) print([d["text"] for d in result])
['I WOULD LIKE TO SET UP A JOINT ACCOUNT WITH MY PARTNER HOW DO I PROCEED WITH DOING THAT', "FODING HOW I'D SET UP A JOIN TO HET WITH MY WIFE AND WHERE THE AP MIGHT BE", "I I'D LIKE TOY SET UP A JOINT ACCOUNT WITH MY PARTNER I'M NOT SEEING THE OPTION TO DO IT ON THE AP SO I CALLED IN TO GET SOME HELP CAN I JUST DO IT OVER THE PHONE WITH YOU AND GIVE YOU THE INFORMATION OR SHOULD I DO IT IN THE AP AND I'M MISSING SOMETHING UQUETTE HAD PREFERRED TO JUST DO IT OVER THE PHONE OF POSSIBLE THINGS", 'HOW DO I THURN A JOIN A COUNT']

(์Œ์„ฑ์ด๋‚˜ ๋น„์ „์ฒ˜๋Ÿผ) ์ž…๋ ฅ์ด ํฐ ๋Œ€๊ทœ๋ชจ ๋ฐ์ดํ„ฐ์…‹์˜ ๊ฒฝ์šฐ, ๋ฉ”๋ชจ๋ฆฌ์— ์ ์žฌ์‹œํ‚ค๊ธฐ ์œ„ํ•ด ๋ฆฌ์ŠคํŠธ ๋Œ€์‹  ์ œ๋„ˆ๋ ˆ์ดํ„ฐ๋กœ ์ž…๋ ฅ์„ ๋ชจ๋‘ ์ „๋‹ฌํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ž์„ธํ•œ ๋‚ด์šฉ์€ pipeline API ๋ ˆํผ๋Ÿฐ์Šค๋ฅผ ํ™•์ธํ•ด์ฃผ์„ธ์š”.

ํŒŒ์ดํ”„๋ผ์ธ์—์„œ ๋‹ค๋ฅธ ๋ชจ๋ธ์ด๋‚˜ tokenizer ์‚ฌ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•[[use-another-model-and-tokenizer-in-the-pipeline]]

pipeline()์€ Hub ์† ๋ชจ๋“  ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์–ด, ์–ผ๋งˆ๋“ ์ง€ pipeline()์„ ์‚ฌ์šฉํ•˜๊ณ  ์‹ถ์€๋Œ€๋กœ ๋ฐ”๊ฟ€ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด ํ”„๋ž‘์Šค์–ด ํ…์ŠคํŠธ๋ฅผ ๋‹ค๋ฃฐ ์ˆ˜ ์žˆ๋Š” ๋ชจ๋ธ์„ ๋งŒ๋“œ๋ ค๋ฉด, Hub์˜ ํƒœ๊ทธ๋กœ ์ ์ ˆํ•œ ๋ชจ๋ธ์„ ์ฐพ์•„๋ณด์„ธ์š”. ์ƒ์œ„ ๊ฒ€์ƒ‰ ๊ฒฐ๊ณผ๋กœ ๋œฌ ๊ฐ์ • ๋ถ„์„์„ ์œ„ํ•ด ํŒŒ์ธํŠœ๋‹๋œ ๋‹ค๊ตญ์–ด BERT ๋ชจ๋ธ์ด ํ”„๋ž‘์Šค์–ด๋ฅผ ์ง€์›ํ•˜๋Š”๊ตฐ์š”.

model_name = "nlptown/bert-base-multilingual-uncased-sentiment"

AutoModelForSequenceClassification๊ณผ AutoTokenizer๋กœ ์‚ฌ์ „ํ•™์Šต๋œ ๋ชจ๋ธ๊ณผ ํ•จ๊ป˜ ์—ฐ๊ด€๋œ ํ† ํฌ๋‚˜์ด์ €๋ฅผ ๋ถˆ๋Ÿฌ์˜ต๋‹ˆ๋‹ค. (AutoClass์— ๋Œ€ํ•œ ๋‚ด์šฉ์€ ๋‹ค์Œ ์„น์…˜์—์„œ ์‚ดํŽด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค)

from transformers import AutoTokenizer, AutoModelForSequenceClassification model = AutoModelForSequenceClassification.from_pretrained(model_name) tokenizer = AutoTokenizer.from_pretrained(model_name)

TFAutoModelForSequenceClassification๊ณผ AutoTokenizer๋กœ ์‚ฌ์ „ํ•™์Šต๋œ ๋ชจ๋ธ๊ณผ ํ•จ๊ป˜ ์—ฐ๊ด€๋œ ํ† ํฌ๋‚˜์ด์ €๋ฅผ ๋ถˆ๋Ÿฌ์˜ต๋‹ˆ๋‹ค. (TFAutoClass์— ๋Œ€ํ•œ ๋‚ด์šฉ์€ ๋‹ค์Œ ์„น์…˜์—์„œ ์‚ดํŽด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค)

from transformers import AutoTokenizer, TFAutoModelForSequenceClassification model = TFAutoModelForSequenceClassification.from_pretrained(model_name) tokenizer = AutoTokenizer.from_pretrained(model_name)

pipeline()์—์„œ ์‚ฌ์šฉํ•  ๋ชจ๋ธ๊ณผ ํ† ํฌ๋‚˜์ด์ €๋ฅผ ์ž…๋ ฅํ•˜๋ฉด ์ด์ œ (๊ฐ์ • ๋ถ„์„๊ธฐ์ธ) classifier๋ฅผ ํ”„๋ž‘์Šค์–ด ํ…์ŠคํŠธ์— ์ ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

classifier = pipeline("sentiment-analysis", model=model, tokenizer=tokenizer) classifier("Nous sommes trรจs heureux de vous prรฉsenter la bibliothรจque ๐Ÿค— Transformers.")
[{'label': '5 stars', 'score': 0.7273}]

ํ•˜๊ณ ์‹ถ์€ ๊ฒƒ์— ์ ์šฉํ•  ๋งˆ๋•…ํ•œ ๋ชจ๋ธ์ด ์—†๋‹ค๋ฉด, ๊ฐ€์ง„ ๋ฐ์ดํ„ฐ๋กœ ์‚ฌ์ „ํ•™์Šต๋œ ๋ชจ๋ธ์„ ํŒŒ์ธํŠœ๋‹ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์ž์„ธํ•œ ๋ฐฉ๋ฒ•์€ ํŒŒ์ธํŠœ๋‹ ํŠœํ† ๋ฆฌ์–ผ์„ ์ฐธ๊ณ ํ•ด์ฃผ์„ธ์š”. ์‚ฌ์ „ํ•™์Šต๋œ ๋ชจ๋ธ์˜ ํŒŒ์ธํŠœ๋‹์„ ๋งˆ์น˜์…จ์œผ๋ฉด, ๋ˆ„๊ตฌ๋‚˜ ๋จธ์‹ ๋Ÿฌ๋‹์„ ํ•  ์ˆ˜ ์žˆ๋„๋ก ๊ณต์œ ํ•˜๋Š” ๊ฒƒ์„ ๊ณ ๋ คํ•ด์ฃผ์„ธ์š”. ๐Ÿค—

AutoClass

#@title from IPython.display import HTML HTML('<iframe width="560" height="315" src="https://www.youtube.com/embed/AhChOFRegn4?rel=0&amp;controls=0&amp;showinfo=0" frameborder="0" allowfullscreen></iframe>')

๋‚ด๋ถ€์ ์œผ๋กœ ๋“ค์–ด๊ฐ€๋ฉด ์œ„์—์„œ ์‚ฌ์šฉํ–ˆ๋˜ pipeline()์€ AutoModelForSequenceClassification๊ณผ AutoTokenizer ํด๋ž˜์Šค๋กœ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค. AutoClass๋ž€ ์ด๋ฆ„์ด๋‚˜ ๊ฒฝ๋กœ๋ฅผ ๋ฐ›์œผ๋ฉด ๊ทธ์— ์•Œ๋งž๋Š” ์‚ฌ์ „ํ•™์Šต๋œ ๋ชจ๋ธ์„ ๊ฐ€์ ธ์˜ค๋Š” '๋ฐ”๋กœ๊ฐ€๊ธฐ'๋ผ๊ณ  ๋ณผ ์ˆ˜ ์žˆ๋Š”๋ฐ์š”. ์›ํ•˜๋Š” ํƒœ์Šคํฌ์™€ ์ „์ฒ˜๋ฆฌ์— ์ ํ•ฉํ•œ AutoClass๋ฅผ ๊ณ ๋ฅด๊ธฐ๋งŒ ํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค.

์ „์— ์‚ฌ์šฉํ–ˆ๋˜ ์˜ˆ์‹œ๋กœ ๋Œ์•„๊ฐ€์„œ AutoClass๋กœ pipeline()๊ณผ ๋™์ผํ•œ ๊ฒฐ๊ณผ๋ฅผ ์–ป์„ ์ˆ˜ ์žˆ๋Š” ๋ฐฉ๋ฒ•์„ ์•Œ์•„๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

AutoTokenizer

ํ† ํฌ๋‚˜์ด์ €๋Š” ์ „์ฒ˜๋ฆฌ๋ฅผ ๋‹ด๋‹นํ•˜๋ฉฐ, ํ…์ŠคํŠธ๋ฅผ ๋ชจ๋ธ์ด ๋ฐ›์„ ์ˆซ์ž ๋ฐฐ์—ด๋กœ ๋ฐ”๊ฟ‰๋‹ˆ๋‹ค. ํ† ํฐํ™” ๊ณผ์ •์—๋Š” ๋‹จ์–ด๋ฅผ ์–ด๋””์—์„œ ๋Š์„์ง€, ์–ผ๋งŒํผ ๋‚˜๋ˆŒ์ง€ ๋“ฑ์„ ํฌํ•จํ•œ ์—ฌ๋Ÿฌ ๊ทœ์น™์ด ์žˆ์Šต๋‹ˆ๋‹ค. ์ž์„ธํ•œ ๋‚ด์šฉ์€ ํ† ํฌ๋‚˜์ด์ € ์š”์•ฝ๋ฅผ ํ™•์ธํ•ด์ฃผ์„ธ์š”. ์ œ์ผ ์ค‘์š”ํ•œ ์ ์€ ๋ชจ๋ธ์ด ํ›ˆ๋ จ๋์„ ๋•Œ์™€ ๋™์ผํ•œ ํ† ํฐํ™” ๊ทœ์น™์„ ์“ฐ๋„๋ก ๋™์ผํ•œ ๋ชจ๋ธ ์ด๋ฆ„์œผ๋กœ ํ† ํฌ๋‚˜์ด์ € ์ธ์Šคํ„ด์Šค๋ฅผ ๋งŒ๋“ค์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

AutoTokenizer๋กœ ํ† ํฌ๋‚˜์ด์ €๋ฅผ ๋ถˆ๋Ÿฌ์˜ค๊ณ ,

from transformers import AutoTokenizer model_name = "nlptown/bert-base-multilingual-uncased-sentiment" tokenizer = AutoTokenizer.from_pretrained(model_name)

ํ† ํฌ๋‚˜์ด์ €์— ํ…์ŠคํŠธ๋ฅผ ์ œ๊ณตํ•˜์„ธ์š”.

encoding = tokenizer("We are very happy to show you the ๐Ÿค— Transformers library.") print(encoding)
{'input_ids': [101, 11312, 10320, 12495, 19308, 10114, 11391, 10855, 10103, 100, 58263, 13299, 119, 102], 'token_type_ids': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]}

๊ทธ๋Ÿฌ๋ฉด ๋‹ค์Œ์„ ํฌํ•จํ•œ ๋”•์…”๋„ˆ๋ฆฌ๊ฐ€ ๋ฐ˜ํ™˜๋ฉ๋‹ˆ๋‹ค.

  • input_ids: ์ˆซ์ž๋กœ ํ‘œํ˜„๋œ ํ† ํฐ๋“ค

  • attention_mask: ์ฃผ์‹œํ•  ํ† ํฐ๋“ค

ํ† ํฌ๋‚˜์ด์ €๋Š” ์ž…๋ ฅ์„ ๋ฆฌ์ŠคํŠธ๋กœ๋„ ๋ฐ›์„ ์ˆ˜ ์žˆ์œผ๋ฉฐ, ํ…์ŠคํŠธ๋ฅผ ํŒจ๋“œํ•˜๊ฑฐ๋‚˜ ์ž˜๋ผ๋‚ด์–ด ๊ท ์ผํ•œ ๊ธธ์ด์˜ ๋ฐฐ์น˜๋ฅผ ๋ฐ˜ํ™˜ํ•  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค.

pt_batch = tokenizer( ["We are very happy to show you the ๐Ÿค— Transformers library.", "We hope you don't hate it."], padding=True, truncation=True, max_length=512, return_tensors="pt", )
tf_batch = tokenizer( ["We are very happy to show you the ๐Ÿค— Transformers library.", "We hope you don't hate it."], padding=True, truncation=True, max_length=512, return_tensors="tf", )
[removed]

์ „์ฒ˜๋ฆฌ ํŠœํ† ๋ฆฌ์–ผ์„ ๋ณด์‹œ๋ฉด ํ† ํฐํ™”์— ๋Œ€ํ•œ ์ž์„ธํ•œ ์„ค๋ช…๊ณผ ํ•จ๊ป˜ ์ด๋ฏธ์ง€, ์˜ค๋””์˜ค์™€ ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ์ž…๋ ฅ์„ ์ „์ฒ˜๋ฆฌํ•˜๊ธฐ ์œ„ํ•œ AutoFeatureExtractor๊ณผ AutoProcessor์˜ ์‚ฌ์šฉ๋ฐฉ๋ฒ•๋„ ์•Œ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

AutoModel

๐Ÿค— Transformers๋กœ ์‚ฌ์ „ํ•™์Šต๋œ ์ธ์Šคํ„ด์Šค๋ฅผ ๊ฐ„๋‹จํ•˜๊ณ  ํ†ต์ผ๋œ ๋ฐฉ์‹์œผ๋กœ ๋ถˆ๋Ÿฌ์˜ฌ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋Ÿฌ๋ฉด AutoTokenizer์ฒ˜๋Ÿผ AutoModel๋„ ๋ถˆ๋Ÿฌ์˜ฌ ์ˆ˜ ์žˆ๊ฒŒ ๋ฉ๋‹ˆ๋‹ค. ์œ ์ผํ•œ ์ฐจ์ด์ ์€ ํƒœ์Šคํฌ์— ์ ํ•ฉํ•œ AutoModel์„ ์„ ํƒํ•ด์•ผ ํ•œ๋‹ค๋Š” ์ ์ž…๋‹ˆ๋‹ค. ํ…์ŠคํŠธ(๋˜๋Š” ์‹œํ€€์Šค) ๋ถ„๋ฅ˜์˜ ๊ฒฝ์šฐ AutoModelForSequenceClassification์„ ๋ถˆ๋Ÿฌ์™€์•ผ ํ•ฉ๋‹ˆ๋‹ค.

from transformers import AutoModelForSequenceClassification model_name = "nlptown/bert-base-multilingual-uncased-sentiment" pt_model = AutoModelForSequenceClassification.from_pretrained(model_name)
[removed]

AutoModel ํด๋ž˜์Šค์—์„œ ์ง€์›ํ•˜๋Š” ํƒœ์Šคํฌ๋“ค์€ ํƒœ์Šคํฌ ์ •๋ฆฌ ๋ฌธ์„œ๋ฅผ ์ฐธ๊ณ ํ•ด์ฃผ์„ธ์š”.

์ด์ œ ์ „์ฒ˜๋ฆฌ๋œ ์ž…๋ ฅ ๋ฐฐ์น˜๋ฅผ ๋ชจ๋ธ๋กœ ์ง์ ‘ ๋ณด๋‚ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์•„๋ž˜์ฒ˜๋Ÿผ **๋ฅผ ์•ž์— ๋ถ™์—ฌ ๋”•์…”๋„ˆ๋ฆฌ๋ฅผ ํ’€์–ด์ฃผ๊ธฐ๋งŒ ํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค.

pt_outputs = pt_model(**pt_batch)

๋ชจ๋ธ์˜ activation ๊ฒฐ๊ณผ๋Š” logits ์†์„ฑ์— ๋‹ด๊ฒจ์žˆ์Šต๋‹ˆ๋‹ค. logits์— Softmax ํ•จ์ˆ˜๋ฅผ ์ ์šฉํ•ด์„œ ํ™•๋ฅ  ํ˜•ํƒœ๋กœ ๋ฐ›์œผ์„ธ์š”.

from torch import nn pt_predictions = nn.functional.softmax(pt_outputs.logits, dim=-1) print(pt_predictions)
tensor([[0.0021, 0.0018, 0.0115, 0.2121, 0.7725], [0.2084, 0.1826, 0.1969, 0.1755, 0.2365]], grad_fn=<SoftmaxBackward0>)

๐Ÿค— Transformers๋Š” ์‚ฌ์ „ํ•™์Šต๋œ ์ธ์Šคํ„ด์Šค๋ฅผ ๊ฐ„๋‹จํ•˜๊ณ  ํ†ต์ผ๋œ ๋ฐฉ์‹์œผ๋กœ ๋ถˆ๋Ÿฌ์˜ฌ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋Ÿฌ๋ฉด AutoTokenizer์ฒ˜๋Ÿผ TFAutoModel๋„ ๋ถˆ๋Ÿฌ์˜ฌ ์ˆ˜ ์žˆ๊ฒŒ ๋ฉ๋‹ˆ๋‹ค. ์œ ์ผํ•œ ์ฐจ์ด์ ์€ ํƒœ์Šคํฌ์— ์ ํ•ฉํ•œ TFAutoModel๋ฅผ ์„ ํƒํ•ด์•ผ ํ•œ๋‹ค๋Š” ์ ์ž…๋‹ˆ๋‹ค. ํ…์ŠคํŠธ(๋˜๋Š” ์‹œํ€€์Šค) ๋ถ„๋ฅ˜์˜ ๊ฒฝ์šฐ TFAutoModelForSequenceClassification์„ ๋ถˆ๋Ÿฌ์™€์•ผ ํ•ฉ๋‹ˆ๋‹ค.

from transformers import TFAutoModelForSequenceClassification model_name = "nlptown/bert-base-multilingual-uncased-sentiment" tf_model = TFAutoModelForSequenceClassification.from_pretrained(model_name)
[removed]

AutoModel ํด๋ž˜์Šค์—์„œ ์ง€์›ํ•˜๋Š” ํƒœ์Šคํฌ๋“ค์€ ํƒœ์Šคํฌ ์ •๋ฆฌ ๋ฌธ์„œ๋ฅผ ์ฐธ๊ณ ํ•ด์ฃผ์„ธ์š”.

์ด์ œ ์ „์ฒ˜๋ฆฌ๋œ ์ž…๋ ฅ ๋ฐฐ์น˜๋ฅผ ๋ชจ๋ธ๋กœ ์ง์ ‘ ๋ณด๋‚ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ๋”•์…”๋„ˆ๋ฆฌ์˜ ํ‚ค๋ฅผ ํ…์„œ์— ์ง์ ‘ ๋„ฃ์–ด์ฃผ๊ธฐ๋งŒ ํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค.

tf_outputs = tf_model(tf_batch)

๋ชจ๋ธ์˜ activation ๊ฒฐ๊ณผ๋Š” logits ์†์„ฑ์— ๋‹ด๊ฒจ์žˆ์Šต๋‹ˆ๋‹ค. logits์— Softmax ํ•จ์ˆ˜๋ฅผ ์ ์šฉํ•ด์„œ ํ™•๋ฅ  ํ˜•ํƒœ๋กœ ๋ฐ›์œผ์„ธ์š”.

import tensorflow as tf tf_predictions = tf.nn.softmax(tf_outputs.logits, axis=-1) tf_predictions
[removed]

๋ชจ๋“  (PyTorch ๋˜๋Š” TensorFlow) ๐Ÿค— Transformers ๋ชจ๋ธ์€ (softmax ๋“ฑ์˜) ์ตœ์ข… activation ํ•จ์ˆ˜ ์ด์ „์— ํ…์„œ๋ฅผ ๋‚ด๋†“์Šต๋‹ˆ๋‹ค. ์™œ๋ƒํ•˜๋ฉด ์ตœ์ข… activation ํ•จ์ˆ˜๋ฅผ ์ข…์ข… loss ํ•จ์ˆ˜์™€ ๋™์ผ์‹œํ•˜๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค. ๋ชจ๋ธ ์ถœ๋ ฅ์€ ํŠน์ˆ˜ ๋ฐ์ดํ„ฐ ํด๋ž˜์Šค์ด๋ฏ€๋กœ ํ•ด๋‹น ์†์„ฑ์€ IDE์—์„œ ์ž๋™์œผ๋กœ ์™„์„ฑ๋ฉ๋‹ˆ๋‹ค. ๋ชจ๋ธ ์ถœ๋ ฅ์€ ํŠœํ”Œ ๋˜๋Š” (์ •์ˆ˜, ์Šฌ๋ผ์ด์Šค ๋˜๋Š” ๋ฌธ์ž์—ด๋กœ ์ธ๋ฑ์‹ฑํ•˜๋Š”) ๋”•์…”๋„ˆ๋ฆฌ ํ˜•ํƒœ๋กœ ์ฃผ์–ด์ง€๊ณ  ์ด๋Ÿฐ ๊ฒฝ์šฐ None์ธ ์†์„ฑ์€ ๋ฌด์‹œ๋ฉ๋‹ˆ๋‹ค.

๋ชจ๋ธ ์ €์žฅํ•˜๊ธฐ[[save-a-model]]

๋ชจ๋ธ์„ ํŒŒ์ธํŠœ๋‹ํ•œ ๋’ค์—๋Š” PreTrainedModel.save_pretrained()๋กœ ๋ชจ๋ธ์„ ํ† ํฌ๋‚˜์ด์ €์™€ ํ•จ๊ป˜ ์ €์žฅํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

pt_save_directory = "./pt_save_pretrained" tokenizer.save_pretrained(pt_save_directory) pt_model.save_pretrained(pt_save_directory)

๋ชจ๋ธ์„ ๋‹ค์‹œ ์‚ฌ์šฉํ•  ๋•Œ๋Š” PreTrainedModel.from_pretrained()๋กœ ๋‹ค์‹œ ๋ถˆ๋Ÿฌ์˜ค๋ฉด ๋ฉ๋‹ˆ๋‹ค.

pt_model = AutoModelForSequenceClassification.from_pretrained("./pt_save_pretrained")

๋ชจ๋ธ์„ ํŒŒ์ธํŠœ๋‹ํ•œ ๋’ค์—๋Š” TFPreTrainedModel.save_pretrained()๋กœ ๋ชจ๋ธ์„ ํ† ํฌ๋‚˜์ด์ €์™€ ํ•จ๊ป˜ ์ €์žฅํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

tf_save_directory = "./tf_save_pretrained" tokenizer.save_pretrained(tf_save_directory) tf_model.save_pretrained(tf_save_directory)

๋ชจ๋ธ์„ ๋‹ค์‹œ ์‚ฌ์šฉํ•  ๋•Œ๋Š” TFPreTrainedModel.from_pretrained()๋กœ ๋‹ค์‹œ ๋ถˆ๋Ÿฌ์˜ค๋ฉด ๋ฉ๋‹ˆ๋‹ค.

tf_model = TFAutoModelForSequenceClassification.from_pretrained("./tf_save_pretrained")

๐Ÿค— Transformers ๊ธฐ๋Šฅ ์ค‘ ํŠนํžˆ ์žฌ๋ฏธ์žˆ๋Š” ํ•œ ๊ฐ€์ง€๋Š” ๋ชจ๋ธ์„ ์ €์žฅํ•˜๊ณ  PyTorch๋‚˜ TensorFlow ๋ชจ๋ธ๋กœ ๋‹ค์‹œ ๋ถˆ๋Ÿฌ์˜ฌ ์ˆ˜ ์žˆ๋Š” ๊ธฐ๋Šฅ์ž…๋‹ˆ๋‹ค. 'from_pt' ๋˜๋Š” 'from_tf' ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ์‚ฌ์šฉํ•ด ๋ชจ๋ธ์„ ๊ธฐ์กด๊ณผ ๋‹ค๋ฅธ ํ”„๋ ˆ์ž„์›Œํฌ๋กœ ๋ณ€ํ™˜์‹œํ‚ฌ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

from transformers import AutoModel tokenizer = AutoTokenizer.from_pretrained(tf_save_directory) pt_model = AutoModelForSequenceClassification.from_pretrained(tf_save_directory, from_tf=True)
from transformers import TFAutoModel tokenizer = AutoTokenizer.from_pretrained(pt_save_directory) tf_model = TFAutoModelForSequenceClassification.from_pretrained(pt_save_directory, from_pt=True)

์ปค์Šคํ…€ ๋ชจ๋ธ ๊ตฌ์ถ•ํ•˜๊ธฐ[[custom-model-builds]]

๋ชจ๋ธ์˜ ๊ตฌ์„ฑ ํด๋ž˜์Šค๋ฅผ ์ˆ˜์ •ํ•˜์—ฌ ๋ชจ๋ธ์˜ ๊ตฌ์กฐ๋ฅผ ๋ฐ”๊ฟ€ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์€๋‹‰์ธต, ์–ดํ…์…˜ ํ—ค๋“œ ์ˆ˜์™€ ๊ฐ™์€ ๋ชจ๋ธ์˜ ์†์„ฑ์„ ๊ตฌ์„ฑ์—์„œ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค. ์ปค์Šคํ…€ ๊ตฌ์„ฑ ํด๋ž˜์Šค์—์„œ ๋ชจ๋ธ์„ ๋งŒ๋“ค๋ฉด ์ฒ˜์Œ๋ถ€ํ„ฐ ์‹œ์ž‘ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ๋ชจ๋ธ ์†์„ฑ์€ ๋žœ๋คํ•˜๊ฒŒ ์ดˆ๊ธฐํ™”๋˜๋ฏ€๋กœ ์˜๋ฏธ ์žˆ๋Š” ๊ฒฐ๊ณผ๋ฅผ ์–ป์œผ๋ ค๋ฉด ๋จผ์ € ๋ชจ๋ธ์„ ํ›ˆ๋ จ์‹œํ‚ฌ ํ•„์š”๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

๋จผ์ € AutoConfig๋ฅผ ์ž„ํฌํŠธํ•˜๊ณ , ์ˆ˜์ •ํ•˜๊ณ  ์‹ถ์€ ์‚ฌ์ „ํ•™์Šต๋œ ๋ชจ๋ธ์„ ๋ถˆ๋Ÿฌ์˜ค์„ธ์š”. AutoConfig.from_pretrained()์—์„œ ์–ดํ…์…˜ ํ—ค๋“œ ์ˆ˜ ๊ฐ™์€ ์†์„ฑ์„ ๋ณ€๊ฒฝํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

from transformers import AutoConfig my_config = AutoConfig.from_pretrained("distilbert-base-uncased", n_heads=12)

AutoModel.from_config()๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ปค์Šคํ…€ ๊ตฌ์„ฑ๋Œ€๋กœ ๋ชจ๋ธ์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.

from transformers import AutoModel my_model = AutoModel.from_config(my_config)

TFAutoModel.from_config()๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ปค์Šคํ…€ ๊ตฌ์„ฑ๋Œ€๋กœ ๋ชจ๋ธ์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.

from transformers import TFAutoModel my_model = TFAutoModel.from_config(my_config)

์ปค์Šคํ…€ ๊ตฌ์„ฑ์„ ์ž‘์„ฑํ•˜๋Š” ๋ฐฉ๋ฒ•์— ๋Œ€ํ•œ ์ž์„ธํ•œ ๋‚ด์šฉ์€ ์ปค์Šคํ…€ ์•„ํ‚คํ…์ฒ˜ ๋งŒ๋“ค๊ธฐ ๊ฐ€์ด๋“œ๋ฅผ ์ฐธ๊ณ ํ•˜์„ธ์š”.

Trainer - PyTorch์— ์ตœ์ ํ™”๋œ ํ›ˆ๋ จ ๋ฐ˜๋ณต ๋ฃจํ”„[[trainer-a-pytorch-optimized-training-loop]]

๋ชจ๋“  ๋ชจ๋ธ์€ torch.nn.Module์ด์–ด์„œ ๋Œ€๋‹ค์ˆ˜์˜ ํ›ˆ๋ จ ๋ฐ˜๋ณต ๋ฃจํ”„์— ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์‚ฌ์šฉ์ž๊ฐ€ ์ง์ ‘ ํ›ˆ๋ จ ๋ฐ˜๋ณต ๋ฃจํ”„๋ฅผ ์ž‘์„ฑํ•ด๋„ ๋˜์ง€๋งŒ, ๐Ÿค— Transformers๋Š” PyTorch์šฉ Trainer ํด๋ž˜์Šค๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ๊ธฐ๋ณธ์ ์ธ ํ›ˆ๋ จ ๋ฐ˜ํญ ๋ฃจํ”„๊ฐ€ ํฌํ•จ๋˜์–ด ์žˆ๊ณ , ๋ถ„์‚ฐ ํ›ˆ๋ จ์ด๋‚˜ ํ˜ผํ•ฉ ์ •๋ฐ€๋„ ๋“ฑ์˜ ์ถ”๊ฐ€ ๊ธฐ๋Šฅ๋„ ์žˆ์Šต๋‹ˆ๋‹ค.

ํƒœ์Šคํฌ์— ๋”ฐ๋ผ ๋‹ค๋ฅด์ง€๋งŒ, ์ผ๋ฐ˜์ ์œผ๋กœ ๋‹ค์Œ ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ Trainer์— ์ „๋‹ฌํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

  1. PreTrainedModel ๋˜๋Š” torch.nn.Module๋กœ ์‹œ์ž‘ํ•ฉ๋‹ˆ๋‹ค.

    >>> from transformers import AutoModelForSequenceClassification >>> model = AutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased")
  2. TrainingArguments๋กœ ํ•™์Šต๋ฅ , ๋ฐฐ์น˜ ํฌ๊ธฐ๋‚˜ ํ›ˆ๋ จํ•  epoch ์ˆ˜์™€ ๊ฐ™์ด ๋ชจ๋ธ์˜ ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์กฐ์ •ํ•ฉ๋‹ˆ๋‹ค. ๊ธฐ๋ณธ๊ฐ’์€ ํ›ˆ๋ จ ์ธ์ˆ˜๋ฅผ ์ „ํ˜€ ์ง€์ •ํ•˜์ง€ ์•Š์€ ๊ฒฝ์šฐ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.

    >>> from transformers import TrainingArguments >>> training_args = TrainingArguments( ... output_dir="path/to/save/folder/", ... learning_rate=2e-5, ... per_device_train_batch_size=8, ... per_device_eval_batch_size=8, ... num_train_epochs=2, ... )
  3. ํ† ํฌ๋‚˜์ด์ €, ํŠน์ง•์ถ”์ถœ๊ธฐ(feature extractor), ์ „์ฒ˜๋ฆฌ๊ธฐ(processor) ํด๋ž˜์Šค ๋“ฑ์œผ๋กœ ์ „์ฒ˜๋ฆฌ๋ฅผ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.

    >>> from transformers import AutoTokenizer >>> tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")
  4. ๋ฐ์ดํ„ฐ์…‹๋ฅผ ์ ์žฌํ•ฉ๋‹ˆ๋‹ค.

    >>> from datasets import load_dataset >>> dataset = load_dataset("rotten_tomatoes") # doctest: +IGNORE_RESULT
  5. ๋ฐ์ดํ„ฐ์…‹์„ ํ† ํฐํ™”ํ•˜๋Š” ํ•จ์ˆ˜๋ฅผ ๋งŒ๋“ค๊ณ  map์œผ๋กœ ์ „์ฒด ๋ฐ์ดํ„ฐ์…‹์— ์ ์šฉ์‹œํ‚ต๋‹ˆ๋‹ค.

    >>> def tokenize_dataset(dataset): ... return tokenizer(dataset["text"]) >>> dataset = dataset.map(tokenize_dataset, batched=True)
  6. DataCollatorWithPadding๋กœ ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ๋ถ€ํ„ฐ ํ‘œ๋ณธ์œผ๋กœ ์‚ผ์„ ๋ฐฐ์น˜๋ฅผ ๋งŒ๋“ญ๋‹ˆ๋‹ค.

    >>> from transformers import DataCollatorWithPadding >>> data_collator = DataCollatorWithPadding(tokenizer=tokenizer)

์ด์ œ ์œ„์˜ ๋ชจ๋“  ํด๋ž˜์Šค๋ฅผ Trainer๋กœ ๋ชจ์œผ์„ธ์š”.

from transformers import Trainer trainer = Trainer( model=model, args=training_args, train_dataset=dataset["train"], eval_dataset=dataset["test"], tokenizer=tokenizer, data_collator=data_collator, ) # doctest: +SKIP

์ค€๋น„๋˜์—ˆ์œผ๋ฉด train()์œผ๋กœ ํ›ˆ๋ จ์„ ์‹œ์ž‘ํ•˜์„ธ์š”.

trainer.train()
[removed]

sequence-to-sequence ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜๋Š” (๋ฒˆ์—ญ์ด๋‚˜ ์š”์•ฝ ๊ฐ™์€) ํƒœ์Šคํฌ์˜ ๊ฒฝ์šฐ Seq2SeqTrainer์™€ Seq2SeqTrainingArguments ํด๋ž˜์Šค๋ฅผ ๋Œ€์‹  ์‚ฌ์šฉํ•˜์‹œ๊ธฐ ๋ฐ”๋ž๋‹ˆ๋‹ค.

Trainer ๋‚ด๋ถ€์˜ ๋ฉ”์„œ๋“œ๋ฅผ ๊ตฌํ˜„ ์ƒ์†(subclassing)ํ•ด์„œ ํ›ˆ๋ จ ๋ฐ˜๋ณต ๋ฃจํ”„๋ฅผ ๊ฐœ์กฐํ•  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋Ÿฌ๋ฉด loss ํ•จ์ˆ˜, optimizer, scheduler ๋“ฑ์˜ ๊ธฐ๋Šฅ๋„ ๊ฐœ์กฐํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์–ด๋–ค ๋ฉ”์„œ๋“œ๋ฅผ ๊ตฌํ˜„ ์ƒ์†ํ•  ์ˆ˜ ์žˆ๋Š”์ง€ ์•Œ์•„๋ณด๋ ค๋ฉด Trainer๋ฅผ ์ฐธ๊ณ ํ•˜์„ธ์š”.

ํ›ˆ๋ จ ๋ฐ˜๋ณต ๋ฃจํ”„๋ฅผ ๊ฐœ์กฐํ•˜๋Š” ๋‹ค๋ฅธ ๋ฐฉ๋ฒ•์€ Callbacks๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. Callbacks๋กœ ๋‹ค๋ฅธ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์™€ ํ†ตํ•ฉํ•˜๊ณ , ํ›ˆ๋ จ ๋ฐ˜๋ณต ๋ฃจํ”„๋ฅผ ์ˆ˜์‹œ๋กœ ์ฒดํฌํ•˜์—ฌ ์ง„ํ–‰ ์ƒํ™ฉ์„ ๋ณด๊ณ ๋ฐ›๊ฑฐ๋‚˜, ํ›ˆ๋ จ์„ ์กฐ๊ธฐ์— ์ค‘๋‹จํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. Callbacks์€ ํ›ˆ๋ จ ๋ฐ˜๋ณต ๋ฃจํ”„ ์ž์ฒด๋ฅผ ์ „ํ˜€ ์ˆ˜์ •ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๋งŒ์•ฝ loss ํ•จ์ˆ˜ ๋“ฑ์„ ๊ฐœ์กฐํ•˜๊ณ  ์‹ถ๋‹ค๋ฉด Trainer๋ฅผ ๊ตฌํ˜„ ์ƒ์†ํ•ด์•ผ๋งŒ ํ•ฉ๋‹ˆ๋‹ค.

TensorFlow๋กœ ํ›ˆ๋ จ์‹œํ‚ค๊ธฐ[[train-with-tensorflow]]

๋ชจ๋“  ๋ชจ๋ธ์€ tf.keras.Model์ด์–ด์„œ Keras API๋ฅผ ํ†ตํ•ด TensorFlow์—์„œ ํ›ˆ๋ จ์‹œํ‚ฌ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๐Ÿค— Transformers์—์„œ ๋ฐ์ดํ„ฐ์…‹๋ฅผ tf.data.Dataset ํ˜•ํƒœ๋กœ ์‰ฝ๊ฒŒ ์ ์žฌํ•  ์ˆ˜ ์žˆ๋Š” prepare_tf_dataset() ๋ฉ”์„œ๋“œ๋ฅผ ์ œ๊ณตํ•˜๊ธฐ ๋•Œ๋ฌธ์—, Keras์˜ compile ๋ฐ fit ๋ฉ”์„œ๋“œ๋กœ ์ฆ‰์‹œ ํ›ˆ๋ จ์„ ์‹œ์ž‘ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

  1. TFPreTrainedModel ๋˜๋Š” tf.keras.Model๋กœ ์‹œ์ž‘ํ•ฉ๋‹ˆ๋‹ค.

    >>> from transformers import TFAutoModelForSequenceClassification >>> model = TFAutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased")
  2. ํ† ํฌ๋‚˜์ด์ €, ํŠน์ง•์ถ”์ถœ๊ธฐ(feature extractor), ์ „์ฒ˜๋ฆฌ๊ธฐ(processor) ํด๋ž˜์Šค ๋“ฑ์œผ๋กœ ์ „์ฒ˜๋ฆฌ๋ฅผ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.

    >>> from transformers import AutoTokenizer >>> tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")
  3. ๋ฐ์ดํ„ฐ์…‹์„ ํ† ํฐํ™”ํ•˜๋Š” ํ•จ์ˆ˜๋ฅผ ๋งŒ๋“ญ๋‹ˆ๋‹ค.

    >>> def tokenize_dataset(dataset): ... return tokenizer(dataset["text"]) # doctest: +SKIP
  4. map์œผ๋กœ ์ „์ฒด ๋ฐ์ดํ„ฐ์…‹์— ์œ„ ํ•จ์ˆ˜๋ฅผ ์ ์šฉ์‹œํ‚จ ๋‹ค์Œ, ๋ฐ์ดํ„ฐ์…‹๊ณผ ํ† ํฌ๋‚˜์ด์ €๋ฅผ prepare_tf_dataset()๋กœ ์ „๋‹ฌํ•ฉ๋‹ˆ๋‹ค. ๋ฐฐ์น˜ ํฌ๊ธฐ๋ฅผ ๋ณ€๊ฒฝํ•ด๋ณด๊ฑฐ๋‚˜ ๋ฐ์ดํ„ฐ์…‹๋ฅผ ์„ž์–ด๋ด๋„ ์ข‹์Šต๋‹ˆ๋‹ค.

    >>> dataset = dataset.map(tokenize_dataset) # doctest: +SKIP >>> tf_dataset = model.prepare_tf_dataset( ... dataset, batch_size=16, shuffle=True, tokenizer=tokenizer ... ) # doctest: +SKIP
  5. ์ค€๋น„๋˜์—ˆ์œผ๋ฉด compile๊ณผ fit์œผ๋กœ ํ›ˆ๋ จ์„ ์‹œ์ž‘ํ•˜์„ธ์š”.

    >>> from tensorflow.keras.optimizers import Adam >>> model.compile(optimizer=Adam(3e-5)) >>> model.fit(dataset) # doctest: +SKIP

์ด์ œ ๋ฌด์–ผ ํ•˜๋ฉด ๋ ๊นŒ์š”?[[whats-next]]

๐Ÿค— Transformers ๋‘˜๋Ÿฌ๋ณด๊ธฐ๋ฅผ ๋ชจ๋‘ ์ฝ์œผ์…จ๋‹ค๋ฉด, ๊ฐ€์ด๋“œ๋ฅผ ํ†ตํ•ด ํŠน์ • ๊ธฐ์ˆ ์„ ๋ฐฐ์šธ ์ˆ˜ ์žˆ์–ด์š”. ์˜ˆ๋ฅผ ๋“ค์–ด ์ปค์Šคํ…€ ๋ชจ๋ธ์„ ์ž‘์„ฑํ•˜๋Š” ๋ฐฉ๋ฒ•, ํƒœ์Šคํฌ์šฉ ๋ชจ๋ธ์„ ํŒŒ์ธํŠœ๋‹ํ•˜๋Š” ๋ฐฉ๋ฒ•, ์Šคํฌ๋ฆฝํŠธ๋กœ ๋ชจ๋ธ์„ ํ›ˆ๋ จ์‹œํ‚ค๋Š” ๋ฐฉ๋ฒ• ๋“ฑ์ด ์žˆ์Šต๋‹ˆ๋‹ค. ๐Ÿค— Transformers์˜ ํ•ต์‹ฌ ๊ฐœ๋…์— ๋Œ€ํ•ด ์ž์„ธํžˆ ์•Œ์•„๋ณด๋ ค๋ฉด ์ปคํ”ผ ํ•œ ์ž”์„ ๋งˆ์‹  ๋’ค ๊ฐœ๋… ๊ฐ€์ด๋“œ๋ฅผ ์‚ดํŽด๋ณด์…”๋„ ์ข‹์Šต๋‹ˆ๋‹ค!