CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutSign UpSign In
huggingface

Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place. Commercial Alternative to JupyterHub.

GitHub Repository: huggingface/notebooks
Path: blob/main/course/videos/tf_lr_scheduling.ipynb
Views: 2542
Kernel: Unknown Kernel

This notebook regroups the code sample of the video below, which is a part of the Hugging Face course.

#@title from IPython.display import HTML HTML('<iframe width="560" height="315" src="https://www.youtube.com/embed/cpzq6ESSM5c?rel=0&amp;controls=0&amp;showinfo=0" frameborder="0" allowfullscreen></iframe>')

Install the Transformers and Datasets libraries to run this notebook.

! pip install datasets transformers[sentencepiece]

This notebook regroups the code sample of the video below, which is a part of the Hugging Face course.

#@title from IPython.display import HTML HTML('<iframe width="560" height="315" src="https://www.youtube.com/embed/eKv4rRcCNX0?rel=0&amp;controls=0&amp;showinfo=0" frameborder="0" allowfullscreen></iframe>')

Install the Transformers and Datasets libraries to run this notebook.

! pip install datasets transformers[sentencepiece]
from datasets import load_dataset from transformers import AutoTokenizer import numpy as np raw_datasets = load_dataset("glue", "mrpc") checkpoint = "bert-base-uncased" tokenizer = AutoTokenizer.from_pretrained(checkpoint) def tokenize_dataset(dataset): encoded = tokenizer( dataset["sentence1"], dataset["sentence2"], truncation=True, ) return encoded.data tokenized_datasets = raw_datasets.map(tokenize_dataset) train_dataset = tokenized_datasets["train"].to_tf_dataset( columns=["input_ids", "attention_mask", "token_type_ids"], label_cols=["label"], shuffle=True, batch_size=8) validation_dataset = tokenized_datasets["validation"].to_tf_dataset( columns=["input_ids", "attention_mask", "token_type_ids"], label_cols=["label"], shuffle=True, batch_size=8)
import tensorflow as tf from transformers import TFAutoModelForSequenceClassification checkpoint = 'bert-base-cased' model = TFAutoModelForSequenceClassification.from_pretrained(checkpoint, num_labels=2) loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
from tensorflow.keras.optimizers.schedules import PolynomialDecay num_epochs = 3 num_train_steps = len(train_dataset) * num_epochs lr_scheduler = PolynomialDecay( initial_learning_rate=5e-5, end_learning_rate=0., decay_steps=num_train_steps )
from tensorflow.keras.optimizers import Adam opt = Adam(learning_rate=lr_scheduler) model.compile(loss=loss, optimizer=opt)