Path: blob/main/a4/model_embeddings.py
995 views
#!/usr/bin/env python31# -*- coding: utf-8 -*-23"""4CS224N 2022-23: Homework 45model_embeddings.py: Embeddings for the NMT model6Pencheng Yin <[email protected]>7Sahil Chopra <[email protected]>8Anand Dhoot <[email protected]>9Vera Lin <[email protected]>10Siyan Li <[email protected]>11"""1213import torch.nn as nn1415class ModelEmbeddings(nn.Module):16"""17Class that converts input words to their embeddings.18"""19def __init__(self, embed_size, vocab):20"""21Init the Embedding layers.2223@param embed_size (int): Embedding size (dimensionality)24@param vocab (Vocab): Vocabulary object containing src and tgt languages25See vocab.py for documentation.26"""27super(ModelEmbeddings, self).__init__()28self.embed_size = embed_size2930# default values31self.source = None32self.target = None3334src_pad_token_idx = vocab.src['<pad>']35tgt_pad_token_idx = vocab.tgt['<pad>']3637### YOUR CODE HERE (~2 Lines)38### TODO - Initialize the following variables:39### self.source (Embedding Layer for source language)40### self.target (Embedding Layer for target langauge)41###42### Note:43### 1. `vocab` object contains two vocabularies:44### `vocab.src` for source45### `vocab.tgt` for target46### 2. You can get the length of a specific vocabulary by running:47### `len(vocab.<specific_vocabulary>)`48### 3. Remember to include the padding token for the specific vocabulary49### when creating your Embedding.50###51### Use the following docs to properly initialize these variables:52### Embedding Layer:53### https://pytorch.org/docs/stable/nn.html#torch.nn.Embedding54self.source = nn.Embedding(len(vocab.src), embed_size, src_pad_token_idx)55self.target = nn.Embedding(len(vocab.tgt), embed_size, tgt_pad_token_idx)5657### END YOUR CODE5859606162