CoCalc -- experiment.ja.json

GitHub Repository: labmlai/annotated_deep_learning_paper_implementations
Path: blob/master/translate_cache/sampling/experiment.ja.json
⁴⁹³¹ views
1
{
2
 "<h1>Trying out Sampling Techniques for Language Models</h1>\n<ul><li><a href=\"greedy.html\">Greedy Sampling</a> </li>\n<li><a href=\"temperature.html\">Temperature Sampling</a> </li>\n<li><a href=\"top_k.html\">Top-k Sampling</a> </li>\n<li><a href=\"nucleus.html\">Nucleus Sampling</a></li></ul>\n<p>This experiment uses the above sampling techniques, on HuggingFace&#x27;s GPT2 model.</p>\n": "<h1>\u8a00\u8a9e\u30e2\u30c7\u30eb\u306e\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0\u624b\u6cd5\u306e\u8a66\u307f</h1>\n<ul><li><a href=\"greedy.html\">\u6b32\u5f35\u308a\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0</a></li>\n<li><a href=\"temperature.html\">\u6e29\u5ea6\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0</a></li>\n<li><a href=\"top_k.html\">\u30c8\u30c3\u30d7k\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0</a></li>\n<li><a href=\"nucleus.html\">\u6838\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0</a></li></ul>\n<p>\u3053\u306e\u5b9f\u9a13\u3067\u306f\u3001HuggingFace\u306eGPT2\u30e2\u30c7\u30eb\u3067\u4e0a\u8a18\u306e\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0\u624b\u6cd5\u3092\u4f7f\u7528\u3057\u3066\u3044\u307e\u3059\u3002</p>\n",
3
 "<h2>Sample from model</h2>\n<ul><li><span translate=no>_^_0_^_</span>  is the model to sample from </li>\n<li><span translate=no>_^_1_^_</span>  is the tokenizer to use </li>\n<li><span translate=no>_^_2_^_</span>  is the sampler to use </li>\n<li><span translate=no>_^_3_^_</span>  is the number of samples to generate </li>\n<li><span translate=no>_^_4_^_</span>  is the number of tokens to generate </li>\n<li><span translate=no>_^_5_^_</span>  is the maximum sequence length for the model </li>\n<li><span translate=no>_^_6_^_</span>  is the starting prompt</li></ul>\n": "<h2>\u30e2\u30c7\u30eb\u304b\u3089\u306e\u30b5\u30f3\u30d7\u30eb</h2>\n<ul><li><span translate=no>_^_0_^_</span>\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0\u5143\u306e\u30e2\u30c7\u30eb\u3067\u3059</li>\n<li><span translate=no>_^_1_^_</span>\u4f7f\u7528\u3059\u308b\u30c8\u30fc\u30af\u30ca\u30a4\u30b6\u30fc\u3067\u3059</li>\n<li><span translate=no>_^_2_^_</span>\u4f7f\u7528\u3059\u308b\u30b5\u30f3\u30d7\u30e9\u30fc\u306f</li>\n<li><span translate=no>_^_3_^_</span>\u306f\u751f\u6210\u3059\u308b\u30b5\u30f3\u30d7\u30eb\u306e\u6570\u3067\u3059</li>\n<li><span translate=no>_^_4_^_</span>\u306f\u751f\u6210\u3059\u308b\u30c8\u30fc\u30af\u30f3\u306e\u6570\u3067\u3059</li>\n<li><span translate=no>_^_5_^_</span>\u30e2\u30c7\u30eb\u306e\u6700\u5927\u30b7\u30fc\u30b1\u30f3\u30b9\u9577\u3067\u3059</li>\n<li><span translate=no>_^_6_^_</span>\u306f\u958b\u59cb\u30d7\u30ed\u30f3\u30d7\u30c8\u3067\u3059</li></ul>\n",
4
 "<h3>Try different sampling techniques</h3>\n": "<h3>\u3055\u307e\u3056\u307e\u306a\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0\u624b\u6cd5\u3092\u8a66\u3057\u3066\u304f\u3060\u3055\u3044</h3>\n",
5
 "<p> </p>\n": "<p></p>\n",
6
 "<p><a href=\"greedy.html\">Greedy Sampling</a> </p>\n": "<p><a href=\"greedy.html\">\u6b32\u5f35\u308a\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0</a></p>\n",
7
 "<p><a href=\"nucleus.html\">Nucleus Sampling</a> </p>\n": "<p><a href=\"nucleus.html\">\u6838\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0</a></p>\n",
8
 "<p><a href=\"temperature.html\">Temperature Sampling</a> </p>\n": "<p><a href=\"temperature.html\">\u6e29\u5ea6\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0</a></p>\n",
9
 "<p><a href=\"top_k.html\">Top-k Sampling</a> </p>\n": "<p><a href=\"top_k.html\">\u30c8\u30c3\u30d7k\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0</a></p>\n",
10
 "<p>Add the sampled token to the data </p>\n": "<p>\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0\u3057\u305f\u30c8\u30fc\u30af\u30f3\u3092\u30c7\u30fc\u30bf\u306b\u8ffd\u52a0\u3057\u307e\u3059</p>\n",
11
 "<p>Collect output for printing </p>\n": "<p>\u5370\u5237\u7528\u306e\u51fa\u529b\u3092\u53ce\u96c6</p>\n",
12
 "<p>Decode and add the sampled token for logging </p>\n": "<p>\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0\u3057\u305f\u30c8\u30fc\u30af\u30f3\u3092\u30c7\u30b3\u30fc\u30c9\u3057\u3066\u30ed\u30ae\u30f3\u30b0\u7528\u306b\u8ffd\u52a0</p>\n",
13
 "<p>Get the <span translate=no>_^_0_^_</span> of the last token </p>\n": "<p>\u6700\u5f8c\u306e\u30c8\u30fc\u30af\u30f3\u306e\u53d6\u5f97 <span translate=no>_^_0_^_</span></p>\n",
14
 "<p>Get the model output. The &#x27;logits&#x27; has shape <span translate=no>_^_0_^_</span> </p>\n": "<p>\u30e2\u30c7\u30eb\u51fa\u529b\u3092\u53d6\u5f97\u3057\u307e\u3059\u3002\u300c\u30ed\u30b8\u30c3\u30c8\u300d\u306b\u306f\u5f62\u304c\u3042\u308a\u307e\u3059 <span translate=no>_^_0_^_</span></p>\n",
15
 "<p>Load the model and tokenizer </p>\n": "<p>\u30e2\u30c7\u30eb\u3068\u30c8\u30fc\u30af\u30ca\u30a4\u30b6\u30fc\u3092\u30ed\u30fc\u30c9</p>\n",
16
 "<p>Print the sampled outputs </p>\n": "<p>\u30b5\u30f3\u30d7\u30eb\u51fa\u529b\u3092\u5370\u5237</p>\n",
17
 "<p>Prompts to use for sampling </p>\n": "<p>\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0\u306b\u4f7f\u7528\u3059\u308b\u30d7\u30ed\u30f3\u30d7\u30c8</p>\n",
18
 "<p>Sample <span translate=no>_^_0_^_</span> </p>\n": "<p>[\u30b5\u30f3\u30d7\u30eb] <span translate=no>_^_0_^_</span></p>\n",
19
 "<p>Sample from the <span translate=no>_^_0_^_</span> </p>\n": "<p>\u304b\u3089\u306e\u30b5\u30f3\u30d7\u30eb <span translate=no>_^_0_^_</span></p>\n",
20
 "<p>Set the model to eval mode </p>\n": "<p>\u30e2\u30c7\u30eb\u3092 eval \u30e2\u30fc\u30c9\u306b\u8a2d\u5b9a</p>\n",
21
 "<p>Tokenize the <span translate=no>_^_0_^_</span> and make <span translate=no>_^_1_^_</span> copies of it </p>\n": "<p><span translate=no>_^_0_^_</span><span translate=no>_^_1_^_</span>\u3092\u30c8\u30fc\u30af\u30f3\u5316\u3057\u3066\u30b3\u30d4\u30fc\u3092\u4f5c\u6210</p>\n",
22
 "<p>Truncate the data to the maximum sequence length </p>\n": "<p>\u30c7\u30fc\u30bf\u3092\u6700\u5927\u30b7\u30fc\u30b1\u30f3\u30b9\u9577\u307e\u3067\u5207\u308a\u6368\u3066\u308b</p>\n",
23
 "Trying out Sampling Techniques for Language Models": "\u8a00\u8a9e\u30e2\u30c7\u30eb\u306e\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0\u624b\u6cd5\u306e\u8a66\u307f",
24
 "We try out different sampling techniques for language models on HuggingFace's GPT2 model.": "HuggingFace\u306eGPT2\u30e2\u30c7\u30eb\u3067\u3001\u8a00\u8a9e\u30e2\u30c7\u30eb\u7528\u306b\u3055\u307e\u3056\u307e\u306a\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0\u624b\u6cd5\u3092\u8a66\u3057\u3066\u3044\u307e\u3059\u3002"
25
}
26
Product

Resources

Company