Path: blob/master/translate_cache/sampling/nucleus.ja.json
4928 views
{1"<h1>Nucleus Sampling</h1>\n<p>This is an implementation of nucleus sampling, introduced in the paper <a href=\"https://arxiv.org/abs/1904.09751\">The Curious Case of Neural Text Degeneration</a>.</p>\n<p>The paper discusses the problems with other sampling methods such as Beam Search, <a href=\"temperature.html\">Pure sampling</a>, <a href=\"temperature.html\">Temperature sampling</a>, and <a href=\"top_k.html\">Top-k sampling</a>. The paper introduces the idea of nucleus sampling, which practically performs better than other sampling methods for text generation.</p>\n<p>Nucleus sampling first picks a subset of the vocabulary <span translate=no>_^_0_^_</span>, where <span translate=no>_^_1_^_</span> is smallest set of tokens such that</p>\n<p><span translate=no>_^_2_^_</span></p>\n<p>That is, we pick the highest probable tokens until the sum of their probabilities is less that <span translate=no>_^_3_^_</span>.</p>\n<p>Then we sample from the selected tokens.</p>\n<p>Here's an <a href=\"experiment.html\">experiment</a> that uses these sampling techniques.</p>\n": "<h1>\u6838\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0</h1>\n<p>\u3053\u308c\u306f\u6838\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0\u306e\u5b9f\u88c5\u3067\u3001\u8ad6\u6587\u300c<a href=\"https://arxiv.org/abs/1904.09751\">\u795e\u7d4c\u30c6\u30ad\u30b9\u30c8\u5909\u6027\u306e\u5947\u5999\u306a\u4e8b\u4f8b</a>\u300d\u3067\u7d39\u4ecb\u3055\u308c\u3066\u3044\u307e\u3059\u3002</p>\n<p>\u3053\u306e\u8ad6\u6587\u3067\u306f\u3001\u30d3\u30fc\u30e0\u30b5\u30fc\u30c1\u3001<a href=\"temperature.html\">\u30d4\u30e5\u30a2\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0\u3001\u6e29\u5ea6\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0</a><a href=\"temperature.html\">\u3001<a href=\"top_k.html\">TOP-K\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0\u306a\u3069\u306e\u4ed6\u306e\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0\u65b9\u6cd5\u306e\u554f\u984c\u306b\u3064\u3044\u3066\u8aac\u660e\u3057\u307e\u3059</a></a>\u3002\u3053\u306e\u8ad6\u6587\u3067\u306f\u3001\u6838\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0\u306e\u30a2\u30a4\u30c7\u30a2\u3092\u7d39\u4ecb\u3057\u3066\u3044\u307e\u3059\u3002\u6838\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0\u306f\u3001\u30c6\u30ad\u30b9\u30c8\u751f\u6210\u306b\u304a\u3044\u3066\u4ed6\u306e\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0\u65b9\u6cd5\u3088\u308a\u3082\u5b9f\u8cea\u7684\u306b\u512a\u308c\u3066\u3044\u307e\u3059</p>\u3002\n<p>Nucleus \u30b5\u30f3\u30d7\u30ea\u30f3\u30b0\u3067\u306f\u3001\u6700\u521d\u306b\u30dc\u30ad\u30e3\u30d6\u30e9\u30ea\u306e\u30b5\u30d6\u30bb\u30c3\u30c8\u3092\u9078\u629e\u3057\u307e\u3059\u3002\u3053\u3053\u3067<span translate=no>_^_0_^_</span>\u3001<span translate=no>_^_1_^_</span>\u306f\u6b21\u306e\u3088\u3046\u306a\u30c8\u30fc\u30af\u30f3\u306e\u6700\u5c0f\u30bb\u30c3\u30c8\u3092\u9078\u629e\u3057\u307e\u3059\u3002</p>\n<p><span translate=no>_^_2_^_</span></p>\n<p>\u3064\u307e\u308a\u3001\u78ba\u7387\u306e\u5408\u8a08\u304c\u305d\u308c\u3088\u308a\u5c0f\u3055\u304f\u306a\u308b\u307e\u3067\u3001\u6700\u3082\u53ef\u80fd\u6027\u306e\u9ad8\u3044\u30c8\u30fc\u30af\u30f3\u3092\u9078\u629e\u3057\u307e\u3059\u3002<span translate=no>_^_3_^_</span></p>\n<p>\u6b21\u306b\u3001\u9078\u629e\u3057\u305f\u30c8\u30fc\u30af\u30f3\u304b\u3089\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0\u3057\u307e\u3059\u3002</p>\n<p>\u3053\u308c\u306f\u3001<a href=\"experiment.html\">\u3053\u308c\u3089\u306e\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0\u624b\u6cd5\u3092\u4f7f\u7528\u3057\u305f\u5b9f\u9a13\u3067\u3059</a>\u3002</p>\n",2"<h2>Nucleus Sampler</h2>\n": "<h2>\u6838\u30b5\u30f3\u30d7\u30e9\u30fc</h2>\n",3"<p> </p>\n": "<p></p>\n",4"<p> Sample from logits with Nucleus Sampling</p>\n": "<p>Nucleus \u30b5\u30f3\u30d7\u30ea\u30f3\u30b0\u306b\u3088\u308b\u30ed\u30b8\u30c3\u30c8\u304b\u3089\u306e\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0</p>\n",5"<p>Find the cumulative sums less than <span translate=no>_^_0_^_</span>. </p>\n": "<p>\u3088\u308a\u5c0f\u3055\u3044\u7d2f\u7a4d\u548c\u3092\u6c42\u3081\u307e\u3059\u3002<span translate=no>_^_0_^_</span></p>\n",6"<p>Get log probabilities and mask out the non-nucleus </p>\n": "<p>\u5bfe\u6570\u78ba\u7387\u3092\u53d6\u5f97\u3057\u3066\u975e\u6838\u3092\u30de\u30b9\u30af\u3059\u308b</p>\n",7"<p>Get probabilities <span translate=no>_^_0_^_</span> </p>\n": "<p>\u78ba\u7387\u3092\u53d6\u5f97 <span translate=no>_^_0_^_</span></p>\n",8"<p>Get the actual indexes </p>\n": "<p>\u5b9f\u969b\u306e\u30a4\u30f3\u30c7\u30c3\u30af\u30b9\u3092\u53d6\u5f97</p>\n",9"<p>Get the cumulative sum of probabilities in the sorted order </p>\n": "<p>\u78ba\u7387\u306e\u7d2f\u7a4d\u5408\u8a08\u3092\u30bd\u30fc\u30c8\u3055\u308c\u305f\u9806\u5e8f\u3067\u6c42\u3081\u308b</p>\n",10"<p>Prepend ones so that we add one token after the minimum number of tokens with cumulative probability less that <span translate=no>_^_0_^_</span>. </p>\n": "<p>\u7d2f\u7a4d\u78ba\u7387\u304c\u305d\u308c\u3088\u308a\u5c0f\u3055\u3044\u30c8\u30fc\u30af\u30f3\u306e\u6700\u5c0f\u6570\u306e\u5f8c\u306b\u30c8\u30fc\u30af\u30f3\u30921\u3064\u8ffd\u52a0\u3059\u308b\u3088\u3046\u306b\u30011\u3092\u5148\u982d\u306b\u8ffd\u52a0\u3057\u307e\u3059\u3002<span translate=no>_^_0_^_</span></p>\n",11"<p>Sample from the sampler </p>\n": "<p>\u30b5\u30f3\u30d7\u30e9\u30fc\u304b\u3089\u306e\u30b5\u30f3\u30d7\u30eb</p>\n",12"<p>Softmax to compute <span translate=no>_^_0_^_</span> from the logits </p>\n": "<p><span translate=no>_^_0_^_</span>\u30ed\u30b8\u30c3\u30c8\u304b\u3089\u8a08\u7b97\u3059\u308b\u30bd\u30d5\u30c8\u30de\u30c3\u30af\u30b9</p>\n",13"<p>Sort probabilities in descending order </p>\n": "<p>\u78ba\u7387\u3092\u964d\u9806\u306b\u4e26\u3079\u66ff\u3048\u308b</p>\n",14"<ul><li><span translate=no>_^_0_^_</span> is the sum of probabilities of tokens to pick <span translate=no>_^_1_^_</span> </li>\n<li><span translate=no>_^_2_^_</span> is the sampler to use for the selected tokens</li></ul>\n": "<ul><li><span translate=no>_^_0_^_</span>\u30d4\u30c3\u30af\u3059\u308b\u30c8\u30fc\u30af\u30f3\u306e\u78ba\u7387\u306e\u5408\u8a08\u3067\u3059 <span translate=no>_^_1_^_</span></li>\n<li><span translate=no>_^_2_^_</span>\u9078\u629e\u3057\u305f\u30c8\u30fc\u30af\u30f3\u306b\u4f7f\u7528\u3059\u308b\u30b5\u30f3\u30d7\u30e9\u30fc\u3067\u3059</li></ul>\n",15"A PyTorch implementation of nucleus sampling from language models.": "\u8a00\u8a9e\u30e2\u30c7\u30eb\u304b\u3089\u306e\u6838\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0\u306ePyTorch\u5b9f\u88c5\u3002",16"Nucleus Sampling": "\u6838\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0"17}1819