Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
labmlai
GitHub Repository: labmlai/annotated_deep_learning_paper_implementations
Path: blob/master/translate_cache/sampling/nucleus.zh.json
4930 views
1
{
2
"<h1>Nucleus Sampling</h1>\n<p>This is an implementation of nucleus sampling, introduced in the paper <a href=\"https://arxiv.org/abs/1904.09751\">The Curious Case of Neural Text Degeneration</a>.</p>\n<p>The paper discusses the problems with other sampling methods such as Beam Search, <a href=\"temperature.html\">Pure sampling</a>, <a href=\"temperature.html\">Temperature sampling</a>, and <a href=\"top_k.html\">Top-k sampling</a>. The paper introduces the idea of nucleus sampling, which practically performs better than other sampling methods for text generation.</p>\n<p>Nucleus sampling first picks a subset of the vocabulary <span translate=no>_^_0_^_</span>, where <span translate=no>_^_1_^_</span> is smallest set of tokens such that</p>\n<p><span translate=no>_^_2_^_</span></p>\n<p>That is, we pick the highest probable tokens until the sum of their probabilities is less that <span translate=no>_^_3_^_</span>.</p>\n<p>Then we sample from the selected tokens.</p>\n<p>Here&#x27;s an <a href=\"experiment.html\">experiment</a> that uses these sampling techniques.</p>\n": "<h1>\u539f\u5b50\u6838\u91c7\u6837</h1>\n<p>\u8fd9\u662f\u539f\u5b50\u6838\u91c7\u6837\u7684\u4e00\u79cd\u5b9e\u73b0\uff0c\u5728\u8bba\u6587<a href=\"https://arxiv.org/abs/1904.09751\">\u300a\u795e\u7ecf\u6587\u672c\u53d8\u6027\u7684\u597d\u5947\u6848\u4f8b\u300b</a>\u4e2d\u8fdb\u884c\u4e86\u4ecb\u7ecd\u3002</p>\n<p>\u672c\u6587\u8ba8\u8bba\u4e86\u5176\u4ed6\u91c7\u6837\u65b9\u6cd5\uff08\u4f8b\u5982\u5149\u675f\u641c\u7d22\u3001<a href=\"temperature.html\">\u7eaf\u91c7\u6837\u3001<a href=\"temperature.html\">\u6e29\u5ea6</a>\u91c7\u6837</a>\u548cT <a href=\"top_k.html\">op-K\u91c7\u6837</a>\uff09\u5b58\u5728\u7684\u95ee\u9898\u3002\u672c\u6587\u4ecb\u7ecd\u4e86\u539f\u5b50\u6838\u91c7\u6837\u7684\u6982\u5ff5\uff0c\u5728\u6587\u672c\u751f\u6210\u65b9\u9762\uff0c\u6838\u91c7\u6837\u7684\u6548\u679c\u5b9e\u9645\u4e0a\u6bd4\u5176\u4ed6\u91c7\u6837\u65b9\u6cd5\u8981\u597d\u3002</p>\n<p>Nucleus \u91c7\u6837\u9996\u5148\u9009\u62e9\u8bcd\u6c47\u7684\u4e00\u4e2a\u5b50\u96c6<span translate=no>_^_0_^_</span>\uff0c\u5176\u4e2d<span translate=no>_^_1_^_</span>\u662f\u6700\u5c0f\u7684\u4ee4\u724c\u96c6\u5408</p>\n<p><span translate=no>_^_2_^_</span></p>\n<p>\u4e5f\u5c31\u662f\u8bf4\uff0c\u6211\u4eec\u9009\u62e9\u53ef\u80fd\u6027\u6700\u9ad8\u7684\u4ee3\u5e01\uff0c\u76f4\u5230\u5b83\u4eec\u7684\u6982\u7387\u603b\u548c\u5c0f\u4e8e\u8be5\u503c\u4e3a\u6b62<span translate=no>_^_3_^_</span>\u3002</p>\n<p>\u7136\u540e\u6211\u4eec\u4ece\u9009\u5b9a\u7684\u4ee4\u724c\u4e2d\u62bd\u6837\u3002</p>\n<p>\u8fd9\u662f\u4e00\u4e2a\u4f7f\u7528\u8fd9\u4e9b\u91c7\u6837\u6280\u672f\u7684<a href=\"experiment.html\">\u5b9e\u9a8c</a>\u3002</p>\n",
3
"<h2>Nucleus Sampler</h2>\n": "<h2>Nucleus \u91c7\u6837\u5668</h2>\n",
4
"<p> </p>\n": "<p></p>\n",
5
"<p> Sample from logits with Nucleus Sampling</p>\n": "<p>\u4f7f\u7528 Nucleus \u91c7\u6837\u4ece logits \u4e2d\u63d0\u53d6\u6837\u672c</p>\n",
6
"<p>Find the cumulative sums less than <span translate=no>_^_0_^_</span>. </p>\n": "<p>\u627e\u51fa\u5c0f\u4e8e\u7684\u7d2f\u8ba1\u603b\u548c<span translate=no>_^_0_^_</span>\u3002</p>\n",
7
"<p>Get log probabilities and mask out the non-nucleus </p>\n": "<p>\u83b7\u53d6\u5bf9\u6570\u6982\u7387\u5e76\u63a9\u76d6\u975e\u6838</p>\n",
8
"<p>Get probabilities <span translate=no>_^_0_^_</span> </p>\n": "<p>\u83b7\u53d6\u6982\u7387<span translate=no>_^_0_^_</span></p>\n",
9
"<p>Get the actual indexes </p>\n": "<p>\u83b7\u53d6\u5b9e\u9645\u7d22\u5f15</p>\n",
10
"<p>Get the cumulative sum of probabilities in the sorted order </p>\n": "<p>\u6309\u6392\u5e8f\u987a\u5e8f\u83b7\u53d6\u6982\u7387\u7684\u7d2f\u79ef\u603b\u548c</p>\n",
11
"<p>Prepend ones so that we add one token after the minimum number of tokens with cumulative probability less that <span translate=no>_^_0_^_</span>. </p>\n": "\u5728@@ <p>\u524d\u9762\u52a0\u4e00\u4e2a\uff0c\u8fd9\u6837\u6211\u4eec\u5c31\u53ef\u4ee5\u5728\u7d2f\u79ef\u6982\u7387\u5c0f\u4e8e\u8be5\u503c\u7684\u6700\u5c0f\u4ee3\u5e01\u6570\u91cf\u4e4b\u540e\u6dfb\u52a0\u4e00\u4e2a\u4ee4\u724c<span translate=no>_^_0_^_</span>\u3002</p>\n",
12
"<p>Sample from the sampler </p>\n": "<p>\u6765\u81ea\u91c7\u6837\u5668\u7684\u6837\u672c</p>\n",
13
"<p>Softmax to compute <span translate=no>_^_0_^_</span> from the logits </p>\n": "<p>\u8981\u6839\u636e\u5bf9\u6570\u8ba1\u7b97<span translate=no>_^_0_^_</span>\u7684 softmax</p>\n",
14
"<p>Sort probabilities in descending order </p>\n": "<p>\u6309\u964d\u5e8f\u5bf9\u6982\u7387\u8fdb\u884c\u6392\u5e8f</p>\n",
15
"<ul><li><span translate=no>_^_0_^_</span> is the sum of probabilities of tokens to pick <span translate=no>_^_1_^_</span> </li>\n<li><span translate=no>_^_2_^_</span> is the sampler to use for the selected tokens</li></ul>\n": "<ul><li><span translate=no>_^_0_^_</span>\u662f\u8981\u9009\u62e9\u7684\u4ee3\u5e01\u6982\u7387\u4e4b\u548c<span translate=no>_^_1_^_</span></li>\n<li><span translate=no>_^_2_^_</span>\u662f\u7528\u4e8e\u9009\u5b9a\u4ee4\u724c\u7684\u91c7\u6837\u5668</li></ul>\n",
16
"A PyTorch implementation of nucleus sampling from language models.": "\u4ece\u8bed\u8a00\u6a21\u578b\u8fdb\u884c\u6838\u91c7\u6837\u7684 PyTorch \u5b9e\u73b0\u3002",
17
"Nucleus Sampling": "\u539f\u5b50\u6838\u91c7\u6837"
18
}
19