Path: blob/master/translate_cache/optimizers/configs.zh.json
4928 views
{1"<h1>Configurable Optimizer</h1>\n": "<h1>\u53ef\u914d\u7f6e\u7684\u4f18\u5316\u5668</h1>\n",2"<p> <a id=\"OptimizerConfigs\"></a></p>\n<h2>Optimizer Configurations</h2>\n": "<p><a id=\"OptimizerConfigs\"></a></p>\n<h2>\u4f18\u5316\u5668\u914d\u7f6e</h2>\n",3"<p>Beta values <span translate=no>_^_0_^_</span> for Adam </p>\n": "<p>Adam<span translate=no>_^_0_^_</span> \u7684 Beta \u503c</p>\n",4"<p>Epsilon <span translate=no>_^_0_^_</span> for adam </p>\n": "<p>Epsilon<span translate=no>_^_0_^_</span> \u4ee3\u8868\u4e9a\u5f53</p>\n",5"<p>Learning rate <span translate=no>_^_0_^_</span> </p>\n": "<p>\u5b66\u4e60\u7387<span translate=no>_^_0_^_</span></p>\n",6"<p>Model embedding size for Noam optimizer </p>\n": "<p>Noam \u4f18\u5316\u5668\u7684\u6a21\u578b\u5d4c\u5165\u5927\u5c0f</p>\n",7"<p>Momentum for SGD </p>\n": "<p>\u65b0\u52a0\u5761\u5143\u7684\u52bf\u5934</p>\n",8"<p>Number of warmup optimizer steps </p>\n": "<p>\u9884\u70ed\u4f18\u5316\u5668\u6b65\u9aa4\u6570</p>\n",9"<p>Optimizer </p>\n": "<p>\u4f18\u5316\u5668</p>\n",10"<p>Parameters to be optimized </p>\n": "<p>\u8981\u4f18\u5316\u7684\u53c2\u6570</p>\n",11"<p>Total number of optimizer steps (for cosine decay) </p>\n": "<p>\u4f18\u5316\u5668\u6b65\u957f\u603b\u6570\uff08\u4f59\u5f26\u8870\u51cf\uff09</p>\n",12"<p>Weight decay </p>\n": "<p>\u4f53\u91cd\u8870\u51cf</p>\n",13"<p>Whether the adam update is optimized (different epsilon) </p>\n": "<p>adam \u66f4\u65b0\u662f\u5426\u7ecf\u8fc7\u4f18\u5316\uff08\u4e0d\u540c\u7684 epsilon\uff09</p>\n",14"<p>Whether to degenerate to SGD in AdaBelief </p>\n": "<p>\u662f\u5426\u5728 AdaBeLief \u4e2d\u9000\u5316\u4e3a\u65b0\u52a0\u5761\u5143</p>\n",15"<p>Whether to use AMSGrad </p>\n": "<p>\u662f\u5426\u4f7f\u7528 AmsGrad</p>\n",16"<p>Whether to use Rectified Adam in AdaBelief </p>\n": "<p>\u662f\u5426\u5728 AdaBelief \u4e2d\u4f7f\u7528\u6574\u6539\u8fc7\u7684\u4e9a\u5f53</p>\n",17"<p>Whether weight decay is absolute or should be multiplied by learning rate </p>\n": "<p>\u4f53\u91cd\u8870\u51cf\u662f\u7edd\u5bf9\u7684\u8fd8\u662f\u5e94\u8be5\u4e58\u4ee5\u5b66\u4e60\u901f\u7387</p>\n",18"<p>Whether weight decay is decoupled; i.e. weight decay is not added to gradients </p>\n": "<p>\u6743\u91cd\u8870\u51cf\u662f\u5426\u89e3\u8026\uff1b\u5373\u6743\u91cd\u8870\u51cf\u4e0d\u6dfb\u52a0\u5230\u68af\u5ea6\u4e2d</p>\n",19"Configurable optimizer module": "\u53ef\u914d\u7f6e\u7684\u4f18\u5316\u5668\u6a21\u5757",20"This implements a configurable module for optimizers.": "\u8fd9\u4e3a\u4f18\u5316\u5668\u5b9e\u73b0\u4e86\u4e00\u4e2a\u53ef\u914d\u7f6e\u7684\u6a21\u5757\u3002"21}2223