Path: blob/master/translate_cache/optimizers/noam.ja.json
4924 views
{1"<h1>Noam Optimizer</h1>\n<p>This is the <a href=\"https://pytorch.org\">PyTorch</a> implementation of optimizer introduced in the paper <a href=\"https://arxiv.org/abs/1706.03762\">Attention Is All You Need</a>.</p>\n": "<h1>\u30ce\u30fc\u30e0\u30aa\u30d7\u30c6\u30a3\u30de\u30a4\u30b6\u30fc</h1>\n<p>\u3053\u308c\u306f\u3001\u8ad6\u6587\u300c<a href=\"https://arxiv.org/abs/1706.03762\">\u5fc5\u8981\u306a\u306e\u306f\u6ce8\u610f\u3060\u3051\u300d<a href=\"https://pytorch.org\">\u3067\u7d39\u4ecb\u3055\u308c\u3066\u3044\u308b\u30aa\u30d7\u30c6\u30a3\u30de\u30a4\u30b6\u30fc\u306ePyTorch\u5b9f\u88c5\u3067\u3059</a></a>\u3002</p>\n",2"<h2>Noam Optimizer</h2>\n<p>This class extends from Adam optimizer defined in <a href=\"adam.html\"><span translate=no>_^_0_^_</span></a>.</p>\n": "<h2>\u30ce\u30fc\u30e0\u30aa\u30d7\u30c6\u30a3\u30de\u30a4\u30b6\u30fc</h2>\n<p>\u3053\u306e\u30af\u30e9\u30b9\u306f\u3001\u3067\u5b9a\u7fa9\u3055\u308c\u3066\u3044\u308b Adam \u30aa\u30d7\u30c6\u30a3\u30de\u30a4\u30b6\u3092\u62e1\u5f35\u3057\u305f\u3082\u306e\u3067\u3059\u3002<a href=\"adam.html\"><span translate=no>_^_0_^_</span></a></p>\n",3"<h3>Get learning-rate</h3>\n<p><span translate=no>_^_0_^_</span> where <span translate=no>_^_1_^_</span> is the number of warmup steps.</p>\n": "<h3>\u5b66\u7fd2\u7387\u3092\u53d6\u5f97</h3>\n<p><span translate=no>_^_0_^_</span><span translate=no>_^_1_^_</span>\u306f\u30a6\u30a9\u30fc\u30e0\u30a2\u30c3\u30d7\u30b9\u30c6\u30c3\u30d7\u306e\u6570\u3067\u3059\u3002</p>\n",4"<h3>Initialize the optimizer</h3>\n<ul><li><span translate=no>_^_0_^_</span> is the list of parameters </li>\n<li><span translate=no>_^_1_^_</span> is the learning rate <span translate=no>_^_2_^_</span> </li>\n<li><span translate=no>_^_3_^_</span> is a tuple of (<span translate=no>_^_4_^_</span>, <span translate=no>_^_5_^_</span>) </li>\n<li><span translate=no>_^_6_^_</span> is <span translate=no>_^_7_^_</span> or <span translate=no>_^_8_^_</span> based on <span translate=no>_^_9_^_</span> </li>\n<li><span translate=no>_^_10_^_</span> is an instance of class <span translate=no>_^_11_^_</span> defined in <a href=\"index.html\"><span translate=no>_^_12_^_</span></a> </li>\n<li>'optimized_update' is a flag whether to optimize the bias correction of the second moment by doing it after adding <span translate=no>_^_13_^_</span> </li>\n<li><span translate=no>_^_14_^_</span> is a flag indicating whether to use AMSGrad or fallback to plain Adam </li>\n<li><span translate=no>_^_15_^_</span> number of warmup steps </li>\n<li><span translate=no>_^_16_^_</span> model size; i.e. number of dimensions in the transformer </li>\n<li><span translate=no>_^_17_^_</span> is a dictionary of default for group values. This is useful when you want to extend the class <span translate=no>_^_18_^_</span>.</li></ul>\n": "<h3>\u30aa\u30d7\u30c6\u30a3\u30de\u30a4\u30b6\u3092\u521d\u671f\u5316</h3>\n<ul><li><span translate=no>_^_0_^_</span>\u306f\u30d1\u30e9\u30e1\u30fc\u30bf\u306e\u30ea\u30b9\u30c8\u3067\u3059</li>\n<li><span translate=no>_^_1_^_</span>\u306f\u5b66\u7fd2\u7387 <span translate=no>_^_2_^_</span></li>\n<li><span translate=no>_^_3_^_</span>(,) <span translate=no>_^_4_^_</span> \u306e\u30bf\u30d7\u30eb\u3067\u3059 <span translate=no>_^_5_^_</span></li>\n<li><span translate=no>_^_6_^_</span><span translate=no>_^_7_^_</span><span translate=no>_^_8_^_</span>\u307e\u305f\u306f\u305d\u308c\u306b\u57fa\u3065\u3044\u3066\u3044\u308b <span translate=no>_^_9_^_</span></li>\n<li><span translate=no>_^_10_^_</span><span translate=no>_^_11_^_</span>\u3067\u5b9a\u7fa9\u3055\u308c\u3066\u3044\u308b\u30af\u30e9\u30b9\u306e\u30a4\u30f3\u30b9\u30bf\u30f3\u30b9\u3067\u3059 <a href=\"index.html\"><span translate=no>_^_12_^_</span></a></li>\n<li>'optimized_update'\u306f\u8ffd\u52a0\u5f8c\u306b\u884c\u3046\u3053\u3068\u3067\u30bb\u30ab\u30f3\u30c9\u30e2\u30fc\u30e1\u30f3\u30c8\u306e\u30d0\u30a4\u30a2\u30b9\u88dc\u6b63\u3092\u6700\u9069\u5316\u3059\u308b\u304b\u3069\u3046\u304b\u306e\u30d5\u30e9\u30b0\u3067\u3059 <span translate=no>_^_13_^_</span></li>\n<li><span translate=no>_^_14_^_</span>amsGrad\u3092\u4f7f\u7528\u3059\u308b\u304b\u3001\u30d7\u30ec\u30fc\u30f3\u306aAdam\u306b\u30d5\u30a9\u30fc\u30eb\u30d0\u30c3\u30af\u3059\u308b\u304b\u3092\u793a\u3059\u30d5\u30e9\u30b0\u3067\u3059</li>\n<li><span translate=no>_^_15_^_</span>\u30a6\u30a9\u30fc\u30e0\u30a2\u30c3\u30d7\u30b9\u30c6\u30c3\u30d7\u6570</li>\n<li><span translate=no>_^_16_^_</span>\u30e2\u30c7\u30eb\u30b5\u30a4\u30ba\u3001\u3064\u307e\u308a\u5909\u5727\u5668\u306e\u6b21\u5143\u6570</li>\n<li><span translate=no>_^_17_^_</span>\u30b0\u30eb\u30fc\u30d7\u5024\u306e\u30c7\u30d5\u30a9\u30eb\u30c8\u8f9e\u66f8\u3067\u3059\u3002\u3053\u308c\u306f\u3001\u30af\u30e9\u30b9\u3092\u62e1\u5f35\u3059\u308b\u5834\u5408\u306b\u4fbf\u5229\u3067\u3059<span translate=no>_^_18_^_</span>\u3002</li></ul>\n",5"<h3>Plot learning rate for different warmups and model sizes</h3>\n<p><span translate=no>_^_0_^_</span></p>\n": "<h3>\u3055\u307e\u3056\u307e\u306a\u30a6\u30a9\u30fc\u30e0\u30a2\u30c3\u30d7\u3068\u30e2\u30c7\u30eb\u30b5\u30a4\u30ba\u306e\u5b66\u7fd2\u7387\u3092\u30d7\u30ed\u30c3\u30c8</h3>\n<p><span translate=no>_^_0_^_</span></p>\n",6"<p><span translate=no>_^_0_^_</span> </p>\n": "<p><span translate=no>_^_0_^_</span></p>\n",7"Noam optimizer from Attention is All You Need paper": "\u30a2\u30c6\u30f3\u30b7\u30e7\u30f3\u30fb\u30a4\u30ba\u30fb\u30aa\u30fc\u30eb\u30fb\u30e6\u30fc\u30fb\u30cb\u30fc\u30c9\u8ad6\u6587\u306e Noam \u30aa\u30d7\u30c6\u30a3\u30de\u30a4\u30b6\u30fc",8"This is a tutorial/implementation of Noam optimizer. Noam optimizer has a warm-up period and then an exponentially decaying learning rate.": "\u3053\u308c\u306fNoam\u30aa\u30d7\u30c6\u30a3\u30de\u30a4\u30b6\u30fc\u306e\u30c1\u30e5\u30fc\u30c8\u30ea\u30a2\u30eb/\u5b9f\u88c5\u3067\u3059\u3002Noam Optimizer\u306b\u306f\u30a6\u30a9\u30fc\u30e0\u30a2\u30c3\u30d7\u671f\u9593\u304c\u3042\u308a\u3001\u305d\u306e\u5f8c\u306f\u5b66\u7fd2\u7387\u304c\u6307\u6570\u95a2\u6570\u7684\u306b\u4f4e\u4e0b\u3057\u307e\u3059\u3002"9}1011