Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
labmlai
GitHub Repository: labmlai/annotated_deep_learning_paper_implementations
Path: blob/master/translate_cache/optimizers/adam_warmup.si.json
4937 views
1
{
2
"<h1>Adam Optimizer with Warmup</h1>\n<p>This extends <a href=\"amsgrad.html\">AMSGrad optimizer</a> and adds a warmup stage.</p>\n": "<h1>\u0d8b\u0dab\u0dd4\u0dc3\u0dd4\u0db8\u0dca\u0d9a\u0dd2\u0dbb\u0dd3\u0db8 \u0dc3\u0db8\u0d9f \u0d87\u0da9\u0db8\u0dca \u0db4\u0dca\u0dbb\u0dc1\u0dc3\u0dca\u0dad\u0d9a\u0dbb\u0dab\u0dba</h1>\n<p>\u0db8\u0dd9\u0dba <a href=\"amsgrad.html\">AMSGrad \u0db4\u0dca\u0dbb\u0dc1\u0dc3\u0dca\u0dad\u0d9a\u0dbb\u0dab\u0dba</a> \u0db4\u0dd4\u0dc5\u0dd4\u0dbd\u0dca \u0d9a\u0dbb\u0db1 \u0d85\u0dad\u0dbb \u0d8b\u0db1\u0dd4\u0dc3\u0dd4\u0db8\u0dca \u0d85\u0dc0\u0db0\u0dd2\u0dba\u0d9a\u0dca \u0d91\u0d9a\u0dca \u0d9a\u0dbb\u0dba\u0dd2. </p>\n",
3
"<h2>Adam Optimizer with Warmup</h2>\n<p>This class extends from AMSGrad optimizer defined in <a href=\"amsgrad.html\"><span translate=no>_^_0_^_</span></a>.</p>\n": "<h2>\u0d8b\u0dab\u0dd4\u0dc3\u0dd4\u0db8\u0dca\u0d9a\u0dd2\u0dbb\u0dd3\u0db8 \u0dc3\u0db8\u0d9f \u0d87\u0da9\u0db8\u0dca \u0db4\u0dca\u0dbb\u0dc1\u0dc3\u0dca\u0dad\u0d9a\u0dbb\u0dab\u0dba</h2>\n<p>\u0db8\u0dd9\u0db8\u0db4\u0db1\u0dca\u0dad\u0dd2\u0dba AMSGrad \u0db4\u0dca\u0dbb\u0dc1\u0dc3\u0dca\u0dad\u0d9a\u0dbb\u0dab\u0dba\u0dd9\u0db1\u0dca \u0d85\u0dbb\u0dca\u0dae \u0daf\u0d9a\u0dca\u0dc0\u0dcf \u0d87\u0dad <a href=\"amsgrad.html\"><span translate=no>_^_0_^_</span></a>. </p>\n",
4
"<h3>Get learning-rate</h3>\n<p><span translate=no>_^_0_^_</span> where <span translate=no>_^_1_^_</span> is the number of warmup steps.</p>\n": "<h3>\u0d89\u0d9c\u0dd9\u0db1\u0dd3\u0db8-\u0d85\u0db1\u0dd4\u0db4\u0dcf\u0dad\u0dba\u0dbd\u0db6\u0dcf \u0d9c\u0db1\u0dca\u0db1</h3>\n<p><span translate=no>_^_0_^_</span> \u0d8b\u0dab\u0dd4\u0dc3\u0dd4\u0db8\u0dca \u0d9a\u0dd2\u0dbb\u0dd3\u0db8\u0dda \u0db4\u0dd2\u0dba\u0dc0\u0dbb \u0d9c\u0dab\u0db1 <span translate=no>_^_1_^_</span> \u0d9a\u0ddc\u0dc4\u0dda\u0daf? </p>\n",
5
"<h3>Initialize the optimizer</h3>\n<ul><li><span translate=no>_^_0_^_</span> is the list of parameters </li>\n<li><span translate=no>_^_1_^_</span> is the learning rate <span translate=no>_^_2_^_</span> </li>\n<li><span translate=no>_^_3_^_</span> is a tuple of (<span translate=no>_^_4_^_</span>, <span translate=no>_^_5_^_</span>) </li>\n<li><span translate=no>_^_6_^_</span> is <span translate=no>_^_7_^_</span> or <span translate=no>_^_8_^_</span> based on <span translate=no>_^_9_^_</span> </li>\n<li><span translate=no>_^_10_^_</span> is an instance of class <span translate=no>_^_11_^_</span> defined in <a href=\"index.html\"><span translate=no>_^_12_^_</span></a> </li>\n<li>&#x27;optimized_update&#x27; is a flag whether to optimize the bias correction of the second moment by doing it after adding <span translate=no>_^_13_^_</span> </li>\n<li><span translate=no>_^_14_^_</span> is a flag indicating whether to use AMSGrad or fallback to plain Adam </li>\n<li><span translate=no>_^_15_^_</span> number of warmup steps </li>\n<li><span translate=no>_^_16_^_</span> is a dictionary of default for group values. This is useful when you want to extend the class <span translate=no>_^_17_^_</span>.</li></ul>\n": "<h3>\u0db4\u0dca\u0dbb\u0dc1\u0dc3\u0dca\u0dad\u0d9a\u0dbb\u0dab\u0dba\u0d86\u0dbb\u0db8\u0dca\u0db7 \u0d9a\u0dbb\u0db1\u0dca\u0db1</h3>\n<ul><li><span translate=no>_^_0_^_</span> \u0dba\u0db1\u0dd4 \u0db4\u0dbb\u0dcf\u0db8\u0dd2\u0dad\u0dd3\u0db1\u0dca \u0dbd\u0dd0\u0dba\u0dd2\u0dc3\u0dca\u0dad\u0dd4\u0dc0\u0dba\u0dd2 </li>\n<li><span translate=no>_^_1_^_</span> \u0dba\u0db1\u0dd4 \u0d89\u0d9c\u0dd9\u0db1\u0dd4\u0db8\u0dca \u0d85\u0db1\u0dd4\u0db4\u0dcf\u0dad\u0dba\u0dba\u0dd2 <span translate=no>_^_2_^_</span> </li>\n<li><span translate=no>_^_3_^_</span> (<span translate=no>_^_4_^_</span>, <span translate=no>_^_5_^_</span>) \u0d9a tuple \u0dc0\u0dda </li>\n<li><span translate=no>_^_6_^_</span> <span translate=no>_^_7_^_</span> \u0dc4\u0ddd \u0db8\u0dad <span translate=no>_^_8_^_</span> \u0db4\u0daf\u0db1\u0db8\u0dca \u0dc0\u0dda <span translate=no>_^_9_^_</span> </li>\n<li><span translate=no>_^_10_^_</span> <span translate=no>_^_11_^_</span> \u0d85\u0dbb\u0dca\u0dae \u0daf\u0d9a\u0dca\u0dc0\u0dcf \u0d87\u0dad\u0dd2 \u0db4\u0db1\u0dca\u0dad\u0dd2\u0dba\u0dda \u0d85\u0dc0\u0dc3\u0dca\u0dae\u0dcf\u0dc0\u0d9a\u0dd2 <a href=\"index.html\"><span translate=no>_^_12_^_</span></a> </li>\n<li>'optimized_update'\u0dba\u0db1\u0dd4 \u0d91\u0d9a\u0dad\u0dd4 \u0d9a\u0dd2\u0dbb\u0dd3\u0db8\u0dd9\u0db1\u0dca \u0db4\u0dc3\u0dd4 \u0d91\u0dba \u0d9a\u0dd2\u0dbb\u0dd3\u0db8\u0dd9\u0db1\u0dca \u0daf\u0dd9\u0dc0\u0db1 \u0db8\u0ddc\u0dc4\u0ddc\u0dad\u0dda \u0db4\u0d9a\u0dca\u0dc2\u0d9c\u0dca\u0dbb\u0dcf\u0dc4\u0dd3\u0dc0 \u0db1\u0dd2\u0dc0\u0dd0\u0dbb\u0daf\u0dd2 \u0d9a\u0dd2\u0dbb\u0dd3\u0db8 \u0db4\u0dca\u0dbb\u0dc1\u0dc3\u0dca\u0dad \u0d9a\u0dd2\u0dbb\u0dd3\u0db8 \u0dc3\u0db3\u0dc4\u0dcf \u0daf \u0dba\u0db1\u0dca\u0db1 \u0db0\u0da2\u0dba\u0d9a\u0dd2 <span translate=no>_^_13_^_</span> </li>\n<li><span translate=no>_^_14_^_</span> \u0d86\u0daf\u0db8\u0dca \u0dc3\u0dbb\u0dbd \u0d9a\u0dd2\u0dbb\u0dd3\u0db8 \u0dc3\u0db3\u0dc4\u0dcf AMSGrad \u0dc4\u0ddd \u0dc0\u0dd0\u0da7\u0dd3\u0db8 \u0db7\u0dcf\u0dc0\u0dd2\u0dad\u0dcf \u0d9a\u0dc5 \u0dba\u0dd4\u0dad\u0dd4\u0daf \u0dba\u0db1\u0dca\u0db1 \u0daf\u0dd0\u0d9a\u0dca\u0dc0\u0dd9\u0db1 \u0db0\u0da2\u0dba\u0d9a\u0dd2 </li>\n<li><span translate=no>_^_15_^_</span> \u0d8b\u0db1\u0dd4\u0dc3\u0dd4\u0db8\u0dca \u0db4\u0dd2\u0dba\u0dc0\u0dbb \u0d9c\u0dab\u0db1 </li>\n<li><span translate=no>_^_16_^_</span> \u0d9a\u0dab\u0dca\u0da9\u0dcf\u0dba\u0db8\u0dca \u0d85\u0d9c\u0dba\u0db1\u0dca \u0dc3\u0db3\u0dc4\u0dcf \u0db4\u0dd9\u0dbb\u0db1\u0dd2\u0db8\u0dd2 \u0dc1\u0db6\u0dca\u0daf \u0d9a\u0ddd\u0dc2\u0dba\u0d9a\u0dd2. \u0d94\u0db6\u0da7 \u0db4\u0db1\u0dca\u0dad\u0dd2\u0dba \u0daf\u0dd3\u0dbb\u0dca extend \u0d9a\u0dd2\u0dbb\u0dd3\u0db8\u0da7 \u0d85\u0dc0\u0dc1\u0dca\u0dba \u0dc0\u0dd2\u0da7 \u0db8\u0dd9\u0dba \u0db4\u0dca\u0dbb\u0dba\u0ddd\u0da2\u0db1\u0dc0\u0dad\u0dca <span translate=no>_^_17_^_</span>\u0dc0\u0dda. </li></ul>\n",
6
"<p>A linearly increasing learning rate from <span translate=no>_^_0_^_</span> to <span translate=no>_^_1_^_</span> </p>\n": "<p>\u0dc3\u0dd2\u0da7 <span translate=no>_^_0_^_</span> \u0dbb\u0dda\u0d9b\u0dd3\u0dba\u0dc0 \u0dc0\u0dd0\u0da9\u0dd2\u0dc0\u0db1 \u0d89\u0d9c\u0dd9\u0db1\u0dd4\u0db8\u0dca \u0d85\u0db1\u0dd4\u0db4\u0dcf\u0dad\u0dba <span translate=no>_^_1_^_</span> </p>\n",
7
"<p>Constant learning rate <span translate=no>_^_0_^_</span> </p>\n": "<p>\u0db1\u0dd2\u0dbb\u0db1\u0dca\u0dad\u0dbb\u0d89\u0d9c\u0dd9\u0db1\u0dd4\u0db8\u0dca \u0d85\u0db1\u0dd4\u0db4\u0dcf\u0dad\u0dba <span translate=no>_^_0_^_</span> </p>\n",
8
"<p>If we are in warmup stage </p>\n": "<p>\u0d85\u0db4\u0dd2\u0d8b\u0db1\u0dd4\u0dc3\u0dd4\u0db8\u0dca \u0d85\u0dc0\u0db0\u0dd2\u0dba\u0d9a \u0dc3\u0dd2\u0da7\u0dd3 \u0db1\u0db8\u0dca </p>\n",
9
"A simple PyTorch implementation/tutorial of Adam optimizer with warm-up.": "\u0d8b\u0dab\u0dd4\u0dc3\u0dd4\u0db8\u0dca \u0d9a\u0dd2\u0dbb\u0dd3\u0db8 \u0dc3\u0db8\u0d9f \u0d86\u0daf\u0db8\u0dca \u0db4\u0dca\u0dbb\u0dc1\u0dc3\u0dca\u0dad\u0dd2\u0d9a\u0dbb\u0dab\u0dba\u0dda \u0dc3\u0dbb\u0dbd \u0db4\u0dba\u0dd2\u0da7\u0ddd\u0da0\u0dca \u0d9a\u0dca\u0dbb\u0dd2\u0dba\u0dcf\u0dad\u0dca\u0db8\u0d9a \u0d9a\u0dd2\u0dbb\u0dd3\u0db8/\u0db1\u0dd2\u0db6\u0db1\u0dca\u0db0\u0db1\u0dba.",
10
"Adam optimizer with warm-up": "\u0d8b\u0dab\u0dd4\u0dc3\u0dd4\u0db8 \u0dc3\u0dc4\u0dd2\u0dad \u0d86\u0daf\u0db8\u0dca \u0db4\u0dca\u0dbb\u0dc1\u0dc3\u0dca\u0dad\u0d9a\u0dbb\u0dab\u0dba"
11
}
12