Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
labmlai
GitHub Repository: labmlai/annotated_deep_learning_paper_implementations
Path: blob/master/translate_cache/rl/dqn/model.zh.json
4937 views
1
{
2
"<h1>Deep Q Network (DQN) Model</h1>\n<p><a href=\"https://colab.research.google.com/github/labmlai/annotated_deep_learning_paper_implementations/blob/master/labml_nn/rl/dqn/experiment.ipynb\"><span translate=no>_^_0_^_</span></a></p>\n": "<h1>\u6df1\u5ea6 Q \u7f51\u7edc (DQN) \u6a21\u578b</h1>\n<p><a href=\"https://colab.research.google.com/github/labmlai/annotated_deep_learning_paper_implementations/blob/master/labml_nn/rl/dqn/experiment.ipynb\"><span translate=no>_^_0_^_</span></a></p>\n",
3
"<h2>Dueling Network \u2694\ufe0f Model for <span translate=no>_^_0_^_</span> Values</h2>\n<p>We are using a <a href=\"https://arxiv.org/abs/1511.06581\">dueling network</a> to calculate Q-values. Intuition behind dueling network architecture is that in most states the action doesn&#x27;t matter, and in some states the action is significant. Dueling network allows this to be represented very well.</p>\n<span translate=no>_^_1_^_</span><p>So we create two networks for <span translate=no>_^_2_^_</span> and <span translate=no>_^_3_^_</span> and get <span translate=no>_^_4_^_</span> from them. <span translate=no>_^_5_^_</span> We share the initial layers of the <span translate=no>_^_6_^_</span> and <span translate=no>_^_7_^_</span> networks.</p>\n": "<h2>\u51b3\u6597\u7f51\u7edc \u2694\ufe0f<span translate=no>_^_0_^_</span> \u4ef7\u503c\u89c2\u6a21\u578b</h2>\n<p>\u6211\u4eec\u6b63\u5728\u4f7f\u7528\u51b3<a href=\"https://arxiv.org/abs/1511.06581\">\u6597\u7f51\u7edc</a>\u6765\u8ba1\u7b97 Q \u503c\u3002\u51b3\u6597\u7f51\u7edc\u67b6\u6784\u80cc\u540e\u7684\u76f4\u89c9\u662f\uff0c\u5728\u5927\u591a\u6570\u5dde\uff0c\u884c\u52a8\u65e0\u5173\u7d27\u8981\uff0c\u800c\u5728\u67d0\u4e9b\u5dde\uff0c\u884c\u52a8\u610f\u4e49\u91cd\u5927\u3002\u51b3\u6597\u7f51\u7edc\u53ef\u4ee5\u5f88\u597d\u5730\u4f53\u73b0\u8fd9\u4e00\u70b9\u3002</p>\n<span translate=no>_^_1_^_</span><p>\u56e0\u6b64\uff0c\u6211\u4eec\u4e3a<span translate=no>_^_2_^_</span>\u548c\u521b\u5efa\u4e86\u4e24\u4e2a\u7f51\u7edc\uff0c<span translate=no>_^_3_^_</span>\u7136\u540e<span translate=no>_^_4_^_</span>\u4ece\u4e2d\u83b7\u53d6\u3002<span translate=no>_^_5_^_</span>\u6211\u4eec\u5171\u4eab<span translate=no>_^_6_^_</span>\u548c<span translate=no>_^_7_^_</span>\u7f51\u7edc\u7684\u521d\u59cb\u5c42\u3002</p>\n",
4
"<p><span translate=no>_^_0_^_</span> </p>\n": "<p><span translate=no>_^_0_^_</span></p>\n",
5
"<p>A fully connected layer takes the flattened frame from third convolution layer, and outputs <span translate=no>_^_0_^_</span> features </p>\n": "<p>\u5b8c\u5168\u8fde\u63a5\u7684\u56fe\u5c42\u4ece\u7b2c\u4e09\u4e2a\u5377\u79ef\u56fe\u5c42\u83b7\u53d6\u5c55\u5e73\u7684\u5e27\uff0c\u5e76\u8f93\u51fa<span translate=no>_^_0_^_</span>\u8981\u7d20</p>\n",
6
"<p>Convolution </p>\n": "<p>\u5377\u79ef</p>\n",
7
"<p>Linear layer </p>\n": "<p>\u7ebf\u6027\u5c42</p>\n",
8
"<p>Reshape for linear layers </p>\n": "<p>\u7ebf\u6027\u56fe\u5c42\u7684\u6574\u5f62</p>\n",
9
"<p>The first convolution layer takes a <span translate=no>_^_0_^_</span> frame and produces a <span translate=no>_^_1_^_</span> frame </p>\n": "<p>\u7b2c\u4e00\u4e2a\u5377\u79ef\u5c42\u9700\u8981\u4e00\u4e2a<span translate=no>_^_0_^_</span>\u5e27\u5e76\u751f\u6210\u4e00\u4e2a<span translate=no>_^_1_^_</span>\u5e27</p>\n",
10
"<p>The second convolution layer takes a <span translate=no>_^_0_^_</span> frame and produces a <span translate=no>_^_1_^_</span> frame </p>\n": "<p>\u7b2c\u4e8c\u4e2a\u5377\u79ef\u5c42\u83b7\u53d6\u4e00\u4e2a<span translate=no>_^_0_^_</span>\u5e27\u5e76\u751f\u6210\u4e00\u4e2a<span translate=no>_^_1_^_</span>\u5e27</p>\n",
11
"<p>The third convolution layer takes a <span translate=no>_^_0_^_</span> frame and produces a <span translate=no>_^_1_^_</span> frame </p>\n": "<p>\u7b2c\u4e09\u4e2a\u5377\u79ef\u5c42\u83b7\u53d6\u4e00\u4e2a<span translate=no>_^_0_^_</span>\u5e27\u5e76\u751f\u6210\u4e00\u4e2a<span translate=no>_^_1_^_</span>\u5e27</p>\n",
12
"<p>This head gives the action value <span translate=no>_^_0_^_</span> </p>\n": "<p>\u8fd9\u4e2a\u5934\u7ed9\u51fa\u4e86\u52a8\u4f5c\u503c<span translate=no>_^_0_^_</span></p>\n",
13
"<p>This head gives the state value <span translate=no>_^_0_^_</span> </p>\n": "<p>\u8fd9\u4e2a\u5934\u7ed9\u51fa\u4e86\u72b6\u6001\u503c<span translate=no>_^_0_^_</span></p>\n",
14
"Deep Q Network (DQN) Model": "\u6df1\u5ea6\u95ee\u7f51\u7edc (DQN) \u6a21\u578b",
15
"Implementation of neural network model for Deep Q Network (DQN).": "\u6df1\u5ea6\u95ee\u7f51\u7edc (DQN) \u795e\u7ecf\u7f51\u7edc\u6a21\u578b\u7684\u5b9e\u73b0\u3002"
16
}
17