Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
labmlai
GitHub Repository: labmlai/annotated_deep_learning_paper_implementations
Path: blob/master/translate_cache/rl/dqn/model.ja.json
4925 views
1
{
2
"<h1>Deep Q Network (DQN) Model</h1>\n<p><a href=\"https://colab.research.google.com/github/labmlai/annotated_deep_learning_paper_implementations/blob/master/labml_nn/rl/dqn/experiment.ipynb\"><span translate=no>_^_0_^_</span></a></p>\n": "<h1>\u30c7\u30a3\u30fc\u30d7Q\u30cd\u30c3\u30c8\u30ef\u30fc\u30af (DQN) \u30e2\u30c7\u30eb</h1>\n<p><a href=\"https://colab.research.google.com/github/labmlai/annotated_deep_learning_paper_implementations/blob/master/labml_nn/rl/dqn/experiment.ipynb\"><span translate=no>_^_0_^_</span></a></p>\n",
3
"<h2>Dueling Network \u2694\ufe0f Model for <span translate=no>_^_0_^_</span> Values</h2>\n<p>We are using a <a href=\"https://arxiv.org/abs/1511.06581\">dueling network</a> to calculate Q-values. Intuition behind dueling network architecture is that in most states the action doesn&#x27;t matter, and in some states the action is significant. Dueling network allows this to be represented very well.</p>\n<span translate=no>_^_1_^_</span><p>So we create two networks for <span translate=no>_^_2_^_</span> and <span translate=no>_^_3_^_</span> and get <span translate=no>_^_4_^_</span> from them. <span translate=no>_^_5_^_</span> We share the initial layers of the <span translate=no>_^_6_^_</span> and <span translate=no>_^_7_^_</span> networks.</p>\n": "<h2>\u30c7\u30e5\u30a8\u30eb\u30cd\u30c3\u30c8\u30ef\u30fc\u30af \u2694\ufe0f \u4fa1\u5024\u30e2\u30c7\u30eb <span translate=no>_^_0_^_</span></h2>\n<p><a href=\"https://arxiv.org/abs/1511.06581\">Q\u5024\u306e\u8a08\u7b97\u306b\u306f\u30c7\u30e5\u30a8\u30eb\u30cd\u30c3\u30c8\u30ef\u30fc\u30af\u3092\u4f7f\u7528\u3057\u3066\u3044\u307e\u3059</a>\u3002\u30c7\u30e5\u30a8\u30eb\u30cd\u30c3\u30c8\u30ef\u30fc\u30af\u30a2\u30fc\u30ad\u30c6\u30af\u30c1\u30e3\u306e\u80cc\u5f8c\u306b\u3042\u308b\u76f4\u611f\u306f\u3001\u307b\u3068\u3093\u3069\u306e\u5dde\u3067\u306f\u30a2\u30af\u30b7\u30e7\u30f3\u306f\u91cd\u8981\u3067\u306f\u306a\u304f\u3001\u4e00\u90e8\u306e\u5dde\u3067\u306f\u30a2\u30af\u30b7\u30e7\u30f3\u304c\u91cd\u8981\u3067\u3042\u308b\u3068\u3044\u3046\u3053\u3068\u3067\u3059\u3002\u30c7\u30e5\u30a8\u30eb\u30cd\u30c3\u30c8\u30ef\u30fc\u30af\u3067\u306f\u3001\u3053\u308c\u3092\u975e\u5e38\u306b\u3088\u304f\u8868\u73fe\u3067\u304d\u307e\u3059</p>\u3002\n<span translate=no>_^_1_^_</span><p>\u305d\u3053\u3067\u3001<span translate=no>_^_2_^_</span><span translate=no>_^_3_^_</span>\u3068\u304b\u3089\u306e 2 \u3064\u306e\u30cd\u30c3\u30c8\u30ef\u30fc\u30af\u3092\u4f5c\u6210\u3057\u3066\u3001\u305d\u306e 2 <span translate=no>_^_4_^_</span> \u3064\u306e\u30cd\u30c3\u30c8\u30ef\u30fc\u30af\u304b\u3089\u53d6\u5f97\u3057\u307e\u3059\u3002<span translate=no>_^_5_^_</span><span translate=no>_^_6_^_</span><span translate=no>_^_7_^_</span>\u3068\u30cd\u30c3\u30c8\u30ef\u30fc\u30af\u306e\u521d\u671f\u30ec\u30a4\u30e4\u30fc\u3092\u5171\u6709\u3057\u307e\u3059\u3002</p>\n",
4
"<p><span translate=no>_^_0_^_</span> </p>\n": "<p><span translate=no>_^_0_^_</span></p>\n",
5
"<p>A fully connected layer takes the flattened frame from third convolution layer, and outputs <span translate=no>_^_0_^_</span> features </p>\n": "<p>\u5b8c\u5168\u306b\u63a5\u7d9a\u3055\u308c\u305f\u30ec\u30a4\u30e4\u30fc\u306f\u30013 \u756a\u76ee\u306e\u30b3\u30f3\u30dc\u30ea\u30e5\u30fc\u30b7\u30e7\u30f3\u30ec\u30a4\u30e4\u30fc\u304b\u3089\u30d5\u30e9\u30c3\u30c8\u5316\u3055\u308c\u305f\u30d5\u30ec\u30fc\u30e0\u3092\u53d6\u308a\u51fa\u3057\u3001\u30d5\u30a3\u30fc\u30c1\u30e3\u3092\u51fa\u529b\u3057\u307e\u3059\u3002<span translate=no>_^_0_^_</span></p>\n",
6
"<p>Convolution </p>\n": "<p>\u30b3\u30f3\u30dc\u30ea\u30e5\u30fc\u30b7\u30e7\u30f3</p>\n",
7
"<p>Linear layer </p>\n": "<p>\u30ea\u30cb\u30a2\u30ec\u30a4\u30e4\u30fc</p>\n",
8
"<p>Reshape for linear layers </p>\n": "<p>\u7dda\u5f62\u30ec\u30a4\u30e4\u30fc\u306e\u5f62\u72b6\u3092\u5909\u66f4</p>\n",
9
"<p>The first convolution layer takes a <span translate=no>_^_0_^_</span> frame and produces a <span translate=no>_^_1_^_</span> frame </p>\n": "<p><span translate=no>_^_0_^_</span>\u6700\u521d\u306e\u7573\u307f\u8fbc\u307f\u5c64\u306f\u30d5\u30ec\u30fc\u30e0\u3092\u53d6\u308a\u3001\u30d5\u30ec\u30fc\u30e0\u3092\u751f\u6210\u3057\u307e\u3059\u3002<span translate=no>_^_1_^_</span></p>\n",
10
"<p>The second convolution layer takes a <span translate=no>_^_0_^_</span> frame and produces a <span translate=no>_^_1_^_</span> frame </p>\n": "<p>2 \u756a\u76ee\u306e\u7573\u307f\u8fbc\u307f\u5c64\u306f\u3001<span translate=no>_^_0_^_</span>\u30d5\u30ec\u30fc\u30e0\u3092\u53d6\u5f97\u3057\u3066\u30d5\u30ec\u30fc\u30e0\u3092\u751f\u6210\u3057\u307e\u3059\u3002<span translate=no>_^_1_^_</span></p>\n",
11
"<p>The third convolution layer takes a <span translate=no>_^_0_^_</span> frame and produces a <span translate=no>_^_1_^_</span> frame </p>\n": "<p>3 \u756a\u76ee\u306e\u7573\u307f\u8fbc\u307f\u5c64\u306f\u3001<span translate=no>_^_0_^_</span>\u30d5\u30ec\u30fc\u30e0\u3092\u53d6\u5f97\u3057\u3066\u30d5\u30ec\u30fc\u30e0\u3092\u751f\u6210\u3057\u307e\u3059\u3002<span translate=no>_^_1_^_</span></p>\n",
12
"<p>This head gives the action value <span translate=no>_^_0_^_</span> </p>\n": "<p>\u3053\u306e\u30d8\u30c3\u30c9\u306f\u30a2\u30af\u30b7\u30e7\u30f3\u5024\u3092\u4e0e\u3048\u307e\u3059 <span translate=no>_^_0_^_</span></p>\n",
13
"<p>This head gives the state value <span translate=no>_^_0_^_</span> </p>\n": "<p>\u3053\u306e\u30d8\u30c3\u30c9\u306f\u72b6\u614b\u5024\u3092\u4e0e\u3048\u307e\u3059 <span translate=no>_^_0_^_</span></p>\n",
14
"Deep Q Network (DQN) Model": "\u30c7\u30a3\u30fc\u30d7Q\u30cd\u30c3\u30c8\u30ef\u30fc\u30af (DQN) \u30e2\u30c7\u30eb",
15
"Implementation of neural network model for Deep Q Network (DQN).": "\u30c7\u30a3\u30fc\u30d7Q\u30cd\u30c3\u30c8\u30ef\u30fc\u30af (DQN) \u7528\u306e\u30cb\u30e5\u30fc\u30e9\u30eb\u30cd\u30c3\u30c8\u30ef\u30fc\u30af\u30e2\u30c7\u30eb\u306e\u5b9f\u88c5\u3002"
16
}
17