Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
labmlai
GitHub Repository: labmlai/annotated_deep_learning_paper_implementations
Path: blob/master/translate_cache/rl/dqn/model.si.json
4925 views
1
{
2
"<h1>Deep Q Network (DQN) Model</h1>\n<p><a href=\"https://colab.research.google.com/github/labmlai/annotated_deep_learning_paper_implementations/blob/master/labml_nn/rl/dqn/experiment.ipynb\"><span translate=no>_^_0_^_</span></a> <a href=\"https://app.labml.ai/run/fe1ad986237511ec86e8b763a2d3f710\"><span translate=no>_^_1_^_</span></a></p>\n": "<h1>\u0d9c\u0dd0\u0db9\u0dd4\u0dbb\u0dd4Q \u0da2\u0dcf\u0dbd (DQN) \u0d86\u0d9a\u0dd8\u0dad\u0dd2\u0dba</h1>\n<p><a href=\"https://colab.research.google.com/github/labmlai/annotated_deep_learning_paper_implementations/blob/master/labml_nn/rl/dqn/experiment.ipynb\"><span translate=no>_^_0_^_</span></a> <a href=\"https://app.labml.ai/run/fe1ad986237511ec86e8b763a2d3f710\"> <span translate=no>_^_1_^_</span></a></p>\n",
3
"<h2>Dueling Network \u2694\ufe0f Model for <span translate=no>_^_0_^_</span> Values</h2>\n<p>We are using a <a href=\"https://arxiv.org/abs/1511.06581\">dueling network</a> to calculate Q-values. Intuition behind dueling network architecture is that in most states the action doesn&#x27;t matter, and in some states the action is significant. Dueling network allows this to be represented very well.</p>\n<span translate=no>_^_1_^_</span><p>So we create two networks for <span translate=no>_^_2_^_</span> and <span translate=no>_^_3_^_</span> and get <span translate=no>_^_4_^_</span> from them. <span translate=no>_^_5_^_</span> We share the initial layers of the <span translate=no>_^_6_^_</span> and <span translate=no>_^_7_^_</span> networks.</p>\n": "<h2>\u0da2\u0dcf\u0dbd\u0dbaDueling \u2694\ufe0f <span translate=no>_^_0_^_</span> \u0dc0\u0da7\u0dd2\u0db1\u0dcf\u0d9a\u0db8\u0dca \u0dc3\u0db3\u0dc4\u0dcf \u0d86\u0d9a\u0dd8\u0dad\u0dd2\u0dba</h2>\n<p>Q-\u0d85\u0d9c\u0dba\u0db1\u0dca\u0d9c\u0dab\u0db1\u0dba \u0d9a\u0dd2\u0dbb\u0dd3\u0db8 \u0dc3\u0db3\u0dc4\u0dcf \u0d85\u0db4\u0dd2 <a href=\"https://arxiv.org/abs/1511.06581\">\u0da9\u0dd6\u0dbd\u0dd2\u0d82 \u0da2\u0dcf\u0dbd\u0dba\u0d9a\u0dca</a> \u0db7\u0dcf\u0dc0\u0dd2\u0dad\u0dcf \u0d9a\u0dbb\u0db8\u0dd4. \u0da2\u0dcf\u0dbd \u0d9c\u0dd8\u0dc4 \u0db1\u0dd2\u0dbb\u0dca\u0db8\u0dcf\u0dab \u0dc1\u0dd2\u0dbd\u0dca\u0db4\u0dba dueling \u0db4\u0dd2\u0da7\u0dd4\u0db4\u0dc3 \u0d87\u0dad\u0dd2 \u0db4\u0dca\u0dbb\u0dad\u0dd2\u0db7\u0dcf\u0db1\u0dba \u0db1\u0db8\u0dca, \u0db6\u0ddc\u0dc4\u0ddd \u0db4\u0dca\u0dbb\u0dcf\u0db1\u0dca\u0dad\u0dc0\u0dbd \u0d9a\u0dca\u0dbb\u0dd2\u0dba\u0dcf\u0dc0 \u0dc0\u0dd0\u0daf\u0d9c\u0dad\u0dca \u0db1\u0ddc\u0dc0\u0db1 \u0d85\u0dad\u0dbb \u0dc3\u0db8\u0dc4\u0dbb \u0db4\u0dca\u0dbb\u0dcf\u0db1\u0dca\u0dad\u0dc0\u0dbd \u0d9a\u0dca\u0dbb\u0dd2\u0dba\u0dcf\u0dc0 \u0dc3\u0dd0\u0dbd\u0d9a\u0dd2\u0dba \u0dba\u0dd4\u0dad\u0dd4 \u0dba. Dueling \u0da2\u0dcf\u0dbd\u0dba \u0db8\u0dd9\u0dba \u0d89\u0dad\u0dcf \u0dc4\u0ddc\u0db3\u0dd2\u0db1\u0dca \u0db1\u0dd2\u0dbb\u0dd6\u0db4\u0dab\u0dba \u0d9a\u0dd2\u0dbb\u0dd3\u0db8\u0da7 \u0d89\u0da9 \u0daf\u0dd9\u0dba\u0dd2. </p>\n<span translate=no>_^_1_^_</span><p>\u0d91\u0db6\u0dd0\u0dc0\u0dd2\u0db1\u0dca\u0d85\u0db4\u0dd2 \u0da2\u0dcf\u0dbd \u0daf\u0dd9\u0d9a\u0d9a\u0dca \u0db1\u0dd2\u0dbb\u0dca\u0db8\u0dcf\u0dab\u0dba <span translate=no>_^_2_^_</span> <span translate=no>_^_3_^_</span> \u0d9a\u0dbb <span translate=no>_^_4_^_</span> \u0d94\u0dc0\u0dd4\u0db1\u0dca\u0d9c\u0dd9\u0db1\u0dca \u0dbd\u0db6\u0dcf \u0d9c\u0db1\u0dd2\u0db8\u0dd4. <span translate=no>_^_5_^_</span> \u0d85\u0db4\u0dd2 <span translate=no>_^_6_^_</span> \u0dc3\u0dc4 <span translate=no>_^_7_^_</span> \u0da2\u0dcf\u0dbd \u0dc0\u0dbd \u0d86\u0dbb\u0db8\u0dca\u0db7\u0d9a \u0dc3\u0dca\u0dae\u0dbb \u0db6\u0dd9\u0daf\u0dcf \u0d9c\u0db1\u0dd2\u0db8\u0dd4. </p>\n",
4
"<p><span translate=no>_^_0_^_</span> </p>\n": "<p><span translate=no>_^_0_^_</span> </p>\n",
5
"<p>A fully connected layer takes the flattened frame from third convolution layer, and outputs <span translate=no>_^_0_^_</span> features </p>\n": "<p>\u0dc3\u0db8\u0dca\u0db4\u0dd4\u0dbb\u0dca\u0dab\u0dba\u0dd9\u0db1\u0dca\u0db8\u0dc3\u0db8\u0dca\u0db6\u0db1\u0dca\u0db0\u0dd2\u0dad \u0dad\u0da7\u0dca\u0da7\u0dd4\u0dc0\u0d9a\u0dca \u0db4\u0dd0\u0dad\u0dbd\u0dd2 \u0dbb\u0dcf\u0db8\u0dd4\u0dc0 \u0dad\u0dd9\u0dc0\u0db1 \u0d9a\u0dd0\u0da7\u0dd2 \u0d9c\u0dd0\u0dc3\u0dd4\u0dab\u0dd4 \u0dc3\u0dca\u0dae\u0dbb\u0dba\u0dd9\u0db1\u0dca \u0d9c\u0db1\u0dca\u0db1\u0dcf \u0d85\u0dad\u0dbb <span translate=no>_^_0_^_</span> \u0dc0\u0dd2\u0dc1\u0dda\u0dc2\u0dcf\u0d82\u0d9c \u0db4\u0dca\u0dbb\u0dad\u0dd2\u0daf\u0dcf\u0db1\u0dba \u0d9a\u0dbb\u0dba\u0dd2 </p>\n",
6
"<p>Convolution </p>\n": "<p>\u0dc3\u0d82\u0dc0\u0dbd\u0dd2\u0dad </p>\n",
7
"<p>Linear layer </p>\n": "<p>\u0dbb\u0dda\u0d9b\u0dd3\u0dba\u0dc3\u0dca\u0dae\u0dbb\u0dba </p>\n",
8
"<p>Reshape for linear layers </p>\n": "<p>\u0dbb\u0dda\u0d9b\u0dd3\u0dba\u0dc3\u0dca\u0dae\u0dbb \u0dc3\u0db3\u0dc4\u0dcf \u0db1\u0dd0\u0dc0\u0dad \u0dc3\u0d9a\u0dc3\u0dca \u0d9a\u0dbb\u0db1\u0dca\u0db1 </p>\n",
9
"<p>The first convolution layer takes a <span translate=no>_^_0_^_</span> frame and produces a <span translate=no>_^_1_^_</span> frame </p>\n": "<p>\u0db4\u0dc5\u0db8\u0dd4\u0d9a\u0dd0\u0da7\u0dd2 \u0d9c\u0dd0\u0dc3\u0dd4\u0dab\u0dd4 \u0dc3\u0dca\u0dad\u0dbb\u0dba <span translate=no>_^_0_^_</span> \u0dbb\u0dcf\u0db8\u0dd4\u0dc0\u0d9a\u0dca \u0d9c\u0dd9\u0db1 <span translate=no>_^_1_^_</span> \u0dbb\u0dcf\u0db8\u0dd4\u0dc0\u0d9a\u0dca \u0db1\u0dd2\u0db4\u0daf\u0dc0\u0dba\u0dd2 </p>\n",
10
"<p>The second convolution layer takes a <span translate=no>_^_0_^_</span> frame and produces a <span translate=no>_^_1_^_</span> frame </p>\n": "<p>\u0daf\u0dd9\u0dc0\u0db1\u0d9a\u0dd0\u0da7\u0dd2 \u0d9c\u0dd0\u0dc3\u0dd4\u0dab\u0dd4 \u0dc3\u0dca\u0dad\u0dbb\u0dba <span translate=no>_^_0_^_</span> \u0dbb\u0dcf\u0db8\u0dd4\u0dc0\u0d9a\u0dca \u0d9c\u0dd9\u0db1 <span translate=no>_^_1_^_</span> \u0dbb\u0dcf\u0db8\u0dd4\u0dc0\u0d9a\u0dca \u0db1\u0dd2\u0db4\u0daf\u0dc0\u0dba\u0dd2 </p>\n",
11
"<p>The third convolution layer takes a <span translate=no>_^_0_^_</span> frame and produces a <span translate=no>_^_1_^_</span> frame </p>\n": "<p>\u0dad\u0dd9\u0dc0\u0db1\u0d9a\u0dd0\u0da7\u0dd2 \u0d9c\u0dd0\u0dc3\u0dd4\u0dab\u0dd4 \u0dc3\u0dca\u0dad\u0dbb\u0dba <span translate=no>_^_0_^_</span> \u0dbb\u0dcf\u0db8\u0dd4\u0dc0\u0d9a\u0dca \u0d9c\u0dd9\u0db1 <span translate=no>_^_1_^_</span> \u0dbb\u0dcf\u0db8\u0dd4\u0dc0\u0d9a\u0dca \u0db1\u0dd2\u0db4\u0daf\u0dc0\u0dba\u0dd2 </p>\n",
12
"<p>This head gives the action value <span translate=no>_^_0_^_</span> </p>\n": "<p>\u0db8\u0dd9\u0db8\u0dc4\u0dd2\u0dc3 \u0d9a\u0dca\u0dbb\u0dd2\u0dba\u0dcf\u0d9a\u0dcf\u0dbb\u0dd3 \u0d85\u0d9c\u0dba \u0dbd\u0db6\u0dcf \u0daf\u0dd9\u0dba\u0dd2 <span translate=no>_^_0_^_</span> </p>\n",
13
"<p>This head gives the state value <span translate=no>_^_0_^_</span> </p>\n": "<p>\u0db8\u0dd9\u0db8\u0dc4\u0dd2\u0dc3 \u0dbb\u0dcf\u0da2\u0dca\u0dba \u0dc0\u0da7\u0dd2\u0db1\u0dcf\u0d9a\u0db8 \u0dbd\u0db6\u0dcf \u0daf\u0dd9\u0dba\u0dd2 <span translate=no>_^_0_^_</span> </p>\n",
14
"Deep Q Network (DQN) Model": "\u0d9c\u0dd0\u0db9\u0dd4\u0dbb\u0dd4 Q \u0da2\u0dcf\u0dbd (DQN) \u0d86\u0d9a\u0dd8\u0dad\u0dd2\u0dba",
15
"Implementation of neural network model for Deep Q Network (DQN).": "Deep Q Network (DQN) \u0dc3\u0db3\u0dc4\u0dcf \u0dc3\u0dca\u0db1\u0dcf\u0dba\u0dd4\u0d9a \u0da2\u0dcf\u0dbd \u0d86\u0d9a\u0dd8\u0dad\u0dd2\u0dba \u0d9a\u0dca\u0dbb\u0dd2\u0dba\u0dcf\u0dad\u0dca\u0db8\u0d9a \u0d9a\u0dd2\u0dbb\u0dd3\u0db8."
16
}
17