Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
labmlai
GitHub Repository: labmlai/annotated_deep_learning_paper_implementations
Path: blob/master/translate_cache/RWKV/experiment.zh.json
4923 views
1
{
2
"<h2>Configurations</h2>\n<p>This inherits from <a href=\"../../experiments/nlp_autoregression.html#NLPAutoRegressionConfigs\"><span translate=no>_^_0_^_</span></a></p>\n": "<h2>Configurations</h2>\n<p>This inherits from <a href=\"../../experiments/nlp_autoregression.html#NLPAutoRegressionConfigs\"><span translate=no>_^_0_^_</span></a></p>\n",
3
"<h3>RWKV configurations</h3>\n": "<h3>RWKV configurations</h3>\n",
4
"<p> </p>\n": "<p> </p>\n",
5
"<p> Create RWKV model and initialize weights</p>\n": "<p> Create RWKV model and initialize weights</p>\n",
6
"<p>Apply custom weight initialization </p>\n": "<p>Apply custom weight initialization </p>\n",
7
"<p>Batch size <span translate=no>_^_0_^_</span> </p>\n": "<p>Batch size <span translate=no>_^_0_^_</span> </p>\n",
8
"<p>Create AdamW optimizer and use the fused version if it is available </p>\n": "<p>Create AdamW optimizer and use the fused version if it is available </p>\n",
9
"<p>Create configs </p>\n": "<p>Create configs </p>\n",
10
"<p>Create experiment </p>\n": "<p>Create experiment </p>\n",
11
"<p>Custom optimizer </p>\n": "<p>Custom optimizer </p>\n",
12
"<p>Override configurations </p>\n": "<p>Override configurations </p>\n",
13
"<p>Prompt separator is blank </p>\n": "<p>Prompt separator is blank </p>\n",
14
"<p>RWKV model </p>\n": "<p>RWKV model </p>\n",
15
"<p>Run training </p>\n": "<p>Run training </p>\n",
16
"<p>Set models for saving and loading </p>\n": "<p>Set models for saving and loading </p>\n",
17
"<p>Set the vocabulary sizes for embeddings and generating logits </p>\n": "<p>Set the vocabulary sizes for embeddings and generating logits </p>\n",
18
"<p>Start the experiment </p>\n": "<p>Start the experiment </p>\n",
19
"<p>Starting prompt for sampling </p>\n": "<p>Starting prompt for sampling </p>\n",
20
"<p>Switch between training and validation for <span translate=no>_^_0_^_</span> times per epoch </p>\n": "<p>Switch between training and validation for <span translate=no>_^_0_^_</span> times per epoch </p>\n",
21
"<p>Train for <span translate=no>_^_0_^_</span> epochs </p>\n": "<p>Train for <span translate=no>_^_0_^_</span> epochs </p>\n",
22
"<p>Use Tiny Shakespeare dataset </p>\n": "<p>Use Tiny Shakespeare dataset </p>\n",
23
"<p>Use a context size of <span translate=no>_^_0_^_</span> </p>\n": "<p>Use a context size of <span translate=no>_^_0_^_</span> </p>\n",
24
"<p>Use character level tokenizer </p>\n": "<p>Use character level tokenizer </p>\n",
25
"<p>We use our <a href=\"../configs.html#RWKVConfigs\">configurable RWKV implementation</a> </p>\n": "<p>We use our <a href=\"../configs.html#RWKVConfigs\">configurable RWKV implementation</a> </p>\n",
26
"<p>create optim groups. Any parameters that is 2D will be weight decayed, otherwise no. i.e. all weight tensors in matmuls + embeddings decay, all biases and layernorms don&#x27;t. </p>\n": "<p>create optim groups. Any parameters that is 2D will be weight decayed, otherwise no. i.e. all weight tensors in matmuls + embeddings decay, all biases and layernorms don&#x27;t. </p>\n",
27
"<p>filter out those that do not require grad </p>\n": "<p>filter out those that do not require grad </p>\n",
28
"<p>initialize Vector Parameters in TimeMixing </p>\n": "<p>initialize Vector Parameters in TimeMixing </p>\n",
29
"<p>model </p>\n": "<p>model </p>\n",
30
"<p>number of warmup iterations </p>\n": "<p>number of warmup iterations </p>\n",
31
"<p>start with all of the candidate parameters </p>\n": "<p>start with all of the candidate parameters </p>\n",
32
"<p>total number of training iterations </p>\n": "<p>total number of training iterations </p>\n",
33
"<p>weight decay </p>\n": "<p>weight decay </p>\n",
34
"experiment.py": "experiment.py"
35
}
36