Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
labmlai
GitHub Repository: labmlai/annotated_deep_learning_paper_implementations
Path: blob/master/labml_nn/__init__.py
4910 views
1
"""
2
# [Annotated Research Paper Implementations: Transformers, StyleGAN, Stable Diffusion, DDPM/DDIM, LayerNorm, Nucleus Sampling and more](index.html)
3
4
This is a collection of simple PyTorch implementations of
5
neural networks and related algorithms.
6
[These implementations](https://github.com/labmlai/annotated_deep_learning_paper_implementations) are documented with explanations,
7
and the [website](index.html)
8
renders these as side-by-side formatted notes.
9
We believe these would help you understand these algorithms better.
10
11
![Screenshot](dqn-light.png)
12
13
We are actively maintaining this repo and adding new
14
implementations.
15
[![Twitter](https://img.shields.io/twitter/follow/labmlai?style=social)](https://twitter.com/labmlai) for updates.
16
17
## Translations
18
19
### **[English (original)](https://nn.labml.ai)**
20
### **[Chinese (translated)](https://nn.labml.ai/zh/)**
21
### **[Japanese (translated)](https://nn.labml.ai/ja/)**
22
23
## Paper Implementations
24
25
#### ✨ [Transformers](transformers/index.html)
26
27
* [JAX implementation](transformers/jax_transformer/index.html)
28
* [Multi-headed attention](transformers/mha.html)
29
* [Triton Flash Attention](transformers/flash/index.html)
30
* [Transformer building blocks](transformers/models.html)
31
* [Transformer XL](transformers/xl/index.html)
32
* [Relative multi-headed attention](transformers/xl/relative_mha.html)
33
* [Rotary Positional Embeddings (RoPE)](transformers/rope/index.html)
34
* [Attention with Linear Biases (ALiBi)](transformers/alibi/index.html)
35
* [RETRO](transformers/retro/index.html)
36
* [Compressive Transformer](transformers/compressive/index.html)
37
* [GPT Architecture](transformers/gpt/index.html)
38
* [GLU Variants](transformers/glu_variants/simple.html)
39
* [kNN-LM: Generalization through Memorization](transformers/knn/index.html)
40
* [Feedback Transformer](transformers/feedback/index.html)
41
* [Switch Transformer](transformers/switch/index.html)
42
* [Fast Weights Transformer](transformers/fast_weights/index.html)
43
* [FNet](transformers/fnet/index.html)
44
* [Attention Free Transformer](transformers/aft/index.html)
45
* [Masked Language Model](transformers/mlm/index.html)
46
* [MLP-Mixer: An all-MLP Architecture for Vision](transformers/mlp_mixer/index.html)
47
* [Pay Attention to MLPs (gMLP)](transformers/gmlp/index.html)
48
* [Vision Transformer (ViT)](transformers/vit/index.html)
49
* [Primer EZ](transformers/primer_ez/index.html)
50
* [Hourglass](transformers/hour_glass/index.html)
51
52
#### ✨ [Low-Rank Adaptation (LoRA)](lora/index.html)
53
54
#### ✨ [Eleuther GPT-NeoX](neox/index.html)
55
* [Generate on a 48GB GPU](neox/samples/generate.html)
56
* [Finetune on two 48GB GPUs](neox/samples/finetune.html)
57
* [LLM.int8()](neox/utils/llm_int8.html)
58
59
#### ✨ [Diffusion models](diffusion/index.html)
60
61
* [Denoising Diffusion Probabilistic Models (DDPM)](diffusion/ddpm/index.html)
62
* [Denoising Diffusion Implicit Models (DDIM)](diffusion/stable_diffusion/sampler/ddim.html)
63
* [Latent Diffusion Models](diffusion/stable_diffusion/latent_diffusion.html)
64
* [Stable Diffusion](diffusion/stable_diffusion/index.html)
65
66
#### ✨ [Generative Adversarial Networks](gan/index.html)
67
* [Original GAN](gan/original/index.html)
68
* [GAN with deep convolutional network](gan/dcgan/index.html)
69
* [Cycle GAN](gan/cycle_gan/index.html)
70
* [Wasserstein GAN](gan/wasserstein/index.html)
71
* [Wasserstein GAN with Gradient Penalty](gan/wasserstein/gradient_penalty/index.html)
72
* [StyleGAN 2](gan/stylegan/index.html)
73
74
#### ✨ [Recurrent Highway Networks](recurrent_highway_networks/index.html)
75
76
#### ✨ [LSTM](lstm/index.html)
77
78
#### ✨ [HyperNetworks - HyperLSTM](hypernetworks/hyper_lstm.html)
79
80
#### ✨ [ResNet](resnet/index.html)
81
82
#### ✨ [ConvMixer](conv_mixer/index.html)
83
84
#### ✨ [Capsule Networks](capsule_networks/index.html)
85
86
#### ✨ [U-Net](unet/index.html)
87
88
#### ✨ [Sketch RNN](sketch_rnn/index.html)
89
90
#### ✨ Graph Neural Networks
91
92
* [Graph Attention Networks (GAT)](graphs/gat/index.html)
93
* [Graph Attention Networks v2 (GATv2)](graphs/gatv2/index.html)
94
95
#### ✨ [Reinforcement Learning](rl/index.html)
96
* [Proximal Policy Optimization](rl/ppo/index.html) with
97
[Generalized Advantage Estimation](rl/ppo/gae.html)
98
* [Deep Q Networks](rl/dqn/index.html) with
99
with [Dueling Network](rl/dqn/model.html),
100
[Prioritized Replay](rl/dqn/replay_buffer.html)
101
and Double Q Network.
102
103
#### ✨ [Counterfactual Regret Minimization (CFR)](cfr/index.html)
104
105
Solving games with incomplete information such as poker with CFR.
106
107
* [Kuhn Poker](cfr/kuhn/index.html)
108
109
#### ✨ [Optimizers](optimizers/index.html)
110
* [Adam](optimizers/adam.html)
111
* [AMSGrad](optimizers/amsgrad.html)
112
* [Adam Optimizer with warmup](optimizers/adam_warmup.html)
113
* [Noam Optimizer](optimizers/noam.html)
114
* [Rectified Adam Optimizer](optimizers/radam.html)
115
* [AdaBelief Optimizer](optimizers/ada_belief.html)
116
* [Sophia-G Optimizer](optimizers/sophia.html)
117
118
#### ✨ [Normalization Layers](normalization/index.html)
119
* [Batch Normalization](normalization/batch_norm/index.html)
120
* [Layer Normalization](normalization/layer_norm/index.html)
121
* [Instance Normalization](normalization/instance_norm/index.html)
122
* [Group Normalization](normalization/group_norm/index.html)
123
* [Weight Standardization](normalization/weight_standardization/index.html)
124
* [Batch-Channel Normalization](normalization/batch_channel_norm/index.html)
125
* [DeepNorm](normalization/deep_norm/index.html)
126
127
#### ✨ [Distillation](distillation/index.html)
128
129
#### ✨ [Adaptive Computation](adaptive_computation/index.html)
130
131
* [PonderNet](adaptive_computation/ponder_net/index.html)
132
133
#### ✨ [Uncertainty](uncertainty/index.html)
134
135
* [Evidential Deep Learning to Quantify Classification Uncertainty](uncertainty/evidence/index.html)
136
137
#### ✨ [Activations](activations/index.html)
138
139
* [Fuzzy Tiling Activations](activations/fta/index.html)
140
141
#### ✨ [Language Model Sampling Techniques](sampling/index.html)
142
* [Greedy Sampling](sampling/greedy.html)
143
* [Temperature Sampling](sampling/temperature.html)
144
* [Top-k Sampling](sampling/top_k.html)
145
* [Nucleus Sampling](sampling/nucleus.html)
146
147
#### ✨ [Scalable Training/Inference](scaling/index.html)
148
* [Zero3 memory optimizations](scaling/zero3/index.html)
149
150
### Installation
151
152
```bash
153
pip install labml-nn
154
```
155
"""
156
157