Path: blob/master/translate_cache/conv_mixer/readme.zh.json
4944 views
{1" Patches Are All You Need?": " \u8865\u4e01\u662f\u4f60\u6240\u9700\u8981\u7684\u5417\uff1f",2"<h1><a href=\"https://nn.labml.ai/conv_mixer/index.html\">Patches Are All You Need?</a></h1>\n<p>This is a <a href=\"https://pytorch.org\">PyTorch</a> implementation of the paper <a href=\"https://arxiv.org/abs/2201.09792\">Patches Are All You Need?</a>.</p>\n<p>ConvMixer is Similar to <a href=\"https://nn.labml.ai/transformers/mlp_mixer/index.html\">MLP-Mixer</a>. MLP-Mixer separates mixing of spatial and channel dimensions, by applying an MLP across spatial dimension and then an MLP across the channel dimension (spatial MLP replaces the <a href=\"https://nn.labml.ai/transformers/vit/index.html\">ViT</a> attention and channel MLP is the <a href=\"https://nn.labml.ai/transformers/feed_forward.html\">FFN</a> of ViT).</p>\n<p>ConvMixer uses a 1x1 convolution for channel mixing and a depth-wise convolution for spatial mixing. Since it's a convolution instead of a full MLP across the space, it mixes only the nearby batches in contrast to ViT or MLP-Mixer. Also, the MLP-mixer uses MLPs of two layers for each mixing and ConvMixer uses a single layer for each mixing.</p>\n<p>The paper recommends removing the residual connection across the channel mixing (point-wise convolution) and having only a residual connection over the spatial mixing (depth-wise convolution). They also use <a href=\"https://nn.labml.ai/normalization/batch_norm/index.html\">Batch normalization</a> instead of <a href=\"../normalization/layer_norm/index.html\">Layer normalization</a>.</p>\n<p>Here's <a href=\"https://nn.labml.ai/conv_mixer/experiment.html\">an experiment</a> that trains ConvMixer on CIFAR-10. </p>\n": "<h1><a href=\"https://nn.labml.ai/conv_mixer/index.html\">\u4f60\u53ea\u9700\u8981\u8865\u4e01\u5417\uff1f</a></h1>\n<p>\u8fd9\u662f <a href=\"https://pytorch.org\">PyTorch</a> \u5bf9\u8bba\u6587\u300a<a href=\"https://arxiv.org/abs/2201.09792\">\u8865\u4e01\u5c31\u662f\u4f60\u6240\u9700\u8981\u7684\uff1f</a>\u300b\u7684\u5b9e\u73b0</p>\u3002\n<p>convMixer \u7c7b\u4f3c\u4e8e <a href=\"https://nn.labml.ai/transformers/mlp_mixer/index.html\">MLP \u6df7\u97f3\u5668</a>\u3002MLP-Mixer \u901a\u8fc7\u5728\u7a7a\u95f4\u7ef4\u5ea6\u4e0a\u5e94\u7528 MLP\uff0c\u7136\u540e\u5728\u4fe1\u9053\u7ef4\u5ea6\u4e0a\u5e94\u7528 MLP \u6765\u5206\u79bb\u7a7a\u95f4\u7ef4\u5ea6\u548c\u4fe1\u9053\u7ef4\u5ea6\u7684\u6df7\u97f3\uff08\u7a7a\u95f4 MLP \u53d6\u4ee3 <a href=\"https://nn.labml.ai/transformers/vit/index.html\">vIT</a> \u6ce8\u610f\u529b\uff0c\u4fe1\u9053 MLP \u662f ViT \u7684 <a href=\"https://nn.labml.ai/transformers/feed_forward.html\">FFN</a>\uff09\u3002</p>\n<p>ConvMixer \u4f7f\u7528 1x1 \u5377\u79ef\u8fdb\u884c\u901a\u9053\u6df7\u5408\uff0c\u4f7f\u7528\u6df1\u5ea6\u5377\u79ef\u8fdb\u884c\u7a7a\u95f4\u6df7\u5408\u3002\u7531\u4e8e\u5b83\u662f\u5377\u79ef\u800c\u4e0d\u662f\u6574\u4e2a\u7a7a\u95f4\u7684\u5b8c\u6574\u7684 MLP\uff0c\u56e0\u6b64\u4e0e vIT \u6216 MLP-Mixer \u76f8\u6bd4\uff0c\u5b83\u53ea\u6df7\u5408\u9644\u8fd1\u7684\u6279\u6b21\u3002\u6b64\u5916\uff0cMLP-Mixer \u6bcf\u6b21\u6df7\u5408\u4f7f\u7528\u4e24\u5c42 MLP\uff0cConvMixer \u6bcf\u6b21\u6df7\u5408\u4f7f\u7528\u5355\u5c42\u3002</p>\n<p>\u8be5\u8bba\u6587\u5efa\u8bae\u5220\u9664\u4fe1\u9053\u6df7\u5408\uff08\u9010\u70b9\u5377\u79ef\uff09\u4e0a\u7684\u5269\u4f59\u8fde\u63a5\uff0c\u5728\u7a7a\u95f4\u6df7\u5408\uff08\u6df1\u5ea6\u5377\u79ef\uff09\u4e0a\u4ec5\u4f7f\u7528\u6b8b\u5dee\u8fde\u63a5\u3002\u4ed6\u4eec\u8fd8\u4f7f\u7528<a href=\"https://nn.labml.ai/normalization/batch_norm/index.html\">\u6279\u91cf\u6807\u51c6\u5316</a>\u800c\u4e0d\u662f<a href=\"../normalization/layer_norm/index.html\">\u56fe\u5c42\u6807\u51c6\u5316</a>\u3002</p>\n<p>\u8fd9\u662f<a href=\"https://nn.labml.ai/conv_mixer/experiment.html\">\u4e00\u9879\u5728 CIFAR-10 \u4e0a\u8bad\u7ec3 ConvMixer \u7684\u5b9e\u9a8c</a>\u3002</p>\n"3}45