Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
labmlai
GitHub Repository: labmlai/annotated_deep_learning_paper_implementations
Path: blob/master/translate_cache/diffusion/ddpm/unet.zh.json
4933 views
1
{
2
"<h1>U-Net model for <a href=\"index.html\">Denoising Diffusion Probabilistic Models (DDPM)</a></h1>\n<p>This is a <a href=\"../../unet/index.html\">U-Net</a> based model to predict noise <span translate=no>_^_0_^_</span>.</p>\n<p>U-Net is a gets it&#x27;s name from the U shape in the model diagram. It processes a given image by progressively lowering (halving) the feature map resolution and then increasing the resolution. There are pass-through connection at each resolution.</p>\n<p><span translate=no>_^_1_^_</span></p>\n<p>This implementation contains a bunch of modifications to original U-Net (residual blocks, multi-head attention) and also adds time-step embeddings <span translate=no>_^_2_^_</span>.</p>\n": "<h1>\u7528\u4e8e<a href=\"index.html\">\u53bb\u566a\u6269\u6563\u6982\u7387\u6a21\u578b (DDPM) \u7684 U-Net \u6a21\u578b</a></h1>\n<p>\u8fd9\u662f\u4e00\u4e2a\u57fa\u4e8e <a href=\"../../unet/index.html\">U-Net</a> \u7684\u6a21\u578b\uff0c\u7528\u4e8e\u9884\u6d4b\u566a\u58f0<span translate=no>_^_0_^_</span>\u3002</p>\n<p>U-Net \u662f\u4ece\u6a21\u578b\u56fe\u4e2d\u7684 U \u5f62\u4e2d\u83b7\u53d6\u5b83\u7684\u540d\u5b57\u3002\u5b83\u901a\u8fc7\u9010\u6b65\u964d\u4f4e\uff08\u51cf\u534a\uff09\u8981\u7d20\u56fe\u5206\u8fa8\u7387\uff0c\u7136\u540e\u63d0\u9ad8\u5206\u8fa8\u7387\u6765\u5904\u7406\u7ed9\u5b9a\u7684\u56fe\u50cf\u3002\u6bcf\u79cd\u5206\u8fa8\u7387\u90fd\u6709\u76f4\u901a\u8fde\u63a5\u3002</p>\n<p><span translate=no>_^_1_^_</span></p>\n<p>\u6b64\u5b9e\u73b0\u5305\u542b\u5bf9\u539f\u59cb U-Net\uff08\u6b8b\u5dee\u5757\u3001\u591a\u5934\u6ce8\u610f\uff09\u7684\u5927\u91cf\u4fee\u6539\uff0c\u8fd8\u6dfb\u52a0\u4e86\u65f6\u95f4\u6b65\u957f\u5d4c\u5165<span translate=no>_^_2_^_</span>\u3002</p>\n",
3
"<h2>U-Net</h2>\n": "<h2>U-Net</h2>\n",
4
"<h3>Attention block</h3>\n<p>This is similar to <a href=\"../../transformers/mha.html\">transformer multi-head attention</a>.</p>\n": "<h3>\u6ce8\u610f\u529b\u5757</h3>\n<p>\u8fd9\u7c7b\u4f3c\u4e8e<a href=\"../../transformers/mha.html\">\u53d8\u538b\u5668\u591a\u5934\u7684\u5173\u6ce8</a>\u3002</p>\n",
5
"<h3>Down block</h3>\n<p>This combines <span translate=no>_^_0_^_</span> and <span translate=no>_^_1_^_</span>. These are used in the first half of U-Net at each resolution.</p>\n": "<h3>\u5411\u4e0b\u65b9\u5757</h3>\n<p>\u8fd9\u7ed3\u5408\u4e86<span translate=no>_^_0_^_</span>\u548c<span translate=no>_^_1_^_</span>.\u8fd9\u4e9b\u5728U-Net\u7684\u524d\u534a\u90e8\u5206\u4ee5\u6bcf\u79cd\u5206\u8fa8\u7387\u4f7f\u7528\u3002</p>\n",
6
"<h3>Embeddings for <span translate=no>_^_0_^_</span></h3>\n": "<h3>\u5d4c\u5165\u7528\u4e8e<span translate=no>_^_0_^_</span></h3>\n",
7
"<h3>Middle block</h3>\n<p>It combines a <span translate=no>_^_0_^_</span>, <span translate=no>_^_1_^_</span>, followed by another <span translate=no>_^_2_^_</span>. This block is applied at the lowest resolution of the U-Net.</p>\n": "<h3>\u4e2d\u95f4\u65b9\u5757</h3>\n<p>\u5b83\u7ed3\u5408\u4e86<span translate=no>_^_0_^_</span>\u3001<span translate=no>_^_1_^_</span>\u3001\u540e\u8ddf\u53e6\u4e00\u4e2a<span translate=no>_^_2_^_</span>\u3002\u6b64\u5757\u5e94\u7528\u4e8e U-Net \u7684\u6700\u4f4e\u5206\u8fa8\u7387\u3002</p>\n",
8
"<h3>Residual block</h3>\n<p>A residual block has two convolution layers with group normalization. Each resolution is processed with two residual blocks.</p>\n": "<h3>\u5269\u4f59\u65b9\u5757</h3>\n<p>\u6b8b\u5dee\u5757\u5177\u6709\u4e24\u4e2a\u5177\u6709\u7ec4\u5f52\u4e00\u5316\u7684\u5377\u79ef\u5c42\u3002\u6bcf\u4e2a\u5206\u8fa8\u7387\u90fd\u4f7f\u7528\u4e24\u4e2a\u6b8b\u5dee\u5757\u8fdb\u884c\u5904\u7406\u3002</p>\n",
9
"<h3>Scale down the feature map by <span translate=no>_^_0_^_</span></h3>\n": "<h3>\u6309\u6bd4\u4f8b\u7f29\u5c0f\u8981\u7d20\u5730\u56fe<span translate=no>_^_0_^_</span></h3>\n",
10
"<h3>Scale up the feature map by <span translate=no>_^_0_^_</span></h3>\n": "<h3>\u6309\u6bd4\u4f8b\u653e\u5927\u8981\u7d20\u5730\u56fe<span translate=no>_^_0_^_</span></h3>\n",
11
"<h3>Swish activation function</h3>\n<p><span translate=no>_^_0_^_</span></p>\n": "<h3>Swish \u6fc0\u6d3b\u529f\u80fd</h3>\n<p><span translate=no>_^_0_^_</span></p>\n",
12
"<h3>Up block</h3>\n<p>This combines <span translate=no>_^_0_^_</span> and <span translate=no>_^_1_^_</span>. These are used in the second half of U-Net at each resolution.</p>\n": "<h3>\u5411\u4e0a\u65b9\u5757</h3>\n<p>\u8fd9\u7ed3\u5408\u4e86<span translate=no>_^_0_^_</span>\u548c<span translate=no>_^_1_^_</span>.\u8fd9\u4e9b\u5728U-Net\u7684\u540e\u534a\u90e8\u5206\u4ee5\u6bcf\u79cd\u5206\u8fa8\u7387\u4f7f\u7528\u3002</p>\n",
13
"<h4>First half of U-Net - decreasing resolution</h4>\n": "<h4>U-Net \u7684\u524d\u534a\u90e8\u5206-\u5206\u8fa8\u7387\u964d\u4f4e</h4>\n",
14
"<h4>Second half of U-Net - increasing resolution</h4>\n": "<h4>U-Net \u7684\u540e\u534a\u90e8\u5206-\u63d0\u9ad8\u5206\u8fa8\u7387</h4>\n",
15
"<p> </p>\n": "<p></p>\n",
16
"<p><span translate=no>_^_0_^_</span> at the same resolution </p>\n": "<p><span translate=no>_^_0_^_</span>\u4ee5\u76f8\u540c\u7684\u5206\u8fa8\u7387</p>\n",
17
"<p><span translate=no>_^_0_^_</span> is not used, but it&#x27;s kept in the arguments because for the attention layer function signature to match with <span translate=no>_^_1_^_</span>. </p>\n": "<p><span translate=no>_^_0_^_</span>\u672a\u4f7f\u7528\uff0c\u4f46\u5b83\u4fdd\u7559\u5728\u53c2\u6570\u4e2d\uff0c\u56e0\u4e3a\u8981\u4e0e\u6ce8\u610f\u5c42\u51fd\u6570\u7b7e\u540d\u5339\u914d<span translate=no>_^_1_^_</span>\u3002</p>\n",
18
"<p><span translate=no>_^_0_^_</span> will store outputs at each resolution for skip connection </p>\n": "<p><span translate=no>_^_0_^_</span>\u5c06\u4ee5\u6bcf\u79cd\u5206\u8fa8\u7387\u5b58\u50a8\u8f93\u51fa\u4ee5\u8fdb\u884c\u8df3\u8fc7\u8fde\u63a5</p>\n",
19
"<p>Activation </p>\n": "<p>\u6fc0\u6d3b</p>\n",
20
"<p>Add <span translate=no>_^_0_^_</span> </p>\n": "<p>\u6dfb\u52a0<span translate=no>_^_0_^_</span></p>\n",
21
"<p>Add skip connection </p>\n": "<p>\u6dfb\u52a0\u8df3\u8fc7\u8fde\u63a5</p>\n",
22
"<p>Add the shortcut connection and return </p>\n": "<p>\u6dfb\u52a0\u5feb\u6377\u65b9\u5f0f\u8fde\u63a5\u5e76\u8fd4\u56de</p>\n",
23
"<p>Add time embeddings </p>\n": "<p>\u6dfb\u52a0\u65f6\u95f4\u5d4c\u5165</p>\n",
24
"<p>Calculate scaled dot-product <span translate=no>_^_0_^_</span> </p>\n": "<p>\u8ba1\u7b97\u7f29\u653e\u7684\u70b9\u79ef<span translate=no>_^_0_^_</span></p>\n",
25
"<p>Change <span translate=no>_^_0_^_</span> to shape <span translate=no>_^_1_^_</span> </p>\n": "<p>\u6539<span translate=no>_^_0_^_</span>\u6210\u5f62\u72b6<span translate=no>_^_1_^_</span></p>\n",
26
"<p>Change to shape <span translate=no>_^_0_^_</span> </p>\n": "<p>\u6539\u6210\u5f62\u72b6<span translate=no>_^_0_^_</span></p>\n",
27
"<p>Combine the set of modules </p>\n": "<p>\u7ec4\u5408\u8fd9\u7ec4\u6a21\u5757</p>\n",
28
"<p>Create sinusoidal position embeddings <a href=\"../../transformers/positional_encoding.html\">same as those from the transformer</a></p>\n<span translate=no>_^_0_^_</span><p>where <span translate=no>_^_1_^_</span> is <span translate=no>_^_2_^_</span> </p>\n": "<p>\u521b\u5efa\u4e0e<a href=\"../../transformers/positional_encoding.html\">\u53d8\u538b\u5668\u76f8\u540c\u7684</a>\u6b63\u5f26\u4f4d\u7f6e\u5d4c\u5165</p>\n<span translate=no>_^_0_^_</span><p>\u5728\u54ea<span translate=no>_^_1_^_</span>\u91cc<span translate=no>_^_2_^_</span></p>\n",
29
"<p>Default <span translate=no>_^_0_^_</span> </p>\n": "<p>\u9ed8\u8ba4<span translate=no>_^_0_^_</span></p>\n",
30
"<p>Down sample at all resolutions except the last </p>\n": "<p>\u9664\u6700\u540e\u4e00\u4e2a\u5206\u8fa8\u7387\u4e4b\u5916\u7684\u6240\u6709\u5206\u8fa8\u7387\u90fd\u5411\u4e0b\u91c7\u6837</p>\n",
31
"<p>Final block to reduce the number of channels </p>\n": "<p>\u51cf\u5c11\u4fe1\u9053\u6570\u91cf\u7684\u6700\u7ec8\u533a\u5757</p>\n",
32
"<p>Final normalization and convolution </p>\n": "<p>\u6700\u7ec8\u5f52\u4e00\u5316\u548c\u5377\u79ef</p>\n",
33
"<p>Final normalization and convolution layer </p>\n": "<p>\u6700\u7ec8\u5f52\u4e00\u5316\u548c\u5377\u79ef\u5c42</p>\n",
34
"<p>First convolution layer </p>\n": "<p>\u7b2c\u4e00\u4e2a\u5377\u79ef\u5c42</p>\n",
35
"<p>First half of U-Net </p>\n": "<p>U-Net \u7684\u4e0a\u534a\u5e74</p>\n",
36
"<p>First linear layer </p>\n": "<p>\u7b2c\u4e00\u4e2a\u7ebf\u6027\u5c42</p>\n",
37
"<p>For each resolution </p>\n": "<p>\u5bf9\u4e8e\u6bcf\u79cd\u5206\u8fa8\u7387</p>\n",
38
"<p>Get image projection </p>\n": "<p>\u83b7\u53d6\u56fe\u50cf\u6295\u5f71</p>\n",
39
"<p>Get query, key, and values (concatenated) and shape it to <span translate=no>_^_0_^_</span> </p>\n": "<p>\u83b7\u53d6\u67e5\u8be2\u3001\u952e\u548c\u503c\uff08\u4e32\u8054\uff09\u5e76\u5c06\u5176\u8c03\u6574\u4e3a<span translate=no>_^_0_^_</span></p>\n",
40
"<p>Get shape </p>\n": "<p>\u5851\u9020\u8eab\u6750</p>\n",
41
"<p>Get the skip connection from first half of U-Net and concatenate </p>\n": "<p>\u4ece U-Net \u7684\u524d\u534a\u90e8\u5206\u83b7\u53d6\u8df3\u8fc7\u8fde\u63a5\u5e76\u8fde\u63a5</p>\n",
42
"<p>Get time-step embeddings </p>\n": "<p>\u83b7\u53d6\u65f6\u95f4\u6b65\u957f\u5d4c\u5165</p>\n",
43
"<p>Group normalization and the first convolution layer </p>\n": "<p>\u7ec4\u5f52\u4e00\u5316\u548c\u7b2c\u4e00\u4e2a\u5377\u79ef\u5c42</p>\n",
44
"<p>Group normalization and the second convolution layer </p>\n": "<p>\u7ec4\u5f52\u4e00\u5316\u548c\u7b2c\u4e8c\u4e2a\u5377\u79ef\u5c42</p>\n",
45
"<p>If the number of input channels is not equal to the number of output channels we have to project the shortcut connection </p>\n": "<p>\u5982\u679c\u8f93\u5165\u901a\u9053\u7684\u6570\u91cf\u4e0d\u7b49\u4e8e\u8f93\u51fa\u901a\u9053\u7684\u6570\u91cf\uff0c\u6211\u4eec\u5fc5\u987b\u6295\u5f71\u5feb\u6377\u65b9\u5f0f\u8fde\u63a5</p>\n",
46
"<p>Linear layer for final transformation </p>\n": "<p>\u7528\u4e8e\u6700\u7ec8\u53d8\u6362\u7684\u7ebf\u6027\u5c42</p>\n",
47
"<p>Linear layer for time embeddings </p>\n": "<p>\u7528\u4e8e\u65f6\u95f4\u5d4c\u5165\u7684\u7ebf\u6027\u5c42</p>\n",
48
"<p>Middle (bottom) </p>\n": "<p>\u4e2d\u95f4\uff08\u5e95\u90e8\uff09</p>\n",
49
"<p>Middle block </p>\n": "<p>\u4e2d\u95f4\u65b9\u5757</p>\n",
50
"<p>Multiply by values </p>\n": "<p>\u4e58\u4ee5\u503c</p>\n",
51
"<p>Normalization layer </p>\n": "<p>\u5f52\u4e00\u5316\u5c42</p>\n",
52
"<p>Number of channels </p>\n": "<p>\u9891\u9053\u6570\u91cf</p>\n",
53
"<p>Number of output channels at this resolution </p>\n": "<p>\u6b64\u5206\u8fa8\u7387\u4e0b\u7684\u8f93\u51fa\u58f0\u9053\u6570</p>\n",
54
"<p>Number of resolutions </p>\n": "<p>\u5206\u8fa8\u7387\u6570\u91cf</p>\n",
55
"<p>Project image into feature map </p>\n": "<p>\u5c06\u56fe\u50cf\u6295\u5f71\u5230\u8981\u7d20\u5730\u56fe\u4e2d</p>\n",
56
"<p>Projections for query, key and values </p>\n": "<p>\u67e5\u8be2\u3001\u952e\u548c\u503c\u7684\u6295\u5f71</p>\n",
57
"<p>Reshape to <span translate=no>_^_0_^_</span> </p>\n": "<p>\u91cd\u5851\u4e3a<span translate=no>_^_0_^_</span></p>\n",
58
"<p>Scale for dot-product attention </p>\n": "<p>\u7f29\u653e\u70b9\u4ea7\u54c1\u6ce8\u610f\u529b</p>\n",
59
"<p>Second convolution layer </p>\n": "<p>\u7b2c\u4e8c\u4e2a\u5377\u79ef\u5c42</p>\n",
60
"<p>Second half of U-Net </p>\n": "<p>U-Net \u7684\u4e0b\u534a\u573a</p>\n",
61
"<p>Second linear layer </p>\n": "<p>\u7b2c\u4e8c\u4e2a\u7ebf\u6027\u5c42</p>\n",
62
"<p>Softmax along the sequence dimension <span translate=no>_^_0_^_</span> </p>\n": "<p>\u987a\u5e8f\u7ef4\u5ea6\u4e0a\u7684 Softmax<span translate=no>_^_0_^_</span></p>\n",
63
"<p>Split query, key, and values. Each of them will have shape <span translate=no>_^_0_^_</span> </p>\n": "<p>\u62c6\u5206\u67e5\u8be2\u3001\u952e\u548c\u503c\u3002\u4ed6\u4eec\u6bcf\u4e2a\u4eba\u90fd\u4f1a\u6709\u5f62\u72b6<span translate=no>_^_0_^_</span></p>\n",
64
"<p>The input has <span translate=no>_^_0_^_</span> because we concatenate the output of the same resolution from the first half of the U-Net </p>\n": "<p>\u8f93\u5165\u4e4b<span translate=no>_^_0_^_</span>\u6240\u4ee5\u6709\uff0c\u662f\u56e0\u4e3a\u6211\u4eec\u5c06 U-Net \u524d\u534a\u90e8\u5206\u76f8\u540c\u5206\u8fa8\u7387\u7684\u8f93\u51fa\u8fde\u63a5\u8d77\u6765</p>\n",
65
"<p>Time embedding layer. Time embedding has <span translate=no>_^_0_^_</span> channels </p>\n": "<p>\u65f6\u95f4\u5d4c\u5165\u5c42\u3002\u65f6\u95f4\u5d4c\u5165\u6709<span translate=no>_^_0_^_</span>\u9891\u9053</p>\n",
66
"<p>Transform to <span translate=no>_^_0_^_</span> </p>\n": "<p>\u53d8\u6362\u4e3a<span translate=no>_^_0_^_</span></p>\n",
67
"<p>Transform with the MLP </p>\n": "<p>\u4f7f\u7528 MLP \u8fdb\u884c\u8f6c\u578b</p>\n",
68
"<p>Up sample at all resolutions except last </p>\n": "<p>\u9664\u6700\u540e\u4e00\u4e2a\u4ee5\u5916\u7684\u6240\u6709\u5206\u8fa8\u7387\u5411\u4e0a\u91c7\u6837</p>\n",
69
"<ul><li><span translate=no>_^_0_^_</span> has shape <span translate=no>_^_1_^_</span> </li>\n<li><span translate=no>_^_2_^_</span> has shape <span translate=no>_^_3_^_</span></li></ul>\n": "<ul><li><span translate=no>_^_0_^_</span>\u6709\u5f62\u72b6<span translate=no>_^_1_^_</span></li>\n<li><span translate=no>_^_2_^_</span>\u6709\u5f62\u72b6<span translate=no>_^_3_^_</span></li></ul>\n",
70
"<ul><li><span translate=no>_^_0_^_</span> is the number of channels in the image. <span translate=no>_^_1_^_</span> for RGB. </li>\n<li><span translate=no>_^_2_^_</span> is number of channels in the initial feature map that we transform the image into </li>\n<li><span translate=no>_^_3_^_</span> is the list of channel numbers at each resolution. The number of channels is <span translate=no>_^_4_^_</span> </li>\n<li><span translate=no>_^_5_^_</span> is a list of booleans that indicate whether to use attention at each resolution </li>\n<li><span translate=no>_^_6_^_</span> is the number of <span translate=no>_^_7_^_</span> at each resolution</li></ul>\n": "<ul><li><span translate=no>_^_0_^_</span>\u662f\u56fe\u50cf\u4e2d\u7684\u901a\u9053\u6570\u3002<span translate=no>_^_1_^_</span>\u5bf9\u4e8e RGB\u3002</li>\n<li><span translate=no>_^_2_^_</span>\u662f\u521d\u59cb\u7279\u5f81\u56fe\u4e2d\u6211\u4eec\u5c06\u56fe\u50cf\u8f6c\u6362\u4e3a\u7684\u901a\u9053\u6570</li>\n<li><span translate=no>_^_3_^_</span>\u662f\u6bcf\u79cd\u5206\u8fa8\u7387\u4e0b\u7684\u901a\u9053\u7f16\u53f7\u5217\u8868\u3002\u9891\u9053\u7684\u6570\u91cf\u662f<span translate=no>_^_4_^_</span></li>\n<li><span translate=no>_^_5_^_</span>\u662f\u4e00\u4e2a\u5e03\u5c14\u503c\u5217\u8868\uff0c\u7528\u4e8e\u6307\u793a\u662f\u5426\u5728\u6bcf\u4e2a\u5206\u8fa8\u7387\u4e0b\u4f7f\u7528\u6ce8\u610f\u529b</li>\n<li><span translate=no>_^_6_^_</span>\u662f\u6bcf\u79cd\u5206\u8fa8<span translate=no>_^_7_^_</span>\u7387\u7684\u6570\u5b57</li></ul>\n",
71
"<ul><li><span translate=no>_^_0_^_</span> is the number of channels in the input </li>\n<li><span translate=no>_^_1_^_</span> is the number of heads in multi-head attention </li>\n<li><span translate=no>_^_2_^_</span> is the number of dimensions in each head </li>\n<li><span translate=no>_^_3_^_</span> is the number of groups for <a href=\"../../normalization/group_norm/index.html\">group normalization</a></li></ul>\n": "<ul><li><span translate=no>_^_0_^_</span>\u662f\u8f93\u5165\u4e2d\u7684\u58f0\u9053\u6570</li>\n<li><span translate=no>_^_1_^_</span>\u662f\u591a\u5934\u5173\u6ce8\u4e2d\u7684\u5934\u90e8\u6570\u91cf</li>\n<li><span translate=no>_^_2_^_</span>\u662f\u6bcf\u4e2a\u5934\u90e8\u7684\u5c3a\u5bf8\u6570</li>\n<li><span translate=no>_^_3_^_</span>\u662f\u7ec4\u5f52\u4e00<a href=\"../../normalization/group_norm/index.html\">\u5316\u7684\u7ec4</a>\u6570</li></ul>\n",
72
"<ul><li><span translate=no>_^_0_^_</span> is the number of dimensions in the embedding</li></ul>\n": "<ul><li><span translate=no>_^_0_^_</span>\u662f\u5d4c\u5165\u4e2d\u7684\u7ef4\u6570</li></ul>\n",
73
"<ul><li><span translate=no>_^_0_^_</span> is the number of input channels </li>\n<li><span translate=no>_^_1_^_</span> is the number of input channels </li>\n<li><span translate=no>_^_2_^_</span> is the number channels in the time step (<span translate=no>_^_3_^_</span>) embeddings </li>\n<li><span translate=no>_^_4_^_</span> is the number of groups for <a href=\"../../normalization/group_norm/index.html\">group normalization</a> </li>\n<li><span translate=no>_^_5_^_</span> is the dropout rate</li></ul>\n": "<ul><li><span translate=no>_^_0_^_</span>\u662f\u8f93\u5165\u901a\u9053\u7684\u6570\u91cf</li>\n<li><span translate=no>_^_1_^_</span>\u662f\u8f93\u5165\u901a\u9053\u7684\u6570\u91cf</li>\n<li><span translate=no>_^_2_^_</span>\u662f\u65f6\u95f4\u6b65 (<span translate=no>_^_3_^_</span>) \u5d4c\u5165\u4e2d\u7684\u901a\u9053\u6570</li>\n<li><span translate=no>_^_4_^_</span>\u662f\u7528\u4e8e\u7ec4<a href=\"../../normalization/group_norm/index.html\">\u6807\u51c6\u5316\u7684\u7ec4</a>\u6570</li>\n<li><span translate=no>_^_5_^_</span>\u662f\u8f8d\u5b66\u7387</li></ul>\n",
74
"U-Net model for Denoising Diffusion Probabilistic Models (DDPM)": "\u7528\u4e8e\u53bb\u566a\u6269\u6563\u6982\u7387\u6a21\u578b (DDPM) \u7684 U-Net \u6a21\u578b",
75
"UNet model for Denoising Diffusion Probabilistic Models (DDPM)": "\u7528\u4e8e\u53bb\u566a\u6269\u6563\u6982\u7387\u6a21\u578b (DDPM) \u7684 unET \u6a21\u578b"
76
}
77