Skip to content

Commit e03dbc1

Browse files
author
pengchzn
committed
Refine Chinese translation
1 parent 1446bb1 commit e03dbc1

File tree

2 files changed

+53
-53
lines changed

2 files changed

+53
-53
lines changed
Lines changed: 30 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,20 @@
11
{
2-
"<h1>Configurable Transformer Components</h1>\n": "<h1>\u53ef\u914d\u7f6e\u53d8\u538b\u5668\u7ec4\u4ef6</h1>\n",
3-
"<h2>GLU Variants</h2>\n<p>These are variants with gated hidden layers for the FFN as introduced in paper <a href=\"https://arxiv.org/abs/2002.05202\">GLU Variants Improve Transformer</a>. We have omitted the bias terms as specified in the paper. </p>\n": "<h2>GLU \u53d8\u4f53</h2>\n<p>\u8fd9\u4e9b\u662f\u7528\u4e8eFFN\u7684\u5c01\u95ed\u9690\u85cf\u5c42\u7684\u53d8\u4f53\uff0c\u5982\u7eb8\u8d28 <a href=\"https://arxiv.org/abs/2002.05202\">GLU\u53d8\u4f53\u6539\u8fdb\u53d8\u538b\u5668</a>\u4e2d\u6240\u8ff0\u3002\u6211\u4eec\u7701\u7565\u4e86\u672c\u6587\u4e2d\u6307\u5b9a\u7684\u504f\u5dee\u672f\u8bed\u3002</p>\n",
2+
"<h1>Configurable Transformer Components</h1>\n": "<h1>\u53ef\u914d\u7f6e\u7684 Transformer \u7ec4\u4ef6</h1>\n",
3+
"<h2>GLU Variants</h2>\n<p>These are variants with gated hidden layers for the FFN as introduced in paper <a href=\"https://arxiv.org/abs/2002.05202\">GLU Variants Improve Transformer</a>. We have omitted the bias terms as specified in the paper. </p>\n": "<h2>GLU \u53d8\u4f53</h2>\n<p>\u8fd9\u4e9b\u662f\u5728\u8bba\u6587 <a href=\"https://arxiv.org/abs/2002.05202\">\u300a GLU Variants Improve Transformer \u300b</a>\u4e2d\u5305\u542b\u7684\u5404\u79cd\u5e26\u95e8\u63a7\u9690\u85cf\u5c42\u7684 ffn \u53d8\u4f53\u3002\u6211\u4eec\u5df2\u6309\u7167\u8bba\u6587\u89c4\u5b9a\u7701\u7565\u4e86\u504f\u7f6e\u9879\u3002</p>\n",
44
"<h3>FFN with Bilinear hidden layer</h3>\n<p><span translate=no>_^_0_^_</span> </p>\n": "<h3>\u5e26\u53cc\u7ebf\u6027\u9690\u85cf\u5c42\u7684 FFN</h3>\n<p><span translate=no>_^_0_^_</span></p>\n",
5-
"<h3>FFN with GELU gate</h3>\n<p><span translate=no>_^_0_^_</span> </p>\n": "<h3>\u5e26\u6709 GELU \u95e8\u7684 FFN</h3>\n<p><span translate=no>_^_0_^_</span></p>\n",
5+
"<h3>FFN with GELU gate</h3>\n<p><span translate=no>_^_0_^_</span> </p>\n": "<h3>\u5e26 GELU \u95e8\u7684 FFN</h3>\n<p><span translate=no>_^_0_^_</span></p>\n",
66
"<h3>FFN with Gated Linear Units</h3>\n<p><span translate=no>_^_0_^_</span> </p>\n": "<h3>\u5e26\u95e8\u63a7\u7ebf\u6027\u5355\u5143\u7684 FFN</h3>\n<p><span translate=no>_^_0_^_</span></p>\n",
7-
"<h3>FFN with ReLU gate</h3>\n<p><span translate=no>_^_0_^_</span> </p>\n": "<h3>\u5e26\u6709 ReLU \u95e8\u7684 FFN</h3>\n<p><span translate=no>_^_0_^_</span></p>\n",
8-
"<h3>FFN with Swish gate</h3>\n<p><span translate=no>_^_0_^_</span> where <span translate=no>_^_1_^_</span> </p>\n": "<h3>FFN \u5e26 Swish gate</h3>\n<p><span translate=no>_^_0_^_</span>\u5728\u54ea\u91cc<span translate=no>_^_1_^_</span></p>\n",
7+
"<h3>FFN with ReLU gate</h3>\n<p><span translate=no>_^_0_^_</span> </p>\n": "<h3>\u5e26 ReLU \u95e8\u7684 FFN</h3>\n<p><span translate=no>_^_0_^_</span></p>\n",
8+
"<h3>FFN with Swish gate</h3>\n<p><span translate=no>_^_0_^_</span> where <span translate=no>_^_1_^_</span> </p>\n": "<h3>\u5e26 Swish \u95e8\u7684 FFN</h3>\n<p><span translate=no>_^_0_^_</span>\u5176\u4e2d\uff0c<span translate=no>_^_1_^_</span></p>\n",
99
"<h3>Fixed Positional Embeddings</h3>\n<p>Source embedding with fixed positional encodings</p>\n": "<h3>\u56fa\u5b9a\u4f4d\u7f6e\u5d4c\u5165</h3>\n<p>\u4f7f\u7528\u56fa\u5b9a\u4f4d\u7f6e\u7f16\u7801\u8fdb\u884c\u6e90\u5d4c\u5165</p>\n",
10-
"<h3>GELU activation</h3>\n<p><span translate=no>_^_0_^_</span> where <span translate=no>_^_1_^_</span></p>\n<p>It was introduced in paper <a href=\"https://arxiv.org/abs/1606.08415\">Gaussian Error Linear Units</a>.</p>\n": "<h3>GELU \u6fc0\u6d3b</h3>\n<p><span translate=no>_^_0_^_</span>\u5728\u54ea\u91cc<span translate=no>_^_1_^_</span></p>\n<p>\u5b83\u662f\u5728\u8bba\u6587\u4e2d\u4ecb\u7ecd\u7684 \u201c<a href=\"https://arxiv.org/abs/1606.08415\">\u9ad8\u65af\u8bef\u5dee\u7ebf\u6027\u5355\u4f4d</a>\u201d\u3002</p>\n",
11-
"<h3>Learned Positional Embeddings</h3>\n<p>Source embedding with learned positional encodings</p>\n": "<h3>\u5b66\u4e60\u8fc7\u7684\u4f4d\u7f6e\u5d4c\u5165</h3>\n<p>\u4f7f\u7528\u5b66\u4e60\u7684\u4f4d\u7f6e\u7f16\u7801\u8fdb\u884c\u6e90\u5d4c\u5165</p>\n",
12-
"<h3>Multi-head Attention</h3>\n": "<h3>\u591a\u5934\u6ce8\u610f</h3>\n",
13-
"<h3>No Positional Embeddings</h3>\n<p>Source embedding without positional encodings</p>\n": "<h3>\u6ca1\u6709\u4f4d\u7f6e\u5d4c\u5165</h3>\n<p>\u4e0d\u5e26\u4f4d\u7f6e\u7f16\u7801\u7684\u6e90\u4ee3\u7801\u5d4c\u5165</p>\n",
14-
"<h3>ReLU activation</h3>\n<p><span translate=no>_^_0_^_</span></p>\n": "<h3>\u6fc0\u6d3b ReLU</h3>\n<p><span translate=no>_^_0_^_</span></p>\n",
10+
"<h3>GELU activation</h3>\n<p><span translate=no>_^_0_^_</span> where <span translate=no>_^_1_^_</span></p>\n<p>It was introduced in paper <a href=\"https://arxiv.org/abs/1606.08415\">Gaussian Error Linear Units</a>.</p>\n": "<h3>GELU \u6fc0\u6d3b\u51fd\u6570</h3>\n<p><span translate=no>_^_0_^_</span>\u5176\u4e2d\uff0c<span translate=no>_^_1_^_</span></p>\n<p>\u8fd9\u662f\u5728\u8bba\u6587<a href=\"https://arxiv.org/abs/1606.08415\">\u300a Gaussian Error Linear Units \u300b</a>\u4e2d\u4ecb\u7ecd\u7684\u3002</p>\n",
11+
"<h3>Learned Positional Embeddings</h3>\n<p>Source embedding with learned positional encodings</p>\n": "<h3>\u53ef\u5b66\u4e60\u7684\u4f4d\u7f6e\u5d4c\u5165</h3>\n<p>\u4f7f\u7528\u53ef\u5b66\u4e60\u7684\u4f4d\u7f6e\u7f16\u7801\u8fdb\u884c\u5d4c\u5165</p>\n",
12+
"<h3>Multi-head Attention</h3>\n": "<h3>\u591a\u5934\u6ce8\u610f\u529b</h3>\n",
13+
"<h3>No Positional Embeddings</h3>\n<p>Source embedding without positional encodings</p>\n": "<h3>\u65e0\u4f4d\u7f6e\u5d4c\u5165</h3>\n<p>\u6ca1\u6709\u4f4d\u7f6e\u7f16\u7801\u7684\u6e90\u5d4c\u5165</p>\n",
14+
"<h3>ReLU activation</h3>\n<p><span translate=no>_^_0_^_</span></p>\n": "<h3>ReLU \u6fc0\u6d3b\u51fd\u6570</h3>\n<p><span translate=no>_^_0_^_</span></p>\n",
1515
"<h3>Relative Multi-head Attention</h3>\n": "<h3>\u76f8\u5bf9\u591a\u5934\u6ce8\u610f\u529b</h3>\n",
16-
"<p> <a id=\"FFN\"></a></p>\n<h2>FFN Configurations</h2>\n<p>Creates a Position-wise FeedForward Network defined in <a href=\"feed_forward.html\"><span translate=no>_^_0_^_</span></a>.</p>\n": "<p><a id=\"FFN\"></a></p>\n<h2>FFN \u914d\u7f6e</h2>\n<p>\u521b\u5efa\u5728\u4e2d\u5b9a\u4e49\u7684\u4f4d\u7f6e\u524d\u9988\u7f51\u7edc<a href=\"feed_forward.html\"><span translate=no>_^_0_^_</span></a>\u3002</p>\n",
17-
"<p> <a id=\"TransformerConfigs\"></a></p>\n<h2>Transformer Configurations</h2>\n<p>This defines configurations for a transformer. The configurations are calculate using option functions. These are lazy loaded and therefore only the necessary modules are calculated.</p>\n": "<p><a id=\"TransformerConfigs\"></a></p>\n<h2>\u53d8\u538b\u5668\u914d\u7f6e</h2>\n<p>\u8fd9\u5b9a\u4e49\u4e86\u53d8\u538b\u5668\u7684\u914d\u7f6e\u3002\u914d\u7f6e\u662f\u4f7f\u7528\u9009\u9879\u51fd\u6570\u8ba1\u7b97\u7684\u3002\u8fd9\u4e9b\u662f\u5ef6\u8fdf\u52a0\u8f7d\u7684\uff0c\u56e0\u6b64\u53ea\u8ba1\u7b97\u5fc5\u8981\u7684\u6a21\u5757\u3002</p>\n",
16+
"<p> <a id=\"FFN\"></a></p>\n<h2>FFN Configurations</h2>\n<p>Creates a Position-wise FeedForward Network defined in <a href=\"feed_forward.html\"><span translate=no>_^_0_^_</span></a>.</p>\n": "<p><a id=\"FFN\"></a></p>\n<h2>FFN \u914d\u7f6e</h2>\n<p>\u5728<a href=\"feed_forward.html\"><span translate=no>_^_0_^_</span></a>\u4e2d\u5b9a\u4e49\u4e86\u4e00\u4e2a\u4f4d\u7f6e\u524d\u9988\u7f51\u7edc\u3002</p>\n",
17+
"<p> <a id=\"TransformerConfigs\"></a></p>\n<h2>Transformer Configurations</h2>\n<p>This defines configurations for a transformer. The configurations are calculate using option functions. These are lazy loaded and therefore only the necessary modules are calculated.</p>\n": "<p><a id=\"TransformerConfigs\"></a></p>\n<h2>Transformer \u914d\u7f6e</h2>\n<p>\u8fd9\u5b9a\u4e49\u4e86 Transformer \u7684\u914d\u7f6e\u3002\u8fd9\u4e9b\u914d\u7f6e\u662f\u901a\u8fc7\u53ef\u9009\u62e9\u7684\u51fd\u6570\u8fdb\u884c\u8ba1\u7b97\u7684\u3002\u5b83\u4eec\u662f\u60f0\u6027\u52a0\u8f7d\u7684\uff0c\u56e0\u6b64\u53ea\u6709\u5fc5\u8981\u7684\u6a21\u5757\u624d\u4f1a\u88ab\u8ba1\u7b97\u3002</p>\n",
1818
"<p> Create feedforward layer configurations</p>\n": "<p>\u521b\u5efa\u524d\u9988\u5c42\u914d\u7f6e</p>\n",
1919
"<p> Decoder layer</p>\n": "<p>\u89e3\u7801\u5668\u5c42</p>\n",
2020
"<p> Decoder</p>\n": "<p>\u89e3\u7801\u5668</p>\n",
@@ -23,34 +23,34 @@
2323
"<p> Initialize a <a href=\"feed_forward.html\">feed forward network</a></p>\n": "<p>\u521d\u59cb\u5316<a href=\"feed_forward.html\">\u524d\u9988\u7f51\u7edc</a></p>\n",
2424
"<p> Logit generator</p>\n": "<p>Logit \u751f\u6210\u5668</p>\n",
2525
"<p> Target embedding with fixed positional encodings</p>\n": "<p>\u4f7f\u7528\u56fa\u5b9a\u4f4d\u7f6e\u7f16\u7801\u8fdb\u884c\u76ee\u6807\u5d4c\u5165</p>\n",
26-
"<p> Target embedding with learned positional encodings</p>\n": "<p>\u4f7f\u7528\u5b66\u4e60\u7684\u4f4d\u7f6e\u7f16\u7801\u8fdb\u884c\u76ee\u6807\u5d4c\u5165</p>\n",
27-
"<p>Activation in position-wise feedforward layer </p>\n": "<p>\u5728\u4f4d\u7f6e\u524d\u9988\u5c42\u6fc0\u6d3b</p>\n",
26+
"<p> Target embedding with learned positional encodings</p>\n": "<p>\u4f7f\u7528\u53ef\u5b66\u4e60\u7684\u4f4d\u7f6e\u7f16\u7801\u8fdb\u884c\u76ee\u6807\u5d4c\u5165</p>\n",
27+
"<p>Activation in position-wise feedforward layer </p>\n": "<p>\u4f4d\u7f6e\u524d\u9988\u5c42\u4e2d\u7684\u6fc0\u6d3b\u51fd\u6570</p>\n",
2828
"<p>Configurable Feedforward Layer </p>\n": "<p>\u53ef\u914d\u7f6e\u7684\u524d\u9988\u5c42</p>\n",
2929
"<p>Decoder layer </p>\n": "<p>\u89e3\u7801\u5668\u5c42</p>\n",
30-
"<p>Dropout probability </p>\n": "<p>\u8f8d\u5b66\u6982\u7387</p>\n",
31-
"<p>Embedding layer for source </p>\n": "<p>\u6e90\u7684\u5d4c\u5165\u5c42</p>\n",
32-
"<p>Embedding layer for target (for decoder) </p>\n": "<p>\u76ee\u6807\u5d4c\u5165\u5c42\uff08\u7528\u4e8e\u89e3\u7801\u5668\uff09</p>\n",
30+
"<p>Dropout probability </p>\n": "<p>Dropout \u7387</p>\n",
31+
"<p>Embedding layer for source </p>\n": "<p>\u6e90\u6570\u636e\u7684\u5d4c\u5165\u5c42</p>\n",
32+
"<p>Embedding layer for target (for decoder) </p>\n": "<p>\u76ee\u6807\u6570\u636e\u7684\u5d4c\u5165\u5c42\uff08\u7528\u4e8e\u89e3\u7801\u5668\uff09</p>\n",
3333
"<p>Encoder consisting of multiple decoder layers </p>\n": "<p>\u7531\u591a\u4e2a\u89e3\u7801\u5668\u5c42\u7ec4\u6210\u7684\u7f16\u7801\u5668</p>\n",
3434
"<p>Encoder consisting of multiple encoder layers </p>\n": "<p>\u7531\u591a\u4e2a\u7f16\u7801\u5668\u5c42\u7ec4\u6210\u7684\u7f16\u7801\u5668</p>\n",
3535
"<p>Encoder layer </p>\n": "<p>\u7f16\u7801\u5668\u5c42</p>\n",
3636
"<p>Encoder-decoder </p>\n": "<p>\u7f16\u7801\u5668-\u89e3\u7801\u5668</p>\n",
3737
"<p>Logit generator for prediction </p>\n": "<p>\u7528\u4e8e\u9884\u6d4b\u7684 Logit \u751f\u6210\u5668</p>\n",
38-
"<p>Number of attention heads </p>\n": "<p>\u6ce8\u610f\u5934\u6570\u91cf</p>\n",
39-
"<p>Number of features in in the hidden layer </p>\n": "<p>\u9690\u85cf\u56fe\u5c42\u4e2d\u7684\u8981\u7d20\u6570\u91cf</p>\n",
40-
"<p>Number of features in the embedding </p>\n": "<p>\u5d4c\u5165\u4e2d\u7684\u8981\u7d20\u6570\u91cf</p>\n",
38+
"<p>Number of attention heads </p>\n": "<p>\u6ce8\u610f\u529b\u5934\u6570\u91cf</p>\n",
39+
"<p>Number of features in in the hidden layer </p>\n": "<p>\u9690\u85cf\u5c42\u4e2d\u7684\u7279\u5f81\u6570\u91cf</p>\n",
40+
"<p>Number of features in the embedding </p>\n": "<p>\u5d4c\u5165\u7684\u7279\u5f81\u6570\u91cf</p>\n",
4141
"<p>Number of layers </p>\n": "<p>\u5c42\u6570</p>\n",
42-
"<p>Number of tokens in the source vocabulary (for token embeddings) </p>\n": "<p>\u6e90\u8bcd\u6c47\u8868\u4e2d\u7684\u6807\u8bb0\u6570\uff08\u7528\u4e8e\u4ee4\u724c\u5d4c\u5165\uff09</p>\n",
43-
"<p>Number of tokens in the target vocabulary (to generate logits for prediction) </p>\n": "<p>\u76ee\u6807\u8bcd\u6c47\u8868\u4e2d\u7684\u6807\u8bb0\u6570\uff08\u7528\u4e8e\u751f\u6210\u9884\u6d4b\u7684\u5bf9\u6570\uff09</p>\n",
42+
"<p>Number of tokens in the source vocabulary (for token embeddings) </p>\n": "<p>\u6e90\u8bcd\u6c47\u8868\u4e2d\u7684 token \u6570\u91cf\uff08\u7528\u4e8e token \u5d4c\u5165\uff09</p>\n",
43+
"<p>Number of tokens in the target vocabulary (to generate logits for prediction) </p>\n": "<p>\u76ee\u6807\u8bcd\u6c47\u8868\u4e2d\u7684 token \u6570\u91cf\uff08\u7528\u4e8e\u751f\u6210\u9884\u6d4b\u7684 logits \uff09</p>\n",
4444
"<p>Position-wise feedforward layer </p>\n": "<p>\u4f4d\u7f6e\u524d\u9988\u5c42</p>\n",
4545
"<p>Predefined GLU variants </p>\n": "<p>\u9884\u5b9a\u4e49\u7684 GLU \u53d8\u4f53</p>\n",
46-
"<p>The decoder memory attention </p>\n": "<p>\u89e3\u7801\u5668\u5185\u5b58\u6ce8\u610f\u4e8b\u9879</p>\n",
47-
"<p>The decoder self attention </p>\n": "<p>\u89e3\u7801\u5668\u81ea\u6211\u6ce8\u610f</p>\n",
48-
"<p>The encoder self attention </p>\n": "<p>\u7f16\u7801\u5668\u81ea\u6211\u6ce8\u610f</p>\n",
49-
"<p>Transformer embedding size </p>\n": "<p>\u53d8\u538b\u5668\u5d4c\u5165\u5c3a\u5bf8</p>\n",
46+
"<p>The decoder memory attention </p>\n": "<p>\u89e3\u7801\u5668\u8bb0\u5fc6\u4e0e\u6ce8\u610f\u529b</p>\n",
47+
"<p>The decoder self attention </p>\n": "<p>\u89e3\u7801\u5668\u81ea\u6ce8\u610f\u529b</p>\n",
48+
"<p>The encoder self attention </p>\n": "<p>\u7f16\u7801\u5668\u81ea\u6ce8\u610f\u529b</p>\n",
49+
"<p>Transformer embedding size </p>\n": "<p>Transformer \u5d4c\u5165\u5927\u5c0f</p>\n",
5050
"<p>Whether the FFN layer should be gated </p>\n": "<p>\u662f\u5426\u5e94\u5bf9 FFN \u5c42\u8fdb\u884c\u95e8\u63a7</p>\n",
51-
"<p>Whether the first fully connected layer should have a learnable bias </p>\n": "<p>\u7b2c\u4e00\u4e2a\u5b8c\u5168\u8fde\u63a5\u7684\u5c42\u662f\u5426\u5e94\u8be5\u6709\u53ef\u5b66\u4e60\u7684\u504f\u5dee</p>\n",
52-
"<p>Whether the fully connected layer for the gate should have a learnable bias </p>\n": "<p>\u6805\u6781\u7684\u5168\u8fde\u63a5\u5c42\u662f\u5426\u5e94\u5177\u6709\u53ef\u5b66\u4e60\u7684\u504f\u5dee</p>\n",
53-
"<p>Whether the second fully connected layer should have a learnable bias </p>\n": "<p>\u7b2c\u4e8c\u4e2a\u5168\u8fde\u63a5\u5c42\u662f\u5426\u5e94\u8be5\u6709\u53ef\u5b66\u4e60\u7684\u504f\u5dee</p>\n",
54-
"Configurable Transformer Components": "\u53ef\u914d\u7f6e\u53d8\u538b\u5668\u7ec4\u4ef6",
51+
"<p>Whether the first fully connected layer should have a learnable bias </p>\n": "<p>\u7b2c\u4e00\u4e2a\u5168\u8fde\u63a5\u5c42\u662f\u5426\u5177\u6709\u53ef\u5b66\u4e60\u7684\u504f\u7f6e</p>\n",
52+
"<p>Whether the fully connected layer for the gate should have a learnable bias </p>\n": "<p>\u95e8\u63a7\u7684\u5168\u8fde\u63a5\u5c42\u662f\u5426\u5177\u6709\u53ef\u5b66\u4e60\u7684\u504f\u7f6e</p>\n",
53+
"<p>Whether the second fully connected layer should have a learnable bias </p>\n": "<p>\u7b2c\u4e8c\u4e2a\u5168\u8fde\u63a5\u5c42\u662f\u5426\u5177\u6709\u53ef\u5b66\u4e60\u7684\u504f\u7f6e</p>\n",
54+
"Configurable Transformer Components": "\u53ef\u914d\u7f6e Transformer \u7ec4\u4ef6",
5555
"These are configurable components that can be re-used quite easily.": "\u8fd9\u4e9b\u662f\u53ef\u914d\u7f6e\u7684\u7ec4\u4ef6\uff0c\u53ef\u4ee5\u5f88\u5bb9\u6613\u5730\u91cd\u590d\u4f7f\u7528\u3002"
5656
}

0 commit comments

Comments
 (0)