fastnlp
diff --git a/‎.gitignore
Lines changed: 16 additions & 0 deletions b/‎.gitignore
Lines changed: 16 additions & 0 deletions
diff --git a/‎.travis.yml
Lines changed: 1 addition & 1 deletion b/‎.travis.yml
Lines changed: 1 addition & 1 deletion
diff --git a/‎README.md
Lines changed: 45 additions & 32 deletions b/‎README.md
Lines changed: 45 additions & 32 deletions
diff --git a/‎docs/Makefile
Lines changed: 3 additions & 0 deletions b/‎docs/Makefile
Lines changed: 3 additions & 0 deletions
diff --git a/‎docs/README.md
Lines changed: 41 additions & 0 deletions b/‎docs/README.md
Lines changed: 41 additions & 0 deletions
diff --git a/‎docs/make.bat
Lines changed: 0 additions & 36 deletions b/‎docs/make.bat
Lines changed: 0 additions & 36 deletions
diff --git a/‎docs/quick_tutorial.md
Lines changed: 0 additions & 2 deletions b/‎docs/quick_tutorial.md
Lines changed: 0 additions & 2 deletions
diff --git a/‎docs/source/conf.py
Lines changed: 2 additions & 2 deletions b/‎docs/source/conf.py
Lines changed: 2 additions & 2 deletions
diff --git a/‎docs/source/fastNLP.core.batch.rst
Lines changed: 3 additions & 3 deletions b/‎docs/source/fastNLP.core.batch.rst
Lines changed: 3 additions & 3 deletions
diff --git a/‎docs/source/fastNLP.core.callback.rst
Lines changed: 3 additions & 3 deletions b/‎docs/source/fastNLP.core.callback.rst
Lines changed: 3 additions & 3 deletions
@@ -0,0 +1,16 @@
+.gitignore
+
+.DS_Store
+.ipynb_checkpoints
+*.pyc
+__pycache__
+*.swp
+.vscode/
+.idea/**
+
+caches
+
+# fitlog
+.fitlog
+logs/
+.fitconfig
@@ -8,7 +8,7 @@ install:
   - pip install pytest-cov
 # command to run tests
 script:
-  - pytest --cov=./
+  - pytest --cov=./ test/
 
 after_success:
   - bash <(curl -s https://codecov.io/bash)
@@ -6,48 +6,69 @@
 ![Hex.pm](https://img.shields.io/hexpm/l/plug.svg)
 [![Documentation Status](https://readthedocs.org/projects/fastnlp/badge/?version=latest)](http://fastnlp.readthedocs.io/?badge=latest)
 
-fastNLP 是一款轻量级的 NLP 处理套件。你既可以使用它快速地完成一个命名实体识别（NER）、中文分词或文本分类任务； 也可以使用他构建许多复杂的网络模型，进行科研。它具有如下的特性：
+fastNLP 是一款轻量级的 NLP 处理套件。你既可以使用它快速地完成一个序列标注（[NER](reproduction/seqence_labelling/ner)、POS-Tagging等）、中文分词、[文本分类](reproduction/text_classification)、[Matching](reproduction/matching)、[指代消解](reproduction/coreference_resolution)、[摘要](reproduction/Summarization)等任务； 也可以使用它构建许多复杂的网络模型，进行科研。它具有如下的特性：
 
-- 统一的Tabular式数据容器，让数据预处理过程简洁明了。内置多种数据集的DataSet Loader，省去预处理代码。
-- 各种方便的NLP工具，例如预处理embedding加载; 中间数据cache等;
-- 详尽的中文文档以供查阅；
+- 统一的Tabular式数据容器，让数据预处理过程简洁明了。内置多种数据集的DataSet Loader，省去预处理代码;
+- 多种训练、测试组件，例如训练器Trainer；测试器Tester；以及各种评测metrics等等;
+- 各种方便的NLP工具，例如预处理embedding加载（包括ELMo和BERT）; 中间数据cache等;
+- 详尽的中文[文档](https://fastnlp.readthedocs.io/)、[教程](https://fastnlp.readthedocs.io/zh/latest/user/tutorials.html)以供查阅;
 - 提供诸多高级模块，例如Variational LSTM, Transformer, CRF等;
-- 封装CNNText，Biaffine等模型可供直接使用;
+- 在序列标注、中文分词、文本分类、Matching、指代消解、摘要等任务上封装了各种模型可供直接使用，详细内容见 [reproduction](reproduction) 部分;
 - 便捷且具有扩展性的训练器; 提供多种内置callback函数，方便实验记录、异常捕获等。
 
 
 ## 安装指南
 
-fastNLP 依赖如下包:
+fastNLP 依赖以下包:
 
-+ numpy
-+ torch>=0.4.0
-+ tqdm
-+ nltk
++ numpy>=1.14.2
++ torch>=1.0.0
++ tqdm>=4.28.1
++ nltk>=3.4.1
++ requests
++ spacy
 
-其中torch的安装可能与操作系统及 CUDA 的版本相关，请参见 PyTorch 官网 。 
-在依赖包安装完成的情况，您可以在命令行执行如下指令完成安装
+其中torch的安装可能与操作系统及 CUDA 的版本相关，请参见 [PyTorch 官网](https://pytorch.org/) 。 
+在依赖包安装完成后，您可以在命令行执行如下指令完成安装
 
 ```shell
 pip install fastNLP
+python -m spacy download en
 ```
 
+目前使用pip安装fastNLP的版本是0.4.1，有较多功能仍未更新，最新内容以master分支为准。
+fastNLP0.5.0版本将在近期推出，请密切关注。
 
-## 参考资源
 
-- [文档](https://fastnlp.readthedocs.io/zh/latest/)
-- [源码](https://github.com/fastnlp/fastNLP)
+## fastNLP教程
+
+- [0. 快速入门](https://fastnlp.readthedocs.io/zh/latest/user/quickstart.html)
+- [1. 使用DataSet预处理文本](https://fastnlp.readthedocs.io/zh/latest/tutorials/tutorial_1_data_preprocess.html)
+- [2. 使用DataSetLoader加载数据集](https://fastnlp.readthedocs.io/zh/latest/tutorials/tutorial_2_load_dataset.html)
+- [3. 使用Embedding模块将文本转成向量](https://fastnlp.readthedocs.io/zh/latest/tutorials/tutorial_3_embedding.html)
+- [4. 动手实现一个文本分类器I-使用Trainer和Tester快速训练和测试](https://fastnlp.readthedocs.io/zh/latest/tutorials/tutorial_4_loss_optimizer.html)
+- [5. 动手实现一个文本分类器II-使用DataSetIter实现自定义训练过程](https://fastnlp.readthedocs.io/zh/latest/tutorials/tutorial_5_datasetiter.html)
+- [6. 快速实现序列标注模型](https://fastnlp.readthedocs.io/zh/latest/tutorials/tutorial_6_seq_labeling.html)
+- [7. 使用Modules和Models快速搭建自定义模型](https://fastnlp.readthedocs.io/zh/latest/tutorials/tutorial_7_modules_models.html)
+- [8. 使用Metric快速评测你的模型](https://fastnlp.readthedocs.io/zh/latest/tutorials/tutorial_8_metrics.html)
+- [9. 使用Callback自定义你的训练过程](https://fastnlp.readthedocs.io/zh/latest/tutorials/tutorial_9_callback.html)
+- [10. 使用fitlog 辅助 fastNLP 进行科研](https://fastnlp.readthedocs.io/zh/latest/tutorials/tutorial_10_fitlog.html)
 
 
 
 ## 内置组件
 
-大部分用于的 NLP 任务神经网络都可以看做由编码（encoder）、聚合（aggregator）、解码（decoder）三种模块组成。
+大部分用于的 NLP 任务神经网络都可以看做由词嵌入（embeddings）和两种模块：编码器（encoder）、解码器（decoder）组成。
+
+以文本分类任务为例，下图展示了一个BiLSTM+Attention实现文本分类器的模型流程图：
 
 
 ![](./docs/source/figures/text_classification.png)
 
-fastNLP 在 modules 模块中内置了三种模块的诸多组件，可以帮助用户快速搭建自己所需的网络。 三种模块的功能和常见组件如下:
+fastNLP 在 embeddings 模块中内置了几种不同的embedding：静态embedding（GloVe、word2vec）、上下文相关embedding
+（ELMo、BERT）、字符embedding（基于CNN或者LSTM的CharEmbedding）
+
+与此同时，fastNLP 在 modules 模块中内置了两种模块的诸多组件，可以帮助用户快速搭建自己所需的网络。 两种模块的功能和常见组件如下:
 
 <table>
 <tr>
@@ -57,29 +78,17 @@ fastNLP 在 modules 模块中内置了三种模块的诸多组件，可以帮助
 </tr>
 <tr>
     <td> encoder </td>
-    <td> 将输入编码为具有具 有表示能力的向量 </td>
+    <td> 将输入编码为具有具有表示能力的向量 </td>
     <td> embedding, RNN, CNN, transformer
 </tr>
-<tr>
-    <td> aggregator </td>
-    <td> 从多个向量中聚合信息 </td>
-    <td> self-attention, max-pooling </td>
-</tr>
 <tr>
     <td> decoder </td>
-    <td> 将具有某种表示意义的 向量解码为需要的输出 形式 </td>
+    <td> 将具有某种表示意义的向量解码为需要的输出形式 </td>
     <td> MLP, CRF </td>
 </tr>
 </table>
 
 
-## 完整模型
-fastNLP 为不同的 NLP 任务实现了许多完整的模型，它们都经过了训练和测试。
-
-你可以在以下两个地方查看相关信息
-- [介绍](reproduction/)
-- [源码](fastNLP/models/)
-
 ## 项目结构
 
 ![](./docs/source/figures/workflow.png)
@@ -93,7 +102,7 @@ fastNLP的大致工作流程如上图所示，而项目结构如下：
 </tr>
 <tr>
     <td><b> fastNLP.core </b></td>
-    <td> 实现了核心功能，包括数据处理组件、训练器、测速器等 </td>
+    <td> 实现了核心功能，包括数据处理组件、训练器、测试器等 </td>
 </tr>
 <tr>
     <td><b> fastNLP.models </b></td>
@@ -103,6 +112,10 @@ fastNLP的大致工作流程如上图所示，而项目结构如下：
     <td><b> fastNLP.modules </b></td>
     <td> 实现了用于搭建神经网络模型的诸多组件 </td>
 </tr>
+<tr>
+    <td><b> fastNLP.embeddings </b></td>
+    <td> 实现了将序列index转为向量序列的功能，包括读取预训练embedding等 </td>
+</tr>
 <tr>
     <td><b> fastNLP.io </b></td>
     <td> 实现了读写功能，包括数据读入，模型读写等 </td>
 
@@ -19,6 +19,9 @@ apidoc:
 server:
 	cd build/html && python -m http.server
 
+dev:
+	rm -rf build/html && make html && make server
+
 .PHONY: help Makefile
 
 # Catch-all target: route all unknown targets to Sphinx using the new
 
@@ -0,0 +1,41 @@
+# 快速入门 fastNLP 文档编写
+
+本教程为 fastNLP 文档编写者创建，文档编写者包括合作开发人员和文档维护人员。您在一般情况下属于前者，
+只需要了解整个框架的部分内容即可。
+
+## 合作开发人员
+
+FastNLP的文档使用基于[reStructuredText标记语言](http://docutils.sourceforge.net/rst.html)的
+[Sphinx](http://sphinx.pocoo.org/)工具生成，由[Read the Docs](https://readthedocs.org/)网站自动维护生成。
+一般开发者只要编写符合reStructuredText语法规范的文档并通过[PR](https://help.github.com/en/articles/about-pull-requests)，
+就可以为fastNLP的文档贡献一份力量。
+
+如果你想在本地编译文档并进行大段文档的编写，您需要安装Sphinx工具以及sphinx-rtd-theme主题：
+```bash
+fastNLP/docs> pip install sphinx
+fastNLP/docs> pip install sphinx-rtd-theme
+```
+然后在本目录下执行 `make dev` 命令。该命令只支持Linux和MacOS系统，期望看到如下输出：
+```bash
+fastNLP/docs> make dev
+rm -rf build/html && make html && make server
+Running Sphinx v1.5.6
+making output directory...
+......
+Build finished. The HTML pages are in build/html.
+cd build/html && python -m http.server
+Serving HTTP on 0.0.0.0 port 8000 (http://0.0.0.0:8000/) ...
+```
+现在您浏览器访问 http://localhost:8000/ 查看文档。如果你在远程服务器尚进行工作，则访问地址为 http://{服务器的ip地址}:8000/ 。
+但您必须保证服务器的8000端口是开放的。如果您的电脑或远程服务器的8000端口被占用，程序会顺延使用8001、8002……等端口。
+当你结束访问时，您可以使用Control(Ctrl) + C 来结束进程。
+
+我们在[这里](./source/user/example.rst)列举了fastNLP文档经常用到的reStructuredText语法（网页查看请结合Raw模式），
+您可以通过阅读它进行快速上手。FastNLP大部分的文档都是写在代码中通过Sphinx工具进行抽取生成的，
+您还可以参考这篇[未完成的文章](./source/user/docs_in_code.rst)了解代码内文档编写的规范。
+
+## 文档维护人员
+
+文档维护人员需要了解 Makefile 中全部命令的含义，并了解到目前的文档结构
+是在 sphinx-apidoc 自动抽取的基础上进行手动修改得到的。
+文档维护人员应进一步提升整个框架的自动化程度，并监督合作开发人员不要破坏文档项目的整体结构。
@@ -24,9 +24,9 @@
 author = 'xpqiu'
 
 # The short X.Y version
-version = '0.4'
+version = '0.4.5'
 # The full version, including alpha/beta/rc tags
-release = '0.4'
+release = '0.4.5'
 
 # -- General configuration ---------------------------------------------------
 
 
@@ -2,6 +2,6 @@ fastNLP.core.batch
 ==================
 
 .. automodule:: fastNLP.core.batch
-    :members:
-    :undoc-members:
-    :show-inheritance:
+   :members:
+   :undoc-members:
+   :show-inheritance:
@@ -2,6 +2,6 @@ fastNLP.core.callback
 =====================
 
 .. automodule:: fastNLP.core.callback
-    :members:
-    :undoc-members:
-    :show-inheritance:
+   :members:
+   :undoc-members:
+   :show-inheritance: