open-mmlab
diff --git a/‎.gitignore
Lines changed: 1 addition & 0 deletions b/‎.gitignore
Lines changed: 1 addition & 0 deletions
diff --git a/‎configs/kie/sdmgr/metafile.yml
Lines changed: 1 addition & 1 deletion b/‎configs/kie/sdmgr/metafile.yml
Lines changed: 1 addition & 1 deletion
diff --git a/‎configs/textdet/drrg/metafile.yml
Lines changed: 0 additions & 12 deletions b/‎configs/textdet/drrg/metafile.yml
Lines changed: 0 additions & 12 deletions
diff --git a/‎configs/textdet/maskrcnn/metafile.yml
Lines changed: 4 additions & 4 deletions b/‎configs/textdet/maskrcnn/metafile.yml
Lines changed: 4 additions & 4 deletions
diff --git a/‎configs/textrecog/master/README.md
Lines changed: 1 addition & 1 deletion b/‎configs/textrecog/master/README.md
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/en/_static/js/table.js
Lines changed: 31 additions & 0 deletions b/‎docs/en/_static/js/table.js
Lines changed: 31 additions & 0 deletions
diff --git a/‎docs/en/basic_concepts/structures.md
Lines changed: 3 additions & 3 deletions b/‎docs/en/basic_concepts/structures.md
Lines changed: 3 additions & 3 deletions
diff --git a/‎docs/en/conf.py
Lines changed: 14 additions & 2 deletions b/‎docs/en/conf.py
Lines changed: 14 additions & 2 deletions
diff --git a/‎docs/en/get_started/install.md
Lines changed: 75 additions & 32 deletions b/‎docs/en/get_started/install.md
Lines changed: 75 additions & 32 deletions
@@ -67,6 +67,7 @@ instance/
 # Sphinx documentation
 docs/en/_build/
 docs/zh_cn/_build/
+docs/*/api/generated/
 
 # PyBuilder
 target/
 
@@ -48,5 +48,5 @@ Models:
         Metrics:
           macro_f1: 0.931
           micro_f1: 0.940
-          edgee_micro_f1: 0.792
+          edge_micro_f1: 0.792
     Weights: https://download.openmmlab.com/mmocr/kie/sdmgr/sdmgr_novisual_60e_wildreceipt-openset/sdmgr_novisual_60e_wildreceipt-openset_20220831_200807-dedf15ec.pth
@@ -26,15 +26,3 @@ Models:
         Metrics:
           hmean-iou: 0.8467
     Weights: https://download.openmmlab.com/mmocr/textdet/drrg/drrg_resnet50_fpn-unet_1200e_ctw1500/drrg_resnet50_fpn-unet_1200e_ctw1500_20220827_105233-d5c702dd.pth
-
-  - Name: drrg_resnet50-oclip_fpn-unet_1200e_ctw1500
-    In Collection: DRRG
-    Config: configs/textdet/drrg/drrg_resnet50-oclip_fpn-unet_1200e_ctw1500.py
-    Metadata:
-      Training Data: CTW1500
-    Results:
-      - Task: Text Detection
-        Dataset: CTW1500
-        Metrics:
-          hmean-iou:
-    Weights:
@@ -26,7 +26,7 @@ Models:
       - Task: Text Detection
         Dataset: CTW1500
         Metrics:
-          hmean: 0.7458
+          hmean-iou: 0.7458
     Weights: https://download.openmmlab.com/mmocr/textdet/maskrcnn/mask-rcnn_resnet50_fpn_160e_ctw1500/mask-rcnn_resnet50_fpn_160e_ctw1500_20220826_154755-ce68ee8e.pth
 
   - Name: mask-rcnn_resnet50-oclip_fpn_160e_ctw1500
@@ -38,7 +38,7 @@ Models:
       - Task: Text Detection
         Dataset: CTW1500
         Metrics:
-          hmean: 0.7562
+          hmean-iou: 0.7562
     Weights: https://download.openmmlab.com/mmocr/textdet/maskrcnn/mask-rcnn_resnet50-oclip_fpn_160e_ctw1500/mask-rcnn_resnet50-oclip_fpn_160e_ctw1500_20221101_154448-6e9e991c.pth
 
   - Name: mask-rcnn_resnet50_fpn_160e_icdar2015
@@ -51,7 +51,7 @@ Models:
       - Task: Text Detection
         Dataset: ICDAR2015
         Metrics:
-          hmean: 0.8182
+          hmean-iou: 0.8182
     Weights: https://download.openmmlab.com/mmocr/textdet/maskrcnn/mask-rcnn_resnet50_fpn_160e_icdar2015/mask-rcnn_resnet50_fpn_160e_icdar2015_20220826_154808-ff5c30bf.pth
 
   - Name: mask-rcnn_resnet50-oclip_fpn_160e_icdar2015
@@ -64,5 +64,5 @@ Models:
       - Task: Text Detection
         Dataset: ICDAR2015
         Metrics:
-          hmean: 0.8513
+          hmean-iou: 0.8513
     Weights: https://download.openmmlab.com/mmocr/textdet/maskrcnn/mask-rcnn_resnet50-oclip_fpn_160e_icdar2015/mask-rcnn_resnet50-oclip_fpn_160e_icdar2015_20221101_131357-a19f7802.pth
@@ -45,7 +45,7 @@ Attention-based scene text recognizers have gained huge success, which leverages
 
 ```bibtex
 @article{Lu2021MASTER,
-  title={{MASTER}: Multi-Aspect Non-local Network for Scene Text Recognition},
+  title={MASTER: Multi-Aspect Non-local Network for Scene Text Recognition},
   author={Ning Lu and Wenwen Yu and Xianbiao Qi and Yihao Chen and Ping Gong and Rong Xiao and Xiang Bai},
   journal={Pattern Recognition},
   year={2021}
 
@@ -0,0 +1,31 @@
+$(document).ready(function () {
+    table = $('.model-summary').DataTable({
+        "stateSave": false,
+        "lengthChange": false,
+        "pageLength": 10,
+        "order": [],
+        "scrollX": true,
+        "columnDefs": [
+            { "type": "summary", targets: '_all' },
+        ]
+    });
+    // Override the default sorting for the summary columns, which
+    // never takes the "-" character into account.
+    jQuery.extend(jQuery.fn.dataTableExt.oSort, {
+        "summary-asc": function (str1, str2) {
+            if (str1 == "<p>-</p>")
+                return 1;
+            if (str2 == "<p>-</p>")
+                return -1;
+            return ((str1 < str2) ? -1 : ((str1 > str2) ? 1 : 0));
+        },
+
+        "summary-desc": function (str1, str2) {
+            if (str1 == "<p>-</p>")
+                return 1;
+            if (str2 == "<p>-</p>")
+                return -1;
+            return ((str1 < str2) ? 1 : ((str1 > str2) ? -1 : 0));
+        }
+    });
+})
@@ -40,7 +40,7 @@ The conventions for the fields in `InstanceData` in MMOCR are shown in the table
 |             |                                    |                                                                                                                                                             |
 | ----------- | ---------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------- |
 | Field       | Type                               | Description                                                                                                                                                 |
-| bboxes      | `torch.FloatTensor`                | Bounding boxes `[x1, x2, y1, y2]` with the shape `(N, 4)`.                                                                                                  |
+| bboxes      | `torch.FloatTensor`                | Bounding boxes `[x1, y1, x2, y2]` with the shape `(N, 4)`.                                                                                                  |
 | labels      | `torch.LongTensor`                 | Instance label with the shape `(N, )`. By default, MMOCR uses `0` to represent the "text" class.                                                            |
 | polygons    | `list[np.array(dtype=np.float32)]` | Polygonal bounding boxes with the shape `(N, )`.                                                                                                            |
 | scores      | `torch.Tensor`                     | Confidence scores of the predictions of bounding boxes. `(N, )`.                                                                                            |
@@ -99,7 +99,7 @@ The fields of [`InstanceData`](#instancedata) that will be used are:
 |          |                                    |                                                                                                  |
 | -------- | ---------------------------------- | ------------------------------------------------------------------------------------------------ |
 | Field    | Type                               | Description                                                                                      |
-| bboxes   | `torch.FloatTensor`                | Bounding boxes `[x1, x2, y1, y2]` with the shape `(N, 4)`.                                       |
+| bboxes   | `torch.FloatTensor`                | Bounding boxes `[x1, y1, x2, y2]` with the shape `(N, 4)`.                                       |
 | labels   | `torch.LongTensor`                 | Instance label with the shape `(N, )`. By default, MMOCR uses `0` to represent the "text" class. |
 | polygons | `list[np.array(dtype=np.float32)]` | Polygonal bounding boxes with the shape `(N, )`.                                                 |
 | scores   | `torch.Tensor`                     | Confidence scores of the predictions of bounding boxes. `(N, )`.                                 |
@@ -182,7 +182,7 @@ The [`InstanceData`](#text-detection-instancedata) fields that will be used by t
 |             |                     |                                                                                                                                                                            |
 | ----------- | ------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
 | Field       | Type                | Description                                                                                                                                                                |
-| bboxes      | `torch.FloatTensor` | Bounding boxes `[x1, x2, y1, y2]` with the shape `(N, 4)`.                                                                                                                 |
+| bboxes      | `torch.FloatTensor` | Bounding boxes `[x1, y1, x2, y2]` with the shape `(N, 4)`.                                                                                                                 |
 | labels      | `torch.LongTensor`  | Instance label with the shape `(N, )`.                                                                                                                                     |
 | texts       | `list[str]`         | The text content of each instance with the shape `(N, )`，used for e2e text spotting or KIE task.                                                                          |
 | edge_labels | `torch.IntTensor`   | The node adjacency matrix with the shape `(N, N)`. In the KIE task, the optional values for the state between nodes are `-1` (ignored, not involved in loss calculation)，`0` (disconnected) and `1`(connected). |
 
@@ -48,6 +48,7 @@
     'sphinx.ext.autodoc.typehints',
     'sphinx.ext.autosummary',
     'sphinx.ext.autosectionlabel',
+    'sphinx_tabs.tabs',
 ]
 autodoc_typehints = 'description'
 autodoc_mock_imports = ['mmcv._ext']
@@ -57,6 +58,8 @@
 copybutton_prompt_text = r'>>> |\.\.\. '
 copybutton_prompt_is_regexp = True
 
+myst_enable_extensions = ['colon_fence']
+
 # Add any paths that contain templates here, relative to this directory.
 templates_path = ['_templates']
 
@@ -149,8 +152,17 @@
 # relative to this directory. They are copied after the builtin static files,
 # so a file named "default.css" will overwrite the builtin "default.css".
 html_static_path = ['_static']
-html_css_files = ['css/readthedocs.css']
-html_js_files = ['js/collapsed.js']
+
+html_css_files = [
+    'https://cdn.datatables.net/1.13.2/css/dataTables.bootstrap5.min.css',
+    'css/readthedocs.css'
+]
+html_js_files = [
+    'https://cdn.datatables.net/1.13.2/js/jquery.dataTables.min.js',
+    'https://cdn.datatables.net/1.13.2/js/dataTables.bootstrap5.min.js',
+    'js/collapsed.js',
+    'js/table.js',
+]
 
 myst_heading_anchors = 4
 
 
@@ -27,43 +27,45 @@ conda activate openmmlab
 
 **Step 2.** Install PyTorch following [official instructions](https://pytorch.org/get-started/locally/), e.g.
 
-On GPU platforms:
+````{tabs}
 
-```shell
+```{code-tab} shell GPU Platform
 conda install pytorch torchvision -c pytorch
 ```
 
-On CPU platforms:
-
-```shell
+```{code-tab} shell CPU Platform
 conda install pytorch torchvision cpuonly -c pytorch
 ```
 
+````
+
 ## Installation Steps
 
 We recommend that users follow our best practices to install MMOCR. However, the whole process is highly customizable. See [Customize Installation](#customize-installation) section for more information.
 
 ### Best Practices
 
-**Step 0.** Install  [MMEngine](https://github.com/open-mmlab/mmengine) and [MMCV](https://github.com/open-mmlab/mmcv) using [MIM](https://github.com/open-mmlab/mim).
+**Step 0.** Install  [MMEngine](https://github.com/open-mmlab/mmengine), [MMCV](https://github.com/open-mmlab/mmcv) and [MMDetection](https://github.com/open-mmlab/mmdetection) using [MIM](https://github.com/open-mmlab/mim).
 
 ```shell
 pip install -U openmim
 mim install mmengine
 mim install 'mmcv>=2.0.0rc1'
+mim install 'mmdet>=3.0.0rc0'
 ```
 
-**Step 1.** Install [MMDetection](https://github.com/open-mmlab/mmdetection) as a dependency.
+**Step 1.** Install MMOCR.
 
-```shell
-pip install 'mmdet>=3.0.0rc0'
-```
+If you wish to run and develop MMOCR directly, install it from **source** (recommended).
 
-**Step 2.** Install MMOCR.
+If you use MMOCR as a dependency or third-party package, install it with **MIM**.
 
-Case A: If you wish to run and develop MMOCR directly, install it from source:
+`````{tabs}
+
+````{group-tab} Install from Source
 
 ```shell
+
 git clone https://github.com/open-mmlab/mmocr.git
 cd mmocr
 git checkout 1.x
@@ -72,58 +74,99 @@ pip install -v -e .
 # "-v" increases pip's verbosity.
 # "-e" means installing the project in editable mode,
 # That is, any local modifications on the code will take effect immediately.
+
 ```
 
-Case B: If you use MMOCR as a dependency or third-party package, install it with pip:
+````
+
+````{group-tab} Install via MIM
 
 ```shell
-pip install 'mmocr>=1.0.0rc0'
+
+mim install 'mmocr>=1.0.0rc0'
+
 ```
 
-**Step 3. (Optional)** If you wish to use any transform involving `albumentations` (For example, `Albu` in ABINet's pipeline), install the dependency using the following command:
+````
+
+`````
+
+**Step 2. (Optional)** If you wish to use any transform involving `albumentations` (For example, `Albu` in ABINet's pipeline), install the dependency using the following command:
+
+`````{tabs}
+
+````{group-tab} Install from Source
 
 ```shell
-# If MMOCR is installed from source
 pip install -r requirements/albu.txt
-# If MMOCR is installed via pip
+```
+
+````
+
+````{group-tab} Install via MIM
+
+```shell
 pip install albumentations>=1.1.0 --no-binary qudida,albumentations
 ```
 
+````
+
+`````
+
 ```{note}
 
 We recommend checking the environment after installing `albumentations` to
 ensure that `opencv-python` and `opencv-python-headless` are not installed together, otherwise it might cause unexpected issues. If that's unfortunately the case, please uninstall `opencv-python-headless` to make sure MMOCR's visualization utilities can work.
 
 Refer
-to ['albumentations`'s official documentation](https://albumentations.ai/docs/getting_started/installation/#note-on-opencv-dependencies) for more details.
+to [albumentations's official documentation](https://albumentations.ai/docs/getting_started/installation/#note-on-opencv-dependencies) for more details.
 
 ```
 
 ### Verify the installation
 
-We provide a method to verify the installation via inference demo, depending on your installation method. You should be able to see a pop-up image and the inference result upon successful verification.
+You may verify the installation via this inference demo.
+
+`````{tabs}
+
+````{tab} Python
+
+Run the following code in a Python interpreter:
+
+```python
+>>> from mmocr.apis import MMOCRInferencer
+>>> ocr = MMOCRInferencer(det='DBNet', rec='CRNN')
+>>> ocr('demo/demo_text_ocr.jpg', show=True, print_result=True)
+```
+````
+
+````{tab} Shell
+
+If you installed MMOCR from source, you can run the following in MMOCR's root directory:
+
+```shell
+python tools/infer.py demo/demo_text_ocr.jpg --det DBNet --rec CRNN --show --print-result
+```
+````
+
+`````
+
+You should be able to see a pop-up image and the inference result printed out in the console upon successful verification.
 
 <div align="center">
     <img src="https://user-images.githubusercontent.com/24622904/187825445-d30cbfa6-5549-4358-97fe-245f08f4ed94.jpg" height="250"/>
 </div>
+<br />
 
 ```bash
 # Inference result
-{'rec_texts': ['cbanke', 'docece', 'sroumats', 'chounsonse', 'doceca', 'c', '', 'sond', 'abrandso', 'sretane', '1', 'tosl', 'roundi', 'slen', 'yet', 'ally', 's', 'sue', 'salle', 'v'], 'rec_scores': [...], 'det_polygons': [...], 'det_scores': tensor([...])}
-```
-
-Run the following in MMOCR's directory:
-
-```bash
-python mmocr/ocr.py --det DB_r18 --recog CRNN demo/demo_text_ocr.jpg --show
+{'predictions': [{'rec_texts': ['cbanks', 'docecea', 'grouf', 'pwate', 'chobnsonsg', 'soxee', 'oeioh', 'c', 'sones', 'lbrandec', 'sretalg', '11', 'to8', 'round', 'sale', 'year',
+'ally', 'sie', 'sall'], 'rec_scores': [...], 'det_polygons': [...], 'det_scores':
+[...]}]}
 ```
 
-Also can run the following codes in your Python interpreter:
-
-```python
-from mmocr.ocr import MMOCR
-ocr = MMOCR(recog='CRNN', det='DB_r18')
-ocr.readtext('demo_text_ocr.jpg', show=True)
+```{note}
+If you are running MMOCR on a server without GUI or via SSH tunnel with X11 forwarding disabled, you may not see the pop-up window.
 ```
 
 ## Customize Installation