Skip to content

Commit 33cbc9b

Browse files
[Docs] Inferencer docs (#1744)
* [Enhancement] Support batch visualization & dumping in Inferencer * fix empty det output * Update mmocr/apis/inferencers/base_mmocr_inferencer.py Co-authored-by: liukuikun <24622904+Harold-lkk@users.noreply.github.com> * [Docs] Inferencer docs * fix * Support weight_list * add req * improve md * inferencers.md * update * add tab * refine * polish * add cn docs * js * js * js * fix ch docs * translate * translate * finish * fix * fix * fix * update * standard inferencer * update docs * update docs * update docs * update docs * update docs * update docs * en * update * update * update * update * fix * apply sugg --------- Co-authored-by: liukuikun <24622904+Harold-lkk@users.noreply.github.com>
1 parent cc78866 commit 33cbc9b

File tree

28 files changed

+1554
-394
lines changed

28 files changed

+1554
-394
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,7 @@ instance/
6767
# Sphinx documentation
6868
docs/en/_build/
6969
docs/zh_cn/_build/
70+
docs/*/api/generated/
7071

7172
# PyBuilder
7273
target/

configs/kie/sdmgr/metafile.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -48,5 +48,5 @@ Models:
4848
Metrics:
4949
macro_f1: 0.931
5050
micro_f1: 0.940
51-
edgee_micro_f1: 0.792
51+
edge_micro_f1: 0.792
5252
Weights: https://download.openmmlab.com/mmocr/kie/sdmgr/sdmgr_novisual_60e_wildreceipt-openset/sdmgr_novisual_60e_wildreceipt-openset_20220831_200807-dedf15ec.pth

configs/textdet/drrg/metafile.yml

Lines changed: 0 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -26,15 +26,3 @@ Models:
2626
Metrics:
2727
hmean-iou: 0.8467
2828
Weights: https://download.openmmlab.com/mmocr/textdet/drrg/drrg_resnet50_fpn-unet_1200e_ctw1500/drrg_resnet50_fpn-unet_1200e_ctw1500_20220827_105233-d5c702dd.pth
29-
30-
- Name: drrg_resnet50-oclip_fpn-unet_1200e_ctw1500
31-
In Collection: DRRG
32-
Config: configs/textdet/drrg/drrg_resnet50-oclip_fpn-unet_1200e_ctw1500.py
33-
Metadata:
34-
Training Data: CTW1500
35-
Results:
36-
- Task: Text Detection
37-
Dataset: CTW1500
38-
Metrics:
39-
hmean-iou:
40-
Weights:

configs/textdet/maskrcnn/metafile.yml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ Models:
2626
- Task: Text Detection
2727
Dataset: CTW1500
2828
Metrics:
29-
hmean: 0.7458
29+
hmean-iou: 0.7458
3030
Weights: https://download.openmmlab.com/mmocr/textdet/maskrcnn/mask-rcnn_resnet50_fpn_160e_ctw1500/mask-rcnn_resnet50_fpn_160e_ctw1500_20220826_154755-ce68ee8e.pth
3131

3232
- Name: mask-rcnn_resnet50-oclip_fpn_160e_ctw1500
@@ -38,7 +38,7 @@ Models:
3838
- Task: Text Detection
3939
Dataset: CTW1500
4040
Metrics:
41-
hmean: 0.7562
41+
hmean-iou: 0.7562
4242
Weights: https://download.openmmlab.com/mmocr/textdet/maskrcnn/mask-rcnn_resnet50-oclip_fpn_160e_ctw1500/mask-rcnn_resnet50-oclip_fpn_160e_ctw1500_20221101_154448-6e9e991c.pth
4343

4444
- Name: mask-rcnn_resnet50_fpn_160e_icdar2015
@@ -51,7 +51,7 @@ Models:
5151
- Task: Text Detection
5252
Dataset: ICDAR2015
5353
Metrics:
54-
hmean: 0.8182
54+
hmean-iou: 0.8182
5555
Weights: https://download.openmmlab.com/mmocr/textdet/maskrcnn/mask-rcnn_resnet50_fpn_160e_icdar2015/mask-rcnn_resnet50_fpn_160e_icdar2015_20220826_154808-ff5c30bf.pth
5656

5757
- Name: mask-rcnn_resnet50-oclip_fpn_160e_icdar2015
@@ -64,5 +64,5 @@ Models:
6464
- Task: Text Detection
6565
Dataset: ICDAR2015
6666
Metrics:
67-
hmean: 0.8513
67+
hmean-iou: 0.8513
6868
Weights: https://download.openmmlab.com/mmocr/textdet/maskrcnn/mask-rcnn_resnet50-oclip_fpn_160e_icdar2015/mask-rcnn_resnet50-oclip_fpn_160e_icdar2015_20221101_131357-a19f7802.pth

configs/textrecog/master/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,7 @@ Attention-based scene text recognizers have gained huge success, which leverages
4545

4646
```bibtex
4747
@article{Lu2021MASTER,
48-
title={{MASTER}: Multi-Aspect Non-local Network for Scene Text Recognition},
48+
title={MASTER: Multi-Aspect Non-local Network for Scene Text Recognition},
4949
author={Ning Lu and Wenwen Yu and Xianbiao Qi and Yihao Chen and Ping Gong and Rong Xiao and Xiang Bai},
5050
journal={Pattern Recognition},
5151
year={2021}

docs/en/_static/js/table.js

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
$(document).ready(function () {
2+
table = $('.model-summary').DataTable({
3+
"stateSave": false,
4+
"lengthChange": false,
5+
"pageLength": 10,
6+
"order": [],
7+
"scrollX": true,
8+
"columnDefs": [
9+
{ "type": "summary", targets: '_all' },
10+
]
11+
});
12+
// Override the default sorting for the summary columns, which
13+
// never takes the "-" character into account.
14+
jQuery.extend(jQuery.fn.dataTableExt.oSort, {
15+
"summary-asc": function (str1, str2) {
16+
if (str1 == "<p>-</p>")
17+
return 1;
18+
if (str2 == "<p>-</p>")
19+
return -1;
20+
return ((str1 < str2) ? -1 : ((str1 > str2) ? 1 : 0));
21+
},
22+
23+
"summary-desc": function (str1, str2) {
24+
if (str1 == "<p>-</p>")
25+
return 1;
26+
if (str2 == "<p>-</p>")
27+
return -1;
28+
return ((str1 < str2) ? 1 : ((str1 > str2) ? -1 : 0));
29+
}
30+
});
31+
})

docs/en/basic_concepts/structures.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ The conventions for the fields in `InstanceData` in MMOCR are shown in the table
4040
| | | |
4141
| ----------- | ---------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------- |
4242
| Field | Type | Description |
43-
| bboxes | `torch.FloatTensor` | Bounding boxes `[x1, x2, y1, y2]` with the shape `(N, 4)`. |
43+
| bboxes | `torch.FloatTensor` | Bounding boxes `[x1, y1, x2, y2]` with the shape `(N, 4)`. |
4444
| labels | `torch.LongTensor` | Instance label with the shape `(N, )`. By default, MMOCR uses `0` to represent the "text" class. |
4545
| polygons | `list[np.array(dtype=np.float32)]` | Polygonal bounding boxes with the shape `(N, )`. |
4646
| scores | `torch.Tensor` | Confidence scores of the predictions of bounding boxes. `(N, )`. |
@@ -99,7 +99,7 @@ The fields of [`InstanceData`](#instancedata) that will be used are:
9999
| | | |
100100
| -------- | ---------------------------------- | ------------------------------------------------------------------------------------------------ |
101101
| Field | Type | Description |
102-
| bboxes | `torch.FloatTensor` | Bounding boxes `[x1, x2, y1, y2]` with the shape `(N, 4)`. |
102+
| bboxes | `torch.FloatTensor` | Bounding boxes `[x1, y1, x2, y2]` with the shape `(N, 4)`. |
103103
| labels | `torch.LongTensor` | Instance label with the shape `(N, )`. By default, MMOCR uses `0` to represent the "text" class. |
104104
| polygons | `list[np.array(dtype=np.float32)]` | Polygonal bounding boxes with the shape `(N, )`. |
105105
| scores | `torch.Tensor` | Confidence scores of the predictions of bounding boxes. `(N, )`. |
@@ -182,7 +182,7 @@ The [`InstanceData`](#text-detection-instancedata) fields that will be used by t
182182
| | | |
183183
| ----------- | ------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
184184
| Field | Type | Description |
185-
| bboxes | `torch.FloatTensor` | Bounding boxes `[x1, x2, y1, y2]` with the shape `(N, 4)`. |
185+
| bboxes | `torch.FloatTensor` | Bounding boxes `[x1, y1, x2, y2]` with the shape `(N, 4)`. |
186186
| labels | `torch.LongTensor` | Instance label with the shape `(N, )`. |
187187
| texts | `list[str]` | The text content of each instance with the shape `(N, )`,used for e2e text spotting or KIE task. |
188188
| edge_labels | `torch.IntTensor` | The node adjacency matrix with the shape `(N, N)`. In the KIE task, the optional values for the state between nodes are `-1` (ignored, not involved in loss calculation),`0` (disconnected) and `1`(connected). |

docs/en/conf.py

Lines changed: 14 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,7 @@
4848
'sphinx.ext.autodoc.typehints',
4949
'sphinx.ext.autosummary',
5050
'sphinx.ext.autosectionlabel',
51+
'sphinx_tabs.tabs',
5152
]
5253
autodoc_typehints = 'description'
5354
autodoc_mock_imports = ['mmcv._ext']
@@ -57,6 +58,8 @@
5758
copybutton_prompt_text = r'>>> |\.\.\. '
5859
copybutton_prompt_is_regexp = True
5960

61+
myst_enable_extensions = ['colon_fence']
62+
6063
# Add any paths that contain templates here, relative to this directory.
6164
templates_path = ['_templates']
6265

@@ -149,8 +152,17 @@
149152
# relative to this directory. They are copied after the builtin static files,
150153
# so a file named "default.css" will overwrite the builtin "default.css".
151154
html_static_path = ['_static']
152-
html_css_files = ['css/readthedocs.css']
153-
html_js_files = ['js/collapsed.js']
155+
156+
html_css_files = [
157+
'https://cdn.datatables.net/1.13.2/css/dataTables.bootstrap5.min.css',
158+
'css/readthedocs.css'
159+
]
160+
html_js_files = [
161+
'https://cdn.datatables.net/1.13.2/js/jquery.dataTables.min.js',
162+
'https://cdn.datatables.net/1.13.2/js/dataTables.bootstrap5.min.js',
163+
'js/collapsed.js',
164+
'js/table.js',
165+
]
154166

155167
myst_heading_anchors = 4
156168

docs/en/get_started/install.md

Lines changed: 75 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -27,43 +27,45 @@ conda activate openmmlab
2727

2828
**Step 2.** Install PyTorch following [official instructions](https://pytorch.org/get-started/locally/), e.g.
2929

30-
On GPU platforms:
30+
````{tabs}
3131
32-
```shell
32+
```{code-tab} shell GPU Platform
3333
conda install pytorch torchvision -c pytorch
3434
```
3535
36-
On CPU platforms:
37-
38-
```shell
36+
```{code-tab} shell CPU Platform
3937
conda install pytorch torchvision cpuonly -c pytorch
4038
```
4139
40+
````
41+
4242
## Installation Steps
4343

4444
We recommend that users follow our best practices to install MMOCR. However, the whole process is highly customizable. See [Customize Installation](#customize-installation) section for more information.
4545

4646
### Best Practices
4747

48-
**Step 0.** Install [MMEngine](https://github.com/open-mmlab/mmengine) and [MMCV](https://github.com/open-mmlab/mmcv) using [MIM](https://github.com/open-mmlab/mim).
48+
**Step 0.** Install [MMEngine](https://github.com/open-mmlab/mmengine), [MMCV](https://github.com/open-mmlab/mmcv) and [MMDetection](https://github.com/open-mmlab/mmdetection) using [MIM](https://github.com/open-mmlab/mim).
4949

5050
```shell
5151
pip install -U openmim
5252
mim install mmengine
5353
mim install 'mmcv>=2.0.0rc1'
54+
mim install 'mmdet>=3.0.0rc0'
5455
```
5556

56-
**Step 1.** Install [MMDetection](https://github.com/open-mmlab/mmdetection) as a dependency.
57+
**Step 1.** Install MMOCR.
5758

58-
```shell
59-
pip install 'mmdet>=3.0.0rc0'
60-
```
59+
If you wish to run and develop MMOCR directly, install it from **source** (recommended).
6160

62-
**Step 2.** Install MMOCR.
61+
If you use MMOCR as a dependency or third-party package, install it with **MIM**.
6362

64-
Case A: If you wish to run and develop MMOCR directly, install it from source:
63+
`````{tabs}
64+
65+
````{group-tab} Install from Source
6566
6667
```shell
68+
6769
git clone https://github.com/open-mmlab/mmocr.git
6870
cd mmocr
6971
git checkout 1.x
@@ -72,58 +74,99 @@ pip install -v -e .
7274
# "-v" increases pip's verbosity.
7375
# "-e" means installing the project in editable mode,
7476
# That is, any local modifications on the code will take effect immediately.
77+
7578
```
7679
77-
Case B: If you use MMOCR as a dependency or third-party package, install it with pip:
80+
````
81+
82+
````{group-tab} Install via MIM
7883
7984
```shell
80-
pip install 'mmocr>=1.0.0rc0'
85+
86+
mim install 'mmocr>=1.0.0rc0'
87+
8188
```
8289
83-
**Step 3. (Optional)** If you wish to use any transform involving `albumentations` (For example, `Albu` in ABINet's pipeline), install the dependency using the following command:
90+
````
91+
92+
`````
93+
94+
**Step 2. (Optional)** If you wish to use any transform involving `albumentations` (For example, `Albu` in ABINet's pipeline), install the dependency using the following command:
95+
96+
`````{tabs}
97+
98+
````{group-tab} Install from Source
8499
85100
```shell
86-
# If MMOCR is installed from source
87101
pip install -r requirements/albu.txt
88-
# If MMOCR is installed via pip
102+
```
103+
104+
````
105+
106+
````{group-tab} Install via MIM
107+
108+
```shell
89109
pip install albumentations>=1.1.0 --no-binary qudida,albumentations
90110
```
91111
112+
````
113+
114+
`````
115+
92116
```{note}
93117
94118
We recommend checking the environment after installing `albumentations` to
95119
ensure that `opencv-python` and `opencv-python-headless` are not installed together, otherwise it might cause unexpected issues. If that's unfortunately the case, please uninstall `opencv-python-headless` to make sure MMOCR's visualization utilities can work.
96120
97121
Refer
98-
to ['albumentations`'s official documentation](https://albumentations.ai/docs/getting_started/installation/#note-on-opencv-dependencies) for more details.
122+
to [albumentations's official documentation](https://albumentations.ai/docs/getting_started/installation/#note-on-opencv-dependencies) for more details.
99123
100124
```
101125

102126
### Verify the installation
103127

104-
We provide a method to verify the installation via inference demo, depending on your installation method. You should be able to see a pop-up image and the inference result upon successful verification.
128+
You may verify the installation via this inference demo.
129+
130+
`````{tabs}
131+
132+
````{tab} Python
133+
134+
Run the following code in a Python interpreter:
135+
136+
```python
137+
>>> from mmocr.apis import MMOCRInferencer
138+
>>> ocr = MMOCRInferencer(det='DBNet', rec='CRNN')
139+
>>> ocr('demo/demo_text_ocr.jpg', show=True, print_result=True)
140+
```
141+
````
142+
143+
````{tab} Shell
144+
145+
If you installed MMOCR from source, you can run the following in MMOCR's root directory:
146+
147+
```shell
148+
python tools/infer.py demo/demo_text_ocr.jpg --det DBNet --rec CRNN --show --print-result
149+
```
150+
````
151+
152+
`````
153+
154+
You should be able to see a pop-up image and the inference result printed out in the console upon successful verification.
105155

106156
<div align="center">
107157
<img src="https://user-images.githubusercontent.com/24622904/187825445-d30cbfa6-5549-4358-97fe-245f08f4ed94.jpg" height="250"/>
108158
</div>
159+
<br />
109160

110161
```bash
111162
# Inference result
112-
{'rec_texts': ['cbanke', 'docece', 'sroumats', 'chounsonse', 'doceca', 'c', '', 'sond', 'abrandso', 'sretane', '1', 'tosl', 'roundi', 'slen', 'yet', 'ally', 's', 'sue', 'salle', 'v'], 'rec_scores': [...], 'det_polygons': [...], 'det_scores': tensor([...])}
113-
```
114-
115-
Run the following in MMOCR's directory:
116-
117-
```bash
118-
python mmocr/ocr.py --det DB_r18 --recog CRNN demo/demo_text_ocr.jpg --show
163+
{'predictions': [{'rec_texts': ['cbanks', 'docecea', 'grouf', 'pwate', 'chobnsonsg', 'soxee', 'oeioh', 'c', 'sones', 'lbrandec', 'sretalg', '11', 'to8', 'round', 'sale', 'year',
164+
'ally', 'sie', 'sall'], 'rec_scores': [...], 'det_polygons': [...], 'det_scores':
165+
[...]}]}
119166
```
120167

121-
Also can run the following codes in your Python interpreter:
122-
123-
```python
124-
from mmocr.ocr import MMOCR
125-
ocr = MMOCR(recog='CRNN', det='DB_r18')
126-
ocr.readtext('demo_text_ocr.jpg', show=True)
168+
```{note}
169+
If you are running MMOCR on a server without GUI or via SSH tunnel with X11 forwarding disabled, you may not see the pop-up window.
127170
```
128171

129172
## Customize Installation

0 commit comments

Comments
 (0)