You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+35-13Lines changed: 35 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -19,7 +19,18 @@
19
19
20
20
> ### Memory note
21
21
>
22
-
> The lite model runs offline and is memory-friendly; the full model is larger and offers higher accuracy. Choose the model that best fits your constraints.
22
+
> The lite model runs offline and is memory-friendly; the full model is larger and offers higher accuracy.
23
+
>
24
+
> Approximate memory usage (RSS after load):
25
+
> - Lite: ~45–60 MB
26
+
> - Full: ~170–210 MB
27
+
> - Auto: tries full first, falls back to lite only on MemoryError.
28
+
>
29
+
> Notes:
30
+
> - Measurements vary by Python version, OS, allocator, and import graph; treat these as practical ranges.
31
+
> - Validate on your system if constrained; see `examples/memory_usage_check.py` (credit: script by github@JackyHe398).
32
+
>
33
+
> Choose the model that best fits your constraints.
23
34
24
35
## Installation 💻
25
36
@@ -75,30 +86,35 @@ from fast_langdetect import LangDetectConfig, detect
# Set a default model via config and let calls omit model
91
+
cfg_lite = LangDetectConfig(model="lite")
92
+
print(detect("Hello", config=cfg_lite)) # uses lite by default
93
+
print(detect("Bonjour", config=cfg_lite)) # uses lite by default
94
+
print(detect("Hello", model='full', config=cfg_lite)) # per-call override to full
95
+
78
96
```
79
97
80
98
### Native API (Recommended)
81
99
82
100
```python
83
-
from fast_langdetect import detect, LangDetector, LangDetectConfig, DetectError
101
+
from fast_langdetect import detect, LangDetector, LangDetectConfig
84
102
85
-
# Simple detection (auto behavior)
86
-
print(detect("Hello, world!", model='auto', k=1))
103
+
# Simple detection (uses config default if not provided; defaults to 'auto')
104
+
print(detect("Hello, world!", k=1))
87
105
# Output: [{'lang': 'en', 'score': 0.98}]
88
106
89
107
# Using full model for better accuracy
90
108
print(detect("Hello, world!", model='full', k=1))
91
109
# Output: [{'lang': 'en', 'score': 0.99}]
92
110
93
111
# Custom configuration
94
-
config = LangDetectConfig(cache_dir="/custom/cache/path") # Custom model cache directory
112
+
config = LangDetectConfig(cache_dir="/custom/cache/path", model="auto") # Custom cache + default model
95
113
detector = LangDetector(config)
96
114
97
-
try:
98
-
result = detector.detect("Hello world", model='full', k=1)
99
-
print(result) # [{'lang': 'en', 'score': 0.98}]
100
-
except DetectError as e:
101
-
print(f"Detection failed: {e}")
115
+
# Omit model to use config.model; pass model to override
116
+
result = detector.detect("Hello world", k=1)
117
+
print(result) # [{'lang': 'en', 'score': 0.98}]
102
118
103
119
# Multiline text is handled automatically (newlines are replaced)
104
120
multiline_text ="Hello, world!\nThis is a multiline text."
@@ -121,10 +137,16 @@ print(results)
121
137
122
138
#### Fallback Policy (Keep It Simple)
123
139
124
-
- Only MemoryError triggers fallback (in `model='auto'`): when loading the full model runs out of memory, it falls back to the lite model.
125
-
- I/O/network/permission/path/integrity errors raise `DetectError` (with original exception) — no silent fallback.
140
+
- Only `MemoryError` triggers fallback (in `model='auto'`): when loading the full model runs out of memory, it falls back to the lite model.
141
+
- I/O/network/permission/path/integrity errors raise standard exceptions (e.g., `FileNotFoundError`, `PermissionError`) or library-specific errors where applicable — no silent fallback.
126
142
-`model='lite'` and `model='full'` never fallback by design.
127
143
144
+
#### Errors
145
+
146
+
- Base error: `FastLangdetectError` (library-specific failures).
147
+
- Model loading failures: `ModelLoadError`.
148
+
- Standard Python exceptions (e.g., `ValueError`, `TypeError`, `FileNotFoundError`, `MemoryError`) propagate when they are not library-specific.
149
+
128
150
### Convenient `detect_language` Function
129
151
130
152
```python
@@ -177,7 +199,7 @@ print(detector.detect("Some very long text..."))
177
199
### Cache Directory Behavior
178
200
179
201
- Default cache: if `cache_dir` is not set, models are stored under a system temp-based directory specified by `FTLANG_CACHE` or an internal default. This directory is created automatically when needed.
180
-
- User-provided cache_dir: if you set `LangDetectConfig(cache_dir=...)` to a path that does not exist, the library raises `DetectError` instead of silently creating or using another location. Create the directory yourself if that’s intended.
202
+
- User-provided cache_dir: if you set `LangDetectConfig(cache_dir=...)` to a path that does not exist, the library raises `FileNotFoundError` instead of silently creating or using another location. Create the directory yourself if that’s intended.
0 commit comments