not-hanjo-mei
diff --git a/‎.gitignore‎
Lines changed: 3 additions & 0 deletions b/‎.gitignore‎
Lines changed: 3 additions & 0 deletions
diff --git a/‎README.md‎
Lines changed: 2 additions & 0 deletions b/‎README.md‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎changes.txt‎
15.5 KB b/‎changes.txt‎
15.5 KB
diff --git a/‎docs/deploy_on_android.md‎
Lines changed: 182 additions & 0 deletions b/‎docs/deploy_on_android.md‎
Lines changed: 182 additions & 0 deletions
diff --git a/‎docs/install.md‎
Lines changed: 10 additions & 3 deletions b/‎docs/install.md‎
Lines changed: 10 additions & 3 deletions
diff --git a/‎docs/training.md‎
Lines changed: 5 additions & 5 deletions b/‎docs/training.md‎
Lines changed: 5 additions & 5 deletions
diff --git a/‎docs/webapi.md‎
Lines changed: 87 additions & 0 deletions b/‎docs/webapi.md‎
Lines changed: 87 additions & 0 deletions
diff --git a/‎melo/api.py‎
Lines changed: 4 additions & 2 deletions b/‎melo/api.py‎
Lines changed: 4 additions & 2 deletions
diff --git a/‎melo/text/chinese_mix.py‎
Lines changed: 1 addition & 1 deletion b/‎melo/text/chinese_mix.py‎
Lines changed: 1 addition & 1 deletion
@@ -6,6 +6,9 @@ multilingual_ckpts
 basetts_outputs_package/
 build/
 *.egg-info/
+melo/data/Teto/
 
 *.zip
+*.mp3
+*.flac
 *.wav
@@ -27,6 +27,8 @@ Some other features include:
 ## Usage
 - [Use without Installation](docs/quick_use.md)
 - [Install and Use Locally](docs/install.md)
+- [Deploy on Android](docs/deploy_on_android.md)
+- [OpenAI-compatible Web API](docs/webapi.md)
 - [Training on Custom Dataset](docs/training.md)
 
 The Python API and model cards can be found in [this repo](https://github.com/myshell-ai/MeloTTS/blob/main/docs/install.md#python-api) or on [HuggingFace](https://huggingface.co/myshell-ai).
 
@@ -0,0 +1,182 @@
+# Tutorial: Deploying MeloTTS on Android using Termux + proot Debian + Micromamba
+
+MeloTTS is designed to be lightweight and efficient, making it an viable choice for deployment on Android devices.
+
+This guide details how to install and run MeloTTS on an Android device using Termux, a Debian environment managed by `proot-distro`, and Micromamba for isolated Python environment management.
+
+**Disclaimer:** This setup is primarily for experimental purposes, don't use it for production.
+
+## Prerequisites
+
+1. **Android Device:** A reasonably capable Android phone. (recommended to use flagship phones released after 2022)
+   - **Important for Android 12+ users:** Before installing Termux, you should disable phantom process killing to prevent background processes from being terminated unexpectedly. This requires either root access or ADB with proper permissions. See the [Troubleshooting section](#troubleshooting-and-tips) for detailed instructions.
+2. **Termux App:** Installed on your device. Download the latest release from the [official GitHub repository](https://github.com/termux/termux-app/releases)
+3. **Internet Connection**
+4. **Storage Space:** Sufficient free space for Termux, Debian rootfs, Micromamba environments, Python dependencies (PyTorch), and MeloTTS source/models
+5. **Basic Linux Command Line Familiarity:** Helpful
+
+## Step 1: Install and Prepare Termux
+
+1. **Download and Install Termux:** Go to the [Termux GitHub Releases page](https://github.com/termux/termux-app/releases), download the latest `.apk` file appropriate for your device's architecture (usually `arm64-v8a`), and install it. Enable installation from unknown sources in Android settings if needed.
+2. **Open Termux.**
+3. **Update and upgrade Termux packages:** Run this command and answer `Y` (yes) to any prompts.
+   ```bash
+   pkg update && pkg upgrade -y
+   ```
+4. **Install `proot-distro`, `git`, and `curl`:** `proot-distro` manages Linux distributions, `git` clones repositories, and `curl` downloads files.
+   ```bash
+   pkg install proot-distro git curl -y
+   ```
+5. **Grant Storage Access:** Allows Termux/Debian to access your phone's shared storage.
+   ```bash
+   termux-setup-storage
+   ```
+   Confirm the permission request from Android. Shared storage is typically at `~/storage/shared/`.
+
+## Step 2: Install Debian Environment
+
+1. **Install Debian using `proot-distro`:** Downloads the Debian filesystem.
+   ```bash
+   proot-distro install debian
+   ```
+
+## Step 3: Enter Debian, Install Micromamba and System Dependencies
+
+1. **Log in to Debian:**
+   ```bash
+   proot-distro login debian
+   ```
+   Your prompt should change (e.g., `root@localhost:~#`). **All subsequent commands in Steps 3-5 are run inside this Debian environment unless stated otherwise.**
+2. **Update Debian's package list and upgrade packages:**
+   ```bash
+   apt update && apt upgrade -y
+   ```
+3. **Install essential build tools and runtime dependencies:**
+   ```bash
+   yes | apt install build-essential libsndfile1 ffmpeg curl bzip2 git nano mecab libmecab-dev mecab-ipadic-utf8
+   ```
+4. **Install Micromamba:** Run the official installation script.
+   ```bash
+   "${SHELL}" <(curl -L micro.mamba.pm/install.sh)
+   ```
+   *Follow any on-screen instructions. Defaults are usually fine.*
+5. **Initialize Shell for Micromamba:** Ensure the `micromamba` command is accessible.
+   ```bash
+   source ~/.bashrc
+   ```
+   *(Or exit and re-login to Debian: `exit`, then `proot-distro login debian`)*.
+6. **Verify Micromamba Installation:**
+   ```bash
+   micromamba --version
+   ```
+
+## Step 4: Create Micromamba Environment and Install MeloTTS
+
+1. **Create a dedicated environment:** Use Python 3.10.
+   ```bash
+   micromamba create -n melotts python=3.10 -c conda-forge -y
+   ```
+2. **Activate the environment:** **Crucial step before proceeding.**
+   ```bash
+   micromamba activate melotts
+   ```
+   Your prompt should now be prefixed with `(melotts)`.
+3. **Clone the MeloTTS Repository:** Navigate to a suitable directory (e.g., `~/`) and clone the repo.
+   ```bash
+   # Example: Clone into ~/MeloTTS
+   cd ~
+   git clone https://github.com/not-hanjo-mei/MeloTTS.git
+   ```
+4. **Navigate into the Cloned Directory:**
+   ```bash
+   cd MeloTTS
+   ```
+5. **Install MeloTTS and Dependencies:**
+   ```bash
+   pip install -e .
+   ```
+6. **Download Japanese Dictionary Data (UniDic):**
+   ```bash
+   python -m unidic download
+   ```
+7. ***(Optional) Install eunjeon (for Korean support):***
+   ```bash
+   pip install eunjeon python-mecab-ko python-mecab-ko-dic
+   ```
+8. **Download NLTK tagger:**
+   ```bash
+   python -m nltk.downloader averaged_perceptron_tagger_eng
+   ```
+   If the NLTK download fails, try this alternative method:
+   ```bash
+   # Ensure you're in the MeloTTS directory and have NLTK installed
+   python webapi/nltk_res.py
+   ```
+## Step 5: Use MeloTTS (Inside Activated Environment)
+
+**IMPORTANT:** Ensure the `melotts` environment is active (`micromamba activate melotts`). Check for the `(melotts)` prefix in your prompt.
+
+**Note:** Example scripts and test resources are available in the `test/` directory of the MeloTTS repository. You can use these files (such as `test_base_model_tts_package.py` and various example text files) to quickly verify your installation or experiment with the TTS functionality. See the contents of the `test/` folder for ready-to-use scripts and sample inputs.
+
+For the latest and most detailed usage instructions (including WebUI, CLI, and Python API), please refer to the **Usage** section in [install.md](./install.md#usage).
+
+This section covers how to:
+
+- Launch and use the WebUI
+- Use the command-line interface (CLI) for TTS
+- Use the Python API for programmatic access
+- Find example scripts and test resources
+
+## Step 6: Accessing Output Files from Android (Inside Debian)
+
+1. **Identify File Location:** Use `pwd` (e.g., `~/MeloTTS/`).
+2. **Copy Files to Shared Storage:**
+   * Example: Copy `output.wav` to Downloads folder:
+     ```bash
+     # Adjust path if needed
+     cp ./output.wav /sdcard/Download/output.wav
+     ```
+3. **Access on Android:** Use a File Manager app.
+
+## Step 7: Exiting and Re-entering (Inside Debian)
+
+1. **To fully exit:** `exit` at Termux prompt or close app.
+2. **To re-enter:**
+   * Open Termux.
+   * `proot-distro login debian`
+   * `micromamba activate melotts`
+   * `cd ~/MeloTTS` (if needed)
+   * Run commands.
+   * `micromamba deactivate`, `exit` when done.
+
+## Troubleshooting and Tips
+
+* **Check Environment Activation:** Always ensure `(melotts)` prefix is present.
+* **MeCab Issues:** If you see "RuntimeError: Could not configure working env. Have you installed MeCab?" during `pip install -e .` or runtime, this could be due to:
+  - Missing system packages: Ensure `mecab`, `libmecab-dev`, and `mecab-ipadic-utf8` are properly installed via `apt`.
+  - Python version mismatch: Make sure you created the Micromamba environment with Python 3.10 as specified in [Step 4](#step-4-create-micromamba-environment-and-install-melotts).
+* **Phantom Process Killing (Android 12+):** Android 12 and newer versions limit background processes, which can affect Termux. If you experience processes being killed unexpectedly:
+  - **Using ADB from a PC (Recommended):** Connect your Android device to a PC with ADB installed and run:
+    ```bash
+    # Disable phantom process killing
+    adb shell settings put global settings_enable_monitor_phantom_procs 0
+
+    # Set max_phantom_processes to maximum value to permanently disable killing of phantom processes
+    adb shell "/system/bin/device_config put activity_manager max_phantom_processes 2147483647"
+    ```
+  - Alternatively, for Android 12+, you can disable phantom process killing by running these commands in Termux (not in Debian):
+    ```bash
+    # Disable phantom process killing
+    settings put global settings_enable_monitor_phantom_procs 0
+    
+    # Disable device config sync to prevent settings from being reset
+    device_config set_sync_disabled_for_tests persistent
+    
+    # Verify settings
+    settings get global settings_enable_monitor_phantom_procs
+    device_config get_sync_disabled_for_tests
+    ```
+  - These commands require either root access or ADB with proper permissions.
+  - This is especially important for long-running processes or when multiple processes are spawned.
+  - For more detailed instructions on disabling phantom process killing, refer to [this comprehensive guide](https://github.com/agnostic-apollo/Android-Docs/blob/master/en/docs/apps/processes/phantom-cached-and-empty-processes.md#commands-to-disable-phantom-process-killing-and-tldr).
+* **Performance/RAM:** Significant limitations remain on mobile devices.
@@ -5,13 +5,16 @@
 - [Docker Install for Windows and macOS](#docker-install)
 - [Usage](#usage)
   - [Web UI](#webui)
+  - [Web API (OpenAI Compatible)](#web-api-openai-compatible)
   - [CLI](#cli)
   - [Python API](#python-api)
 
 ### Linux and macOS Install
-The repo is developed and tested on `Ubuntu 20.04` and `Python 3.9`.
+**Tested Environments:**
+- [Original repository](https://github.com/myshell-ai/MeloTTS): Ubuntu 20.04 + Python 3.9
+- [This fork](https://github.com/not-hanjo-mei/MeloTTS): Ubuntu 24.04 + Python 3.10(conda 24.9.2), Debian 12 + Python 3.10(Micromamba 2.1.0)
 ```bash
-git clone https://github.com/myshell-ai/MeloTTS.git
+git clone https://github.com/not-hanjo-mei/MeloTTS.git
 cd MeloTTS
 pip install -e .
 python -m unidic download
@@ -25,7 +28,7 @@ To avoid compatibility issues, for Windows users and some macOS users, we sugges
 
 This could take a few minutes.
 ```bash
-git clone https://github.com/myshell-ai/MeloTTS.git
+git clone https://github.com/not-hanjo-mei/MeloTTS.git
 cd MeloTTS
 docker build -t melotts . 
 ```
@@ -51,6 +54,10 @@ melo-ui
 # Or: python melo/app.py
 ```
 
+### Web API (OpenAI Compatible)
+
+See [webapi.md](./webapi.md) for more details.
+
 ### CLI
 
 You may use the MeloTTS CLI to interact with MeloTTS. The CLI may be invoked using either `melotts` or `melo`. Here are some examples:
 
@@ -1,8 +1,9 @@
 ## Training
 
-Before training, please install MeloTTS in dev mode and go to the `melo` folder. 
-```
+Before training, please install MeloTTS in dev mode and required dependencies, then go to the `melo` folder. Note: This training process assumes a proper Linux environment. For debugging issues during installation, you may want to check the [deploy_on_android.md](deploy_on_android.md) guide for additional troubleshooting tips.
+```bash
 pip install -e .
+pip install matplotlib==3.5.3
 cd melo
 ```
 
@@ -16,14 +17,14 @@ path/to/audio_002.wav |<speaker_name>|<language_code>|<text_002>
 The transcribed text can be obtained by ASR model, (e.g., [whisper](https://github.com/openai/whisper)). An example metadata can be found in `data/example/metadata.list`
 
 We can then run the preprocessing code:
-```
+```bash
 python preprocess_text.py --metadata data/example/metadata.list 
 ```
 A config file `data/example/config.json` will be generated. Feel free to edit some hyper-parameters in that config file (for example, you may decrease the batch size if you have encountered the CUDA out-of-memory issue).
 
 ### Training
 The training can be launched by:
-```
+```bash
 bash train.sh <path/to/config.json> <num_of_gpus>
 ```
 
@@ -34,4 +35,3 @@ Simply run:
 ```
 python infer.py --text "<some text here>" -m /path/to/checkpoint/G_<iter>.pth -o <output_dir>
 ```
-
@@ -0,0 +1,87 @@
+# MeloTTS Web API
+
+[This fork](https://github.com/not-hanjo-mei/MeloTTS) of MeloTTS provides an OpenAI-compatible web API for text-to-speech conversion, allowing you to use MeloTTS with the OpenAI Python SDK or any other client that supports the OpenAI API format.
+
+## Starting the Web API Server
+
+To start the web API server, run the following command from the MeloTTS root directory:
+
+```bash
+python webapi/webapi.py
+```
+
+This will start the server on port 18000 by default. You can access a simple API documentation at `http://localhost:18000/docs`.
+
+## API Endpoint
+
+The API implements the OpenAI-compatible endpoint for text-to-speech:
+
+```
+POST /v1/audio/speech
+```
+
+### Request Parameters
+
+| Parameter | Type | Description | Default |
+|-----------|------|-------------|--------|
+| `model` | string | The model to use for text-to-speech, currently does nothing | `"tts-1"` |
+| `input` | string | The text to convert to speech | Required |
+| `voice` | string | The voice to use, format can be `"lang/speaker"` or just a speaker ID | `"EN/EN-Default"` |
+| `response_format` | string | The format of the response (mp3, flac, wav) | `"mp3"` |
+| `speed` | float | The speed of the speech | `1.0` |
+
+### Voice Format
+
+The `voice` parameter can be specified in two formats:
+
+1. `"language/speaker"` - e.g., `"EN/EN-Default"`, `"ZH/ZH"`, etc.
+2. Just the speaker ID - e.g., `"EN-Default"`, `"ZH"`, etc.
+
+If only the speaker ID is provided, the language will be auto-detected from the input text.
+
+### Supported Languages and Voices
+
+The API supports the following languages and voices:
+
+- English (EN): `EN-Default`, `EN-US`, `EN-BR`, `EN_INDIA`, `EN-AU`
+- Spanish (ES): `ES`
+- French (FR): `FR`
+- Chinese (ZH): `ZH` (supports mixed Chinese and English)
+- Japanese (JP): `JP`
+- Korean (KR): `KR`
+
+## Example Usage with OpenAI Python SDK
+
+You can use the MeloTTS web API with the OpenAI Python SDK as follows:
+
+```python
+# You might want run this file in other environment with the OpenAI Python SDK
+
+from pathlib import Path
+import openai
+
+client = openai.OpenAI(api_key="sk-xxx", base_url="http://localhost:18000/v1")
+
+speech_file_path = Path(__file__).parent / "speech.mp3"
+
+with client.audio.speech.with_streaming_response.create(
+  model="tts-1",
+  voice="",
+  input="Dirty deeds done dirt cheap.",
+) as response:
+  response.stream_to_file(speech_file_path)
+```
+
+## Language Auto-detection
+
+The API includes automatic language detection. If the language is not specified in the `voice` parameter, it will be detected from the input text. If the detected language doesn't match the specified language, the API will use the appropriate model for the detected language.
+
+## Error Handling
+
+If an error occurs during speech generation, the API will return a 500 error with details about the error.
+
+## Notes
+
+- The API automatically selects the appropriate hardware (CPU/GPU) for inference.
+- Temporary files are automatically cleaned up after streaming.
+- For best performance, specify the language in the `voice` parameter.
@@ -121,8 +121,10 @@ def tts_to_file(self, text, speaker_id, output_path=None, sdp_ratio=0.2, noise_s
                         length_scale=1. / speed,
                     )[0][0, 0].data.cpu().float().numpy()
                 del x_tst, tones, lang_ids, bert, ja_bert, x_tst_lengths, speakers
-                # 
-            audio_list.append(audio)
+                
+            # Ref:
+            # https://github.com/myshell-ai/MeloTTS/pull/221
+            audio_list.append(utils.fix_loudness(audio,self.hps.data.sampling_rate))
         torch.cuda.empty_cache()
         audio = self.audio_numpy_concat(audio_list, sr=self.hps.data.sampling_rate, speed=speed)
 
 
@@ -238,7 +238,7 @@ def _g2p_v2(segments):
 
     text = "NFT啊！chemistry 但是《原神》是由,米哈\游自主，  [研发]的一款全.新开放世界.冒险游戏"
     text = '我最近在学习machine learning，希望能够在未来的artificial intelligence领域有所建树。'
-    text = '今天下午，我们准备去shopping mall购物，然后晚上去看一场movie。'
+    text = '你们有一个好，全世界跑到什么地方，你们比其他的西方记者啊，跑得还快。但是呢，问来问去的问题啊，都 too simple ， sometimes naive !'
     text = '我们现在 also 能够 help 很多公司 use some machine learning 的 algorithms 啊!'
     text = text_normalize(text)
     print(text)