A high-performance, scalable tool to download and archive your entire IMAP mailbox.
This script connects to any IMAP email account, downloads all messages, and optionally converts them into the standard .mbox
format. It is heavily optimized for very large mailboxes (100GB+) and designed for reliability, using a SQLite database to track progress and prevent duplicate downloads or missing emails across folders.
This is an enhanced fork of the original imaptombox by BeforeMyCompileFails.
This version includes major architectural improvements for performance and reliability:
- ✅ Connects to any IMAP server (e.g., Gmail, Outlook, self-hosted) with SSL/TLS.
- 🧠 Intelligent Duplicate Detection: Uses a global
Message-ID
cache to prevent downloading the same email multiple times, even if it's in different folders (e.g., INBOX and 'All Mail'). - ⚡ High-Performance Syncing: Uses efficient UID set comparisons and batched IMAP requests to quickly identify and download only new messages, drastically reducing sync time.
- 🗃️ Scalable SQLite Backend: Replaces the original JSON metadata file with a robust SQLite database, ensuring high performance and data integrity for mailboxes with millions of messages.
- 📂 Flexible Output: Save emails in a structure that mirrors your server's folders or consolidate them all into a single directory.
- 📊 Visual Progress Bars: Uses
tqdm
to show real-time progress, so you always know what the script is doing. - 🔐 Secure Password Handling: Prompts for your password securely and does not store it in your command history.
- 🔄 Optional MBOX Conversion: Download emails as individual
.eml
files and convert them to a single.mbox
file, either in one step or separately. - 🔧 Robust Error Handling: Gracefully handles connection issues and problematic emails.
Make sure you have Python 3.7+ installed.
-
Clone the repository:
git clone https://github.com/ConstantinHvber/imaptombox.git cd imaptombox-enhanced
-
Install the recommended dependency: The script works without any dependencies, but
tqdm
is highly recommended for progress bars.pip install -r requirements.txt
This command connects to the server, downloads all new emails using high-performance batching, and then converts the entire local cache to a single .mbox
file.
python3 imaptombox.py \
--host imap.gmail.com \
--username your.email@gmail.com \
--batch-requests \
--convert
Run this daily or weekly to quickly fetch only the newest emails without creating a new .mbox
file each time.
python3 imaptombox.py \
--host imap.mail.com \
--username user@mail.com \
--batch-requests
A great option for creating a simple, searchable archive of .eml
files. The original folder name is prepended to each filename.
python3 imaptombox.py \
--host imap.office365.com \
--username user@outlook.com \
--batch-requests \
--single-folder
If you have already downloaded your emails, you can run the conversion step at any time.
python3 imaptombox.py --skip-download --convert --output-dir emails/
Option | Description |
---|---|
--host |
(Required) IMAP server hostname. |
--username |
(Required) Email account username. |
--password |
Password for the email account. If not provided, you will be prompted securely. |
--port |
IMAP server port (default: 993). |
--no-ssl |
Disable SSL connection (not recommended). |
Option | Description |
---|---|
--output-dir |
Directory to save emails and metadata (default: emails ). |
--folders |
Download from specific folders only (e.g., --folders INBOX Sent ). Defaults to all. |
--inbox-only |
A shortcut to only download from the INBOX folder. |
--batch-requests |
(Recommended) Use batched IMAP requests to dramatically speed up checking for new emails. |
--batch-size |
Number of messages to check in each batch request (default: 250). |
--single-folder |
Save all emails into a single all_emails directory instead of mirroring the server's folder structure. |
--download-all |
Force re-checking of all server emails against the database. Use if you suspect the local metadata is out of sync. |
--no-progress |
Disable the visual progress bar (useful for logging or cron jobs). |
Option | Description |
---|---|
--convert |
Convert downloaded .eml files to .mbox format after the download is complete. |
--skip-download |
Skip the download process and only run the converter on existing files. |
--convert-folder |
Convert only a specific sub-folder from your output directory. |
--mbox-file |
Set a custom output filename for the .mbox file. |
Option | Description |
---|---|
--debug |
Enable verbose debug logging for troubleshooting. |
The script will create an output directory (defaulting to emails/
) with the following structure:
emails/
├── INBOX/
│ ├── 12345_Hello_World.eml
│ └── ...
├── Sent/
│ ├── 12346_Sent_Message.eml
│ └── ...
├── metadata.db <-- SQLite database for tracking emails
└── all_emails_... .mbox <-- Generated mbox file (if --convert is used)
If --single-folder
is used, the structure will be:
emails/
├── all_emails/
│ ├── INBOX_12345_Hello_World.eml
│ ├── Sent_12346_Sent_Message.eml
│ └── ...
├── metadata.db
└── ...
The script is self-contained and relies on standard Python libraries.
For the best user experience, one optional package is recommended:
tqdm
: Provides a visual progress bar.
Install it via the included requirements.txt
:
pip install -r requirements.txt
This project was originally created by Denis (BeforeMyCompileFails).
This fork has been reworked and is maintained by ConstantinHvber. The new version introduces a SQLite backend, high-performance batching, global duplicate detection, allows for download without conversion, and other enhancements.