Batch PDF OCR Processor for Windows

Batch process all PDF files in a folder to make them searchable with OCR using ocrmypdf and a simple PowerShell script. Output files are saved in an output subfolder. Perfect for Windows users needing fast PDF text recovery.

Features

Processes all PDF files in the current folder
Runs OCR to make PDFs searchable (text layer added)
Outputs processed PDFs to an output subfolder

Prerequisites

Windows 10/11
PowerShell (already included in Windows)
Chocolatey package manager (for easy installation)
Python 3 (with pip)
Tesseract-OCR
Ghostscript
ocrmypdf

Optional but Recommended

pngquant (for better image compression)
jbig2 (for advanced PDF compression, but see important Windows note below)

Step-by-Step Installation (Stupid-Proof)

1. Install Chocolatey

Chocolatey lets you install Windows programs from the command line.

Open PowerShell as Administrator (Right click PowerShell > "Run as Administrator").

Paste this command and press Enter:

Set-ExecutionPolicy Bypass -Scope Process -Force; `
  [System.Net.ServicePointManager]::SecurityProtocol = [System.Net.ServicePointManager]::SecurityProtocol -bor 3072; `
  iex ((New-Object System.Net.WebClient).DownloadString('https://community.chocolatey.org/install.ps1'))

Close and reopen PowerShell (as normal user is fine for next steps).

2. Install Python and Pip

Using Chocolatey (in PowerShell):

choco install python -y

This will install Python and pip.
Close and reopen PowerShell after installation.
Test with:
```
python --version
pip --version
```

3. Install Required Packages (ocrmypdf, tesseract, ghostscript)

Install Tesseract and Ghostscript using Chocolatey:

choco install tesseract -y
choco install ghostscript -y

Install ocrmypdf (using pip):

pip install ocrmypdf

4. (Optional) Install Additional Recommended Packages

pngquant

For better image compression, install:

choco install pngquant -y

jbig2 (Advanced, Optional, Not Directly Supported on Windows)

jbig2 is an optional dependency that can improve PDF compression.

Important: There is no official Windows binary and it is not available via Chocolatey.
If you require jbig2, you will need to manually compile it from source or find a trusted third-party binary for Windows. For most users, this step can be skipped.

5. Enable PowerShell Script Execution

IMPORTANT:
By default, Windows may prevent running scripts.
Before running the script, in PowerShell, execute:

Set-ExecutionPolicy -Scope Process -ExecutionPolicy Bypass

This change is temporary and only for the current PowerShell window.

Usage

Place ocr_batch.ps1 in the same folder as your PDFs.
Open PowerShell in that folder (Shift + Right Click in the folder > "Open PowerShell window here").
Run the script:
```
.\ocr_batch.ps1
```
Processed PDFs will appear in the output subfolder.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.github		.github
Powershell_script		Powershell_script
CITATION.cff		CITATION.cff
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
SUPPORT.md		SUPPORT.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Repository files navigation

Batch PDF OCR Processor for Windows

Features

Prerequisites

Optional but Recommended

Step-by-Step Installation (Stupid-Proof)

1. Install Chocolatey

2. Install Python and Pip

3. Install Required Packages (ocrmypdf, tesseract, ghostscript)

4. (Optional) Install Additional Recommended Packages

pngquant

jbig2 (Advanced, Optional, Not Directly Supported on Windows)

5. Enable PowerShell Script Execution

Usage

About

Uh oh!

Releases 1

Sponsor this project

Uh oh!

Uh oh!

Languages

Uh oh!

License

R0mb0/Batch_PDF_OCR_Processor

Folders and files

Latest commit

History

Repository files navigation

Batch PDF OCR Processor for Windows

Features

Prerequisites

Optional but Recommended

Step-by-Step Installation (Stupid-Proof)

1. Install Chocolatey

2. Install Python and Pip

3. Install Required Packages (ocrmypdf, tesseract, ghostscript)

4. (Optional) Install Additional Recommended Packages

pngquant

jbig2 (Advanced, Optional, Not Directly Supported on Windows)

5. Enable PowerShell Script Execution

Usage

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 1

Sponsor this project

Uh oh!

Uh oh!

Languages