Skip to content

Conversation

krizic
Copy link

@krizic krizic commented Oct 10, 2025

PR Type

Enhancement, Documentation


Description

  • Added comprehensive CLI tool for document processing

  • Converted JavaScript codebase to TypeScript with type definitions

  • Enhanced image optimization with OCR language support

  • Improved local file validation and processing capabilities


Changes walkthrough 📝

Relevant files
Enhancement
42 files
schemaValidator.js
Converted to TypeScript with improved formatting                 
+53/-58 
autogenerateSchema.js
Refactored with TypeScript and better error handling         
+53/-81 
extract.js
Added TypeScript types and image optimization parameters 
+53/-61 
convertToZodSchema.js
Enhanced enum validation and TypeScript conversion             
+59/-71 
formatter.js
Improved PDF validation and TypeScript support                     
+31/-39 
google.js
Refactored with TypeScript and better error handling         
+29/-42 
templates.js
Converted to TypeScript with better type safety                   
+15/-20 
index.js
Added new local models and TypeScript types                           
+15/-20 
ollama.js
Code formatting and TypeScript conversion                               
+19/-24 
openAI.js
Code formatting and TypeScript conversion                               
+16/-20 
fileValidator.js
Added local file path validation support                                 
+4/-7     
extract.ts
New CLI command for document extraction                                   
+147/-0 
utils.ts
Added OCR language support and image optimization               
+41/-16 
extract.ts
New TypeScript extract service with enhanced options         
+110/-0 
autogenerateSchema.ts
TypeScript implementation of schema generation                     
+107/-0 
schemaValidator.ts
TypeScript schema validation with proper types                     
+70/-0   
convertToZodSchema.ts
TypeScript implementation with proper type definitions     
+86/-0   
formatter.ts
TypeScript formatter service with improved error handling
+60/-0   
google.ts
TypeScript Google extractor with proper typing                     
+53/-0   
templates.ts
TypeScript templates service with SchemaField types           
+42/-0   
index.ts
TypeScript extractors index with new model support             
+39/-0   
types.ts
Added new local models and configuration options                 
+6/-1     
ollama.ts
TypeScript Ollama extractor implementation                             
+35/-0   
openAI.ts
TypeScript OpenAI extractor implementation                             
+31/-0   
file-helper.ts
New file helper utilities for CLI                                               
+38/-0   
convertToText.ts
TypeScript implementation of text conversion                         
+21/-0   
convert.ts
New CLI command for document conversion                                   
+73/-0   
fileValidator.ts
TypeScript file validator with local path support               
+57/-0   
templates.ts
New CLI command for template management                                   
+39/-0   
converter.ts
New TypeScript document converter service                               
+37/-0   
pdfValidator.ts
TypeScript PDF validator implementation                                   
+24/-0   
logger.ts
New CLI logger utility with colored output                             
+23/-0   
index.ts
Added new configuration options and exports                           
+15/-6   
index.ts
Added TypeScript exports and type definitions                       
+4/-1     
index.ts
Updated import paths for TypeScript                                           
+5/-5     
base.ts
TypeScript implementation of base schema                                 
+15/-0   
secondary.ts
TypeScript implementation of secondary schema                       
+16/-0   
cleanSchemaFields.ts
TypeScript implementation of schema field cleaning             
+12/-0   
ollama.ts
Updated import paths for TypeScript                                           
+3/-3     
openAI.ts
Updated import paths for TypeScript                                           
+3/-3     
google.ts
Updated import paths for TypeScript                                           
+3/-3     
index.ts
New CLI entry point with command structure                             
+66/-0   
Formatting
6 files
convertToText.js
Code formatting improvements                                                         
+13/-16 
pdfValidator.js
Code formatting improvements                                                         
+4/-5     
generateMarkdown.js
Code formatting improvements                                                         
+10/-10 
secondary.js
Code formatting improvements                                                         
+6/-9     
base.js
Code formatting improvements                                                         
+4/-8     
cleanSchemaFields.js
Code formatting improvements                                                         
+8/-7     
Documentation
21 files
extract.d.ts
Type definitions for extract service                                         
+24/-0   
convertToZodSchema.d.ts
Type definitions for Zod schema conversion                             
+15/-0   
secondary.d.ts
Type definitions for secondary schema                                       
+63/-0   
autogenerateSchema.d.ts
Type definitions for auto schema generation                           
+8/-0     
index.d.ts
Type definitions for extractors                                                   
+13/-0   
ollama.d.ts
Type definitions for Ollama extractor                                       
+10/-0   
openAI.d.ts
Type definitions for OpenAI extractor                                       
+10/-0   
google.d.ts
Type definitions for Google extractor                                       
+10/-0   
formatter.d.ts
Type definitions for formatter service                                     
+13/-0   
templates.d.ts
Type definitions for templates service                                     
+21/-0   
schemaValidator.d.ts
Type definitions for schema validator                                       
+13/-0   
generateMarkdown.d.ts
Type definitions for markdown generation                                 
+8/-0     
cleanSchemaFields.d.ts
Type definitions for schema field cleaning                             
+3/-0     
base.d.ts
Type definitions for base schema                                                 
+9/-0     
fileValidator.d.ts
Type definitions for file validator                                           
+7/-0     
pdfValidator.d.ts
Type definitions for PDF validator                                             
+7/-0     
convertToText.d.ts
Type definitions for text conversion                                         
+2/-0     
extract.d.ts.map
Source map for extract service types                                         
+1/-0     
index.d.ts.map
Source map for extractors types                                                   
+1/-0     
convertToZodSchema.d.ts.map
Source map for Zod schema types                                                   
+1/-0     
README.md
Comprehensive CLI documentation and usage guide                   
+254/-0 
Configuration changes
3 files
package.json
Updated build scripts and TypeScript configuration             
+13/-3   
package.json
Added CLI workspace and updated build process                       
+13/-5   
package.json
New CLI package configuration                                                       
+37/-0   
Additional files
40 files
tsconfig.json +26/-0   
index.d.ts +0/-2     
index.js +0/-174 
openAI.d.ts +0/-2     
openAI.js +0/-75   
google.d.ts +0/-5     
google.js +0/-71   
index.d.ts +0/-5     
index.js +0/-22   
ollama.d.ts +0/-5     
ollama.js +0/-67   
openAI.d.ts +0/-5     
openAI.js +0/-67   
completion.d.ts +0/-4     
completion.js +0/-2     
types.d.ts +0/-59   
types.js +0/-23   
utils.d.ts +0/-27   
utils.js +0/-258 
package.json +3/-2     
completion.ts +1/-1     
tsconfig.json +6/-2     
autogenerateSchema.d.ts.map +1/-0     
cleanSchemaFields.d.ts.map +1/-0     
base.d.ts.map +1/-0     
secondary.d.ts.map +1/-0     
converter.js +0/-21   
google.d.ts.map +1/-0     
ollama.d.ts.map +1/-0     
openAI.d.ts.map +1/-0     
prompts.ts +3/-3     
formatter.d.ts.map +1/-0     
templates.d.ts.map +1/-0     
convertToText.d.ts.map +1/-0     
fileValidator.d.ts.map +1/-0     
generateMarkdown.d.ts.map +1/-0     
generateMarkdown.ts +18/-0   
pdfValidator.d.ts.map +1/-0     
schemaValidator.d.ts.map +1/-0     
tsconfig.json +27/-0   

Added TS support
Adds a command-line interface for document processing and
structured data extraction.

Includes commands for extracting data, converting documents to
markdown or plaintext, and listing available templates.  Also sets
up new CLI options for using local LLMs and adds associated documentation.

These changes streamline the extraction and conversion processes and makes Documind
functionality accessible from the command line.
Extends file validation to check local file paths,
verifying both extension and existence.

This change allows the application to process files
directly from the file system, in addition to URLs.
Introduces options to control image quality, max width,
and OCR language for document processing, enhancing performance
and accuracy, allowing to reduce image size for LLM vision.

Updates core library to optimize image conversion with
lower DPI and JPG format for better compression.

Extends CLI with options for image quality, max width, and OCR
language.

Adds new local models for extraction
@krizic
Copy link
Author

krizic commented Oct 10, 2025

PR Description updated to latest commit (141fe50)

@krizic
Copy link
Author

krizic commented Oct 10, 2025

PR Overview: CLI, TypeScript Migration, and Enhanced Document Processing

Core Changes

1. TypeScript Migration & Type Safety

  • Complete JavaScript → TypeScript conversion across the entire codebase
  • Added comprehensive type definitions with interfaces for:
    • SchemaField - Document schema structure
    • ExtractOptions/ExtractResult - Extraction parameters and results
    • ExtractorParams - LLM provider interface
    • ValidationResult - Schema validation output
  • Enhanced type safety with proper Zod schema integration

2. New CLI Implementation

  • Command-line interface built with Commander.js
  • Three main commands:
    • extract - Structured data extraction from documents
    • convert - Document format conversion (markdown/plaintext)
    • templates - Template management
  • Rich feature set:
    • Multiple LLM provider support (OpenAI, Google, Ollama)
    • Schema validation and auto-generation
    • File output capabilities
    • Progress indicators with Ora
    • Colored logging with Chalk

3. Enhanced Image Processing & OCR

  • Multi-language OCR support via Tesseract.js
    • Configurable language parameter (eng, deu, fra, etc.)
  • Image optimization improvements:
    • JPEG compression with quality control (1-100)
    • Maximum width constraints for resizing
    • Reduced DPI (300 → 72) for LLM compatibility
    • Automatic orientation correction
  • Better file validation supporting local paths and URLs

4. Architectural Improvements

  • Modular extractor system with provider abstraction
  • Enhanced schema validation with recursive field checking
  • Improved error handling with structured error messages
  • Template system for predefined extraction schemas

Technical Specifications

New CLI Features

# Extraction with auto-schema
documind extract -f invoice.pdf --auto-schema

# Custom schema extraction  
documind extract -f doc.pdf -s schema.json -o output.json

# Local LLM integration
documind extract -f doc.pdf -m llama3.2-vision --base-url http://localhost:11434/v1

Enhanced Configuration Options

  • OCR Language Support: --language eng|deu|fra|...
  • Image Quality Control: --image-quality 85 (1-100)
  • Size Optimization: --max-image-width 2048
  • Local LLM Integration: --base-url for Ollama support

Type System

interface SchemaField {
  name: string;
  type: 'string' | 'number' | 'boolean' | 'enum' | 'object' | 'array';
  description?: string;
  values?: string[];
  children?: SchemaField[];
}

Key Benefits

  1. Developer Experience: Full TypeScript support with comprehensive type definitions
  2. Operational Flexibility: CLI enables automation and scripting workflows
  3. Internationalization: Multi-language OCR expands global usability
  4. Performance: Image optimization reduces processing time and costs
  5. Extensibility: Modular architecture supports additional LLM providers

Integration Points

  • Environment Variables: OPENAI_API_KEY, GEMINI_API_KEY, BASE_URL
  • File Formats: PDF, PNG, JPG, TXT, DOCX, HTML
  • LLM Providers: OpenAI, Google Gemini, Ollama (local models)
  • Output Formats: JSON, Markdown, Plaintext

This PR represents a significant maturity step for the codebase, transitioning from a library-focused approach to a comprehensive toolchain with enterprise-grade features and developer tooling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant