Anthropic Export Extractor

A production-grade C tool for extracting and organizing conversations from Anthropic Claude JSON exports into human-readable markdown files with structured artifact management.

Author: Richard Tune rich@quantumencoding.io Company: QUANTUM ENCODING LTD License: MIT

Overview

The Anthropic Export Extractor transforms massive, monolithic conversations.json exports from Anthropic Claude into a clean, organized directory structure with:

Markdown files for each conversation (human-readable format)
Artifact extraction (embedded files, code snippets, attachments)
JSON manifests for each conversation (machine-readable metadata)
Timestamped output directories for versioning and audit trails

Features

✅ Production-Grade Architecture: Built with the library-first pattern (core logic in libjson_parser.a)
✅ RFC 8259 Compliant: Full JSON parsing with proper error handling
✅ Large File Support: Handles 300MB+ JSON exports efficiently
✅ Artifact Management: Extracts embedded content to organized artifacts/ directories
✅ Metadata Preservation: Captures UUIDs, timestamps, and conversation structure
✅ Cross-Platform: Builds on Linux, macOS, and Windows (with minor adjustments)

Build Instructions

Prerequisites

GCC or Clang compiler
GNU Make
Standard C11 support

Building

make all          # Build all binaries
make rebuild      # Clean and rebuild from scratch
make help         # Display usage information

This produces:

anthropic_export_extractor - The main extraction tool
json_parser - JSON validation and pretty-print utility
libjson_parser.a - The reusable JSON parser library

Usage

Getting Help

./anthropic_export_extractor --help    # Display full help menu
./anthropic_export_extractor -h        # Short version
make help                              # Via makefile

Basic Extraction

./anthropic_export_extractor conversations.json

Output Structure

The tool creates a timestamped directory with the following structure:

extracted_conversations_2025-09-30_17-08-37/
├── Conversation_Name_abc12345/
│   ├── Conversation_Name.md        # Human-readable markdown
│   ├── manifest.json               # Machine-readable metadata
│   └── artifacts/                  # Extracted files/attachments
│       ├── code_snippet.py
│       ├── diagram.svg
│       └── data.csv
├── Another_Conversation_def67890/
│   ├── Another_Conversation.md
│   ├── manifest.json
│   └── artifacts/
└── ...

Output Format

Each conversation directory contains:

Markdown File (*.md): Complete conversation history with:
- Conversation metadata (UUID, timestamps)
- Message-by-message transcripts
- Sender identification (human/assistant)
- Links to extracted artifacts
Manifest File (manifest.json): Structured metadata including:
- Conversation metadata
- List of artifacts and their types
- Statistics (message count, artifact count)
- External file references
Artifacts Directory: Extracted embedded content:
- Code files
- Attachments
- Images
- Data files

Example Output

═══════════════════════════════════════════════════════
   ANTHROPIC EXPORT EXTRACTOR
═══════════════════════════════════════════════════════

Input: conversations.json (335544320 bytes)

Parsing JSON...
Found 615 conversations

Created root output directory: extracted_conversations_2025-09-30_17-08-37/

Extracting conversations:
───────────────────────────────────────────────────────
  [23] Build_quantum_library (msg:23 art:5 ext:0)
  [18] Implement_neural_network (msg:18 art:12 ext:2)
  [45] Debugging_session (msg:45 art:8 ext:1)
  ...
───────────────────────────────────────────────────────

✓ Extraction complete: 615/615 conversations processed
✓ Output directory: extracted_conversations_2025-09-30_17-08-37/

Architecture

This project follows the Library-First Pattern:

Core Library (libjson_parser.a): Self-contained, reusable JSON parsing engine
- Zero external dependencies
- Clean public API via json_parser.h
- Full memory management and error handling
Application Binary (anthropic_export_extractor): Thin wrapper that:
- Handles command-line arguments
- Manages file I/O and directory creation
- Links against the core library

This architecture enables:

Code reuse across multiple projects
Easy testing and validation
Clean separation of concerns
Maintainable, production-grade code

Development

Using the JSON Parser Tool

The included json_parser utility can validate and format any JSON file:

./json_parser --help                  # Show help
./json_parser data.json               # Parse and display JSON
./json_parser --validate data.json    # Validate only
./json_parser --pretty data.json      # Pretty-print JSON
./json_parser --compact data.json     # Minify JSON

Cleaning Build Artifacts

make clean        # Remove all build artifacts

Code Structure

json_parser.c/h - Core JSON parsing library
json_extractor.c - Main extraction logic
main.c - JSON parser test suite
Makefile - Build system

Use Cases

Knowledge Management: Archive and organize AI conversation history
Documentation: Extract project documentation from development conversations
Research: Analyze conversation patterns and artifact usage
Backup: Create searchable, human-readable backups of Claude conversations
Migration: Transform Anthropic exports into other formats

Technical Specifications

Language: C11
Build System: GNU Make
Dependencies: Standard C library only
Memory Management: Manual allocation with comprehensive cleanup
Error Handling: POSIX errno + custom error reporting
File Format: UTF-8 text (Markdown, JSON)

Performance

Tested with 300MB+ JSON exports
Handles 600+ conversations efficiently
Minimal memory footprint (streaming parser architecture)
Fast execution on commodity hardware

Contributing

This is a production tool from QUANTUM ENCODING LTD. For inquiries, contact:

Richard Tune Email: rich@quantumencoding.io Company: QUANTUM ENCODING LTD

License

MIT License - See LICENSE file for details

Built with precision. Built for production. Built by QUANTUM ENCODING LTD.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
anthropic_export_extractor		anthropic_export_extractor
README.md		README.md
README_RELEASE.md		README_RELEASE.md
RELEASE_NOTES.md		RELEASE_NOTES.md
SHA256SUMS.txt		SHA256SUMS.txt
anthropic_export_extractor-v1.0.0.tar.gz		anthropic_export_extractor-v1.0.0.tar.gz
anthropic_export_extractor-v1.0.0.zip		anthropic_export_extractor-v1.0.0.zip

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Anthropic Export Extractor

Overview

Features

Build Instructions

Prerequisites

Building

Usage

Getting Help

Basic Extraction

Output Structure

Output Format

Example Output

Architecture

Development

Using the JSON Parser Tool

Cleaning Build Artifacts

Code Structure

Use Cases

Technical Specifications

Performance

Contributing

License

About

Uh oh!

Releases 1

Packages

Contributors 2

Uh oh!

Languages

quantum-encoding/anthropic-export-extractor

Folders and files

Latest commit

History

Repository files navigation

Anthropic Export Extractor

Overview

Features

Build Instructions

Prerequisites

Building

Usage

Getting Help

Basic Extraction

Output Structure

Output Format

Example Output

Architecture

Development

Using the JSON Parser Tool

Cleaning Build Artifacts

Code Structure

Use Cases

Technical Specifications

Performance

Contributing

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Uh oh!

Languages

Packages