Skip to content

mqxym/scanc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

scanc

Test Status

scanc = scan c(ode)
A fast, pure‑Python project code‑scanner that outputs clean, AI‑ready Markdown or XML.

scanc helps you spill an entire codebase into an LLM prompt (or a file) in seconds—while keeping noise low, controlling token budgets, and giving you full visibility.


Features

Feature Description
Blazing Fast, Pure‑Python Zero native dependencies; easy to install and run anywhere.
Smart Default Ignores Automatically skips node_modules, .venv, .git, and more.
Flexible Filters Include/exclude by extension, filename, or regex patterns.
Optional Directory Tree Prepend a fenced tree diagram of your project structure.
Token Counter Estimate LLM token costs with tiktoken before you paste.
Cross‑Platform CLI Works on macOS, Linux, and Windows out of the box.

Installation

# Optional: Use a virutal environment
python3 -m venv --prompt scanc-env .venv
source .venv/bin/activate

pip install scanc[tiktoken]  # installs optional token‑counter support

Quickstart

Scan a directory and emit Markdown:

scanc .                         # scan current folder
scanc -e py,js --tree           # only .py and .js files + directory tree
scanc -f xml                    # output scan in xml format (new in v1.2.0)
scanc -e py -x "tests" | less   # only py files exclude tests in path
scanc --tokens gpt-4o           # show token count for gpt 4o only
scanc -e py | pbcopy            # scan and copy (macOS copy command example)

Write output directly to a file:

scanc -e ts --tree -o scan.md src/
cat scan.md

CLI Reference

scanc [OPTIONS] [PATHS...]
  • -e, --ext EXTS Comma‑separated extensions to include (e.g. py,js).
  • -i, --include-regex Regex patterns to include (full path match).
  • -x, --exclude-regex Regex patterns to exclude (full path match).
  • --no-default-excludes Disable built‑in ignore list.
  • -t, --tree Prepend directory tree (fenced code block).
  • -T, --tokens MODEL Output only token count for given LLM model.
  • --max-size BYTES Skip files larger than BYTES (default 1 MiB).
  • --follow-symlinks Traverse symlinks when scanning.
  • -o, --out OUTFILE Write result to OUTFILE instead of stdout.
  • -f, --format FORMAT Output format (default: markdown).
  • -V, --version Show version and exit.

Integration & Extensibility

  • Formatter Hook: Customize output by passing your own formatter via entry points.
  • Extras: Use scanc[tiktoken] to enable token counting; more extras may follow.

Docker usage

A ready-to-run container is published to GitHub Container Registry (GHCR). It runs as non-root and scans the mounted host directory by default.

Pull

docker pull ghcr.io/mqxym/scanc-cli:latest

Scan the current project (read-only mount)

# Linux/macOS (Bash/Zsh)
docker run --rm -v "$PWD":/work:ro ghcr.io/mqxym/scanc-cli:latest

# Windows PowerShell
docker run --rm -v "${PWD}:/work:ro" ghcr.io/mqxym/scanc-cli:latest

Because the container’s WORKDIR is /work and ENTRYPOINT is scanc, passing . scans your host’s current folder.

Write output to a file

Either redirect on the host:

docker run --rm -v "$PWD":/work:ro ghcr.io/mqxym/scanc-cli:latest -e py --tree > scan.md

...or mount as writable and write into /work:

docker run --rm -v "$PWD":/work ghcr.io/mqxym/scanc-cli:latest -e py --tree -o /work/scan.md

Tip (Linux/macOS): preserve file ownership when writing by mapping your UID/GID

docker run --rm \
  --user "$(id -u)":"$(id -g)" \
  -v "$PWD":/work ghcr.io/mqxym/scanc-cli:latest -o /work/scan.md 

Examples

# Only Python & JS files, include directory tree
docker run --rm -v "$PWD":/work:ro ghcr.io/mqxym/scanc-cli:latest -e py,js --tree

# Token count only (requires optional 'tiktoken' which is baked into the image)
docker run --rm -v "$PWD":/work:ro ghcr.io/mqxym/scanc-cli:latest --tokens gpt-4o

Licence

Released under the MIT Licence. See LICENCE for details.

About

AI-ready code-base scanner that outputs Markdown or XML.

Topics

Resources

License

Stars

Watchers

Forks

Packages