Skip to content

evalops/bandit_dspy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

8 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

bandit_dspy

A production-ready library that integrates Bandit static analysis with DSPy for security-aware LLM development and optimization.

πŸš€ FEATURES

Core Security Integration

  • Security-Aware LLM Development: Integrate static analysis directly into your DSPy optimization loops
  • Advanced Bandit Integration: Leverage Bandit to identify Python security vulnerabilities in generated code with caching and performance optimizations
  • Configurable Security Metrics: Customizable security scoring with severity/confidence weighting and penalty thresholds

Optimization Algorithms

  • Multi-Algorithm Support: Choose from random search, genetic algorithms, Bayesian optimization, or MIPRO
  • GEPA Optimizer: State-of-the-art Reflective Prompt Evolution using LLM feedback for iterative security improvements
  • Multi-Objective Optimization: Balance security, performance, and functionality using Pareto frontiers

Production Features

  • Performance Optimizations: Bounded LRU caching for faster repeated analysis
  • Robust Error Handling: Graceful fallbacks and a comprehensive test suite
  • Type Hints: Typed core APIs and mypy-friendly usage
  • Configurable Pipeline: Flexible train/validation splits and optimization parameters

πŸ”§ PREREQUISITES

  • Python 3.8+
  • A configured DSPy Language Model (e.g., dspy.OllamaLocal, dspy.OpenAI)

πŸ“¦ INSTALLATION

From Source

git clone https://github.com/evalops/bandit_dspy.git
cd bandit_dspy
pip install -e .

Development Installation

pip install -e .[dev]  # Includes pytest, mypy, ruff, etc.

🚦 QUICK START

Basic Security Optimization

import dspy
from bandit_dspy import BanditTeleprompter, create_bandit_metric

class CodeGen(dspy.Signature):
    """Generate secure Python code for the given task."""
    description = dspy.InputField()
    code = dspy.OutputField()

class SecureCodeGenerator(dspy.Module):
    def __init__(self):
        super().__init__()
        self.predictor = dspy.Predict(CodeGen)
    def forward(self, description):
        return self.predictor(description=description)

dspy.settings.configure(lm=dspy.OpenAI(model="gpt-3.5-turbo"))

trainset = [
    dspy.Example(description="hash a password securely", code="import bcrypt
def hash_password(password): return bcrypt.hashpw(password.encode(), bcrypt.gensalt())").with_inputs('description'),
    dspy.Example(description="validate user input", code="import re
def validate_email(email): return bool(re.match(r'^[^@]+@[^@]+\.[^@]+$', email))").with_inputs('description'),
]

student = SecureCodeGenerator()
teleprompter = BanditTeleprompter(metric=create_bandit_metric(), optimization_method="genetic", k=3, num_candidates=10)
compiled = teleprompter.compile(student, trainset=trainset)
res = compiled(description="create a secure file upload function")
print(res.code)

GEPA Optimization

from bandit_dspy import GEPATeleprompter, create_bandit_metric

gepa = GEPATeleprompter(metric=create_bandit_metric(), max_iterations=4, population_size=3)
compiled = gepa.compile(student, trainset=trainset)

Performance Monitoring

import time
from bandit_dspy import BanditRunner

runner = BanditRunner(cache_maxsize=2000)

# Measure cache performance
code = "import os\npassword = 'hardcoded'"

# Cold cache
start = time.time()
result1 = runner.analyze_code(code)
cold_time = time.time() - start

# Warm cache  
start = time.time()
result2 = runner.analyze_code(code)
warm_time = time.time() - start

print(f"Speedup: {cold_time/warm_time:.1f}x")

πŸ§ͺ TESTING

Run All Tests

export DSPY_CACHEDIR="$(pwd)/.dspy_cache"  # ensure writable cache\npytest tests/ -v

Run with Coverage

pytest tests/ --cov=bandit_dspy --cov-report=html

Run Performance Tests

pytest tests/test_performance.py -v

Run GEPA Tests (requires dependencies)

pytest tests/test_gepa_optimizer.py -v

πŸ› TROUBLESHOOTING

Common Issues

Import Errors:

# Install all dependencies
pip install -e .[dev]

# Or install individual packages
pip install dspy-ai bandit

sqlite3 OperationalError: attempt to write a readonly database

# Set DSPy cache directory to a writable location
export DSPY_CACHEDIR="$(pwd)/.dspy_cache"

Memory Issues with Large Codebases:

# Reduce cache size
runner = BanditRunner(cache_maxsize=500)

# Use streaming evaluation
for code_chunk in large_codebase:
    result = runner.analyze_code(code_chunk)

Slow Optimization:

# Reduce candidates for faster iteration
teleprompter = BanditTeleprompter(
    num_candidates=5,          # Reduce from default 10
    optimization_method="random"  # Fastest method
)

# Use smaller population for genetic algorithm
teleprompter = BanditTeleprompter(
    optimization_method="genetic",
    genetic_config={"population_size": 10, "generations": 5}
)

πŸ”¬ DEVELOPMENT

Project Structure

bandit_dspy/
β”œβ”€β”€ src/bandit_dspy/
β”‚   β”œβ”€β”€ __init__.py           # Package exports and compatibility shims
β”‚   β”œβ”€β”€ core.py              # BanditRunner with caching and AST analysis  
β”‚   β”œβ”€β”€ metric.py            # SecurityMetric and create_bandit_metric factory
β”‚   β”œβ”€β”€ teleprompter.py      # BanditTeleprompter with multiple algorithms
β”‚   └── gepa_optimizer.py    # GEPA implementation for reflective optimization
β”œβ”€β”€ tests/                   # Comprehensive test suite
β”‚   β”œβ”€β”€ conftest.py         # Shared fixtures and test configuration
β”‚   β”œβ”€β”€ test_*.py           # Individual test modules
β”‚   └── test_gepa_*.py      # GEPA-specific tests
β”œβ”€β”€ example.py              # Quick start example
β”œβ”€β”€ pyproject.toml          # Project metadata and dependencies
└── README.md               # This file

Contributing

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature/amazing-feature
  3. Install dev dependencies: pip install -e .[dev]
  4. Make changes and add tests
  5. Run the test suite: export DSPY_CACHEDIR="$(pwd)/.dspy_cache" # ensure writable cache\npytest tests/ -v
  6. Run type checking: mypy src/bandit_dspy
  7. Run linting: ruff check src/ tests/
  8. Format code: black src/ tests/
  9. Commit with conventional messages: git commit -m "feat: add amazing feature"
  10. Push and create a pull request

Release Process

# Update version in pyproject.toml
# Create release notes
git tag v0.1.0
git push origin v0.1.0

# Build and publish
python -m build
twine upload dist/*

πŸ“„ LICENSE

This project is licensed under the MIT License. See the LICENSE file for details.

πŸ”— DEPENDENCIES

Required

  • dspy-ai: The DSPy framework for language model programming
  • bandit: Python security linter for static analysis

Optional Development

  • pytest: Testing framework with coverage and fixtures
  • mypy: Static type checking
  • ruff: Fast Python linter and formatter
  • black: Code formatting
  • pre-commit: Git hooks for code quality

πŸ“š RESOURCES

πŸ† ACKNOWLEDGMENTS

  • DSPy Team for the innovative language model programming framework
  • Bandit Contributors for the robust security analysis tool
  • GEPA Researchers for the reflective prompt evolution methodology

Built with ❀️ for secure AI development

About

A DSPy library for security-aware LLM development using Bandit.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages