-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Description:
Currently, the ability to download the SQL database and launch a local instance (via Datasette) exists only as a Bash script (tests/run-integration-tests.sh) used for integration testing. This functionality is essential not just for testing but also for users working in environments where direct remote access is unreliable or blocked (e.g., due to filestream issues or institutional firewalls).
We should expose this functionality directly within the Python package to allow any user (including those installing via pip) to:
• Download the remote .db file from a known URL
• Save it to a local path
• Launch a local Datasette server
• Optionally return the local URL or manage .env updates
Why this matters:
• Reduces friction for users who want local access without manually running shell scripts
• Makes the workflow programmatically accessible within Python workflows (e.g., Jupyter, CLI wrappers, etc.)
• Avoids stream redirection issues that arise from remote file access in constrained environments
• Enables broader support on cloud clusters (e.g., SLURM jobs or JupyterHub setups)
⸻
Proposal:
Add a function to the package, e.g., apprent.utils.download_and_launch_local_db() with the following behavior:
def download_and_launch_local_db(
db_url: str = "https://apparent.topology.rocks/us_physician_referral_networks.db",
db_path: str = "data/us_physician_referral_networks.db",
port: int = 8001,
update_env: bool = True
) -> str:
"""
Downloads the SQL database from a remote URL and launches a local Datasette instance.
Returns:
local_url (str): The URL of the locally running Datasette instance.
"""
• Use requests or urllib to download the .db if not present
• Use subprocess.Popen to launch Datasette
• Wait for readiness with backoff retry logic (like current curl check)
• Optionally write the resulting local URL to a .env file
⸻
Reference:
Current logic implemented in:
tests/run-integration-tests.sh
⸻
Tasks:
• Implement download_and_launch_local_db() in apprent/utils.py
• Add .env update logic if update_env=True
• Add logging and error handling for Datasette readiness
• Write unit tests for download logic and mock Datasette launch
• Add usage example in the README or examples/ folder