Skip to content
Open
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 6 additions & 1 deletion .github/workflows/deploy-docs.yml
Original file line number Diff line number Diff line change
@@ -1,9 +1,12 @@
name: Docs
# build the documentation whenever there are new commits on main
# build the documentation on PRs to main and deploy on pushes to main
on:
push:
branches:
- main
pull_request:
branches:
- main

# security: restrict permissions for CI jobs.
permissions:
Expand All @@ -30,7 +33,9 @@ jobs:

# Deploy the artifact to GitHub pages.
# This is a separate job so that only actions/deploy-pages has the necessary permissions.
# Only deploy on pushes to main, not on pull requests.
deploy:
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
needs: build
runs-on: ubuntu-latest
permissions:
Expand Down
66 changes: 66 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,7 @@ Most actions can completed using the `Apparent` object, including the following
4. _Compare Networks:_ Analyze pairwise distances between networks using metrics like Forman curvature and Ollivier-Ricci curvature.
5. _Embed Networks:_ Reduce dimensionality for visualization and machine learning.
6. _Cluster Networks:_ Group similar networks using clustering algorithms like `KMeans`, `DBSCAN`, and hierarchical clustering.
7. _Local Database:_ Download the SQL database and launch a local Datasette instance for environments with limited connectivity or firewall restrictions.

### Quick Example

Expand Down Expand Up @@ -126,6 +127,45 @@ A.embed()
A.cluster_networks()
```

### Working with a Local Database

If you're in an environment with connectivity issues, firewall restrictions, or need offline access, you can download the database and run a local Datasette instance:

```python
from apparent import Apparent
from apparent.utils import download_and_launch_local_datasette, stop_local_datasette

# Download the database and start a local Datasette server
local_url = download_and_launch_local_datasette(verbose=True)

print(f"Local Datasette server running at: {local_url}")
# Now you can use Apparent as normal, it will automatically use the local URL
app = Apparent(base_url=local_url)

# Simple sample query that mirrors the test patterns
# This gets basic network info for small networks from 2017
query = """
SELECT
hospital_atlas_data.hsa,
hospital_atlas_data.year,
hospital_atlas_data.latitude,
hospital_atlas_data.longitude
FROM
hospital_atlas_data
WHERE
hospital_atlas_data.year = 2017
LIMIT
10;
"""

app.pull(query)
print(f"Retrieved {len(app.data)} networks")
print(app.data.head())

# Stop the local Datasette server when done
stop_local_datasette(port=8001)
```

## 🀝 Contributing

Contributions are welcome! To contribute:
Expand Down Expand Up @@ -163,6 +203,10 @@ pytest -m unit

### Integration Tests

You can run integration tests in two ways:

#### Option 1: Using the helper script

A script is provided to simplify running the integration tests. This script handles:

1. Downloading the raw dataset (under `data/us_physician_referral_networks.db`).
Expand All @@ -177,6 +221,28 @@ To execute the script, run the following command from the root directory:
bash tests/run-integration-tests.sh
```

#### Option 2: Using the Python API

You can also use the new Python API to set up the local database and run tests:

```python
from apparent.utils import download_and_launch_local_datasette, stop_local_datasette
import subprocess

# Download DB and start Datasette
local_url = download_and_launch_local_datasette(
db_path="data/us_physician_referral_networks.db",
port=8001,
update_env=True
)

# Run integration tests
subprocess.run(["python", "-m", "pytest", "tests/", "-v", "-m", "integration"])

# Stop the local Datasette server when done
stop_local_datasette(port=8001)
```

## πŸ“ License

This project is licensed under the BSD-3 License. See the LICENSE file for details.
Expand Down
10 changes: 8 additions & 2 deletions apparent/apparent.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
"Query and interact with our US Physician Referral Network Datasette."

"""
Apparent: A Comprehensive Interface for US Physician Referral Network Analysis

The Apparent class provides a user-friendly interface to query, build, analyze,
and visualize physician referral networks from the US healthcare system. It integrates
various functionalities including data fetching, network construction, feature
computation, network comparison, clustering, and embedding.
"""
import pandas as pd
import urllib
import os
Expand Down
Loading