This project visualizes bus speeds across Portland, Oregon by calculating speed from breadcrumb location and timestamp data at bus stops. The data pipeline was automated with Google Cloud PubSub and processed into a PostgreSQL database, using Google Compute Engine with Linux Virtual Machines for continuous data flow and processing.
This project leverages real-time data from Portland bus stops, using geolocation and timestamp data to compute speed metrics. The computed speeds are then visualized on a dynamic map for insights into speed trends by location across the city.
- Python (
pandas
): Data cleaning and processing - Google Cloud PubSub: Real-time data ingestion
- Google Compute Engine (Linux VMs): Pipeline automation
- PostgreSQL: Data storage and querying
- Mapbox GL: Geospatial visualization
- UNIX: Task automation and data handling
- Data Ingestion: Breadcrumb data (location and timestamp) is transmitted through Google Cloud PubSub to ensure continuous data flow.
- Processing: Data is processed in Python using
pandas
to compute speed metrics based on timestamp and geolocation. - Storage: Processed data is stored in a PostgreSQL database for efficient retrieval and visualization.
- Visualization: Using Mapbox GL, speeds are displayed dynamically on a map, highlighting speed by location for easy analysis.
Here are sample visualizations showcasing speed data across Portland's bus stops:
Map Visualization Example |
---|
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |