A standalone, serverless analytics tracking system for AWS that supports multiple applications and fine-grained IAM controls.
Client/App → API Gateway → Lambda → S3 (bucket per app) → Athena → Visualization
Uses CDK for infrastructure.
Key Features:
- âś… Multi-tenant support (one tracker, multiple apps/buckets)
- âś… Fine-grained IAM controls
- âś… Bucket name passed via request (no hard-coding)
- âś… Privacy-focused (hashed IPs, no PII)
- âś… Serverless and scalable
- âś… SQL-queryable via Athena
- Installation
- Configuration
- Local Dev
- Deployment
- Usage
- IAM Permissions
- Querying Data
- Multi-Tenant Setup
- Security
- Cost Estimation
Configure the tracker for your use case:
import * as cdk from 'aws-cdk-lib';
import { AnalyticsTrackerStack } from '../lib/analytics-stack';
const app = new cdk.App();
new AnalyticsTrackerStack(
app,
'MyAnalyticsTracker',
{
// List of allowed S3 buckets (supports wildcards)
allowedBuckets: [
'app1-analytics-prod',
'app2-analytics-staging',
'*-analytics', // All buckets ending with -analytics
],
// CORS origin (use '*' for all, or specify domains)
corsOrigin: '*',
// Function and API naming
functionPrefix: 'mycompany',
apiName: 'mycompany-analytics-api',
// Optional: Additional configuration
enableMetrics: true,
lambdaTimeout: 10,
},
{
env: {
account: process.env.CDK_DEFAULT_ACCOUNT,
region: process.env.CDK_DEFAULT_REGION,
},
}
);npm run offline# Build TypeScript
npm run build
# Preview changes
npm run synth
# Deploy to AWS
npm run deployAfter deployment, note the API endpoint URL:
Outputs:
MyAnalyticsTracker.ApiEndpoint = https://abc123.execute-api.us-east-1.amazonaws.com/prod
Send analytics events via POST request:
curl -X POST https://your-api-url.execute-api.us-east-1.amazonaws.com/prod/track \
-H "Content-Type: application/json" \
-d '{
"bucket": "app1-analytics-prod",
"eventType": "page_view",
"timestamp": "2025-12-17T12:00:00.000Z",
"page": "/articles/my-post",
"userAgent": "Mozilla/5.0...",
"viewport": { "width": 1920, "height": 1080 },
"sessionId": "abc123",
"referrer": "https://google.com"
"fromWebsite": "LinkedIn"
}'Required Fields:
bucket(string): S3 bucket name to write to (must be in allowedBuckets)eventType(string): Type of event (e.g., "page_view", "scroll_complete")timestamp(string): ISO 8601 timestamp
Optional Fields:
page(string): Page pathuserAgent(string): Browser user agentviewport(object):{ width: number, height: number }sessionId(string): Session identifierreferrer(string): Referrer URLmetadata(object): Any additional custom datafromWebsite(string): use for promo code to track promotions or other data relavant to where the interest comes from
Success (200):
{
"status": "ok",
"eventId": "1702742400-a1b2c3d4",
"bucket": "app1-analytics-prod",
"key": "analytics/year=2025/month=12/day=17/1702742400-a1b2c3d4.json"
}Error (400):
{
"error": "Missing required field: bucket",
"message": "You must specify the S3 bucket name in the request body"
}Forbidden (403):
{
"error": "Forbidden",
"message": "The specified bucket is not authorized for this analytics service"
}The Lambda function is granted fine-grained S3 write permissions:
{
"Effect": "Allow",
"Action": [
"s3:PutObject",
],
"Resource": [
"arn:aws:s3:::app1-analytics-prod/analytics/*",
"arn:aws:s3:::app2-analytics-prod/analytics/*"
]
}The Lambda validates incoming bucket names against the ALLOWED_BUCKETS environment variable:
// Exact match
allowedBuckets: ['app1-analytics', 'app2-analytics']
// Wildcard patterns
allowedBuckets: ['*-analytics'] // Matches: my-app-analytics, other-app-analytics
allowedBuckets: ['analytics-*'] // Matches: analytics-prod, analytics-staging- Open
sql/athena-setup.sql - Replace
YOUR-BUCKET-NAMEwith your bucket name - Replace
TABLE_NAMEwith a unique name (e.g.,app1_events) - Run in AWS Athena console
CREATE EXTERNAL TABLE analytics_db.app1_events (
-- schema...
)
PARTITIONED BY (year STRING, month STRING, day STRING)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
LOCATION 's3://app1-analytics-prod/analytics/';
-- Load partitions
MSCK REPAIR TABLE analytics_db.app1_events;See sql/example-queries.sql for common queries:
-- Daily page views
SELECT year, month, day, COUNT(*) as views
FROM analytics_db.app1_events
WHERE eventType = 'page_view'
GROUP BY year, month, day
ORDER BY year DESC, month DESC, day DESC;
-- Device breakdown
SELECT device.device_type, device.browser, COUNT(*) as count
FROM analytics_db.app1_events
WHERE eventType = 'page_view'
GROUP BY device.device_type, device.browser;Option A: AWS QuickSight
- Connect to Athena
- Create dashboards
- ~$9/month per author
Option B: Export to CSV
- Run queries in Athena
- Download results
- Import to Excel/Sheets/Tableau
Option C: Programmatic Access
import { AthenaClient, StartQueryExecutionCommand } from '@aws-sdk/client-athena';
const athena = new AthenaClient({ region: 'us-east-1' });
const result = await athena.send(new StartQueryExecutionCommand({
QueryString: 'SELECT * FROM analytics_db.app1_events LIMIT 100',
ResultConfiguration: { OutputLocation: 's3://query-results-bucket/' }
}));Setup:
// Deploy ONE tracker
new AnalyticsTrackerStack(app, 'SharedTracker', {
allowedBuckets: [
'website-analytics',
'mobile-app-analytics',
'api-analytics',
],
});Usage:
# Website sends to website-analytics bucket
curl -X POST https://tracker.../track -d '{"bucket": "website-analytics", ...}'
# Mobile app sends to mobile-app-analytics bucket
curl -X POST https://tracker.../track -d '{"bucket": "mobile-app-analytics", ...}'Athena Tables:
-- One table per app
CREATE EXTERNAL TABLE analytics_db.website_events (...) LOCATION 's3://website-analytics/analytics/';
CREATE EXTERNAL TABLE analytics_db.mobile_events (...) LOCATION 's3://mobile-app-analytics/analytics/';
CREATE EXTERNAL TABLE analytics_db.api_events (...) LOCATION 's3://api-analytics/analytics/';Query All Apps:
SELECT 'Website' as app, COUNT(*) as views FROM analytics_db.website_events WHERE eventType = 'page_view'
UNION ALL
SELECT 'Mobile' as app, COUNT(*) as views FROM analytics_db.mobile_events WHERE eventType = 'page_view'
UNION ALL
SELECT 'API' as app, COUNT(*) as views FROM analytics_db.api_events WHERE eventType = 'page_view';- IPs are hashed using SHA-256
- Only first 16 characters stored
- Cannot reverse-engineer original IP
- Requests with unauthorized buckets return 403
- Lambda validates bucket against whitelist
- Supports wildcard patterns for flexibility
- Configurable per deployment
- Use '*' for public analytics
- Specify domain for backend-only
- S3 server-side encryption (SSE-S3) recommended
- Support for KMS via additionalPolicies
- HTTPS enforced via API Gateway
Events are stored as JSON with this structure:
{
eventId: string; // Unique event ID
eventType: string; // Event type
timestamp: string; // Client timestamp (ISO 8601)
serverTimestamp: string; // Server timestamp (ISO 8601)
page: string; // Page path
sessionId: string; // Session ID
ip: string; // Hashed IP (16 chars)
location: {
country: string; // Country code
city: string; // City name
region: string; // Region/state
};
device: {
device_type: string; // 'mobile' | 'tablet' | 'desktop'
browser: string; // Browser name
};
viewport: {
width: number; // Viewport width
height: number; // Viewport height
};
referrer: string; // Referrer URL
userAgent: string; // User agent string
fromWebsite: string;
// ... any custom metadata
}Check Lambda environment variable ALLOWED_BUCKETS:
aws lambda get-function-configuration --function-name mycompany-analytics-tracker \
--query 'Environment.Variables.ALLOWED_BUCKETS'-
Check S3 bucket for files:
aws s3 ls s3://your-bucket/analytics/ --recursive
-
Run partition repair:
MSCK REPAIR TABLE analytics_db.your_table;
-
Verify table location matches bucket
- Enable S3 Intelligent Tiering (transitions after 90 days)
- Use partition projection to avoid MSCK REPAIR queries
- Limit Athena query scope with WHERE clauses on partitions
This is a standalone package. Contributions welcome!
MIT