Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
202 changes: 202 additions & 0 deletions python/cloudfront-v2-logging/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,202 @@
# CloudFront V2 Logging with AWS CDK (Python)

This project demonstrates how to set up Amazon CloudFront with the new CloudFront Standard Logging V2 feature using AWS CDK in Python. The example shows how to configure multiple logging destinations for CloudFront access logs, including:

1. Amazon CloudWatch Logs
2. Amazon S3 (with Parquet format)
3. Amazon Kinesis Data Firehose (with JSON format)

## Architecture

![CloudFront V2 Logging Architecture](./architecture.drawio.png)

The project deploys the following resources:

- An S3 bucket to host a simple static website
- A CloudFront distribution with Origin Access Control (OAC) to serve the website
- A logging S3 bucket with appropriate lifecycle policies
- CloudFront Standard Logging V2 configuration with multiple delivery destinations
- Kinesis Data Firehose delivery stream
- CloudWatch Logs group
- Necessary IAM roles and permissions

## Prerequisites

- [AWS CLI](https://aws.amazon.com/cli/) configured with appropriate credentials
- [AWS CDK](https://aws.amazon.com/cdk/) installed (v2.x)
- Python 3.6 or later
- Node.js 14.x or later (for CDK)

## Setup

1. Create and activate a virtual environment:

```bash
python3 -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate.bat
```

2. Install the required dependencies:

```bash
pip install -r requirements.txt
```

3. Synthesize the CloudFormation template:

```bash
cdk synth
```

4. Deploy the stack:

```bash
cdk deploy
```

You can customize the log retention periods by providing parameters:

```bash
cdk deploy --parameters LogRetentionDays=90 --parameters CloudWatchLogRetentionDays=60
```

5. After deployment, the CloudFront distribution domain name will be displayed in the outputs. You can access your website using this domain.

## How It Works

This example demonstrates CloudFront Standard Logging V2, which provides more flexibility in how you collect and analyze CloudFront access logs:

- **CloudWatch Logs**: Logs are delivered in JSON format for real-time monitoring and analysis
- **S3 (Parquet)**: Logs are delivered in Parquet format with Hive-compatible paths for efficient querying with services like Amazon Athena
- **Kinesis Data Firehose**: Logs are streamed in JSON format, allowing for real-time processing and transformation

The CDK stack creates all necessary resources and configures the appropriate permissions for log delivery.

## Example Log Outputs

### CloudWatch Logs (JSON format)
```json
{
"timestamp": "2023-03-15T20:12:34Z",
"c-ip": "192.0.2.100",
"time-to-first-byte": 0.002,
"sc-status": 200,
"sc-bytes": 2326,
"cs-method": "GET",
"cs-uri-stem": "/index.html",
"cs-protocol": "https",
"cs-host": "d111111abcdef8.cloudfront.net",
"cs-user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36",
"cs-referer": "https://www.example.com/",
"x-edge-location": "IAD79-C2",
"x-edge-request-id": "tLAGM_r7TyiRgwgk_4U5Xb-vv4JHOjzGCh61ER9nM_2UFY8hTKdEoQ=="
}
```

### S3 Parquet Format
The Parquet format is a columnar storage format that provides efficient compression and encoding schemes. The logs are stored in a Hive-compatible directory structure:

```
s3://your-logging-bucket/s3_delivery/EDFDVBD6EXAMPLE/2023/03/15/20/
```

### Kinesis Data Firehose (JSON format)
Firehose delivers logs in JSON format with a timestamp-based prefix:

```
s3://your-logging-bucket/firehose_delivery/year=2023/month=03/day=15/delivery-stream-1-2023-03-15-20-12-34-a1b2c3d4.json.gz
```

## Querying Logs with Athena

You can use Amazon Athena to query the Parquet logs stored in S3. Here's an example query to get started:

```sql
CREATE EXTERNAL TABLE IF NOT EXISTS cloudfront_logs (
`timestamp` string,
`c-ip` string,
`time-to-first-byte` float,
`sc-status` int,
`sc-bytes` bigint,
`cs-method` string,
`cs-uri-stem` string,
`cs-protocol` string,
`cs-host` string,
`cs-user-agent` string,
`cs-referer` string,
`x-edge-location` string,
`x-edge-request-id` string
)
PARTITIONED BY (
`distributionid` string,
`year` string,
`month` string,
`day` string,
`hour` string
)
STORED AS PARQUET
LOCATION 's3://your-logging-bucket/s3_delivery/';

-- Update partitions
MSCK REPAIR TABLE cloudfront_logs;

-- Example query to find the top requested URLs
SELECT cs_uri_stem, COUNT(*) as request_count
FROM cloudfront_logs
WHERE year='2023' AND month='03' AND day='15'
GROUP BY cs_uri_stem
ORDER BY request_count DESC
LIMIT 10;
```

## Troubleshooting

### Common Issues

1. **Logs not appearing in CloudWatch**
- Check that the CloudFront distribution is receiving traffic
- Verify the IAM permissions for the log delivery service
- Check CloudWatch service quotas if you have high traffic volumes

2. **Parquet files not appearing in S3**
- Verify bucket permissions allow the log delivery service to write
- Check for any errors in CloudTrail related to log delivery

3. **Firehose delivery errors**
- Check the Firehose error prefix in S3 for error logs
- Verify IAM role permissions for Firehose
- Monitor Firehose metrics in CloudWatch

### Useful Commands

- Check CloudFront distribution status:
```bash
aws cloudfront get-distribution --id <distribution-id>
```

- List log files in S3:
```bash
aws s3 ls s3://your-logging-bucket/s3_delivery/ --recursive
```

- View CloudWatch logs:
```bash
aws logs get-log-events --log-group-name <log-group-name> --log-stream-name <log-stream-name>
```

## Cleanup

To avoid incurring charges, delete the deployed resources when you're done:

```bash
cdk destroy
```

## Security Considerations

This example includes several security best practices:

- S3 buckets are configured with encryption, SSL enforcement, and public access blocking
- CloudFront uses Origin Access Control (OAC) to secure S3 content
- IAM permissions follow the principle of least privilege
- Logging bucket has appropriate lifecycle policies to manage log retention
54 changes: 54 additions & 0 deletions python/cloudfront-v2-logging/app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
#!/usr/bin/env python3
import os
import aws_cdk as cdk
from aws_cdk import Aspects
from cdk_nag import AwsSolutionsChecks, NagSuppressions

from cloudfront_v2_logging.cloudfront_v2_logging_stack import CloudfrontV2LoggingStack

app = cdk.App()
stack = CloudfrontV2LoggingStack(app, "CloudfrontV2LoggingStack")

# Add CDK-NAG to check for best practices
Aspects.of(app).add(AwsSolutionsChecks())

# Add suppressions at the stack level
NagSuppressions.add_stack_suppressions(
stack,
[
{
"id": "AwsSolutions-IAM4",
"reason": "Suppressing managed policy warning as permissions are appropriate"
},
{
"id": "AwsSolutions-L1",
"reason": "Lambda runtime is 3.11 and managed by CDK BucketDeployment construct, and so out of scope for this project"
},
{
"id": "AwsSolutions-CFR1",
"reason": "Geo restrictions not required for this demo"
},
{
"id": "AwsSolutions-CFR2",
"reason": "WAF integration not required for this demo"
},
{
"id": "AwsSolutions-CFR3",
"reason": "Using CloudFront V2 logging instead of traditional access logging"
},
{
"id": "AwsSolutions-S1",
"reason": "S3 access logging not required for this demo as we're demonstrating CloudFront V2 logging"
},
{
"id": "AwsSolutions-IAM5",
"reason": "Wildcard permissions are required for PUT actions for the CDK BucketDeployment construct and Firehose role"
},
{
"id": "AwsSolutions-CFR4",
"reason": "We're making use of the highest currently available viewer certificate. This flag is due to our use of the default viewer certificate which is not an issue in this demonstration case."
}
]
)

app.synth()
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
86 changes: 86 additions & 0 deletions python/cloudfront-v2-logging/cdk.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
{
"app": "python3 app.py",
"watch": {
"include": [
"**"
],
"exclude": [
"README.md",
"cdk*.json",
"requirements*.txt",
"source.bat",
"**/__init__.py",
"**/__pycache__",
"tests"
]
},
"context": {
"@aws-cdk/aws-lambda:recognizeLayerVersion": true,
"@aws-cdk/core:checkSecretUsage": true,
"@aws-cdk/core:target-partitions": [
"aws",
"aws-cn"
],
"@aws-cdk-containers/ecs-service-extensions:enableDefaultLogDriver": true,
"@aws-cdk/aws-ec2:uniqueImdsv2TemplateName": true,
"@aws-cdk/aws-ecs:arnFormatIncludesClusterName": true,
"@aws-cdk/aws-iam:minimizePolicies": true,
"@aws-cdk/core:validateSnapshotRemovalPolicy": true,
"@aws-cdk/aws-codepipeline:crossAccountKeyAliasStackSafeResourceName": true,
"@aws-cdk/aws-s3:createDefaultLoggingPolicy": true,
"@aws-cdk/aws-sns-subscriptions:restrictSqsDescryption": true,
"@aws-cdk/aws-apigateway:disableCloudWatchRole": true,
"@aws-cdk/core:enablePartitionLiterals": true,
"@aws-cdk/aws-events:eventsTargetQueueSameAccount": true,
"@aws-cdk/aws-ecs:disableExplicitDeploymentControllerForCircuitBreaker": true,
"@aws-cdk/aws-iam:importedRoleStackSafeDefaultPolicyName": true,
"@aws-cdk/aws-s3:serverAccessLogsUseBucketPolicy": true,
"@aws-cdk/aws-route53-patters:useCertificate": true,
"@aws-cdk/customresources:installLatestAwsSdkDefault": false,
"@aws-cdk/aws-rds:databaseProxyUniqueResourceName": true,
"@aws-cdk/aws-codedeploy:removeAlarmsFromDeploymentGroup": true,
"@aws-cdk/aws-apigateway:authorizerChangeDeploymentLogicalId": true,
"@aws-cdk/aws-ec2:launchTemplateDefaultUserData": true,
"@aws-cdk/aws-secretsmanager:useAttachedSecretResourcePolicyForSecretTargetAttachments": true,
"@aws-cdk/aws-redshift:columnId": true,
"@aws-cdk/aws-stepfunctions-tasks:enableEmrServicePolicyV2": true,
"@aws-cdk/aws-ec2:restrictDefaultSecurityGroup": true,
"@aws-cdk/aws-apigateway:requestValidatorUniqueId": true,
"@aws-cdk/aws-kms:aliasNameRef": true,
"@aws-cdk/aws-autoscaling:generateLaunchTemplateInsteadOfLaunchConfig": true,
"@aws-cdk/core:includePrefixInUniqueNameGeneration": true,
"@aws-cdk/aws-efs:denyAnonymousAccess": true,
"@aws-cdk/aws-opensearchservice:enableOpensearchMultiAzWithStandby": true,
"@aws-cdk/aws-lambda-nodejs:useLatestRuntimeVersion": true,
"@aws-cdk/aws-efs:mountTargetOrderInsensitiveLogicalId": true,
"@aws-cdk/aws-rds:auroraClusterChangeScopeOfInstanceParameterGroupWithEachParameters": true,
"@aws-cdk/aws-appsync:useArnForSourceApiAssociationIdentifier": true,
"@aws-cdk/aws-rds:preventRenderingDeprecatedCredentials": true,
"@aws-cdk/aws-codepipeline-actions:useNewDefaultBranchForCodeCommitSource": true,
"@aws-cdk/aws-cloudwatch-actions:changeLambdaPermissionLogicalIdForLambdaAction": true,
"@aws-cdk/aws-codepipeline:crossAccountKeysDefaultValueToFalse": true,
"@aws-cdk/aws-codepipeline:defaultPipelineTypeToV2": true,
"@aws-cdk/aws-kms:reduceCrossAccountRegionPolicyScope": true,
"@aws-cdk/aws-eks:nodegroupNameAttribute": true,
"@aws-cdk/aws-ec2:ebsDefaultGp3Volume": true,
"@aws-cdk/aws-ecs:removeDefaultDeploymentAlarm": true,
"@aws-cdk/custom-resources:logApiResponseDataPropertyTrueDefault": false,
"@aws-cdk/aws-s3:keepNotificationInImportedBucket": false,
"@aws-cdk/aws-ecs:enableImdsBlockingDeprecatedFeature": false,
"@aws-cdk/aws-ecs:disableEcsImdsBlocking": true,
"@aws-cdk/aws-ecs:reduceEc2FargateCloudWatchPermissions": true,
"@aws-cdk/aws-dynamodb:resourcePolicyPerReplica": true,
"@aws-cdk/aws-ec2:ec2SumTImeoutEnabled": true,
"@aws-cdk/aws-appsync:appSyncGraphQLAPIScopeLambdaPermission": true,
"@aws-cdk/aws-rds:setCorrectValueForDatabaseInstanceReadReplicaInstanceResourceId": true,
"@aws-cdk/core:cfnIncludeRejectComplexResourceUpdateCreatePolicyIntrinsics": true,
"@aws-cdk/aws-lambda-nodejs:sdkV3ExcludeSmithyPackages": true,
"@aws-cdk/aws-stepfunctions-tasks:fixRunEcsTaskPolicy": true,
"@aws-cdk/aws-ec2:bastionHostUseAmazonLinux2023ByDefault": true,
"@aws-cdk/aws-route53-targets:userPoolDomainNameMethodWithoutCustomResource": true,
"@aws-cdk/aws-elasticloadbalancingV2:albDualstackWithoutPublicIpv4SecurityGroupRulesDefault": true,
"@aws-cdk/aws-iam:oidcRejectUnauthorizedConnections": true,
"@aws-cdk/core:enableAdditionalMetadataCollection": true,
"@aws-cdk/aws-lambda:createNewPoliciesWithAddToRolePolicy": true
}
}
Empty file.
Loading
Loading