In this project, I set up and build a big data processing pipeline using Apache Spark integrated with various AWS services, including S3, EMR, EC2, VPC, IAM, and Redshift and Terraform to setup the infrastructure
S3EMREC2AirflowRedshiftTerraformSparkVPCIAM
