Realtime Data Streaming With TCP Socket, Apache Spark, OpenAI LLM, Kafka and Elasticsearch

Introduction

This project serves as a comprehensive guide to building an end-to-end data engineering pipeline using TCP/IP Socket, Apache Spark, OpenAI LLM, Kafka and Elasticsearch. It covers each stage from data acquisition, processing, sentiment analysis with ChatGPT, production to kafka topic and connection to elasticsearch.

System Architecture

The project is designed with the following components:

Data Source: We use yelp.com dataset for our pipeline.
TCP/IP Socket: Used to stream data over the network in chunks
Apache Spark: For data processing with its master and worker nodes.
Confluent Kafka: Our cluster on the cloud
Control Center and Schema Registry: Helps in monitoring and schema management of our Kafka streams.
Kafka Connect: For connecting to elasticsearch
Elasticsearch: For indexing and querying

Technologies

Python
TCP/IP
Confluent Kafka
Apache Spark
Docker
Elasticsearch

Getting Started

Clone the repository:

git clone https://github.com/FroCode/Real_Streaming_Kafka.git

Navigate to the project directory:
```
cd Real_Streaming_Kafka
```
Run Docker Compose to spin up the spark cluster:
```
docker-compose up
```

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
assets		assets
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Realtime Data Streaming With TCP Socket, Apache Spark, OpenAI LLM, Kafka and Elasticsearch

Table of Contents

Introduction

System Architecture

Technologies

Getting Started

About

Uh oh!

Releases

Packages

Languages

FroCode/Real_Streaming_Kafka

Folders and files

Latest commit

History

Repository files navigation

Realtime Data Streaming With TCP Socket, Apache Spark, OpenAI LLM, Kafka and Elasticsearch

Table of Contents

Introduction

System Architecture

Technologies

Getting Started

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages