Skip to content

feat: Add continuous monitoring with user prompts and real-time healt… #11

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
691 changes: 691 additions & 0 deletions kafka-analysis/kafka-analysis-1752114836141.json

Large diffs are not rendered by default.

16 changes: 16 additions & 0 deletions kafka-analysis/kafka-health-checks-1752114836141.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
"Health Check Results"
"Check Name","Status","Message","Description","Recommendation"
"Replication Factor vs Broker Count","pass","All topics have appropriate replication factor (≤ 1 brokers)","Checks if any topic has a replication factor greater than the number of brokers. Healthy: All topics have RF ≤ broker count. Failed: Any topic has RF > broker count.",""
"Topic Partition Distribution","pass","Good partition distribution: avg=1.0, min=1, max=1","Checks if user topics have a balanced number of partitions. Healthy: Partition counts are similar. Warning: Large difference between min and max partitions.",""
"Consumer Group Health","info","No consumer groups found","Checks if all consumer groups have active members. Healthy: All groups have members. Warning: Some groups have no active members.",""
"Internal Topics Health","pass","All 1 internal topics are healthy","Checks if all internal topics (names starting with __) have partitions > 0. Healthy: All internal topics have partitions. Failed: Any internal topic has 0 or missing partitions.",""
"Under-Replicated Partitions","pass","All topics have the expected number of in-sync replicas","",""
"Min In-Sync Replicas Configuration","pass","All topics have appropriate min.insync.replicas configuration","",""
"Rack Awareness","warning","Rack awareness is not configured - no brokers have rack information","Checks if rack awareness is configured in the cluster. Healthy: Rack awareness is configured. Warning: Rack awareness is not configured.","Consider enabling rack awareness for better availability and fault tolerance"
"Replica Distribution","pass","Perfect replica balance: Each broker carries 51.0 replicas on average (range: 51-51)","Checks if data replicas are evenly distributed across all brokers. Healthy: Each broker carries a similar number of replicas. Warning/Failed: Some brokers carry significantly more replicas than others, which can cause performance issues.",""
"Metrics Configuration","warning","No JMX metrics configuration detected on any brokers","Checks if monitoring metrics are properly configured. For AWS MSK: Checks Open Monitoring with Prometheus JMX exporter. For others: Checks JMX metrics configuration. Healthy: Metrics are enabled and accessible. Warning: Metrics are not configured or partially configured.","Enable JMX metrics on brokers for better monitoring, alerting, and performance analysis"
"Generic Kafka Logging Configuration","info","Generic Kafka logging configuration check","Checks if logging configuration is properly configured. For AWS MSK: Checks LoggingInfo configuration and CloudTrail. For Confluent Cloud/Aiven: Built-in logging is available. For others: Checks log4j configuration. Healthy: Logging is enabled and configured. Warning: Logging is not configured or partially configured.","Verify log4j configuration and log directory permissions in server.properties"
"Generic Kafka Authentication Configuration","fail","Unauthenticated access is enabled - this is a security risk","Checks if unauthenticated access is enabled. For AWS MSK: Checks if SASL or SSL is configured. For Confluent Cloud/Aiven: Built-in authentication prevents unauthenticated access. For others: Checks if SASL or SSL is configured. Healthy: Authentication is enabled (no unauthenticated access). Failed: Unauthenticated access is enabled (security risk).","Enable SASL or SSL authentication in server.properties for better security"
"Generic Kafka Quotas Configuration","warning","No quota configuration detected in Kafka cluster","Checks if Kafka quotas are configured and being used. For AWS MSK: Checks quota configuration via AWS console/CLI. For Confluent Cloud/Aiven: Built-in quota management is available. For others: Checks server.properties and kafka-configs.sh for quota settings. Healthy: Quotas are configured and managed. Info: Quotas configuration check available.","Configure quotas in server.properties or use kafka-configs.sh to set client quotas for better resource management"
"Payload Compression","warning","No compression detected on any of the 1 user topics (0%)","Checks if payload compression is enabled on user topics. Analyzes compression.type, compression, and producer.compression.type configurations. Healthy: All user topics have compression enabled (100%). Warning: Some or no topics have compression enabled (<100%). Info: No user topics to analyze.","Enable compression on topics to reduce storage usage and improve network performance"
"Infinite Retention Policy","pass","No topics have infinite retention policy enabled","Checks if any topics have infinite retention policy enabled (retention.ms = infinite). Healthy: No topics have infinite retention. Warning: Some topics have infinite retention policy (bad practice). Info: Unable to verify retention policy.",""
Loading