Skip to content

Commit e675f21

Browse files
authored
Merge pull request #4639 from amolsr/main
Enhancement: expand metrics section in kafka-clickhouse-connect-sink.md
2 parents c28761a + ec3e046 commit e675f21

File tree

2 files changed

+80
-6
lines changed

2 files changed

+80
-6
lines changed

docs/integrations/data-ingestion/kafka/kafka-clickhouse-connect-sink.md

Lines changed: 79 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -310,20 +310,93 @@ For additional details check out the official [tutorial](https://docs.confluent.
310310

311311
ClickHouse Kafka Connect reports runtime metrics via [Java Management Extensions (JMX)](https://www.oracle.com/technical-resources/articles/javase/jmx.html). JMX is enabled in Kafka Connector by default.
312312

313-
ClickHouse Connect `MBeanName`:
313+
#### ClickHouse-Specific Metrics {#clickhouse-specific-metrics}
314+
315+
The connector exposes custom metrics via the following MBean name:
314316

315317
```java
316318
com.clickhouse:type=ClickHouseKafkaConnector,name=SinkTask{id}
317319
```
318320

319-
ClickHouse Kafka Connect reports the following metrics:
320-
321-
| Name | Type | Description |
322-
|----------------------|------|-----------------------------------------------------------------------------------------|
323-
| `receivedRecords` | long | The total number of records received. |
321+
| Metric Name | Type | Description |
322+
|-----------------------|------|-----------------------------------------------------------------------------------------|
323+
| `receivedRecords` | long | The total number of records received. |
324324
| `recordProcessingTime` | long | Total time in nanoseconds spent grouping and converting records to a unified structure. |
325325
| `taskProcessingTime` | long | Total time in nanoseconds spent processing and inserting data into ClickHouse. |
326326

327+
#### Kafka Producer/Consumer Metrics {#kafka-producer-consumer-metrics}
328+
329+
The connector exposes standard Kafka producer and consumer metrics that provide insights into data flow, throughput, and performance.
330+
331+
**Topic-Level Metrics:**
332+
- `records-sent-total`: Total number of records sent to the topic
333+
- `bytes-sent-total`: Total bytes sent to the topic
334+
- `record-send-rate`: Average rate of records sent per second
335+
- `byte-rate`: Average bytes sent per second
336+
- `compression-rate`: Compression ratio achieved
337+
338+
**Partition-Level Metrics:**
339+
- `records-sent-total`: Total records sent to the partition
340+
- `bytes-sent-total`: Total bytes sent to the partition
341+
- `records-lag`: Current lag in the partition
342+
- `records-lead`: Current lead in the partition
343+
- `replica-fetch-lag`: Lag information for replicas
344+
345+
**Node-Level Connection Metrics:**
346+
- `connection-creation-total`: Total connections created to the Kafka node
347+
- `connection-close-total`: Total connections closed
348+
- `request-total`: Total requests sent to the node
349+
- `response-total`: Total responses received from the node
350+
- `request-rate`: Average request rate per second
351+
- `response-rate`: Average response rate per second
352+
353+
These metrics help monitor:
354+
- **Throughput**: Track data ingestion rates
355+
- **Lag**: Identify bottlenecks and processing delays
356+
- **Compression**: Measure data compression efficiency
357+
- **Connection Health**: Monitor network connectivity and stability
358+
359+
#### Kafka Connect Framework Metrics {#kafka-connect-framework-metrics}
360+
361+
The connector integrates with the Kafka Connect framework and exposes metrics for task lifecycle and error tracking.
362+
363+
**Task Status Metrics:**
364+
- `task-count`: Total number of tasks in the connector
365+
- `running-task-count`: Number of tasks currently running
366+
- `paused-task-count`: Number of tasks currently paused
367+
- `failed-task-count`: Number of tasks that have failed
368+
- `destroyed-task-count`: Number of destroyed tasks
369+
- `unassigned-task-count`: Number of unassigned tasks
370+
371+
Task status values include: `running`, `paused`, `failed`, `destroyed`, `unassigned`
372+
373+
**Error Metrics:**
374+
- `deadletterqueue-produce-failures`: Number of failed DLQ writes
375+
- `deadletterqueue-produce-requests`: Total DLQ write attempts
376+
- `last-error-timestamp`: Timestamp of the last error
377+
- `records-skip-total`: Total number of records skipped due to errors
378+
- `records-retry-total`: Total number of records that were retried
379+
- `errors-total`: Total number of errors encountered
380+
381+
**Performance Metrics:**
382+
- `offset-commit-failures`: Number of failed offset commits
383+
- `offset-commit-avg-time-ms`: Average time for offset commits
384+
- `offset-commit-max-time-ms`: Maximum time for offset commits
385+
- `put-batch-avg-time-ms`: Average time to process a batch
386+
- `put-batch-max-time-ms`: Maximum time to process a batch
387+
- `source-record-poll-total`: Total records polled
388+
389+
#### Monitoring Best Practices {#monitoring-best-practices}
390+
391+
1. **Monitor Consumer Lag**: Track `records-lag` per partition to identify processing bottlenecks
392+
2. **Track Error Rates**: Watch `errors-total` and `records-skip-total` to detect data quality issues
393+
3. **Observe Task Health**: Monitor task status metrics to ensure tasks are running properly
394+
4. **Measure Throughput**: Use `records-send-rate` and `byte-rate` to track ingestion performance
395+
5. **Monitor Connection Health**: Check node-level connection metrics for network issues
396+
6. **Track Compression Efficiency**: Use `compression-rate` to optimize data transfer
397+
398+
For detailed JMX metric definitions and Prometheus integration, see the [jmx-export-connector.yml](https://github.com/ClickHouse/clickhouse-kafka-connect/blob/main/jmx-export-connector.yml) configuration file.
399+
327400
### Limitations {#limitations}
328401

329402
- Deletes are not supported.

scripts/aspell-ignore/en/aspell-dict.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -726,6 +726,7 @@ Lyft
726726
MACNumToString
727727
MACStringToNum
728728
MACStringToOUI
729+
MBean
729730
MCPHost
730731
MEDIUMINT
731732
MEMTABLE

0 commit comments

Comments
 (0)