Skip to content

Commit 09385ca

Browse files
committed
formatting and docs
1 parent 0605218 commit 09385ca

File tree

8 files changed

+184
-127
lines changed

8 files changed

+184
-127
lines changed

.github/FUNDING.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
custom: ["https://www.paypal.me/coffeeDarioBalinzo"]
1+
custom: [ "https://www.paypal.me/coffeeDarioBalinzo" ]

README.md

Lines changed: 89 additions & 60 deletions
Original file line numberDiff line numberDiff line change
@@ -1,39 +1,48 @@
11
# Kafka-connect-elasticsearch-source
2-
[![YourActionName Actions Status](https://github.com/DarioBalinzo/kafka-connect-elasticsearch-source/workflows/Java%20CI%20with%20Maven/badge.svg)](https://github.com/DarioBalinzo/kafka-connect-elasticsearch-source/actions)
32

3+
[![YourActionName Actions Status](https://github.com/DarioBalinzo/kafka-connect-elasticsearch-source/workflows/Java%20CI%20with%20Maven/badge.svg)](https://github.com/DarioBalinzo/kafka-connect-elasticsearch-source/actions)
44

5-
Kafka Connect Elasticsearch Source: fetch data from elastic-search and sends it to kafka. The connector fetches only new data using a strictly incremental / temporal field (like a timestamp or an incrementing id).
6-
It supports dynamic schema and nested objects/ arrays.
5+
Kafka Connect Elasticsearch Source: fetch data from elastic-search and sends it to kafka. The connector fetches only new
6+
data using a strictly incremental / temporal field (like a timestamp or an incrementing id). It supports dynamic schema
7+
and nested objects/ arrays.
78

89
## Requirements:
10+
911
- Elasticsearch 6.x and 7.x
1012
- Java >= 8
1113
- Maven
1214

1315
## Output data serialization format:
14-
The connector uses kafka-connect schema and structs, that are agnostic regarding
15-
the user serialization method (e.g. it might be Avro or json, etc...).
16+
17+
The connector uses kafka-connect schema and structs, that are agnostic regarding the user serialization method (e.g. it
18+
might be Avro or json, etc...).
1619

1720
## Bugs or new Ideas?
21+
1822
- Issues tracker: https://github.com/DarioBalinzo/kafka-connect-elasticsearch-source/issues
1923
- Feel free to open an issue to discuss new ideas (or propose new solutions with a PR).
2024

2125
## Installation:
26+
2227
Compile the project with:
28+
2329
```bash
2430
mvn clean package -DskipTests
2531
```
2632

2733
You can also compile and running both unit and integration tests (docker is mandatory) with:
34+
2835
```bash
2936
mvn clean package
3037
```
3138

32-
Copy the jar with dependencies from the target folder into connect classpath (e.g ``/usr/share/java/kafka-connect-elasticsearch`` ) or set ``plugin.path`` parameter appropriately.
39+
Copy the jar with dependencies from the target folder into connect classpath (
40+
e.g ``/usr/share/java/kafka-connect-elasticsearch`` ) or set ``plugin.path`` parameter appropriately.
3341

3442
## Example
35-
Using kafka connect in distributed way, a sample config file to fetch ``my_awesome_index*`` indices and to produce output topics with ``es_`` prefix:
3643

44+
Using kafka connect in distributed way, a sample config file to fetch ``my_awesome_index*`` indices and to produce
45+
output topics with ``es_`` prefix:
3746

3847
```json
3948
{
@@ -49,40 +58,43 @@ Using kafka connect in distributed way, a sample config file to fetch ``my_aweso
4958
}
5059
}
5160
```
61+
5262
To start the connector with curl:
63+
5364
```bash
5465
curl -X POST -H "Content-Type: application/json" --data @config.json http://localhost:8083/connectors | jq
5566
```
5667

5768
To check the status:
69+
5870
```bash
5971
curl localhost:8083/connectors/elastic-source/status | jq
6072
```
6173

6274
To stop the connector:
75+
6376
```bash
6477
curl -X DELETE localhost:8083/connectors/elastic-source | jq
6578
```
6679

67-
6880
## Documentation
6981

7082
### Elasticsearch Configuration
7183

7284
``es.host``
73-
ElasticSearch host. Optionally it is possible to specify many hosts using ``;`` as separator (``host1;host2;host3``)
85+
ElasticSearch host. Optionally it is possible to specify many hosts using ``;`` as separator (``host1;host2;host3``)
7486

75-
* Type: string
76-
* Importance: high
77-
* Dependents: ``index.prefix``
87+
* Type: string
88+
* Importance: high
89+
* Dependents: ``index.prefix``
7890

7991
``es.port``
80-
ElasticSearch port
92+
ElasticSearch port
93+
94+
* Type: string
95+
* Importance: high
96+
* Dependents: ``index.prefix``
8197

82-
* Type: string
83-
* Importance: high
84-
* Dependents: ``index.prefix``
85-
8698
``es.scheme``
8799
ElasticSearch scheme (http/https)
88100

@@ -91,84 +103,101 @@ ElasticSearch scheme (http/https)
91103
* Default: ``http``
92104

93105
``es.user``
94-
Elasticsearch username
106+
Elasticsearch username
95107

96-
* Type: string
97-
* Default: null
98-
* Importance: high
108+
* Type: string
109+
* Default: null
110+
* Importance: high
99111

100112
``es.password``
101-
Elasticsearch password
113+
Elasticsearch password
114+
115+
* Type: password
116+
* Default: null
117+
* Importance: high
118+
119+
120+
``incrementing.field.name``
121+
The name of the strictly incrementing field to use to detect new records.
102122

103-
* Type: password
104-
* Default: null
105-
* Importance: high
123+
* Type: any
124+
* Importance: high
125+
126+
``incrementing.secondary.field.name``
127+
In case the main incrementing field may have duplicates,
128+
this secondary field is used as a secondary sort field in order
129+
to avoid data losses when paginating (available starting from versions >= 1.4).
130+
131+
* Type: any
132+
* Importance: low
106133

107134
``connection.attempts``
108-
Maximum number of attempts to retrieve a valid Elasticsearch connection.
135+
Maximum number of attempts to retrieve a valid Elasticsearch connection.
109136

110-
* Type: int
111-
* Default: 3
112-
* Importance: low
137+
* Type: int
138+
* Default: 3
139+
* Importance: low
113140

114141
``connection.backoff.ms``
115-
Backoff time in milliseconds between connection attempts.
142+
Backoff time in milliseconds between connection attempts.
116143

117-
* Type: long
118-
* Default: 10000
119-
* Importance: low
144+
* Type: long
145+
* Default: 10000
146+
* Importance: low
120147

121148
``index.prefix``
122-
Indices prefix to include in copying.
123-
124-
* Type: string
125-
* Default: ""
126-
* Importance: medium
149+
Indices prefix to include in copying.
127150

151+
* Type: string
152+
* Default: ""
153+
* Importance: medium
128154

129155
### Connector Configuration
130156

131157
``poll.interval.ms``
132-
Frequency in ms to poll for new data in each index.
158+
Frequency in ms to poll for new data in each index.
133159

134-
* Type: int
135-
* Default: 5000
136-
* Importance: high
160+
* Type: int
161+
* Default: 5000
162+
* Importance: high
137163

138164
``batch.max.rows``
139-
Maximum number of documents to include in a single batch when polling for new data.
165+
Maximum number of documents to include in a single batch when polling for new data.
140166

141-
* Type: int
142-
* Default: 10000
143-
* Importance: low
167+
* Type: int
168+
* Default: 10000
169+
* Importance: low
144170

145171
``topic.prefix``
146-
Prefix to prepend to index names to generate the name of the Kafka topic to publish data
172+
Prefix to prepend to index names to generate the name of the Kafka topic to publish data
147173

148-
* Type: string
149-
* Importance: high
174+
* Type: string
175+
* Importance: high
150176

151177
``filters.whitelist``
152-
Whitelist filter for extracting a subset of fields from elastic-search json documents.
153-
The whitelist filter supports nested fields. To provide multiple fields use `;` as separator
178+
Whitelist filter for extracting a subset of fields from elastic-search json documents. The whitelist filter supports
179+
nested fields. To provide multiple fields use `;` as separator
154180
(e.g. `customer;order.qty;order.price`).
155-
* Type: string
156-
* Importance: medium
157-
* Default: null
181+
182+
* Type: string
183+
* Importance: medium
184+
* Default: null
158185

159186
``filters.json_cast``
160-
This filter casts nested fields to json string, avoiding parsing recursively as kafka connect-schema.
161-
The json-cast filter supports nested fields. To provide multiple fields use `;` as separator
187+
This filter casts nested fields to json string, avoiding parsing recursively as kafka connect-schema. The json-cast
188+
filter supports nested fields. To provide multiple fields use `;` as separator
162189
(e.g. `customer;order.qty;order.price`).
190+
163191
* Type: string
164192
* Importance: medium
165193
* Default: null
166194

167195
``fieldname_converter``
168-
Configuring which field name converter should be used (allowed values: `avro` or `nop`).
169-
By default, the avro field name converter renames the json fields non respecting the avro specifications (https://avro.apache.org/docs/current/spec.html#names)
170-
in order to be serialized correctly.
171-
To disable the field name conversion set this parameter to `nop`.
196+
Configuring which field name converter should be used (allowed values: `avro` or `nop`). By default, the avro field name
197+
converter renames the json fields non respecting the avro
198+
specifications (https://avro.apache.org/docs/current/spec.html#names)
199+
in order to be serialized correctly. To disable the field name conversion set this parameter to `nop`.
200+
172201
* Type: string
173202
* Importance: medium
174203
* Default: avro

0 commit comments

Comments
 (0)