Skip to content

Commit 635022f

Browse files
committed
update readme
1 parent b9cb7e2 commit 635022f

File tree

1 file changed

+11
-8
lines changed

1 file changed

+11
-8
lines changed

README.md

Lines changed: 11 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,24 @@
1-
# Hdfs output plugin for Embulk
1+
# Hdfs file output plugin for Embulk
22

33
A File Output Plugin for Embulk to write HDFS.
44

55
## Overview
66

77
* **Plugin type**: file output
8-
* **Load all or nothing**: no
8+
* **Load all or nothing**: yes
99
* **Resume supported**: no
1010
* **Cleanup supported**: no
1111

1212
## Configuration
1313

1414
- **config_files** list of paths to Hadoop's configuration files (array of strings, default: `[]`)
1515
- **config** overwrites configuration parameters (hash, default: `{}`)
16-
- **output_path** the path finally stored files. (string, default: `"/tmp/embulk.output.hdfs_output.%Y%m%d_%s"`)
17-
- **working_path** the path temporary stored files. (string, default: `"/tmp/embulk.working.hdfs_output.%Y%m%d_%s"`)
16+
- **path_prefix** prefix of target files (string, required)
17+
- **file_ext** suffix of target files (string, required)
18+
- **sequence_format** format for sequence part of target files (string, default: `'.%03d.%02d'`)
19+
- **rewind_seconds** When you use Date format in path_prefix property(like `/tmp/embulk/%Y-%m-%d/out`), the format is interpreted by using the time which is Now minus this property. (int, default: `0`)
20+
- **overwrite** overwrite files when the same filenames already exists (boolean, default: `false`)
21+
- *caution*: even if this property is `true`, this does not mean ensuring the idempotence. if you want to ensure the idempotence, you need the procedures to remove output files after or before running.
1822

1923
## Example
2024

@@ -24,14 +28,13 @@ out:
2428
config_files:
2529
- /etc/hadoop/conf/core-site.xml
2630
- /etc/hadoop/conf/hdfs-site.xml
27-
- /etc/hadoop/conf/mapred-site.xml
28-
- /etc/hadoop/conf/yarn-site.xml
2931
config:
3032
fs.defaultFS: 'hdfs://hdp-nn1:8020'
31-
dfs.replication: 1
32-
mapreduce.client.submit.file.replication: 1
3333
fs.hdfs.impl: 'org.apache.hadoop.hdfs.DistributedFileSystem'
3434
fs.file.impl: 'org.apache.hadoop.fs.LocalFileSystem'
35+
path_prefix: '/tmp/embulk/hdfs_output/%Y-%m-%d/out'
36+
file_ext: 'txt'
37+
overwrite: true
3538
formatter:
3639
type: csv
3740
encoding: UTF-8

0 commit comments

Comments
 (0)