Skip to content

Commit 0dcae14

Browse files
committed
Merge pull request #8 from civitaspo/v0.2.0
V0.2.0
2 parents 7e3e716 + 635022f commit 0dcae14

File tree

8 files changed

+223
-238
lines changed

8 files changed

+223
-238
lines changed

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,3 +6,6 @@
66
/classpath/
77
build/
88
.idea
9+
*.iml
10+
.ruby-version
11+

README.md

Lines changed: 11 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,24 @@
1-
# Hdfs output plugin for Embulk
1+
# Hdfs file output plugin for Embulk
22

33
A File Output Plugin for Embulk to write HDFS.
44

55
## Overview
66

77
* **Plugin type**: file output
8-
* **Load all or nothing**: no
8+
* **Load all or nothing**: yes
99
* **Resume supported**: no
1010
* **Cleanup supported**: no
1111

1212
## Configuration
1313

1414
- **config_files** list of paths to Hadoop's configuration files (array of strings, default: `[]`)
1515
- **config** overwrites configuration parameters (hash, default: `{}`)
16-
- **output_path** the path finally stored files. (string, default: `"/tmp/embulk.output.hdfs_output.%Y%m%d_%s"`)
17-
- **working_path** the path temporary stored files. (string, default: `"/tmp/embulk.working.hdfs_output.%Y%m%d_%s"`)
16+
- **path_prefix** prefix of target files (string, required)
17+
- **file_ext** suffix of target files (string, required)
18+
- **sequence_format** format for sequence part of target files (string, default: `'.%03d.%02d'`)
19+
- **rewind_seconds** When you use Date format in path_prefix property(like `/tmp/embulk/%Y-%m-%d/out`), the format is interpreted by using the time which is Now minus this property. (int, default: `0`)
20+
- **overwrite** overwrite files when the same filenames already exists (boolean, default: `false`)
21+
- *caution*: even if this property is `true`, this does not mean ensuring the idempotence. if you want to ensure the idempotence, you need the procedures to remove output files after or before running.
1822

1923
## Example
2024

@@ -24,14 +28,13 @@ out:
2428
config_files:
2529
- /etc/hadoop/conf/core-site.xml
2630
- /etc/hadoop/conf/hdfs-site.xml
27-
- /etc/hadoop/conf/mapred-site.xml
28-
- /etc/hadoop/conf/yarn-site.xml
2931
config:
3032
fs.defaultFS: 'hdfs://hdp-nn1:8020'
31-
dfs.replication: 1
32-
mapreduce.client.submit.file.replication: 1
3333
fs.hdfs.impl: 'org.apache.hadoop.hdfs.DistributedFileSystem'
3434
fs.file.impl: 'org.apache.hadoop.fs.LocalFileSystem'
35+
path_prefix: '/tmp/embulk/hdfs_output/%Y-%m-%d/out'
36+
file_ext: 'txt'
37+
overwrite: true
3538
formatter:
3639
type: csv
3740
encoding: UTF-8

build.gradle

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ configurations {
1212
provided
1313
}
1414

15-
version = "0.1.2"
15+
version = "0.2.0"
1616

1717
sourceCompatibility = 1.7
1818
targetCompatibility = 1.7
@@ -22,7 +22,7 @@ dependencies {
2222
provided "org.embulk:embulk-core:0.7.0"
2323
// compile "YOUR_JAR_DEPENDENCY_GROUP:YOUR_JAR_DEPENDENCY_MODULE:YOUR_JAR_DEPENDENCY_VERSION"
2424
compile 'org.apache.hadoop:hadoop-client:2.6.0'
25-
compile 'com.google.guava:guava:14.0'
25+
compile 'com.google.guava:guava:15.0'
2626
testCompile "junit:junit:4.+"
2727
}
2828

@@ -57,9 +57,9 @@ task gemspec {
5757
Gem::Specification.new do |spec|
5858
spec.name = "${project.name}"
5959
spec.version = "${project.version}"
60-
spec.authors = ["takahiro.nakayama"]
61-
spec.summary = %[Hdfs output plugin for Embulk]
62-
spec.description = %[Dumps records to Hdfs.]
60+
spec.authors = ["Civitaspo"]
61+
spec.summary = %[Hdfs file output plugin for Embulk]
62+
spec.description = %[Stores files on Hdfs.]
6363
spec.email = ["civitaspo@gmail.com"]
6464
spec.licenses = ["MIT"]
6565
spec.homepage = "https://github.com/civitaspo/embulk-output-hdfs"

lib/embulk/output/hdfs.rb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
11
Embulk::JavaPlugin.register_output(
2-
"hdfs", "org.embulk.output.HdfsOutputPlugin",
2+
"hdfs", "org.embulk.output.hdfs.HdfsFileOutputPlugin",
33
File.expand_path('../../../../classpath', __FILE__))

src/main/java/org/embulk/output/HdfsOutputPlugin.java

Lines changed: 0 additions & 219 deletions
This file was deleted.

0 commit comments

Comments
 (0)