Skip to content

DGST Configuration

Guanghui.Zhu edited this page Jan 31, 2018 · 5 revisions

Configuration File

The default configuration file of DGST is /conf/conf.properties.

Core Parameters

The core parameters required by DGST are:

Parameter Name Default Meaning
alphabet.num 128 Size of alphabet
div.start 2 Initial count window size in sub-tree partitioning
div.step 4 Count window step size in sub-tree partitioning
root.max.count 2000000 Maximum sub-tree size (i.e., maximum S-prefix frequency)
fs.extra.len 1024 Tail length of input split
first.buffer 10 Number of symbols in the first element of the local LCP-Range array
lcp.range 128 Size of range in the LCP-Range structure
grouping.method bfhg Sub-tree construction task allocation strategy
spark.partitions 48 Computation parallelism on Spark
input.dir hdfs://master:9000/input The input data path on HDFS or local file system
output.location hdfs://master:9000/output The output data path on HDFS or local file system
working.dir hdfs://master:9000/tmp The tempoaray data path on HDFS or local file system
Clone this wiki locally