-
Notifications
You must be signed in to change notification settings - Fork 34
Description
Hi,
I collected all prerequisites (fsimage, audit log) and prepared local environment (accompanying hdfs, separate yarn manager) according to Dynamometer readme and tried to start workload scripts. Tried Hadoop versions: 2.7.4 and 2.8.4.
${DYN_HOME}/bin/upload-fsimage.sh 0894 ${HDFS_PATH}/fsimage \
${BASE_DIR}/fsimage-${HADOOP_VERSION}
fsimage - passed
${DYN_HOME}/bin/generate-block-lists.sh \
-fsimage_input_path ${HDFS_PATH}/fsimage/fsimage_0000000000000000894.xml \
-block_image_output_dir ${HDFS_PATH}/blocks \
-num_reducers 10 -num_datanodes 3
generate-block-lists - passed
${DYN_HOME}/bin/start-dynamometer-cluster.sh "" \
-hadoop_binary_path file://${BASE_DIR}/hadoop-${HADOOP_VERSION}.tar.gz \
-conf_path file://${BASE_DIR}/conf.zip \
-fs_image_dir ${HDFS_PATH}/fsimage \
-block_list_path ${HDFS_PATH}/blocks
start-dynamometer-cluster: looks working according to output:
...
19/10/18 03:56:56 INFO dynamometer.Client: NameNode has started!
19/10/18 03:56:56 INFO dynamometer.Client: Waiting for 2 DataNodes to register with the NameNode...
19/10/18 03:57:02 INFO dynamometer.Client: Number of live DataNodes = 2.00; above threshold of 2.00; done waiting after 6017 ms.
19/10/18 03:57:02 INFO dynamometer.Client: Waiting for MissingBlocks to fall below 0.010199999...
19/10/18 03:57:02 INFO dynamometer.Client: Number of missing blocks: 102.00
19/10/18 04:00:03 INFO dynamometer.Client: Number of missing blocks = 0.00; below threshold of 0.01; done waiting after 180082 ms.
19/10/18 04:00:03 INFO dynamometer.Client: Waiting for UnderReplicatedBlocks to fall below 1.02...
19/10/18 04:00:03 INFO dynamometer.Client: Number of under replicated blocks: 102.00
${DYN_HOME}/bin/start-workload.sh \
-Dauditreplay.log-start-time.ms=1000 \
-Dauditreplay.input-path=file://${BASE_DIR}/audit_logs-${HADOOP_VERSION} \
-Dauditreplay.output-path=${RESULTS_DIR} \
-Dauditreplay.num-threads=1 \
-nn_uri hdfs://$HOSTNAME:9000/ \
-start_time_offset 1m \
-mapper_class_name AuditReplayMapper
start-workload - it started and never finish during couple of hours repeating 'map > map':
19/10/18 04:07:53 INFO workloadgenerator.WorkloadDriver: The workload will start at 1571396933516 ms (2019/10/18 04:08:53 PDT)
19/10/18 04:07:54 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
19/10/18 04:07:54 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
19/10/18 04:07:55 INFO input.FileInputFormat: Total input files to process : 1
19/10/18 04:07:55 INFO mapreduce.JobSubmitter: number of splits:1
19/10/18 04:07:55 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local579807884_0001
19/10/18 04:07:55 INFO mapreduce.Job: The url to track the job: http://localhost:8080/
19/10/18 04:07:55 INFO mapreduce.Job: Running job: job_local579807884_0001
19/10/18 04:07:55 INFO mapred.LocalJobRunner: OutputCommitter set in config null
19/10/18 04:07:55 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
19/10/18 04:07:55 INFO output.FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
19/10/18 04:07:55 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
19/10/18 04:07:55 INFO mapred.LocalJobRunner: Waiting for map tasks
19/10/18 04:07:55 INFO mapred.LocalJobRunner: Starting task: attempt_local579807884_0001_m_000000_0
19/10/18 04:07:55 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
19/10/18 04:07:55 INFO output.FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
19/10/18 04:07:55 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
19/10/18 04:07:55 INFO mapred.MapTask: Processing split: file:/home/rscherba/ws/hadoop/dynamometer-test/audit_logs-2.8.4/hdfs-audit.log:0+251649
19/10/18 04:07:55 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
19/10/18 04:07:55 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
19/10/18 04:07:55 INFO mapred.MapTask: soft limit at 83886080
19/10/18 04:07:55 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
19/10/18 04:07:55 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
19/10/18 04:07:55 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
19/10/18 04:07:55 INFO audit.AuditReplayMapper: Starting 1 threads
19/10/18 04:07:55 INFO audit.AuditReplayThread: Start timestamp: 1571396933516
19/10/18 04:07:55 INFO audit.AuditReplayThread: Sleeping for 57526 ms
19/10/18 04:07:56 INFO mapreduce.Job: Job job_local579807884_0001 running in uber mode : false
19/10/18 04:07:56 INFO mapreduce.Job: map 0% reduce 0%
19/10/18 04:08:07 INFO mapred.LocalJobRunner: map > map
19/10/18 04:13:07 INFO mapred.LocalJobRunner: map > map
19/10/18 04:18:07 INFO mapred.LocalJobRunner: map > map
19/10/18 04:23:07 INFO mapred.LocalJobRunner: map > map
19/10/18 04:28:07 INFO mapred.LocalJobRunner: map > map
19/10/18 04:33:07 INFO mapred.LocalJobRunner: map > map
19/10/18 04:38:07 INFO mapred.LocalJobRunner: map > map
19/10/18 04:43:07 INFO mapred.LocalJobRunner: map > map
...
How long Dynamometer workload should work? How run script arguments can affect test run? How to check in logs is there is something wrong in the configuration?