Skip to content

Executing Dynamometer workload #104

@novosibman

Description

@novosibman

Hi,

I collected all prerequisites (fsimage, audit log) and prepared local environment (accompanying hdfs, separate yarn manager) according to Dynamometer readme and tried to start workload scripts. Tried Hadoop versions: 2.7.4 and 2.8.4.

${DYN_HOME}/bin/upload-fsimage.sh 0894 ${HDFS_PATH}/fsimage \
    ${BASE_DIR}/fsimage-${HADOOP_VERSION} 

fsimage - passed

${DYN_HOME}/bin/generate-block-lists.sh \
    -fsimage_input_path ${HDFS_PATH}/fsimage/fsimage_0000000000000000894.xml \
    -block_image_output_dir ${HDFS_PATH}/blocks \
    -num_reducers 10 -num_datanodes 3 

generate-block-lists - passed

${DYN_HOME}/bin/start-dynamometer-cluster.sh "" \
    -hadoop_binary_path file://${BASE_DIR}/hadoop-${HADOOP_VERSION}.tar.gz \
    -conf_path file://${BASE_DIR}/conf.zip \
    -fs_image_dir ${HDFS_PATH}/fsimage \
    -block_list_path ${HDFS_PATH}/blocks

start-dynamometer-cluster: looks working according to output:
...
19/10/18 03:56:56 INFO dynamometer.Client: NameNode has started!
19/10/18 03:56:56 INFO dynamometer.Client: Waiting for 2 DataNodes to register with the NameNode...
19/10/18 03:57:02 INFO dynamometer.Client: Number of live DataNodes = 2.00; above threshold of 2.00; done waiting after 6017 ms.
19/10/18 03:57:02 INFO dynamometer.Client: Waiting for MissingBlocks to fall below 0.010199999...
19/10/18 03:57:02 INFO dynamometer.Client: Number of missing blocks: 102.00
19/10/18 04:00:03 INFO dynamometer.Client: Number of missing blocks = 0.00; below threshold of 0.01; done waiting after 180082 ms.
19/10/18 04:00:03 INFO dynamometer.Client: Waiting for UnderReplicatedBlocks to fall below 1.02...
19/10/18 04:00:03 INFO dynamometer.Client: Number of under replicated blocks: 102.00

${DYN_HOME}/bin/start-workload.sh \
    -Dauditreplay.log-start-time.ms=1000 \
    -Dauditreplay.input-path=file://${BASE_DIR}/audit_logs-${HADOOP_VERSION} \
    -Dauditreplay.output-path=${RESULTS_DIR} \
    -Dauditreplay.num-threads=1 \
    -nn_uri hdfs://$HOSTNAME:9000/ \
    -start_time_offset 1m \
    -mapper_class_name AuditReplayMapper

start-workload - it started and never finish during couple of hours repeating 'map > map':
19/10/18 04:07:53 INFO workloadgenerator.WorkloadDriver: The workload will start at 1571396933516 ms (2019/10/18 04:08:53 PDT)
19/10/18 04:07:54 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
19/10/18 04:07:54 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
19/10/18 04:07:55 INFO input.FileInputFormat: Total input files to process : 1
19/10/18 04:07:55 INFO mapreduce.JobSubmitter: number of splits:1
19/10/18 04:07:55 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local579807884_0001
19/10/18 04:07:55 INFO mapreduce.Job: The url to track the job: http://localhost:8080/
19/10/18 04:07:55 INFO mapreduce.Job: Running job: job_local579807884_0001
19/10/18 04:07:55 INFO mapred.LocalJobRunner: OutputCommitter set in config null
19/10/18 04:07:55 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
19/10/18 04:07:55 INFO output.FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
19/10/18 04:07:55 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
19/10/18 04:07:55 INFO mapred.LocalJobRunner: Waiting for map tasks
19/10/18 04:07:55 INFO mapred.LocalJobRunner: Starting task: attempt_local579807884_0001_m_000000_0
19/10/18 04:07:55 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
19/10/18 04:07:55 INFO output.FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
19/10/18 04:07:55 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
19/10/18 04:07:55 INFO mapred.MapTask: Processing split: file:/home/rscherba/ws/hadoop/dynamometer-test/audit_logs-2.8.4/hdfs-audit.log:0+251649
19/10/18 04:07:55 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
19/10/18 04:07:55 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
19/10/18 04:07:55 INFO mapred.MapTask: soft limit at 83886080
19/10/18 04:07:55 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
19/10/18 04:07:55 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
19/10/18 04:07:55 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
19/10/18 04:07:55 INFO audit.AuditReplayMapper: Starting 1 threads
19/10/18 04:07:55 INFO audit.AuditReplayThread: Start timestamp: 1571396933516
19/10/18 04:07:55 INFO audit.AuditReplayThread: Sleeping for 57526 ms
19/10/18 04:07:56 INFO mapreduce.Job: Job job_local579807884_0001 running in uber mode : false
19/10/18 04:07:56 INFO mapreduce.Job: map 0% reduce 0%
19/10/18 04:08:07 INFO mapred.LocalJobRunner: map > map
19/10/18 04:13:07 INFO mapred.LocalJobRunner: map > map
19/10/18 04:18:07 INFO mapred.LocalJobRunner: map > map
19/10/18 04:23:07 INFO mapred.LocalJobRunner: map > map
19/10/18 04:28:07 INFO mapred.LocalJobRunner: map > map
19/10/18 04:33:07 INFO mapred.LocalJobRunner: map > map
19/10/18 04:38:07 INFO mapred.LocalJobRunner: map > map
19/10/18 04:43:07 INFO mapred.LocalJobRunner: map > map
...

How long Dynamometer workload should work? How run script arguments can affect test run? How to check in logs is there is something wrong in the configuration?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions