SLAC Workflow

This is a tutorial that will walk you through setting up your workspace at SLAC.

Installing ldmx-sw

Start by logging in to a CentOS machine at SLAC and navigating to your user directory. You'll need to enable the developer's toolset every time you log in.

ssh -XY <USER>@centos7.slac.stanford.edu
<ENTER PASSWORD>
scl enable devtoolset-8 bash
cd /nfs/slac/g/ldmx/users/<USER>

Make a directory to host your workspace and clone the LDMX-Software repository. For this tutorial, we'll be installing v3.0.0 which is the most stable version of the software for which we have plenty of samples. In principle though, you can install any version of the software with the -b flag. Also, don't forget the --recursive flag when cloning the repository!

mkdir ldmx-sw-v3.0.0
cd ldmx-sw-v3.0.0
git clone --recursive https://github.com/LDMX-Software/ldmx-sw.git -b v3.0.0

Set up the environment with source ldmx-sw/scripts/ldmx-env.sh. You'll need to source this script every time you open up a new terminal.
Make a build directory and configure the build.

cd ldmx-sw
mkdir build
cd build
ldmx cmake ..

Build and install ldmx-sw with ldmx make install. Don't be surprised if you see a few warnings. As long as no fatal errors occur, everything should install correctly.

Installing LDMX-scripts and Dependencies

Start by pip installing numpy, matplotlib, and xgboost to your user directory.
Then just navigate to your workspace and clone the LDMX-scripts repository.

cd /nfs/slac/g/ldmx/users/<USER>/ldmx-sw-v3.0.0
git clone https://github.com/IncandelaLab/LDMX-scripts.git

In the past, our analysis was done through the kickstart repository ldmx-analysis. Frequent software changes made this difficult, so a Python-based analysis framework emerged within our group to be more immune to chaotic development periods. This part of the tutorial will walk you through analyzing a collection of ROOT files with pyEcalVeto.

NOTE: There are plans to refactor pyEcalVeto in the interest of being more user-friendly and enhancing readability. This page will be updated once this is done.

Processing ROOT Files Interactively

The ROOT files for this tutorial are provided under inputs in the TutorialFiles folder. Start by navigating to pyEcalVeto and examine treeMaker.py. This script processes each event and calculates a slew of kinematic variables, some of which we feed to a machine learning program called a boosted decision tree (BDT). For now, we'll use this script to analyze the tutorial files. Let's also make a directory to hold whatever gets output later.

cd /nfs/slac/g/ldmx/users/<USER>/ldmx-sw-v3.0.0/LDMX-scripts/pyEcalVeto
ldmx python3 treeMaker.py --help
mkdir outputs

The second command should have brought up some useful information on how to use the script. A quick rundown: Tell the script to run in batch mode with the --batch flag, specify your inputs either as a list of files with the -i flag or as a list of directories with the --indirs flag, label each group of files with the -g flag, specify your outputs for each file group with the -o flag, and tell the script how many events to process for each file group with the -m flag.

Now we'll have the analysis script run over each file and output the results from the first 500 events of each file. We'll also label each output file by its process. The following command does all of this.

ldmx python3 treeMaker.py -i $PWD/../TutorialFiles/inputs/0.001_input.root $PWD/../TutorialFiles/inputs/0.01_input.root $PWD/../TutorialFiles/inputs/0.1_input.root $PWD/../TutorialFiles/inputs/1.0_input.root -g 0.001 0.01 0.1 1.0 -o $PWD/outputs $PWD/outputs $PWD/outputs $PWD/outputs -m 500

Once the script finishes processing the files, you can go ahead and delete the newly created scratch directory. Sometimes it isn't able to do this on its own, but this will hopefully be fixed in the future.

Navigate to the output directory and open up the 0.001 GeV signal file in ROOT. Let's examine the number of reconstructed hits read out from the ECal.

cd outputs
root 0.001_unsorted.root
new TBrowser()

Browse through the file and select the nReadoutHits leaf under the EcalVeto branch. It should be the first leaf. If all goes as expected, you should see the following histogram.

Submitting Batch Jobs

Oftentimes you'll need to run over a large number of files. This is where batch submission comes into play. Start by setting LSB_JOB_REPORT_MAIL=Y to receive email updates about your jobs' progress. You'll need to set this variable every time you open up a new terminal if you want to receive updates.
Navigate to your workspace and submit some batch jobs. This is done through the bsub command. You can set which queue to submit a job to with the -q flag (Available options are short, medium, and long), specify how long a job is expected to run in minutes with the -W flag, and set how many cores you want to use with the -n flag. Let's have treeMaker.py run over each file as before and process all of the events this time. We'll submit the jobs to the short queue, running each on a single core with an expected run time of 5 minutes.

cd /nfs/slac/g/ldmx/users/<USER>/ldmx-sw-v3.0.0
bsub -q short -W 5 -n 1 -R "select[centos7] span[hosts=1]" singularity run --home $PWD $PWD/ldmx_dev_latest.sif . python3 $PWD/LDMX-scripts/pyEcalVeto/treeMaker.py --batch -i $PWD/LDMX-scripts/TutorialFiles/inputs/0.001_input.root -g 0.001 -o $PWD/LDMX-scripts/pyEcalVeto/outputs
bsub -q short -W 5 -n 1 -R "select[centos7] span[hosts=1]" singularity run --home $PWD $PWD/ldmx_dev_latest.sif . python3 $PWD/LDMX-scripts/pyEcalVeto/treeMaker.py --batch -i $PWD/LDMX-scripts/TutorialFiles/inputs/0.01_input.root -g 0.01 -o $PWD/LDMX-scripts/pyEcalVeto/outputs
bsub -q short -W 5 -n 1 -R "select[centos7] span[hosts=1]" singularity run --home $PWD $PWD/ldmx_dev_latest.sif . python3 $PWD/LDMX-scripts/pyEcalVeto/treeMaker.py --batch -i $PWD/LDMX-scripts/TutorialFiles/inputs/0.1_input.root -g 0.1 -o $PWD/LDMX-scripts/pyEcalVeto/outputs
bsub -q short -W 5 -n 1 -R "select[centos7] span[hosts=1]" singularity run --home $PWD $PWD/ldmx_dev_latest.sif . python3 $PWD/LDMX-scripts/pyEcalVeto/treeMaker.py --batch -i $PWD/LDMX-scripts/TutorialFiles/inputs/1.0_input.root -g 1.0 -o $PWD/LDMX-scripts/pyEcalVeto/outputs

It's crucial that you run bsub from the directory where your singularity image file (.sif) is located. If your image has a different name than the one shown here, make sure to point the command to the correct file. Note the judicious application of absolute file paths. This is good practice when working from inside the container, as it can be finicky about the locations of files sometimes.

Navigate to the output directory and open up the 0.001 GeV signal file again. This time, let's examine the transverse RMS deviation of ECal hits.

cd LDMX-scripts/pyEcalVeto/outputs
root 0.001_unsorted.root
new TBrowser()

Browse through the file and select the showerRMS leaf under the EcalVeto branch. It should be the fourth leaf down from nReadoutHits. If all goes as expected, you should see the following histogram.

Plotting and Visualization

For the last part of this tutorial, we'll go over how to generate plots that you can present during group meetings. The template configuration file for plotting is provided under plotting_template.conf in the TutorialFiles folder. Open it up and take a look around to get a feel for all the settings that are available, and feel free to modify it as you see fit! Let's navigate to pyEcalVeto and create a folder to hold the plots that we're about to generate.

cd /nfs/slac/g/ldmx/users/<USER>/ldmx-sw-v3.0.0/LDMX-scripts/pyEcalVeto
mkdir plots
cd outputs
mv 0.001_unsorted.root 0.001_tree.root
mv 0.01_unsorted.root 0.01_tree.root
mv 0.1_unsorted.root 0.1_tree.root
mv 1.0_unsorted.root 1.0_tree.root

The last few commands above navigate to the folder with our processed ROOT files and change their names to ones that are recognized by the plotting script.

Now we'll navigate to the plotting scripts folder and generate our plots with the plotVariables.py script. We have to point this script to the .conf file that specifies all of the settings for our plots.

cd /nfs/slac/g/ldmx/users/<USER>/ldmx-sw-v3.0.0/LDMX-scripts/plotting
ldmx python3 plotVariables.py $PWD/../TutorialFiles/plotting_template.py

To view your plots locally, open up a new tab in the terminal and download the images like so.

scp -r <USER>@centos7.slac.stanford.edu:/nfs/slac/g/ldmx/users/<USER>/ldmx-sw-v3.0.0/LDMX-scripts/pyEcalVeto/plots <LOCAL_DESTINATION>
<ENTER_PASSWORD>

If all goes as expected, you should see the following images downloaded to whatever local destination you specified. Be advised: They're not the prettiest plots in the world because of the low statistics!

SLAC Workflow

Installing ldmx-sw

Installing LDMX-scripts and Dependencies

Processing ROOT Files Interactively

Submitting Batch Jobs

Plotting and Visualization

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally