hhbbgg AwkwardAnalyzer

Repository to keep the analyzers using awkward arrays, using skimmer or nanoAOD as input.

Dependencies

Following packages are needed for the analyzer to work

matplotlib
uproot
hist
numpy
mplhep
vector
root
awkward
pandas
pyarrow

A virtual environment can be created for this using the following command

conda env create -f requirement.yaml

if available, do it with mamba, it's much faster

mamba env create -f requirement.yaml

To use the framework, the environment created by conda has to be activated every time. It can be done as follows:

conda activate hhbbgg-awk

For now the analyzer can be run normally using python

with `.root` file

python hhbbgg_Analyzer.py -i <Input root file directory OR single root file>

provided that the input directory having one root file for each background is defined with the variable name inputfilesDir in hhbbgg_Analyzer.py. This saves a root file in outputfiles which contains sample names as directory and all the histograms are saved inside those directories.

with `.parquet` file

python hhbbgg_Analyzer_parquet.py -i <Input root file directory OR single root file>

e.g. with all file moved in this NMSSM_v2

python hhbbgg_Analyzer_parquet.py -i ../../output_root/v2_production_central/

To plot the histograms hhbbgg_Plotter.py can be used as:

python hhbbgg_Plotter.py

The plots will be saved in stack_plots directory

To add the variable, changes are to be done in hhbbgg_Analyzer.py, binning.py and variables.py file

To plot the histogram of the variable, it has to be added in histogram_names list and xtitle_dict dictionary in hhbbgg_Plotter.py file

Fixing issues of seg fault on lxplus

with files hhbbgg_analyzer_lxplus_par.py, it fixes the seg fault.

python hhbbgg_analyzer_lxplus_par.py -i ~/public/samples/VBFHToGG.parquet

Quickstart

# 1. Clone the repository
git clone https://github.com/raj2022/hhbbgg_AwkwardAnalyzer.git
cd hhbbgg-AwkwardAnalyzer

# 2. Install micromamba (lightweight, recommended)
curl -Ls https://micro.mamba.pm/install.sh | bash
export PATH="$HOME/.local/bin:$PATH"

# 3. Create the environment
micromamba create -f environment.yml

# 4. Activate the environment
micromamba activate hhbbgg-awk

# 5. Run the analyzer (example with .root file)
python hhbbgg_Analyzer.py -i <input_root_file_or_dir>

Changes according to `Era`

Single era/year (use config)

python hhbbgg_analyzer_lxplus_par.py --year 2022 --era PostEE

This will:

Read Parquet files from the path defined in datasets.yaml
Write outputs to:

outputfiles/2022/PostEE/
  ├─ hhbbgg_analyzer-v2-histograms.root
  └─ hhbbgg_analyzer-v2-trees.root

Override input path manually

python hhbbgg_analyzer_lxplus_par.py --year 2022 --era PostEE \
  -i /afs/cern.ch/user/s/sraj/public/samples

combine everything (2022 + 2023, all eras)

Provide -i multiple times:

python hhbbgg_analyzer_lxplus_par.py --year 2023 --era All \
  -i /afs/cern.ch/user/s/sraj/Analysis/output_root/v3_production/samples/preEE \
  -i /afs/cern.ch/user/s/sraj/Analysis/output_root/v3_production/samples/postEE \
  -i /afs/cern.ch/user/s/sraj/Analysis/output_root/v3_production/samples/preBPix \
  -i /afs/cern.ch/user/s/sraj/Analysis/output_root/v3_production/samples/postBPix \
  --tag CombinedAll

For individual eras

2022 only

# 2022 PreEE (C+D)
python hhbbgg_analyzer_lxplus_par.py \
  --year 2022 --era PreEE \
  -i /afs/cern.ch/user/s/sraj/Analysis/output_root/v3_production/samples/preEE \
  --tag Y2022_PreEE

# 2022 PostEE (E+F+G)
python hhbbgg_analyzer_lxplus_par.py \
  --year 2022 --era PostEE \
  -i /afs/cern.ch/user/s/sraj/Analysis/output_root/v3_production/samples/postEE \
  --tag Y2022_PostEE

2023 only

# 2023 preBPix (Era C)
python hhbbgg_analyzer_lxplus_par.py \
  --year 2023 --era preBPix \
  -i /afs/cern.ch/user/s/sraj/Analysis/output_root/v3_production/samples/preBPix \
  --tag Y2023_preBPix

# 2023 postBPix (Era D)
python hhbbgg_analyzer_lxplus_par.py \
  --year 2023 --era postBPix \
  -i /afs/cern.ch/user/s/sraj/Analysis/output_root/v3_production/samples/postBPix \
  --tag Y2023_postBPix

drive from `datasets.yaml` (no `-i`)

If you wired RunConfig to use cfg.raw_paths when -i isn’t given, you can run:

# From YAML: 2022 (PreEE+PostEE)
python hhbbgg_analyzer_lxplus_par.py --year 2022 --era All --tag Combined2022

# From YAML: 2023 (preBPix+postBPix)
python hhbbgg_analyzer_lxplus_par.py --year 2023 --era All --tag Combined2023

With DD sample:

Combine DD (2022 + 2023, all eras)

with only a file

python hhbbgg_analyzer_lxplus_par.py --year 2023 --era All \
  -i /afs/cern.ch/user/s/sraj/Analysis/output_root/v3_production/samples/preEE/DDQCDGJET_Rescaled.parquet \
  -i /afs/cern.ch/user/s/sraj/Analysis/output_root/v3_production/samples/postEE/DDQCDGJET_Rescaled.parquet \
  -i /afs/cern.ch/user/s/sraj/Analysis/output_root/v3_production/samples/preBPix/DDQCDGJET_Rescaled.parquet \
  -i /afs/cern.ch/user/s/sraj/Analysis/output_root/v3_production/samples/postBPix/DDQCDGJET_Rescaled.parquet \
  --tag DD_CombinedAll

with whole folder

python hhbbgg_analyzer_lxplus_par.py --year 2023 --era All \
  -i /afs/cern.ch/user/s/sraj/Analysis/output_root/v3_production/samples/preEE/ \
  -i /afs/cern.ch/user/s/sraj/Analysis/output_root/v3_production/samples/postEE/ \
  -i /afs/cern.ch/user/s/sraj/Analysis/output_root/v3_production/samples/preBPix/ \
  -i /afs/cern.ch/user/s/sraj/Analysis/output_root/v3_production/samples/postBPix/ \
  --tag DD_CombinedAll

For individual eras

2022 only

# 2022 PreEE
python hhbbgg_analyzer_lxplus_par.py \
  --year 2022 --era PreEE \
  -i /afs/cern.ch/user/s/sraj/Analysis/output_root/v3_production/samples/preEE/DDQCDGJET_Rescaled.parquet \
  --tag DD_Y2022_PreEE

# 2022 PostEE
python hhbbgg_analyzer_lxplus_par.py \
  --year 2022 --era PostEE \
  -i /afs/cern.ch/user/s/sraj/Analysis/output_root/v3_production/samples/postEE/DDQCDGJET_Rescaled.parquet \
  --tag DD_Y2022_PostEE

2023 only

# 2023 preBPix
python hhbbgg_analyzer_lxplus_par.py \
  --year 2023 --era preBPix \
  -i /afs/cern.ch/user/s/sraj/Analysis/output_root/v3_production/samples/preBPix/DDQCDGJET_Rescaled.parquet \
  --tag DD_Y2023_preBPix

# 2023 postBPix
python hhbbgg_analyzer_lxplus_par.py \
  --year 2023 --era postBPix \
  -i /afs/cern.ch/user/s/sraj/Analysis/output_root/v3_production/samples/postBPix/DDQCDGJET_Rescaled.parquet \
  --tag DD_Y2023_postBPix

Name		Name	Last commit message	Last commit date
Latest commit History 675 Commits
ML_Application		ML_Application
config		config
data_driven_bkg_est		data_driven_bkg_est
event_categorization		event_categorization
finalfits		finalfits
jsonhiggsdnaproduction		jsonhiggsdnaproduction
mass_regression		mass_regression
mass_sculpting		mass_sculpting
outputfiles		outputfiles
signal		signal
stack_plots		stack_plots
stats_study		stats_study
tth_killer		tth_killer
.gitignore		.gitignore
README.md		README.md
binning.py		binning.py
command.md		command.md
comparison.py		comparison.py
hhbbgg_Analyzer.py		hhbbgg_Analyzer.py
hhbbgg_Analyzer_parquet.py		hhbbgg_Analyzer_parquet.py
hhbbgg_Plotter.py		hhbbgg_Plotter.py
hhbbgg_analyzer_lxplus_par.py		hhbbgg_analyzer_lxplus_par.py
normalisation.py		normalisation.py
regions.py		regions.py
requirement.yaml		requirement.yaml
variables.py		variables.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

hhbbgg AwkwardAnalyzer

Dependencies

with `.root` file

with `.parquet` file

Fixing issues of seg fault on lxplus

Quickstart

Changes according to `Era`

Single era/year (use config)

Override input path manually

combine everything (2022 + 2023, all eras)

For individual eras

2022 only

2023 only

drive from `datasets.yaml` (no `-i`)

With DD sample:

Combine DD (2022 + 2023, all eras)

2022 only

2023 only

About

Uh oh!

Releases

Packages

Languages

raj2022/hhbbgg_AwkwardAnalyzer

Folders and files

Latest commit

History

Repository files navigation

hhbbgg AwkwardAnalyzer

Dependencies

with .root file

with .parquet file

Fixing issues of seg fault on lxplus

Quickstart

Changes according to Era

Single era/year (use config)

Override input path manually

combine everything (2022 + 2023, all eras)

For individual eras

2022 only

2023 only

drive from datasets.yaml (no -i)

With DD sample:

Combine DD (2022 + 2023, all eras)

2022 only

2023 only

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

with `.root` file

with `.parquet` file

Changes according to `Era`

drive from `datasets.yaml` (no `-i`)

Packages