Sequential Conformal Admissibility Control for Generative Models (SCOPE-Gen)

Official code for the ICLR 2025 paper Conformal Generative Modeling With Improved Sample Efficiency Through Sequential Greedy Filtering by Klaus-Rudolf Kladny, Bernhard Schölkopf and Michael Muehlebach.

Code Organization

.
├── scope_gen
│     ├── algorithms
│     │     └── base.py          # the "heart" of the algorithm: Prediction pipeline generation
│     ├── models                 # prediction pipeline classes
│     ├── calibrate              # calibration functions
│     ├── data                   # basic data structures
│     ├── scripts                # scripts to process data and create tables and figures
│     ├── baselines
│     │     └── clm              # adaptation of the CLM algorithm
│     ├── mimic_cxr
│     │     ├── data             # generative model outputs
│     │     ├── scripts          # evaluation scripts
│     │     ├── configs          # config files for the evaluation scripts
│     │     └── paths.py        
│     ├── ...                    # all other experiments are identical in structure to mimic_cxr
│     ├── nc_scores.py           # non-conformity score functions
│     ├── order_funcs.py         # order functions: determine order according to sub-sample function
│     ├── admissions.py          # admission functions
│     └── distances.py           # distance functions
└── ...

General

Set up a conda environment

We recommend you to use a conda environment. To install all packages into such an environment, run

conda env create -f environment.yml -n scope_gen

Then, activate the environment

conda activate scope_gen

We note that all required packages will be drawn from the community-led conda-forge channel.

Run the code

In the current implementation, SCOP-Gen requires three .npy files:

scores.npy
labels.npy
diversity.npy

The scores.npy array contains the quality estimates for each sample. The labels.npy array contains the admissibility labels (0 means "inadmissible", 1 means "admissible"). The diversity.npy array contains similarities (not distances) between samples. The scores.npy and labels.npy are numpy arrays of shapes (n, max), where n is the amount of calibration points and max is the sample limit. diversity.npy must be of shape (n, max, max).

If you want to run the MIMIC-CXR experiment, these files must be moved into

scope_gen/mimic_cxr/data/outputs

After specifying these arrays, you are ready to get started! First, you must format the data. For instance, if you want to reproduce our MIMIC-CXR results, run

python -m scope_gen.mimic_cxr.scripts.format_data

Then, you can reproduce our quantitative evaluation results (including all baselines) via

python -m scope_gen.mimic_cxr.scripts.eval_all

For the qualitative evaluation results, run the jupyter notebook

jupyter notebook scope_gen/mimic_cxr/scripts/qualitative_comparison.ipynb

If you want to reproduce the other experiments, simply replace mimic_cxr by any of the other project directories cnn_dm, triviaqa or molecules. The folder structures are identical.

If you would like to reproduce the table in Appendix H, run

python -m scope_gen.mimic_cxr.scripts.eval --custom_path "single_run_results" --config "./scope_gen/mimic_cxr/scripts/configs/single_runs.json" --name "ourmethod{}" --return_std_coverages True --score "sum"

and finally,

python -m scope_gen.mimic_cxr.scripts.single_runs_assessment

Natural Language Generation Tasks

To generate the numpy files for the tasks mimic_cxr, cnn_dm and triviaqa, follow the instructions of the CLM auxiliary repository.

Molecular Scaffold Extension

Install the conda environment (assuming you would like to call it scope_gen_mol):

conda env create -n scope_gen_mol -f scope_gen/molecular_extensions/environment.yml

Install DiGress

Clone this specific fork of DiGress (original repository):

git clone https://github.com/rudolfwilliam/DiGress.git

Then, cd into the main directory and install it via

pip install .

Install Moses

Clone the Moses repository

git clone https://github.com/molecularsets/moses.git

and also install this one in the same way as for DiGress.

Obtain the Model Weights

Either obtain model weights by training a DiGress model on the MOSES data train split or simply download the MOSES model checkpoint from the original repository. Place the model .ckpt file into

scope_gen/molecules/models/checkpoints

Generate Data

Finally, generate the model predictions via

python -m scope_gen.molecules.scripts.generate_data

Then, you should find the three numpy arrays in the outputs directory.

Issues?

If you encounter any issues in running the code, please contact me at kkladny [at] tuebingen [dot] mpg [dot] de.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
assets		assets
scope_gen		scope_gen
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Sequential Conformal Admissibility Control for Generative Models (SCOPE-Gen)

Code Organization

General

Set up a conda environment

Run the code

Natural Language Generation Tasks

Molecular Scaffold Extension

Install DiGress

Install Moses

Obtain the Model Weights

Generate Data

Issues?

About

Uh oh!

Releases

Packages

Languages

License

rudolfwilliam/scope-gen

Folders and files

Latest commit

History

Repository files navigation

Sequential Conformal Admissibility Control for Generative Models (SCOPE-Gen)

Code Organization

General

Set up a conda environment

Run the code

Natural Language Generation Tasks

Molecular Scaffold Extension

Install DiGress

Install Moses

Obtain the Model Weights

Generate Data

Issues?

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages