1212
1313- ** Experiment tracking** : Integrated <img src =" ../doc/_static/img/logos/wandb.svg " width =" 20px " align =" center " /> [ Weights & Biases] ( https://wandb.ai/ ) logging for monitoring and reproducibility
1414
15- - [ Quick Start] ( #quick-start )
16- - [ Installation] ( #installation )
17- - [ Configuration] ( #configuration )
18- - [ Running Experiments] ( #running-experiments )
19- - [ Custom configurations] ( #custom-configurations )
20- - [ Output Structure] ( #output-structure )
21- - [ Configuration Details] ( #configuration-details )
22- - [ Configuration Structure] ( #configuration-structure )
23- - [ Dataset Configuration] ( #dataset-configuration-datasetyaml )
24- - [ Model Configuration] ( #model-configuration-modelyaml )
25- - [ Implementation] ( #implementation )
26- - [ Implementing Your Own Model] ( #implementing-your-own-model )
27- - [ Implementing Your Own Dataset] ( #implementing-your-own-dataset )
28- - [ Contributing] ( #contributing )
29- - [ Cite this library] ( #cite-this-library )
15+ 📚 ** Full Documentation** : See the [ comprehensive Conceptarium guide] ( ../doc/guides/using_conceptarium.rst ) for detailed documentation on:
16+ - Configuration system and hierarchy
17+ - Dataset and model configuration
18+ - Custom losses and metrics
19+ - Advanced usage patterns
20+ - Troubleshooting
3021
3122---
3223
@@ -63,20 +54,21 @@ hydra:
6354 name : my_experiment
6455 sweeper :
6556 params :
66- model : cbm # One or more models (blackbox, cbm, cem, cgm, c2bm, etc.)
67- dataset : celeba, cub # One or more datasets (celeba, cub, MNIST, alarm, etc.)
68- seed : 1,2,3,4,5 # sweep over multiple seeds for robustness
57+ seed : 1,2,3,4,5 # Sweep over multiple seeds for robustness
58+ dataset : cub,celeba # One or more datasets
59+ model : cbm_joint # One or more models (blackbox, cbm_joint)
6960
7061model :
7162 optim_kwargs :
72- lr : 0.001
63+ lr : 0.01
64+
65+ metrics :
7366 summary_metrics : true
74- perconcept_metrics : false
67+ perconcept_metrics : true
7568
7669trainer :
77- max_epochs : 500
78- patience : 30
79- monitor : " val_loss"
70+ max_epochs : 200
71+ patience : 20
8072` ` `
8173
8274## Running Experiments
@@ -97,13 +89,13 @@ python run_experiment.py --config-name your_sweep.yaml
9789On top of this, you can also override configurations from command line:
9890``` bash
9991# Change dataset
100- python run_experiment.py dataset=alarm
92+ python run_experiment.py dataset=cub
10193
10294# Change learning rate
103- python run_experiment.py model.optim_kwargs.lr=0.001
95+ python run_experiment.py model.optim_kwargs.lr=0.01
10496
10597# Change multiple configurations
106- python run_experiment.py model=cbm dataset=asia,alarm seed=1,2,3
98+ python run_experiment.py model=cbm_joint dataset=cub,celeba seed=1,2,3
10799```
108100
109101## Output Structure
@@ -133,57 +125,44 @@ Configuration files are organized in `conceptarium/conf/`:
133125
134126```
135127conf/
136- ├── _default.yaml # Base configuration with defaults
137- ├── sweep.yaml # Experiment sweep configuration
138- ├── dataset/ # Dataset configurations
139- │ ├── _commons.yaml # Common dataset parameters
140- │ ├── celeba.yaml
141- │ ├── cub.yaml
142- │ ├── sachs.yaml
143- │ └── ...
144- └── model/ # Model architectures
145- ├── loss/ # Loss function configurations
146- │ ├── _default.yaml # Type-aware losses (BCE, CE, MSE)
147- │ └── weighted.yaml # Weighted type-aware losses
148- ├── metrics/ # Metric configurations
149- │ ├── _default.yaml # Type-aware metrics (Accuracy, MAE, MSE)
150- │ └── ...
151- ├── _commons.yaml # Common model parameters
152- ├── blackbox.yaml # Black-box baseline
153- ├── cbm_joint.yaml # Concept Bottleneck Model (Joint)
154- ├── cem.yaml # Concept Embedding Model
155- ├── cgm.yaml # Concept Graph Model
156- └── c2bm.yaml # Causally Reliable CBM
157- ```
158- │ ├── default.yaml # Type-aware metrics (Accuracy, MAE, MSE)
159- │ └── ...
160- ├── _commons.yaml # Common model parameters
161- ├── blackbox.yaml # Black-box baseline
162- ├── cbm.yaml # Concept Bottleneck Model
163- ├── cem.yaml # Concept Embedding Model
164- ├── cgm.yaml # Concept Graph Model
165- └── c2bm.yaml # Causally Reliable CBM
128+ ├── _default.yaml # Base configuration with defaults
129+ ├── sweep.yaml # Example sweep configuration
130+ ├── dataset/ # Dataset configurations
131+ │ ├── _commons.yaml # Common dataset parameters
132+ │ ├── cub.yaml # CUB-200-2011 birds dataset
133+ │ ├── celeba.yaml # CelebA faces dataset
134+ │ └── ... # More datasets
135+ ├── loss/ # Loss function configurations
136+ │ ├── standard.yaml # Standard type-aware losses
137+ │ └── weighted.yaml # Weighted type-aware losses
138+ ├── metrics/ # Metric configurations
139+ │ └── standard.yaml # Type-aware metrics (Accuracy)
140+ └── model/ # Model architectures
141+ ├── _commons.yaml # Common model parameters
142+ ├── blackbox.yaml # Black-box baseline
143+ ├── cbm.yaml # Alias for CBM Joint
144+ └── cbm_joint.yaml # Concept Bottleneck Model (Joint)
166145```
167146
168147
169148## Dataset Configuration (` dataset/*.yaml ` )
170149
171- Dataset configurations specify the dataset class to instantiate, all data-specific parameters, and all necessary preprocessing parameters. An example configuration for the CUB dataset is provided below:
150+ Dataset configurations specify the dataset class to instantiate, all data-specific parameters, and all necessary preprocessing parameters. An example configuration for the CUB-200-2011 birds dataset is provided below:
172151
173152``` yaml
174153defaults :
175154 - _commons
176155 - _self_
177156
178- _target_: torch_concepts.data.datamodules.CUBDataModule # the path to your datamodule class
157+ _target_ : torch_concepts.data.datamodules.CUBDataModule
179158
180159name : cub
181160
182161backbone :
183- _target_: "path.to.your.backbone.ClassName"
184- # ... (backbone arguments)
162+ _target_ : torchvision.models.resnet18
163+ pretrained : true
185164
186- precompute_embs: true # precompute input to speed up training
165+ precompute_embs : true # precompute embeddings to speed up training
187166
188167default_task_names : [bird_species]
189168
@@ -197,7 +176,8 @@ label_descriptions:
197176
198177### Common Parameters
199178
200- Default parameters, common to all dataset, are in ` _commons.yaml ` :
179+ Default parameters, common to all datasets, are in ` _commons.yaml ` :
180+
201181- ** ` batch_size ` ** : Training batch size (default: 256)
202182- ** ` val_size ` ** : Validation set fraction (default: 0.15)
203183- ** ` test_size ` ** : Test set fraction (default: 0.15)
@@ -212,38 +192,37 @@ Model configurations specify the architecture, loss, metrics, optimizer, and inf
212192``` yaml
213193defaults :
214194 - _commons
215- - loss : _default
216- - metrics : _default
217195 - _self_
218196
219- _target_ : " torch_concepts.nn.ConceptBottleneckModel_Joint"
197+ _target_ : torch_concepts.nn.ConceptBottleneckModel_Joint
220198
221199task_names : ${dataset.default_task_names}
222200
223201inference :
224- _target_ : " torch_concepts.nn.DeterministicInference"
202+ _target_ : torch_concepts.nn.DeterministicInference
225203 _partial_ : true
226204
227205summary_metrics : true # enable/disable summary metrics over concepts
228206perconcept_metrics : false # enable/disable per-concept metrics
229207` ` `
230208
231- ### Common Parameters
209+ ### Model Common Parameters
232210
233211From ` _commons.yaml`:
212+
234213- **`encoder_kwargs`**: Encoder architecture parameters
235214 - **`hidden_size`**: Hidden layer dimension in encoder
236215 - **`n_layers`**: Number of hidden layers in encoder
237216 - **`activation`**: Activation function (relu, tanh, etc.) in encoder
238217 - **`dropout`**: Dropout probability in encoder
239- - **`variable_distributions`**: Probability distributions with which concepts are modeled:
218+ - **`variable_distributions`**: Probability distributions with which concepts are modeled
240219- **`optim_class`**: Optimizer class
241220- **`optim_kwargs`**:
242221 - **`lr`**: 0.00075
243222
244223and more...
245224
246- # ## Loss Configuration (`model/ loss/_default .yaml`)
225+ # ## Loss Configuration (`loss/standard .yaml`)
247226
248227Type-aware losses automatically select appropriate loss functions based on variable types :
249228
@@ -264,7 +243,7 @@ fn_collection:
264243 # ... not supported yet
265244` ` `
266245
267- # ## Metrics Configuration (`model/ metrics/_default .yaml`)
246+ # ## Metrics Configuration (`metrics/standard .yaml`)
268247
269248Type-aware metrics automatically select appropriate metrics based on variable types :
270249
@@ -306,32 +285,40 @@ This involves the following steps:
306285- Run experiments using your model.
307286
308287If your model is compatible with the default configuration structure, you can run experiments directly as follows :
288+
309289` ` ` bash
310- python run_experiment.py model=your_model dataset=...
290+ python run_experiment.py model=your_model dataset=cub
311291` ` `
312- Alernatively, create your own sweep file `conf/your_sweep.yaml` containing your mdoel and run :
292+
293+ Alternatively, create your own sweep file `conf/your_sweep.yaml` containing your model and run :
294+
313295` ` ` bash
314- python run_experiment.py --config-file your_sweep.yaml
296+ python run_experiment.py --config-name your_sweep
315297` ` `
316298
317299---
318300
319301# # Implementing Your Own Dataset
302+
320303Create your dataset in Conceptarium by following the guidelines given in [torch_concepts/examples/contributing/dataset.md](../examples/contributing/dataset.md).
321304
322305This involves the following steps :
306+
323307- Create the dataset (`your_dataset.py`).
324308- Create the datamodule (`your_datamodule.py`) wrapping the dataset.
325309- Create configuration file in `conceptarium/conf/dataset/your_dataset.yaml`, targeting the datamodule class.
326- - Run experiments using your dataset.
310+ - Run experiments using your dataset.
327311
328312If your dataset is compatible with the default configuration structure, you can run experiments directly as follows :
313+
329314` ` ` bash
330- python run_experiment.py dataset=your_dataset model=...
315+ python run_experiment.py dataset=your_dataset model=cbm_joint
331316` ` `
317+
332318Alternatively, create your own sweep file `conf/your_sweep.yaml` containing your dataset and run :
319+
333320` ` ` bash
334- python run_experiment.py --config-name your_sweep.yaml
321+ python run_experiment.py --config-name your_sweep
335322` ` `
336323
337324---
0 commit comments