Discrepancy between`n_mels` vs. mel bin size in MelSpectrogramLoss

In the DAC paper, the authors discuss using varied mel bin sizes for the multi-scale spectrogram loss (pictured). In the implementation, [`n_mels`](https://github.com/descriptinc/descript-audio-codec/blob/c7cfc5d2647e26471dc394f95846a0830e7bec34/dac/nn/loss.py#L237) is passed to the `MelSpectrogramLoss` function. The [config file ](https://github.com/descriptinc/descript-audio-codec/blob/main/conf/final/44khz.yml#L63) then takes `n_mels`, using the same numbers described in the paper (pictured below) as being the "mel bin sizes".

As far as I understand, the number of mel bins does not (necessarily) equal the mel bin width. Is this an error in the description in the paper, or in implementation? I also checked the Encodec paper which uses 64 as the number of bins, not the bin width. Thank you!

![Image](https://github.com/user-attachments/assets/a25fed99-611d-4681-ba22-82973511cce7)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Discrepancy between`n_mels` vs. mel bin size in MelSpectrogramLoss #102

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Discrepancy betweenn_mels vs. mel bin size in MelSpectrogramLoss #102

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Discrepancy between`n_mels` vs. mel bin size in MelSpectrogramLoss #102