You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the DAC paper, the authors discuss using varied mel bin sizes for the multi-scale spectrogram loss (pictured). In the implementation, n_mels is passed to the MelSpectrogramLoss function. The config file then takes n_mels, using the same numbers described in the paper (pictured below) as being the "mel bin sizes".
As far as I understand, the number of mel bins does not (necessarily) equal the mel bin width. Is this an error in the description in the paper, or in implementation? I also checked the Encodec paper which uses 64 as the number of bins, not the bin width. Thank you!