nnAudio2.features.cqt.iCQT

class nnAudio2.features.cqt.iCQT(sr=22050, hop_length=512, fmin=32.7, fmax=None, n_bins=84, bins_per_octave=12, filter_scale=1, norm=1, window='hann', center=True, pad_mode='reflect', n_iter=32, verbose=True)

Bases: Module

Inverse Constant-Q Transform via Landweber iteration.

Reconstructs a waveform from the complex CQT output of CQT1992v2() using iterative Landweber inversion. Reconstruction SNR exceeds 30 dB for signals whose frequency content is within the well-sampled range of the CQT (roughly f < sr / (2 * hop_length) per bin). High-frequency content in undercomplete configurations (hop_length > 2 * n_bins) cannot be recovered.

All hyper-parameters must match the CQT1992v2() that produced the CQT being inverted.

Parameters:
  • sr (int) – Sample rate. Default 22050.

  • hop_length (int) – Hop size used during analysis. Default 512.

  • fmin (float) – Lowest CQT frequency in Hz. Default 32.70 (C0).

  • fmax (float or None) – Highest CQT frequency. Default None (derived from n_bins).

  • n_bins (int) – Total number of CQT bins. Default 84.

  • bins_per_octave (int) – Bins per octave. Default 12.

  • filter_scale (float) – Filter scale factor. Default 1.

  • norm (int) – Kernel normalisation (1 = L1, 2 = L2). Default 1.

  • window (str) – Window function. Default 'hann'.

  • center (bool) – Whether the analysis used centre-padding. Must match the CQT1992v2 setting. Default True.

  • pad_mode (str) – Padding mode. Must match the CQT1992v2 setting. Default 'reflect'.

  • n_iter (int) – Number of Landweber iterations. Default 32.

  • verbose (bool) – Print kernel-creation progress. Default True.

  • Input

  • -----

  • CQT (torch.Tensor, shape (batch, n_bins, time_steps, 2)) – Complex CQT from CQT1992v2(output_format='Complex', normalization_type='librosa').

  • length (int or None) – Original waveform length in samples. Providing this trims or zero-pads the output to exactly the right number of samples.

Returns:

waveform

Return type:

torch.Tensor, shape (batch, length)

Examples

>>> cqt  = CQT1992v2(sr=22050, hop_length=512, output_format='Complex')
>>> icqt = iCQT(sr=22050, hop_length=512)
>>> X    = cqt(waveform)
>>> waveform_hat = icqt(X, length=waveform.shape[-1])

Methods

__init__

Initialize internal Module state, shared by both nn.Module and ScriptModule.

extra_repr

Return the extra representation of the module.

forward

Reconstruct a waveform from a complex CQT spectrogram.

extra_repr() str

Return the extra representation of the module.

To print customized extra information, you should re-implement this method in your own modules. Both single-line and multi-line strings are acceptable.

forward(CQT, length=None)

Reconstruct a waveform from a complex CQT spectrogram.

Parameters:
  • CQT (torch.Tensor, shape (batch, n_bins, time_steps, 2)) – Complex CQT from CQT1992v2(output_format='Complex', normalization_type='librosa').

  • length (int, optional) – Original waveform length. Used to set the reconstruction target length; output is trimmed or zero-padded to match.

Returns:

waveform

Return type:

torch.Tensor, shape (batch, length)