Skip to content

On-the-fly spectrogram computation and data augmentation with tensorflow

Oliver Kirsebom requested to merge augment-on-the-fly into dev-3.0

I have implemented a set of functions for computing spectrograms and performing various data augmentation operations on waveforms and spectrograms. These are all implemented using Tensorflow functions, meaning they can execute rather fast on GPU and can be called on-the-fly during model training.

The new functions are kept in two new modules named ketos.neural_networks.spectrogram and ketos.neural_networks.augmentation.

Functions/layers still to be implemented:

  • Pre-processing

    • resample (waveform); ensure if this can be done with tensorflow... maybe using tfio?
    • bandpass (waveform); doesn't look like tensorflow can do this, just use scipy.signal?
    • resize (spectrogram)
    • crop (spectrogram)
    • subtract_row_median (spectrogram)
    • subtract_column_median (spectrogram)
    • Mel spectrogram
    • CQT spectrogram; doesn't look like tensorflow can do this, just keep librosa implementation for now
  • Augmentation

    • convolve (waveform)
    • band_mask (spectrogram)
  • Misc

@frazao @padovese , is there anything you'd like to add to the list?

Edited by Oliver Kirsebom

Merge request reports

Loading