Skip to content

Draft: On-the-fly spectrogram computation and data augmentation

Oliver Kirsebom requested to merge augment-on-the-fly into development

I have implemented a set of functions for computing spectrograms and performing various data augmentation operations on waveforms and spectrograms. These are all implemented using Tensorflow functions, meaning they can execute rather fast on GPU and can be called on-the-fly during model training.

The new functions are kept in two new modules named ketos.neural_networks.spectrogram and ketos.neural_networks.augmentation.

A fair bit of work remains to be done to:

  • complete doc string of augment_pipeline function
  • complete implementation of band_mask function
  • implement spectrogram as a tf.Layer (in addition to the stand-alone Python function)
  • consider abstracting the design of the augmentation module to enable users to design their own, custom augmentation pipelines.
    • each augmentation operation should be its own class, with a __call__ method with a specific signature
    • there should be an AugmentationPipeline class that can string together augmentation operations
    • there should be two "tiers" of augmentation operators: those acting on waveforms and those acting on spectrograms
    • ...
  • ensure consistency with existing Ketos code base
  • implement unit testing and regression testing
  • create examples / tutorials

@padovese , is this something you would be interested and have bandwidth to collaborate with me on, at some point?

Edited by Oliver Kirsebom

Merge request reports