On-the-fly spectrogram computation and data augmentation with tensorflow
I have implemented a set of functions for computing spectrograms and performing various data augmentation operations on waveforms and spectrograms. These are all implemented using Tensorflow functions, meaning they can execute rather fast on GPU and can be called on-the-fly during model training.
The new functions are kept in two new modules named ketos.neural_networks.spectrogram
and ketos.neural_networks.augmentation
.
Functions/layers still to be implemented:
-
Pre-processing
-
resample (waveform); ensure if this can be done with tensorflow... maybe using tfio? -
bandpass (waveform); doesn't look like tensorflow can do this, just use scipy.signal? -
resize (spectrogram) -
crop (spectrogram) -
subtract_row_median (spectrogram) -
subtract_column_median (spectrogram) -
Mel spectrogram -
CQT spectrogram; doesn't look like tensorflow can do this, just keep librosa implementation for now
-
-
Augmentation
-
convolve (waveform) -
band_mask (spectrogram)
-
-
Misc
@frazao @padovese , is there anything you'd like to add to the list?
Edited by Oliver Kirsebom