Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
farzanaanjum
GitHub Repository: farzanaanjum/Music-Genre-Classification-with-Python
Path: blob/master/Audio Analysis in Python.ipynb
133 views
Kernel: Python 3

Loading an audio file

import warnings warnings.filterwarnings('ignore')
import librosa audio_path = '../T08-violin.wav' x , sr = librosa.load(audio_path)

Playing Audio

Using IPython.display.Audio, to play the audio

import IPython.display as ipd ipd.Audio(audio_path)

You can even use an mp3 or a WMA format for the audio example.

Visualizing Audio

Waveform

We can plot the audio array using librosa.display.waveplot:

%matplotlib inline import sklearn import matplotlib.pyplot as plt import librosa.display plt.figure(figsize=(14, 5)) librosa.display.waveplot(x, sr=sr)
<matplotlib.collections.PolyCollection at 0x1285c0668>
Image in a Jupyter notebook

Here, we have the plot the amplitude envelope of a waveform.

Spectrogram

We can also display a spectrogram using librosa.display.specshow.

X = librosa.stft(x) Xdb = librosa.amplitude_to_db(abs(X)) plt.figure(figsize=(14, 5)) librosa.display.specshow(Xdb, sr=sr, x_axis='time', y_axis='hz') plt.colorbar()
<matplotlib.colorbar.Colorbar at 0x1236f2d30>
Image in a Jupyter notebook

Log Frequency axis

librosa.display.specshow(Xdb, sr=sr, x_axis='time', y_axis='log') plt.colorbar()
<matplotlib.colorbar.Colorbar at 0x123496208>
Image in a Jupyter notebook

Creating an audio signal

Let us now create an audio signal at 220Hz. We know an audio signal is a numpy array, so we shall create one and pass it on to the audio function.

import numpy as np sr = 22050 # sample rate T = 5.0 # seconds t = np.linspace(0, T, int(T*sr), endpoint=False) # time variable x = 0.5*np.sin(2*np.pi*220*t)# pure sine wave at 220 Hz

Playing the sound

ipd.Audio(x, rate=sr) # load a NumPy array

Saving the signal

librosa.output.write_wav('../tone_440.wav', x, sr)

Feature Extraction

x, sr = librosa.load('../T08-violin.wav') ipd.Audio(x, rate=sr)
#Plot the signal: plt.figure(figsize=(14, 5)) librosa.display.waveplot(x, sr=sr)
<matplotlib.collections.PolyCollection at 0x137da34a8>
Image in a Jupyter notebook

1. Zero Crossing Rate

# Zooming in n0 = 9000 n1 = 9100 plt.figure(figsize=(14, 5)) plt.plot(x[n0:n1]) plt.grid()
Image in a Jupyter notebook

I count 6 zero crossings. Let's compute the zero crossings using librosa.

zero_crossings = librosa.zero_crossings(x[n0:n1], pad=False) zero_crossings.shape
(100,)
print(sum(zero_crossings))
6

2.Spectral Centroid

spectral_centroids = librosa.feature.spectral_centroid(x, sr=sr)[0] spectral_centroids.shape
(775,)
# Computing the time variable for visualization frames = range(len(spectral_centroids)) t = librosa.frames_to_time(frames) # Normalising the spectral centroid for visualisation def normalize(x, axis=0): return sklearn.preprocessing.minmax_scale(x, axis=axis) #Plotting the Spectral Centroid along the waveform librosa.display.waveplot(x, sr=sr, alpha=0.4) plt.plot(t, normalize(spectral_centroids), color='r')
[<matplotlib.lines.Line2D at 0x12856ca58>]
Image in a Jupyter notebook

3.Spectral Rolloff

spectral_rolloff = librosa.feature.spectral_rolloff(x+0.01, sr=sr)[0] librosa.display.waveplot(x, sr=sr, alpha=0.4) plt.plot(t, normalize(spectral_rolloff), color='r') plt.grid()
Image in a Jupyter notebook

4.MFCC

x, fs = librosa.load('../simple_loop.wav') librosa.display.waveplot(x, sr=sr)
<matplotlib.collections.PolyCollection at 0x12ed3f358>
Image in a Jupyter notebook
# MFCC mfccs = librosa.feature.mfcc(x, sr=sr) print(mfccs.shape) librosa.display.specshow(mfccs, sr=sr, x_axis='time')
(20, 97)
<matplotlib.axes._subplots.AxesSubplot at 0x135def400>
Image in a Jupyter notebook

Feature Scaling

Let's scale the MFCCs such that each coefficient dimension has zero mean and unit variance:

mfccs = sklearn.preprocessing.scale(mfccs, axis=1) print(mfccs.mean(axis=1)) print(mfccs.var(axis=1))
[ 3.52524424e-16 -5.37943115e-17 4.72130925e-17 4.76136885e-16 -2.40357562e-17 7.43963882e-17 6.81013092e-17 -6.52399097e-17 2.77555756e-17 6.40953499e-17 7.67570429e-17 1.50223476e-16 6.75290293e-17 8.69865463e-17 -8.24083070e-17 -4.74992325e-17 7.35379684e-17 -3.25913409e-16 2.80417156e-17 -4.14330655e-16] [1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
librosa.display.specshow(mfccs, sr=sr, x_axis='time')
<matplotlib.axes._subplots.AxesSubplot at 0x13613b748>
Image in a Jupyter notebook

Chroma Frequencies

# Loadign the file x, sr = librosa.load('../simple_piano.wav') ipd.Audio(x, rate=sr)
hop_length = 512 chromagram = librosa.feature.chroma_stft(x, sr=sr, hop_length=hop_length) plt.figure(figsize=(15, 5)) librosa.display.specshow(chromagram, x_axis='time', y_axis='chroma', hop_length=hop_length, cmap='coolwarm')
<matplotlib.axes._subplots.AxesSubplot at 0x136d96320>
Image in a Jupyter notebook