{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Plotting" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Using Parselmouth, it is possible to use the existing Python plotting libraries – such as [Matplotlib](https://matplotlib.org/) and [seaborn](https://seaborn.pydata.org/) – to make custom visualizations of the speech data and analysis results obtained by running Praat's algorithms." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The following examples visualize an audio recording of someone saying *\"The north wind and the sun [...]\"*: [the_north_wind_and_the_sun.wav](audio/the_north_wind_and_the_sun.wav), extracted from a [Wikipedia Commons audio file](https://commons.wikimedia.org/wiki/File:Recording_of_speaker_of_British_English_%28Received_Pronunciation%29.ogg)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We start out by importing `parselmouth`, some common Python plotting libraries `matplotlib` and `seaborn`, and the `numpy` numeric library." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "import parselmouth\n", "\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "import seaborn as sns" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sns.set() # Use seaborn's default style to make attractive graphs\n", "plt.rcParams['figure.dpi'] = 100 # Show nicely large images in this notebook" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Once we have the necessary libraries for this example, we open and read in the audio file and plot the raw waveform." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "snd = parselmouth.Sound(\"audio/the_north_wind_and_the_sun.wav\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`snd` is now a Parselmouth [Sound](../api_reference.rst#parselmouth.Sound) object, and we can access its values and other properties to plot them with the common `matplotlib` Python library:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plt.figure()\n", "plt.plot(snd.xs(), snd.values.T)\n", "plt.xlim([snd.xmin, snd.xmax])\n", "plt.xlabel(\"time [s]\")\n", "plt.ylabel(\"amplitude\")\n", "plt.show() # or plt.savefig(\"sound.png\"), or plt.savefig(\"sound.pdf\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It is also possible to extract part of the speech fragment and plot it separately. For example, let's extract the word *\"sun\"* and plot its waveform with a finer line." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "snd_part = snd.extract_part(from_time=0.9, preserve_times=True)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plt.figure()\n", "plt.plot(snd_part.xs(), snd_part.values.T, linewidth=0.5)\n", "plt.xlim([snd_part.xmin, snd_part.xmax])\n", "plt.xlabel(\"time [s]\")\n", "plt.ylabel(\"amplitude\")\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next, we can write a couple of ordinary Python functions to plot a Parselmouth `Spectrogram` and `Intensity`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def draw_spectrogram(spectrogram, dynamic_range=70):\n", " X, Y = spectrogram.x_grid(), spectrogram.y_grid()\n", " sg_db = 10 * np.log10(spectrogram.values)\n", " plt.pcolormesh(X, Y, sg_db, vmin=sg_db.max() - dynamic_range, cmap='afmhot')\n", " plt.ylim([spectrogram.ymin, spectrogram.ymax])\n", " plt.xlabel(\"time [s]\")\n", " plt.ylabel(\"frequency [Hz]\")\n", "\n", "def draw_intensity(intensity):\n", " plt.plot(intensity.xs(), intensity.values.T, linewidth=3, color='w')\n", " plt.plot(intensity.xs(), intensity.values.T, linewidth=1)\n", " plt.grid(False)\n", " plt.ylim(0)\n", " plt.ylabel(\"intensity [dB]\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "After defining how to plot these, we use Praat (through Parselmouth) to calculate the spectrogram and intensity to actually plot the intensity curve overlaid on the spectrogram." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "intensity = snd.to_intensity()\n", "spectrogram = snd.to_spectrogram()\n", "plt.figure()\n", "draw_spectrogram(spectrogram)\n", "plt.twinx()\n", "draw_intensity(intensity)\n", "plt.xlim([snd.xmin, snd.xmax])\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The Parselmouth functions and methods have the same arguments as the Praat commands, so we can for example also change the window size of the spectrogram analysis to get a narrow-band spectrogram. Next to that, let's now have Praat calculate the pitch of the fragment, so we can plot it instead of the intensity." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def draw_pitch(pitch):\n", " # Extract selected pitch contour, and\n", " # replace unvoiced samples by NaN to not plot\n", " pitch_values = pitch.selected_array['frequency']\n", " pitch_values[pitch_values==0] = np.nan\n", " plt.plot(pitch.xs(), pitch_values, 'o', markersize=5, color='w')\n", " plt.plot(pitch.xs(), pitch_values, 'o', markersize=2)\n", " plt.grid(False)\n", " plt.ylim(0, pitch.ceiling)\n", " plt.ylabel(\"fundamental frequency [Hz]\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "pitch = snd.to_pitch()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# If desired, pre-emphasize the sound fragment before calculating the spectrogram\n", "pre_emphasized_snd = snd.copy()\n", "pre_emphasized_snd.pre_emphasize()\n", "spectrogram = pre_emphasized_snd.to_spectrogram(window_length=0.03, maximum_frequency=8000)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plt.figure()\n", "draw_spectrogram(spectrogram)\n", "plt.twinx()\n", "draw_pitch(pitch)\n", "plt.xlim([snd.xmin, snd.xmax])\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Using the [FacetGrid](https://seaborn.pydata.org/generated/seaborn.FacetGrid.html) functionality from `seaborn`, we can even plot plot multiple a structured grid of multiple custom spectrograms. For example, we will read a CSV file (using the [pandas](https://pandas.pydata.org/) library) that contains the digit that was spoken, the ID of the speaker and the file name of the audio fragment: [digit_list.csv](other/digit_list.csv), [1_b.wav](audio/1_b.wav), [2_b.wav](audio/2_b.wav), [3_b.wav](audio/3_b.wav), [4_b.wav](audio/4_b.wav), [5_b.wav](audio/5_b.wav), [1_y.wav](audio/1_y.wav), [2_y.wav](audio/2_y.wav), [3_y.wav](audio/3_y.wav), [4_y.wav](audio/4_y.wav), [5_y.wav](audio/5_y.wav)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "\n", "def facet_util(data, **kwargs):\n", " digit, speaker_id = data[['digit', 'speaker_id']].iloc[0]\n", " sound = parselmouth.Sound(\"audio/{}_{}.wav\".format(digit, speaker_id))\n", " draw_spectrogram(sound.to_spectrogram())\n", " plt.twinx()\n", " draw_pitch(sound.to_pitch())\n", " # If not the rightmost column, then clear the right side axis\n", " if digit != 5:\n", " plt.ylabel(\"\")\n", " plt.yticks([])\n", "\n", "results = pd.read_csv(\"other/digit_list.csv\")\n", "\n", "grid = sns.FacetGrid(results, row='speaker_id', col='digit')\n", "grid.map_dataframe(facet_util)\n", "grid.set_titles(col_template=\"{col_name}\", row_template=\"{row_name}\")\n", "grid.set_axis_labels(\"time [s]\", \"frequency [Hz]\")\n", "grid.set(facecolor='white', xlim=(0, None))\n", "plt.show()" ] } ], "metadata": { "celltoolbar": "Raw Cell Format", "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.7" } }, "nbformat": 4, "nbformat_minor": 4 }