Reference

The functions documented here are implemented in cython and interact directly with the underlying C++ code in Loris.

Info

See also loristrck.util for supporting utilities regarding transformation of partials, rendering, plotting, etc.

analyze

def analyze(samples: np.ndarray, 
            sr: float, 
            resolution: float, 
            windowsize: float = None,
            hoptime: float = None, 
            freqdrift: float = None, 
            sidelobe: float = None,
            ampfloor: float = -90, 
            croptime: float = None, 
            residuebw: float = None, 
            convergencebw: float = None,
            outfile: str = None
            ) -> list[np.ndarray]

Partial Tracking Analysis

Analyze the audio samples. Returns a list of 2D numpy arrays, where each array represent a partial with columns: [time, freq, amplitude, phase, bandwidth] If outfile is given, a sdif file is saved with the results of the analysis

There are three categories of analysis parameters: - the resolution and params related to it (freq. floor and drift) - the window width and params related to it (hop and crop times) - independent parameters (bw region width and amp floor)

Args

samples: numpy.ndarray An array representing a mono sndfile.
sr: int (Hz) The sampling rate
resolution: Hz Only one partial will be found within this distance. Usual values range from 30 Hz to 200 Hz. As a rule of thumb, when tracking a monophonic source, resolution ~= min(f0) * 0.9 So if the source is a male voice dropping down to 70 Hz, resolution=60 Hz
windowsize: Hz. Is the main lobe width of the Kaiser analysis window in Hz (main-lobe, zero to zero) If not given, a default value is calculated. The size of the window in samples can be calculated by: util.kaiserLength(windowsize, sr, sidelobe)
hoptime: sec The time to move the window after each analysis. Default: 1/windowsize. "hop time in secs is the inverse of the window width really. A good choice of hop is the window length divided by the main lobe width in freq. samples, which turns out to be just the inverse of the width." A lower hoptime can be used: for instance a 2x overlap would result in a hoptime of hoptime=1/windowsize*0.5 NB: when using overlap, croptime should still be 1/windowsize
freqdrift: Hz The maximum variation of frecuency between two breakpoints to be considered to belong to the same partial. A sensible value is between 1/2 to 3/4 of resolution: freqdrift=0.62*resolution
sidelobe: dB (default: 90 dB) A positive dB value, indicates the shape of the Kaiser window
ampfloor: dB A breakpoint with an amp < ampfloor can't be part of a partial
croptime: sec Max. time correction beyond which a reassigned bp is considered unreliable, and not eligible. Default: the hop time.
residuebw: Hz (default = 2000 Hz) Construct Partial bandwidth env. by associating residual energy with the selected spectral peaks that are used to construct Partials. The bandwidth is the width (in Hz) association regions used. Defaults to 2 kHz, corresponding to 1 kHz region center spacing. NB: if residuebw is set, convergencebw must be left unset
convergencebw: range [0, 1] Construct Partial bandwidth env. by storing the mixed derivative of short-time phase, scaled and shifted.
The value is the amount of range over which the mixed derivative indicator should be allowed to drift away from a pure sinusoid before saturating. This range is mapped to bandwidth values on the range [0,1].
NB: one can set residuebw or convergencebw, but not both

Returns

A list of partials, where each partial is a numpy array of shape (number of breakpoints, 5) The format for each partial is:

Col 0	Col 1	Col 2	Col 3	Col 4
time_0	freq_0	amp_0	phase_0	bw_0
time_1	freq_1	amp_1	phase_1	bw_1
...	...	...	...	...
time_n	freq_n	amp_n	phase_n	bw_n

Example

import loristrck as lt
import numpy as np

# Read a soundfile as a numpy array
samples, sr = lt.sndreadmono("voice.wav")

# Analyze the soundfile with a frequency resolution of 30 Hz and 
# a window size of 40 Hz. A hoptime of 1/120 will result in 4x overlap
partials = lt.analyze(samples, sr, resolution=30, windowsize=40, 
                      hoptime=1/120)

# for each partial, calculate the mean weighted frequency
def mean_weighted_freq(partial):
    return np.mean(partial[:,1] * partail[:,2])

freqs = [mean_weighted_freq(partial) for partial in partials]
for i, partial in enumerate(partials):
    freq = mean_weighted_freq(partial)
    print(f"Partial #{i}, start time: {partial[0, 0]}, mean freq.: freq}")

# Save the analysis as a .sdif file with RBEP format
lt.write_sdif(partials, "analysis.sdif")

read_sdif

Read a SDIF file (1TRC or RBEP)

def read_sdif(path: str
             ) -> tuple[list[np.ndarray], list[int]]

Args

path: The path of the .sdif file to read

Returns

A tuple (list of partials, labels), where a partial is a 2D numpy array with a shape (number of breakpoints, 5).

write_sdif

def write_sdif(partials: list[np.ndarray], 
               outfile: str, 
               labels:list[int] | None, 
               rbep=True, 
               fadetime=0.
               ) -> None

Write a list of partials in the sdif

Args

partials: a seq. of 2D arrays with columns [time freq amp phase bw]
outfile: the path of the sdif file
labels: a seq. of integer labels, or None to skip saving labels
rbep: if True, use RBEP format, otherwise, 1TRC

Note

The 1TRC format forces resampling

read_aiff

Read a mono AIFF file (Loris does not read stereo files)

def read_aiff(path: str
             ) -> tuple[audiodata: np.ndarray, samplerate: int]

Args

path (str): The path to the soundfile (.aif or .aiff)

Returns

A tuple (audiodata: np.ndarray, samplerate: int)

Warning

This function will raise ValueError if the soundfile is not mono

synthesize

Synthesizes a list of partials, returns the generated audio samples as 1D numpy array

def synthesize(partials: list[np.ndarray],
               samplerate: int,
               fadetime: float = None,
               start: float = None,
               end: float = None
               ) -> np.ndarray

Args

partials (list[np.ndarray]): a list of partials, where each partial is a 2D numpy array
samplerate (int): the samplerate of the synthesized samples (in Hz)
fadetime (float): to avoid clicks, partials not ending in 0 amp should be faded. If not given a sensible default is used. A minimum fadetime is always applied, even if 0 is given.
start (float): start time of synthesis (in seconds).
end (float): end time of synthesis

Returns

The sampes generated, as a 1D numpy array.

Example

Synthesize and play partials from a previously analyzed sound

import loristrck as lt
import sounddevice as sd
partials, labels = lt.read_sdif("analysis.sdif")
samples = lt.synthesize(partials, 44100)
sd.play(samples, 44100)

estimatef0

Estimate the fundamental of a previously analyzed sound

def estimatef0(partials: list[np.ndarray, 
               minfreq: float, 
               maxfreq: float,
               interval: float  
               ) -> tuple[freqs: np.ndarray, 
                          confidencies: np.ndarray, 
                          starttime: float, 
                          endtime: float]

Args

partials (list[np.ndarray]): the partials to analyze to determine the fundamental
minfreq (float): the min. frequency to considere as a fundamental
maxfreq (float): the max. frequency to considere as a fundamental
interval (float): the time resolution of the fundamental curve

Returns

A tuple (freqs, confidencies, starttime, endtime).

freqs (np.ndarray): an array with the frequencies representing the fundamental in time
confidencies (np.ndarray): for each frequency there is a corresponding confidency value determining the confidence on this value being the correct f0
starttime: the start time of the fundamental
endtime: the endtime of the fundamental.

To determine the time for each frequency measurement of the f0, do:

times = np.linspace(starttime, endtime, len(freqs))

meancol

Calculates the mean over a given column of a 2D np.ndarray.

def meancol(X: np.ndarray, col: int) -> float

Args

X: a 2D numpy array
col: the index of the column to calculate the average for

Returns

The average over the given column

meancolw

Calculate the weighted mean over a given column and using another column as the weights

def meancol(X: np.ndarray, col: int, colw: int
            ) -> float

Args

X: a 2D numpy array
col: the index of the column to calculate the average for
colw: the index of the column to use as weight

Returns

The weighted average over the given column

Example

Calculate the weighted average frequency of a given partial

import loristrck as lt
partials, labels = lt.read_sdif("analysis.sdif")
for i, partial in enumerate(partials):
    # average frequency using amplitude as weight
    freq = lt.meancolw(partial, 1, 2)
    print(f"Partial #{i}, avg. freq: {fre} Hz")