Reference
The functions documented here are implemented in cython and interact directly with the underlying C++ code in Loris.
Info
See also loristrck.util for supporting utilities regarding transformation of partials, rendering, plotting, etc.
analyze
def analyze(samples: np.ndarray,
sr: float,
resolution: float,
windowsize: float = None,
hoptime: float = None,
freqdrift: float = None,
sidelobe: float = None,
ampfloor: float = -90,
croptime: float = None,
residuebw: float = None,
convergencebw: float = None,
outfile: str = None
) -> list[np.ndarray]
Partial Tracking Analysis
Analyze the audio samples. Returns a list of 2D numpy arrays, where each array
represent a partial with columns: [time, freq, amplitude, phase, bandwidth]
If outfile is given, a sdif file is saved with the results of the analysis
There are three categories of analysis parameters: - the resolution and params related to it (freq. floor and drift) - the window width and params related to it (hop and crop times) - independent parameters (bw region width and amp floor)
Args
- samples: numpy.ndarray An array representing a mono sndfile.
- sr: int (Hz) The sampling rate
- resolution: Hz Only one partial will be found within this distance. Usual values range from 30 Hz to 200 Hz. As a rule of thumb, when tracking a monophonic source, resolution ~= min(f0) * 0.9 So if the source is a male voice dropping down to 70 Hz, resolution=60 Hz
- windowsize: Hz. Is the main lobe width of the Kaiser analysis window in Hz (main-lobe, zero to zero) If not given, a default value is calculated. The size of the window in samples can be calculated by: util.kaiserLength(windowsize, sr, sidelobe)
- hoptime: sec The time to move the window after each analysis. Default: 1/windowsize. "hop time in secs is the inverse of the window width really. A good choice of hop is the window length divided by the main lobe width in freq. samples, which turns out to be just the inverse of the width." A lower hoptime can be used: for instance a 2x overlap would result in a hoptime of hoptime=1/windowsize*0.5 NB: when using overlap, croptime should still be 1/windowsize
- freqdrift: Hz The maximum variation of frecuency between two breakpoints to be considered to belong to the same partial. A sensible value is between 1/2 to 3/4 of resolution: freqdrift=0.62*resolution
- sidelobe: dB (default: 90 dB) A positive dB value, indicates the shape of the Kaiser window
- ampfloor: dB A breakpoint with an amp < ampfloor can't be part of a partial
- croptime: sec Max. time correction beyond which a reassigned bp is considered unreliable, and not eligible. Default: the hop time.
- residuebw: Hz (default = 2000 Hz) Construct Partial bandwidth env. by associating residual energy with the selected spectral peaks that are used to construct Partials. The bandwidth is the width (in Hz) association regions used. Defaults to 2 kHz, corresponding to 1 kHz region center spacing. NB: if residuebw is set, convergencebw must be left unset
- convergencebw: range
[0, 1]
Construct Partial bandwidth env. by storing the mixed derivative of short-time phase, scaled and shifted.
The value is the amount of range over which the mixed derivative indicator should be allowed to drift away from a pure sinusoid before saturating. This range is mapped to bandwidth values on the range[0,1]
.
NB: one can set residuebw or convergencebw, but not both
Returns
A list of partials, where each partial is a numpy array of shape (number of breakpoints, 5)
The format for each partial is:
Col 0 | Col 1 | Col 2 | Col 3 | Col 4 |
---|---|---|---|---|
time_0 | freq_0 | amp_0 | phase_0 | bw_0 |
time_1 | freq_1 | amp_1 | phase_1 | bw_1 |
... | ... | ... | ... | ... |
time_n | freq_n | amp_n | phase_n | bw_n |
Example
import loristrck as lt
import numpy as np
# Read a soundfile as a numpy array
samples, sr = lt.sndreadmono("voice.wav")
# Analyze the soundfile with a frequency resolution of 30 Hz and
# a window size of 40 Hz. A hoptime of 1/120 will result in 4x overlap
partials = lt.analyze(samples, sr, resolution=30, windowsize=40,
hoptime=1/120)
# for each partial, calculate the mean weighted frequency
def mean_weighted_freq(partial):
return np.mean(partial[:,1] * partail[:,2])
freqs = [mean_weighted_freq(partial) for partial in partials]
for i, partial in enumerate(partials):
freq = mean_weighted_freq(partial)
print(f"Partial #{i}, start time: {partial[0, 0]}, mean freq.: freq}")
# Save the analysis as a .sdif file with RBEP format
lt.write_sdif(partials, "analysis.sdif")
read_sdif
Read a SDIF
file (1TRC
or RBEP
)
def read_sdif(path: str
) -> tuple[list[np.ndarray], list[int]]
Args
- path: The path of the
.sdif
file to read
Returns
A tuple (list of partials, labels), where a partial is a 2D numpy array with a shape (number of breakpoints, 5).
write_sdif
def write_sdif(partials: list[np.ndarray],
outfile: str,
labels:list[int] | None,
rbep=True,
fadetime=0.
) -> None
Write a list of partials in the sdif
Args
- partials: a seq. of 2D arrays with columns [time freq amp phase bw]
- outfile: the path of the sdif file
- labels: a seq. of integer labels, or None to skip saving labels
- rbep: if True, use RBEP format, otherwise, 1TRC
Note
The 1TRC format forces resampling
read_aiff
Read a mono AIFF file (Loris does not read stereo files)
def read_aiff(path: str
) -> tuple[audiodata: np.ndarray, samplerate: int]
Args
- path (
str
): The path to the soundfile (.aif
or.aiff
)
Returns
A tuple (audiodata: np.ndarray, samplerate: int)
Warning
This function will raise ValueError
if the soundfile is not mono
synthesize
Synthesizes a list of partials, returns the generated audio samples as 1D numpy array
def synthesize(partials: list[np.ndarray],
samplerate: int,
fadetime: float = None,
start: float = None,
end: float = None
) -> np.ndarray
Args
- partials (
list[np.ndarray]
): a list of partials, where each partial is a 2D numpy array - samplerate (int): the samplerate of the synthesized samples (in Hz)
- fadetime (float): to avoid clicks, partials not ending in 0 amp should be faded. If not given a sensible default is used. A minimum fadetime is always applied, even if 0 is given.
- start (float): start time of synthesis (in seconds).
- end (float): end time of synthesis
Returns
The sampes generated, as a 1D numpy array.
Example
Synthesize and play partials from a previously analyzed sound
import loristrck as lt
import sounddevice as sd
partials, labels = lt.read_sdif("analysis.sdif")
samples = lt.synthesize(partials, 44100)
sd.play(samples, 44100)
estimatef0
Estimate the fundamental of a previously analyzed sound
def estimatef0(partials: list[np.ndarray,
minfreq: float,
maxfreq: float,
interval: float
) -> tuple[freqs: np.ndarray,
confidencies: np.ndarray,
starttime: float,
endtime: float]
Args
- partials (
list[np.ndarray]
): the partials to analyze to determine the fundamental - minfreq (
float
): the min. frequency to considere as a fundamental - maxfreq (
float
): the max. frequency to considere as a fundamental - interval (
float
): the time resolution of the fundamental curve
Returns
A tuple (freqs, confidencies, starttime, endtime).
- freqs (
np.ndarray
): an array with the frequencies representing the fundamental in time - confidencies (
np.ndarray
): for each frequency there is a corresponding confidency value determining the confidence on this value being the correct f0 - starttime: the start time of the fundamental
- endtime: the endtime of the fundamental.
To determine the time for each frequency measurement of the f0, do:
times = np.linspace(starttime, endtime, len(freqs))
meancol
Calculates the mean over a given column of a 2D np.ndarray.
def meancol(X: np.ndarray, col: int) -> float
Args
- X: a 2D numpy array
- col: the index of the column to calculate the average for
Returns
The average over the given column
meancolw
Calculate the weighted mean over a given column and using another column as the weights
def meancol(X: np.ndarray, col: int, colw: int
) -> float
Args
- X: a 2D numpy array
- col: the index of the column to calculate the average for
- colw: the index of the column to use as weight
Returns
The weighted average over the given column
Example
Calculate the weighted average frequency of a given partial
import loristrck as lt
partials, labels = lt.read_sdif("analysis.sdif")
for i, partial in enumerate(partials):
# average frequency using amplitude as weight
freq = lt.meancolw(partial, 1, 2)
print(f"Partial #{i}, avg. freq: {fre} Hz")