File tree Expand file tree Collapse file tree
Expand file tree Collapse file tree Original file line number Diff line number Diff line change @@ -31,7 +31,18 @@ class TorchSquimObjectiveQualityMetricsProcessor(BaseProcessor):
3131 """This processor calculates Squim quality metrics for audio files.
3232
3333 It uses a pre-trained Squim model to calculate audio quality metrics like PESQ, STOI
34- and SI-SDR for each audio segment in the manifest.
34+ and SI-SDR for each audio segment in the manifest:
35+
36+ PESQ (Perceptual Evaluation of Speech Quality)
37+ A measure of overall quality for speech (originally designed to detect codec distortions but highly correlated to all kinds of distortion.
38+
39+ STOI (Short-Time Objective Intelligibility)
40+ A measure of speech intelligibility, basically measures speech envelope integrity.
41+ A STOI value of 1.0 means 100% of the speech being evaluated is intelligible on average.
42+
43+ SI-SDR (Scale-Invariant Signal-to-Distortion Ratio)
44+ A measure of how strong the speech signal is vs. all the distortion present in the audio, in decibels.
45+ 0 dB means the energies of speech and distortion are the same. A value between 15-20 dB is what is considered "clean enough" speech in general.
3546
3647 Args:
3748 device (str, Optional): Device to run the model on. Defaults to "cuda".
You can’t perform that action at this time.
0 commit comments