The official repo of "WhiStress: Enriching Transcriptions with Sentence Stress Detection" (Interspeech 2025). WhiStress extends OpenAI’s Whisper to provide not only accurate transcriptions of speech, but also token-level sentence stress annotations — allowing you to detect which words are emphasized in a spoken sentence.
The model is built on top of the whisper-small.en variant, and enhanced with a lightweight decoder-based classifier that predicts the stress label for each token.
🤗 WhiStress Model
🤗 TinyStress-15K Dataset | 🤗 Demo
| 🌐 Project | 📃 Paper
Clone the repository and install dependencies:
git clone https://github.com/slp-rl/WhiStress.git
cd WhiStress
pip install -r requirements.txtDownload the model weights from WhiStress 🤗 huggingface:
https://huggingface.co/slprl/WhiStress/tree/main
and place them inside the whistress/weights directory.
Expected structure:
whistress/
├── weights/
│ └── additional_decoder_block.pt
│ └── classifier.pt
│ └── metadata.json
├── ...
README.md
download_weights.py
...
You can use the download_weights.py script places under the main repo folder.
WhiStress was trained on the TinyStress-15K dataset. This dataset is based on TinyStories, adapted for sentence stress supervision.
source path/to/your/venv/bin/activateTo generate a transcription with stress predictions:
python inference_example.pyRun evaluation on a sample dataset:
python evaluation_example.pyYou can check out our Demo on 🤗 huggingface.
Or, run the interface locally:
python app_ui.pyThis will launch a browser-based UI for trying out the model interactively on your own audio.
To train WhiStress on a supported dataset (e.g., TinyStress-15K), use the following instructions.
First, activate your virtual environment and optionally export your WAND_API_KEY if you're using Weights & Biases for logging:
export WAND_API_KEY="your_wandb_key"To launch training, run:
python -m whistress.training.train \
--dataset_path path_to_where_preprocessed_ds_will_be_stored \
--transcription_column_name transcription \
--dataset_train tinyStress-15K \
--dataset_eval tinyStress-15K \
--is_train trueThis will:
- Preprocess the TinyStress-15K dataset and store it at the location specified by
--dataset_path. - Save the final training checkpoint to the
training_resultsdirectory (created automatically if--output_pathis not specified).
Expected structure:
whistress/
├── training/
│ └── train.py
│ └── training_results
│ │ └── ...
│ └── ...
├── ...
README.md
...
Note: You can customize your training run using additional CLI arguments.
Check out train.py for the full list of available options and their descriptions.
If you use WhiStress in your work, please cite our paper:
@misc{yosha2025whistress,
title={WHISTRESS: Enriching Transcriptions with Sentence Stress Detection},
author={Iddo Yosha and Dorin Shteyman and Yossi Adi},
year={2025},
eprint={2505.19103},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2505.19103},
}