3D-Corpus

3D-Corpus is a feature extraction and processing pipeline for audio datasets, designed for visualization in 3D space.

Main Features

Audio file loading and preprocessing
Onset detection-based audio segmentation
MFCC, Spectral Centroid, and Chroma feature extraction
GPU acceleration support (Apple Silicon MPS backend and CUDA)
Asynchronous processing and optimized batch processing

System Requirements

Python 3.10 or higher
Apple Silicon Mac (M1/M2/M3) or CUDA-compatible GPU
32GB RAM recommended

Installation

Clone the repository:

git clone https://github.com/unohee/3d-corpus.git
cd 3d-corpus

Create and activate a virtual environment:

python -m venv research_env
source research_env/bin/activate  # macOS/Linux

Install required packages:

pip install -r requirements.txt

Usage

Basic Feature Extraction (CPU Version)

python featureExtractor.py ./path/to/audio/folder

GPU-Accelerated Feature Extraction (MPS/CUDA Version)

python featureExtractor_torch.py ./path/to/audio/folder

Interactive TUI Interface

To use the text-based user interface for selecting datasets:

python curses_interface.py

Command-line Options

You can also run the pipeline with various options:

python run.py ./path/to/audio/folder [options]

Available options:

--download-only: Only download the FSD50K dataset without extracting features
--no-onset: Disable onset detection and extract features for entire audio files
--save-splits: Save onset-split audio files to disk
--output-dir DIR: Specify the directory to save split audio files (default: splitted_files)

Implemented Feature Extraction

MFCC (Mel-Frequency Cepstral Coefficients)
- 13 MFCC coefficients
- 40 mel filter banks
- 256 frame size, 256 hop length
Spectral Centroid
- Center frequency of the spectrum
- 256 frame size, 256 hop length
Chroma Features
- 12 semitone bins
- 256 frame size, 256 hop length

Performance Optimizations

Asynchronous I/O processing
Batch processing optimization
GPU memory management
Transformer caching
Vectorized operations
Multi-processing and multi-threading

Code Structure

The codebase is organized into several well-documented Python modules:

featureExtractor_torch.py: GPU-accelerated feature extraction with PyTorch
featureExtractor.py: CPU-based feature extraction with librosa
curses_interface.py: Text-based user interface for dataset selection
run.py: Command-line interface with FSD50K dataset download capabilities

All functions include detailed docstrings with parameter descriptions and return value information.

Dataset Structure

dataset/
├── [dataset_name].pkl          # Original audio buffers
└── [dataset_name]_features.pkl # Extracted features

Feature Normalization

The extracted features are normalized to consistent 1D array formats:

MFCC: Multi-dimensional to 1D array (dimension reduction)
Spectral Centroid: Array to scalar value
Chroma: Multi-dimensional to 1D array

Reference Datasets

FSD50K dataset: https://zenodo.org/record/4060432

License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
images		images
.DS_Store		.DS_Store
.gitignore		.gitignore
3d-corpus.ipynb		3d-corpus.ipynb
README.md		README.md
curses_interface.py		curses_interface.py
featureExtractor.py		featureExtractor.py
featureExtractor_CUDA.py		featureExtractor_CUDA.py
featureExtractor_torch.py		featureExtractor_torch.py
requirements.txt		requirements.txt
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

3D-Corpus

Main Features

System Requirements

Installation

Usage

Basic Feature Extraction (CPU Version)

GPU-Accelerated Feature Extraction (MPS/CUDA Version)

Interactive TUI Interface

Command-line Options

Implemented Feature Extraction

Performance Optimizations

Code Structure

Dataset Structure

Feature Normalization

Reference Datasets

License

References

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

3D-Corpus

Main Features

System Requirements

Installation

Usage

Basic Feature Extraction (CPU Version)

GPU-Accelerated Feature Extraction (MPS/CUDA Version)

Interactive TUI Interface

Command-line Options

Implemented Feature Extraction

Performance Optimizations

Code Structure

Dataset Structure

Feature Normalization

Reference Datasets

License

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages