Problem
The provided code implements an advanced ensemble approach (LightGBM, XGBoost, CatBoost) for mouse behavior detection (MABe Challenge). The goal is to improve prediction accuracy, particularly the f1-score, by optimizing the ensemble, feature engineering, and post-processing steps.
Suggestions for Optimization
- Hyperparameter Tuning:
- Perform grid/random search or Bayesian optimization for LightGBM, XGBoost, CatBoost hyperparameters to maximize f1-score.
- Tune thresholds (
action_thresholds) per action using cross-validation.
- Advanced Feature Engineering:
- Incorporate domain-specific features (e.g., angles between mice, acceleration, jerk, time-to-collision).
- Use unsupervised learning (PCA, t-SNE, UMAP) to reduce dimensionality and highlight relevant features.
- Add features representing behavioral context (e.g., time since last event, cumulative event counts).
- Modeling Improvements:
- Add stacking/blending with meta-learners (e.g., logistic regression, neural nets) for final prediction.
- Use stratified K-fold/grouped cross-validation to avoid data leakage.
- Experiment with more diverse base models (RandomForest, ExtraTrees, neural nets if time allows).
- Post-processing:
- Use more sophisticated temporal smoothing (e.g., median filter, Savitzky-Golay) for probability outputs.
- Apply non-max suppression or event merging to reduce duplicate/overlapping events.
- Remove predictions in frames unlikely to contain behavior (e.g., based on inactivity, context).
- Error Analysis:
- Systematically analyze false positives/negatives per action and adjust feature set or model accordingly.
- Data Augmentation:
- Augment training data with synthetic noise, time-shifts, or by simulating occlusions.
- Ensemble Weighting:
- Learn optimal weights for each model in the ensemble based on validation performance per action.
Acceptance Criteria
- Clear documentation of which optimizations were tried and their impact.
- Demonstrable improvement in f1-score on validation/test sets.
- Code changes are modular and clearly commented.
References:
- See the provided code for current ensemble and feature strategy.
- Consult recent top solutions in the MABe Challenge for model/feature ideas.
Problem
The provided code implements an advanced ensemble approach (LightGBM, XGBoost, CatBoost) for mouse behavior detection (MABe Challenge). The goal is to improve prediction accuracy, particularly the f1-score, by optimizing the ensemble, feature engineering, and post-processing steps.
Suggestions for Optimization
action_thresholds) per action using cross-validation.Acceptance Criteria
References: