Skip to content

Optimize Advanced Ensemble for Accuracy in MABe Challenge Code #1

@charriry

Description

@charriry

Problem

The provided code implements an advanced ensemble approach (LightGBM, XGBoost, CatBoost) for mouse behavior detection (MABe Challenge). The goal is to improve prediction accuracy, particularly the f1-score, by optimizing the ensemble, feature engineering, and post-processing steps.

Suggestions for Optimization

  1. Hyperparameter Tuning:
    • Perform grid/random search or Bayesian optimization for LightGBM, XGBoost, CatBoost hyperparameters to maximize f1-score.
    • Tune thresholds (action_thresholds) per action using cross-validation.
  2. Advanced Feature Engineering:
    • Incorporate domain-specific features (e.g., angles between mice, acceleration, jerk, time-to-collision).
    • Use unsupervised learning (PCA, t-SNE, UMAP) to reduce dimensionality and highlight relevant features.
    • Add features representing behavioral context (e.g., time since last event, cumulative event counts).
  3. Modeling Improvements:
    • Add stacking/blending with meta-learners (e.g., logistic regression, neural nets) for final prediction.
    • Use stratified K-fold/grouped cross-validation to avoid data leakage.
    • Experiment with more diverse base models (RandomForest, ExtraTrees, neural nets if time allows).
  4. Post-processing:
    • Use more sophisticated temporal smoothing (e.g., median filter, Savitzky-Golay) for probability outputs.
    • Apply non-max suppression or event merging to reduce duplicate/overlapping events.
    • Remove predictions in frames unlikely to contain behavior (e.g., based on inactivity, context).
  5. Error Analysis:
    • Systematically analyze false positives/negatives per action and adjust feature set or model accordingly.
  6. Data Augmentation:
    • Augment training data with synthetic noise, time-shifts, or by simulating occlusions.
  7. Ensemble Weighting:
    • Learn optimal weights for each model in the ensemble based on validation performance per action.

Acceptance Criteria

  • Clear documentation of which optimizations were tried and their impact.
  • Demonstrable improvement in f1-score on validation/test sets.
  • Code changes are modular and clearly commented.

References:

  • See the provided code for current ensemble and feature strategy.
  • Consult recent top solutions in the MABe Challenge for model/feature ideas.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions