Here is the cleaned-up version of your markdown without the abnormal characters:
- [April 2026] StreamFP has been accepted to The Web Conference 2026 (WWW '26)!
StreamFP is a novel stream learning framework designed to handle non-stationary data streams with high efficiency and robustness against catastrophic forgetting. It introduces learnable fingerprints—compact parameter vectors that summarize the model state—to guide data selection processes.
Key challenges in Stream Learning (SL) addressed by StreamFP:
- Data Redundancy: Incoming streams often contain redundant data that wastes computation.
- Catastrophic Forgetting: Incremental updates can overwrite earlier knowledge.
- Efficiency: Traditional model-based selection is often too computationally expensive for real-time streams.
StreamFP achieves superior accuracy and efficiency compared to state-of-the-art methods (e.g., Camel, ER, GradMatch) across varying data arrival rates.
StreamFP consists of three key components driven by a shared set of learnable fingerprints [cite: 141-144]:
- Fingerprint-based Coreset Selection (FCS): Selects informative samples from incoming batches based on fingerprint similarity, prioritizing data that balances novelty and familiarity.
- Fingerprint-based Buffer Update (FBU): Dynamically maintains the replay buffer by preserving representative historical samples and discarding redundant ones.
- Fingerprint Attunement (FA): A lightweight plugin that uses pre-trained ViT attention to calibrate fingerprints online with negligible overhead.
- Linux or macOS
- Python 3.8+
- PyTorch 1.12+ and CUDA 11.3+
# Clone the repository
git clone https://github.com/CGCL-codes/StreamFP.git
cd StreamFP
# Create and activate conda environment
conda env create -f environment.yml
conda activate sl
# (Optional) Install FastMoE (main path: build without NCCL)
# NOTE: FastMoE builds a CUDA extension. If you see errors like "nccl.h: No such file or directory",
# you can build without NCCL by setting USE_NCCL=0 (recommended unless you need NCCL-based distributed comm).
conda install -y cmake ninja
git clone --recursive https://github.com/laekov/fastmoe.git
cd fastmoe
# Option 1: disabling distributed features
USE_NCCL=0 python setup.py install
# Option 2: enabling distributed features
python setup.py install
# Quick check
python -c "import fmoe, fmoe_cuda; print('FastMoE installed:', fmoe_cuda.__file__)"
cd ..Create a data/ directory in the project root.
- Clear10 / Clear100: Download from Clear Benchmark.
- Stream-51: Download from Stream-51 GitHub.
- CORe50: Run the provided script to download and setup:
sh core50.shTo run a standard experiment (e.g., on Clear10), use the scripts provided in experiments/:
# Run Clear10 experiment
sh experiments/clear10.sh
# Run Stream-51 experiment
sh experiments/stream51.shYou can customize the training by modifying the arguments in run.py. Key arguments include:
--selection_method: Strategy for coreset selection (e.g.,StreamFP,Camel,Random).--update_method: Strategy for buffer update (e.g.,StreamFP,ER,GSS).--skip_batch: Enable batch skipping for high-speed streams (default:1).--traintime_limit: Simulate real-time constraints.
Example command:
python -u run.py --config configs/clear10.yaml \
--repeat 1 --overwrite 1 \
--selection_method StreamFP --update_method StreamFP \
--mem_size 102 --traintime_limit 10StreamFP consistently outperforms baselines in both Accuracy and Forgetting metrics. Below is a comparison on Stream-51 and Clear10 datasets:
| Dataset | Method | Accuracy (%) | Forgetting (%) | Runtime (s) |
|---|---|---|---|---|
| Stream-51 | ER | 59.99 | 3.70 | 1883.75 |
| StreamFP | 64.44 | 2.25 | 2049.52 | |
| Clear10 | ER | 51.90 | 1.09 | 412.50 |
| StreamFP | 54.94 | 0.82 | 448.80 |
Detailed results can be found in the results_log/ directory after training.
If you find this work useful for your research, please cite our WWW '26 paper:
@inproceedings{li2026streamfp,
title={StreamFP: Fingerprint-guided Data Selection for Efficient Stream Learning},
author={Li, Changwu and Shi, Tongjun and Zhang, Shuhao and Chen, Binbin and He, Bingsheng and Liao, Xiaofei and Jin, Hai},
booktitle={Proceedings of the ACM Web Conference 2026 (WWW '26)},
year={2026},
publisher={ACM},
address={Dubai, United Arab Emirates},
doi={10.1145/XXXXXXXXXXXX}
}This project is licensed under the MIT License - see the LICENSE file for details.
This research is supported by Huazhong University of Science and Technology and Singapore University of Technology and Design. We thank the authors of Clear Benchmark, CORe50, and Stream-51 for their datasets.
