JAX-based diffusion policy RL algorithms.
- Agents:
flowrl/agent/online/(online) andflowrl/agent/offline/(offline) - Configs:
flowrl/config/(dataclasses),examples/**/config/(YAML hyperparams) - Entry points:
examples/— organized by scenario (online/offline) and benchmark (DMControl, HumanoidBench, MuJoCo)
- Keep the code clean, readable and consistent with existing ones.
- You can sacrifice grammar in your response for conciseness.
- Read the official implementation; identify all hyperparameters.
- Implement in
flowrl/agent/. Match existing code style — reference SDAC (online) or DAC (offline). - Add config dataclasses and YAML files consistent with the official implementation.
- Flag anything abnormal vs. standard RL practices.
- If an official implementation is provided, diff against it for logic and hyperparameter mismatches.
- README: Identify undocumented algorithms. Prompt me to add experiment links — only include algorithms with results. Do not overwrite existing README content.
- Dependencies: If changes are needed, ask me before updating.