You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
<sub>RLinf: Reinforcement Learning Infrastructure for Agentic AI</sub>
16
23
</h1>
@@ -25,6 +32,7 @@ RLinf is a flexible and scalable open-source infrastructure designed for post-tr
25
32
## What's NEW!
26
33
-[2025/09] <imgsrc="https://github.githubassets.com/images/icons/emoji/unicode/1f525.png"width="18" /> [Example Gallery](https://rlinf.readthedocs.io/en/latest/rst_source/examples/index.html) is updated, users can find various off-the-shelf examples!
27
34
-[2025/09] The paper [RLinf: Flexible and Efficient Large-scale Reinforcement Learning via Macro-to-Micro Flow Transformation](https://arxiv.org/abs/2509.15965) is released.
35
+
-[2025/09] The [report on RLinf by Machine Heart](https://mp.weixin.qq.com/s/Xtv4gDu3lhDDGadLrzt6Aw) is released.
28
36
-[2025/08] RLinf is open-sourced. The formal v0.1 will be released soon.
29
37
30
38
## Key Features
@@ -68,7 +76,7 @@ RLinf is a flexible and scalable open-source infrastructure designed for post-tr
68
76
<divalign="center">
69
77
<table>
70
78
<tr>
71
-
<th colspan="5" style="text-align:center;"><strong>OpenVLA-OFT model results on ManiSkill3</strong></th>
79
+
<th colspan="5" style="text-align:center;"><strong>OpenVLA and OpenVLA-OFT model results on ManiSkill3</strong></th>
72
80
</tr>
73
81
<tr>
74
82
<th>Model</th>
@@ -120,10 +128,10 @@ RLinf is a flexible and scalable open-source infrastructure designed for post-tr
@@ -330,7 +338,20 @@ If you find **RLinf** helpful, please cite the paper:
330
338
}
331
339
```
332
340
333
-
If you use RL+VLA in RLinf, you can also cite our empirical study paper:
341
+
If you use RL+VLA in RLinf, you can also cite our technical report and empirical study paper:
342
+
343
+
```bibtex
344
+
@misc{zang2025rlinfvlaunifiedefficientframework,
345
+
title={RLinf-VLA: A Unified and Efficient Framework for VLA+RL Training},
346
+
author={Hongzhi Zang and Mingjie Wei and Si Xu and Yongji Wu and Zhen Guo and Yuanqing Wang and Hao Lin and Liangzhi Shi and Yuqing Xie and Zhexuan Xu and Zhihao Liu and Kang Chen and Wenhao Tang and Quanlu Zhang and Weinan Zhang and Chao Yu and Yu Wang},
347
+
year={2025},
348
+
eprint={2510.06710},
349
+
archivePrefix={arXiv},
350
+
primaryClass={cs.RO},
351
+
url={https://arxiv.org/abs/2510.06710},
352
+
}
353
+
```
354
+
334
355
```bibtex
335
356
@misc{liu2025rlbringvlageneralization,
336
357
title={What Can RL Bring to VLA Generalization? An Empirical Study},
0 commit comments