OSS reinforcement learning lib by ByteDance is used to reproduce DeepSeek R1
(github.com/volcengine)
verl is a flexible, efficient and production-ready RL training library for large language models (LLMs).
verl is a flexible, efficient and production-ready RL training library for large language models (LLMs).