Preprint, 2025
|
SuperRL: Reinforcement Learning with Supervision to Boost Language Model Reasoning
Yihao Liu, Shuocheng Li, Lang Cao, Yuhang Xie, Mengyu Zhou, Haoyu Dong, Xiaojun Ma, Shi Han, Dongmei Zhang
Preprint, 2025
bibtex /
paper /
code
[LLM Reasoning]
SuperRL is a unified training framework that adaptively combines supervised and reinforcement learning to boost language model reasoning, achieving greater stability and generalization, especially under sparse rewards.
|
Technical Report, 2023
|
DiagGPT: An LLM-based and Multi-agent Dialogue System with Automatic Topic Management for Flexible Task-Oriented Dialogue
Lang Cao
Technical Report, 2023
bibtex /
paper /
code
[LLM Agents]
DiagGPT extends large language models for task-oriented dialogue by enabling proactive question-asking and topic management, achieving strong performance in complex diagnostic interactions across domains like medicine and law.
|
© Copyright Lang Cao (last updated May 22, 2025).
|
|