Research Projects

Project pages for selected publications — with method overview, results, and BibTeX

ICLR 2026

CESAR: Audio LLM Reasoning via Process Rewards

J. Fan*, R. Ren, J. Li, R. Pandey, P.G. Shivakumar, Y. Gu, A. Gandhe, G. Liu, I. Bulyko

🏆 SOTA MMAU · Beats Gemini 2.5 Pro & GPT-4o Audio

Resolves test-time inverse scaling in Audio LLMs by rewarding the reasoning process.

Project Page OpenReview arXiv

NeurIPS 2025

J. Fan*, T. Wei, C. Cheng, Y. Chen, G. Liu

🚀 2B SD3 surpasses 4.8B & 12B models

Sample-level adaptive KL — high-value samples explore freely, poor samples stay constrained.

Project Page OpenReview arXiv

ICLR 2025

J. Fan*, S. Shen, C. Cheng, Y. Chen, C. Liang, G. Liu

✨ First online RLHF for flow matching models

No human data, no mode collapse. W2 regularization preserves generation diversity.

Project Page OpenReview

Preprint 2025

J. Fan*, C. Cheng, S. Shen, X. Zhou, G. Liu

📝 Under Review

Intermediate feedback + dual-stability for robust flow matching fine-tuning on SD3.

Project Page arXiv

ICLR 2023 ⭐ Oral · 5/4176

J. Fan*, Y. Zhuang, Y. Liu, J. Hao, B. Wang, J. Zhu, H. Wang, S.-T. Xia

🏅 10,077% HNS · 24 records · 500× data efficient

Learnable behavior control via hybrid policy mapping + bandit meta-controller.

Project Page OpenReview

ICML 2022

J. Fan*, C. Xiao

📈 22 world records · 9,620% HNS · 500× vs Agent57

Unified RL framework showing data distribution is the key to superhuman efficiency.

Project Page PMLR