โ† Jiajun Fan / Publications / Projects

Research Projects

Project pages for selected publications โ€” with method overview, results, and BibTeX

2026

ICLR 2026

CESAR: Audio LLM Reasoning via Process Rewards

J. Fan*, R. Ren, J. Li, R. Pandey, P.G. Shivakumar, Y. Gu, A. Gandhe, G. Liu, I. Bulyko
๐Ÿ† SOTA MMAU ยท Beats Gemini 2.5 Pro & GPT-4o Audio

Resolves test-time inverse scaling in Audio LLMs by rewarding the reasoning process.

2025

NeurIPS 2025

ADRPO: Adaptive Divergence for Generative Models

J. Fan*, T. Wei, C. Cheng, Y. Chen, G. Liu
๐Ÿš€ 2B SD3 surpasses 4.8B & 12B models

Sample-level adaptive KL โ€” high-value samples explore freely, poor samples stay constrained.

ICLR 2025

ORW-CFM-W2: Flow Matching Self-Evolution

J. Fan*, S. Shen, C. Cheng, Y. Chen, C. Liang, G. Liu
โœจ First online RLHF for flow matching models

No human data, no mode collapse. W2 regularization preserves generation diversity.

Preprint 2025

AC-Flow: Actor-Critic for Flow Matching

J. Fan*, C. Cheng, S. Shen, X. Zhou, G. Liu
๐Ÿ“ Under Review

Intermediate feedback + dual-stability for robust flow matching fine-tuning on SD3.

2023

ICLR 2023 โญ Oral ยท 5/4176

LBC: Breaking 24 Atari World Records

J. Fan*, Y. Zhuang, Y. Liu, J. Hao, B. Wang, J. Zhu, H. Wang, S.-T. Xia
๐Ÿ… 10,077% HNS ยท 24 records ยท 500ร— data efficient

Learnable behavior control via hybrid policy mapping + bandit meta-controller.

2022

ICML 2022

GDI: Generalized Data Distribution Iteration

J. Fan*, C. Xiao
๐Ÿ“ˆ 22 world records ยท 9,620% HNS ยท 500ร— vs Agent57

Unified RL framework showing data distribution is the key to superhuman efficiency.