Hailin Zhang (张海林)
Email: z.hl [AT] pku.edu.cn
Hailin Zhang currently works at Xiaomi MiMo, specializing in AI infrastructure. He is building efficient, scalable, and stable RL infrastructures for the MiMo series models. He earned his Ph.D. in Computer Science from Peking University in 2025, advised by Prof. Bin Cui, where his academic excellence was recognized with the Peking University Outstanding Doctoral Dissertation Award.
With over 10 publications in top-tier conferences, Hailin Zhang’s research interests lie in the field of MLSys (Machine Learning Systems), with a focus on large-scale LLMs, DLRMs, Information Retrieval (IR), and general distributed computing. His first-author research has earned prestigious accolades, including the Best Scalable Data Science Paper Award at VLDB 2022 and an Honorable Mention for Best Artifact at SIGMOD 2024.
Hailin Zhang was the lead contributor to the Hetu distributed deep learning system in 2021, the same year the project was recognized with the Synced Machine Intelligence TOP-10 Open Source Awards. He created PQCache, which ranks as the top-performing LLM sparse decoding method on the SkyLight benchmark.
[Hiring] I am looking for highly motivated full-time engineers and research interns in AI/RL Infra to join us in building next-generation AGI. If interested, please reach out to me or send your resume to mimo@xiaomi.com.Technical Reports
- MiMo-V2-Flash Technical Report. PDF & Models
- 🚀 RL Infra Highlights: Introduces R3 and request-level cache for stable training; develops Data Scheduler for seamless multi-source fine-grained dynamic sampling; provides Toolbox and Tool Manager for scalable RL agent training with unified tool management.
- 🎯 RL Training Highlights: Supports both non-agentic and agentic RL training, boosting SWE-Verified from ~66 to ~74 and SWE-Multilingual from ~56 to ~74 with over 100K code agent environments; enables efficient multi-teacher on-policy distillation with multiple teachers.
MiMo-Audio: Audio Language Models are Few-Shot Learners. PDF & Models
- MiMo-VL Technical Report. PDF & Models
- 🎯 RL Training Highlights: Supports mixed on-policy RL training across diverse tasks, including those with verifiable rewards and human feedback.
- MiMo: Unlocking the Reasoning Potential of Language Model–From Pretraining to Posttraining. PDF & Models
- 🚀 RL Infra Highlights: Builds Seamless Rollout Engine (continuous rollout, asynchronous reward computation, and early termination) for efficient dynamic sampling-based RL.
- 🎯 RL Training Highlights: Enables on-policy RL with extended generation budget on 7B model, achieving parity with the Deepseek-R1 performance in mathematical reasoning.
Education
PhD, major in Computer Science
Peking University 2020-2025BS, major in Computer Science; BEc, double major in Economics
Peking University 2016-2020
Publications
Publications in reverse chronological order of acceptance date. * represents co-first author.
2025
Stabilizing MoE Reinforcement Learning by Aligning Training and Inference Routers. PDF
Wenhan Ma, Hailin Zhang, Liang Zhao, Yifan Song, Yudong Wang, Zhifang Sui, Fuli Luo.
Preprint.SALE: Low-bit Estimation for Efficient Sparse Attention in Long-context LLM Prefilling. PDF
Xiaodong Ji, Hailin Zhang, Fangcheng Fu, Bin Cui.
Preprint.Efficient and Scalable Huge Embedding Model Training via Distributed Cache Management. PDF
Xupeng Miao, Hailin Zhang, Yining Shi, Xiaonan Nie, Zhi Yang, Yangyu Tao, Jie Jiang, Bin Cui.
The International Journal on Very Large Data Bases.
VLDBJ 2025, CCF-A.PQCache: Product Quantization-based KVCache for Long Context LLM Inference. PDF
Hailin Zhang, Xiaodong Ji, Yilin Chen, Fangcheng Fu, Xupeng Miao, Xiaonan Nie, Weipeng Chen, Bin Cui.
ACM SIGMOD International Conference on Management of Data.
SIGMOD 2025, CCF-A.Malleus: Straggler-Resilient Hybrid Parallel Training of Large-scale Models via Malleable Data and Model Parallelization. PDF
Haoyang Li, Fangcheng Fu, Hao Ge, Sheng Lin, Xuanyu Wang, Jiawen Niu, Yujie Wang, Hailin Zhang, Xiaonan Nie, Bin Cui.
ACM SIGMOD International Conference on Management of Data.
SIGMOD 2025, CCF-A.CAFE+: Towards Compact, Adaptive, and Fast Embedding for Large-scale Online Recommendation Models. PDF
Zirui Liu*, Hailin Zhang*, Boxuan Chen*, Zihan Jiang, Yikai Zhao, Yangyu Tao, Tong Yang, Bin Cui.
Transactions on Information Systems.
TOIS 2025, CCF-A.MEMO: Fine-grained Tensor Management For Ultra-long Context LLM Training. PDF
Pinxue Zhao, Hailin Zhang, Fangcheng Fu, Xiaonan Nie, Qibin Liu, Fang Yang, Yuanbo Peng, Dian Jiao, Shuaipeng Li, Jinbao Xue, Yangyu Tao, Bin Cui.
ACM SIGMOD International Conference on Management of Data.
SIGMOD 2025, CCF-A.
2024
Retrieval-Augmented Generation for AI-Generated Content: A Survey. PDF
Penghao Zhao*, Hailin Zhang*, Qinhan Yu, Zhengren Wang, Yunteng Geng, Fangcheng Fu, Ling Yang, Wentao Zhang, Jie Jiang, Bin Cui.
Preprint.Surge Phenomenon in Optimal Learning Rate and Batch Size Scaling. PDF
Shuaipeng Li*, Penghao Zhao*, Hailin Zhang*, Xingwu Sun, Hao Wu, Dian Jiao, Weiyan Wang, Chengjun Liu, Zheng Fang, Jinbao Xue, Yangyu Tao, Bin Cui, Di Wang.
Conference on Neural Information Processing Systems.
NeurIPS 2024, CCF-A.Enabling Parallelism Hot Switching for Efficient Training of Large Language Models. PDF
Hao Ge, Fangcheng Fu, Haoyang Li, Xuanyu Wang, Sheng Lin, Yujie Wang, Xiaonan Nie, Hailin Zhang, Xupeng Miao, Bin Cui.
Symposium on Operating Systems Principles.
SOSP 2024, CCF-A.A Unified Framework for Mining Batch and Periodic Batch in Data Streams. PDF
Zirui Liu, Xiangyuan Wang, Yuhan Wu, Tong Yang, Kaicheng Yang, Hailin Zhang, Yaofeng Tu, Bin Cui.
IEEE Transactions on Knowledge and Data Engineering.
TKDE 2024, CCF-A.CAFE: Towards Compact, Adaptive, and Fast Embedding for Large-scale Recommendation Models. PDF
Hailin Zhang*, Zirui Liu*, Boxuan Chen, Yikai Zhao, Tong Zhao, Tong Yang, Bin Cui.
ACM SIGMOD International Conference on Management of Data.
SIGMOD 2024, CCF-A, Honorable Mention for Best Artifact!Experimental Analysis of Large-scale Learnable Vector Storage Compression. PDF
Hailin Zhang, Penghao Zhao, Xupeng Miao, Yingxia Shao, Zirui Liu, Tong Yang, Bin Cui.
International Conference on Very Large Data Bases.
VLDB 2024, CCF-A.
2023
Model-enhanced Vector Index. PDF
Hailin Zhang, Yujing Wang, Qi Chen, Ruiheng Chang, Ting Zhang, Ziming Miao, Yingyan Hou, Yang Ding, Xupeng Miao, Haonan Wang, Bochen Pang, Yuefeng Zhan, Hao Sun, Weiwei Deng, Qi Zhang, Fan Yang, Xing Xie, Mao Yang, Bin Cui.
Conference on Neural Information Processing Systems.
NeurIPS 2023, CCF-A.Galvatron: Efficient Transformer Training over Multiple GPUs Using Automatic Parallelism. PDF
Xupeng Miao, Yujie Wang, Youhe Jiang, Chunan Shi, Xiaonan Nie, Hailin Zhang, Bin Cui.
International Conference on Very Large Data Bases.
VLDB 2023, CCF-A.
2022
Hetu: A Highly Efficient Automatic Parallel Distributed Deep Learning System. PDF
Xupeng Miao, Xiaonan Nie, Hailin Zhang, Tong Zhao, Bin Cui.
Science China Information Sciences.
SCIS 2022, CCF-A.HET-GMP: A Graph-based System Approach to Scaling Large Embedding Model Training. PDF
Xupeng Miao, Yining Shi, Hailin Zhang, Xin Zhang, Xiaonan Nie, Zhi Yang, Bin Cui.
ACM SIGMOD International Conference on Management of Data.
SIGMOD 2022, CCF-A.HET: Scaling out Huge Embedding Model Training via Cache-enabled Distributed Framework. PDF
Xupeng Miao*, Hailin Zhang*, Yining Shi, Xiaonan Nie, Zhi Yang, Yangyu Tao, Bin Cui.
International Conference on Very Large Data Bases.
VLDB 2022, CCF-A, Best Scalable Data Science Paper!
Systems
- Hetu
- 2021 Synced Machine Intelligence TOP-10 Open Source Awards.
- Pop SOTA!List for AI Developers 2021.
- Outstanding Award & Champion of 2021 CCF BDCI Contest.
