Hi~ I’m Xinyi, a Ph.D. candidate at University College London (UCL), fortunate to be supervised by Prof. Jing-Hao Xue. Before UCL, I received my master’s degree with honors from Xiamen University (XMU) under the wonderful guidance of Prof. Yang Lu.
I’m currently a visiting student at MBZUAI, fortunate to be supervised by Prof. Zhiqiang Shen, and I’ve truly been enjoying my time in Abu Dhabi. I’ve also been lucky to spend time as a research intern at Westlake University with Prof. Tao Lin, and at Wuhan University with Prof. Mang Ye. I’m especially grateful to my long-term collaborators, Peng Sun at Westlake University & Zhejiang University, and Jingyu Lin at Monash University, whose friendship and shared curiosity make research truly wonderful and joyful.
Last but not least, a heartfelt thank-you to my three ragdoll cats, 🍉 (xixi), 🧃 (zhizhi), and 🥥 (yeye), for their unwavering company throughout my Ph.D. journey. 🐾
You can reach me at xinyi.shang.23@ucl.ac.uk, and find my publications on Google Scholar .
🔥 News
- 2026.04: 📄 We release a technical report for A Systematic and Comprehensive Analysis of Claude Code. Paper at: [Link]
- 2026.04: 🎉 One paper is accepted by ACL 2026 (main).
- 2026.02: 🎉 Two papers are accepted by CVPR 2026.
- 2025.09: Started as a visiting student at MBZUAI, supervised by Prof. Zhiqiang Shen.
- 2025.02: 🎉 One paper is accepted by CVPR 2025.
- 2025.01: 🎉 One paper is accepted by ICLR 2025.
- 2023.05: 🎉 Received the Outstanding Master Thesis Award.
- 2023.03: 🎉 Received the PhD offer from University College London (UCL), supported by full scholarship!
- 2022.09: Awarded the China National Scholarship.
- 2022.04: One IJCAI paper is accepted.
- 2022.03: One ICME paper is accepted as oral.
🎯 Research Interests
My research centers on efficient and generalized deep learning. Numbers in brackets link to the corresponding entries in Publications below.
- Efficient Deep Learning. Compressing large datasets into compact, informative subsets, and identifying which data most effectively drives training efficiency.
- Federated Learning (FL). A privacy-preserving distributed paradigm enabling collaborative model training across devices or organizations without sharing raw data.
- Imperfect global data: addressing long-tailed class distributions [9,8] and limited labeled data [3,2] for improved robustness.
- Personalization: strengthening local-model personalization while preserving global performance [6].
- Generalization & optimization: improving global-model generalization through the lens of training dynamics [7].
📝 Publications
* denotes equal contribution; † denotes corresponding author.

[1] Dataset Distillation in the Era of Large-Scale Data: Methods, Analysis, and Future Directions
Xinyi Shang, Peng Sun, Zhiqiang Shen, Tao Lin, Jing-Hao Xue.
- We identify four significant shifts in the field of dataset distillation and provide the first comprehensive, stage-wise review through the dataset-distillation pipeline.

[2] Federated Semi-Supervised Learning with Annotation Heterogeneity
Xinyi Shang, Gang Huang, Yang Lu†, Jian Lou, Bo Han, Yiu-ming Cheung, Hanzi Wang.
- We formalize Federated Semi-Supervised Learning with annotation heterogeneity and propose a new framework with a mutual-learning strategy.

Yijie Liu, Xinyi Shang, Yiqun Zhang, Yang Lu†, Chen Gong, Jing-Hao Xue, Hanzi Wang.
- We show that (1) data heterogeneity intensifies pseudo-label mismatches, and (2) local- and global-model predictive tendencies diverge with heterogeneity. We propose a simple yet effective method to correct pseudo-labels by exploiting confidence discrepancies.

[4] GIFT: Unlocking Full Potential of Labels in Distilled Dataset at Near-zero Cost | [code]
Xinyi Shang*, Peng Sun*, Tao Lin†.
- Models trained on distilled datasets are highly sensitive to the soft-label loss. Building on this insight, we introduce a plug-and-play approach that efficiently leverages full label information at near-zero cost.

[5] Information Compensation: A Fix for Any-scale Dataset Distillation
Peng Sun, Bei Shi, Xinyi Shang, Tao Lin†.
- A near-lossless information-compression approach that distills the key information of original datasets with minimal loss, surpassing existing methods in both efficiency and effectiveness across dataset scales.

[6] No Fear of Classifier Biases: Neural Collapse Inspired Federated Learning with Synthetic and Fixed Classifier | [code]
Zexi Li, Xinyi Shang, Rui He, Tao Lin†, Chao Wu†.
- A neural-collapse-inspired method that mitigates classifier biases in federated learning, achieving high global-model generalization together with strong local-model personalization.

[7] Understanding the Training Dynamics in Federated Deep Learning via Aggregation Weight Optimization | [code]
Zexi Li, Tao Lin†, Xinyi Shang, Chao Wu†.
- We analyze FL training dynamics through client coherence and global weight shrinking, and design an aggregation algorithm that measurably improves generalization.

[8] FEDIC: Federated Learning on Non-IID and Long-Tailed Data via Calibrated Distillation | [code]
Xinyi Shang, Yang Lu†, Yiu-ming Cheung, Hanzi Wang.
- A new distillation method with logit adjustment and calibration gating network to solve the joint problem of heterogeneous and long-tailed data.

[9] Federated Learning on Heterogeneous and Long-Tailed Data via Classifier Re-Training with Federated Features | [code]
Xinyi Shang, Yang Lu†, Gang Huang, Hanzi Wang.
- We first find that the biased classifier is the primary factor behind the poor performance of the global model, then propose CReFF to optimize a small set of learnable features for classifier re-training.
🎖 Honors and Awards
- 2023 Xiamen University Outstanding Master Thesis.
- 2023 Xiamen University Outstanding Graduates.
- 2022 China National Scholarship (Top 0.2%, the highest-level scholarship established by the central government).
- 2022 Excellent Merit Student of Xiamen University (Top 2%).
- 2021 Merit Student of Xiamen University (Top 8%).
- 2020 China College Students Innovation and Entrepreneurship Competition — two provincial projects.
- 2019 Provincial Excellent Volunteer honor (500+ hours of volunteering).
- 2018 Star of Excellent Volunteers honor (only one student in the college per year).
💻 Research Experience
- 2022.11 - 2023.09, Research Intern, LINs Lab, Westlake University, supervised by Prof. Tao Lin. Decentralized deep learning.
- 2022.06 - 2022.09, Research Intern, MARS Lab, Wuhan University, supervised by Prof. Mang Ye. Federated learning.
🤝 Academic Service
- Conference Reviewer: NeurIPS 2025, ICCV 2025, ICLR 2025, CVPR 2025, WACV 2025, IJCAI 2024.
- Journal Reviewer: IEEE TNNLS, IEEE TCSVT, ACM CSUR, IEEE TC, IEEE TETCI.
🙌 Voluntary Activities
- 2019.03 - 2019.09, Director of Teach For China at Zhongnan University of Economics and Law.
- 2017.09 - 2019.06, Director of We-Bright, supporting 53 rural primary schools across Sichuan and Guangxi provinces.
🎨 Hobbies
- Cooking and Bakery — I hope I will own my bakery one day.
- Drawing.
- Photography and keeping journals.