YIZHE
profile photo

Yizhe Xiong (熊翊哲)

I am now a Ph.D. candidate at Multimedia Intelligence Group, School of Software, Tsinghua University. Before that, I received my bachelor's degree at Department of Computer Science and Technology, Tsinghua University. I am instructed by Prof. Guiguang Ding.

Email  /  Google Scholar  /  Linkedin  /  CV

News

  • 08/2025, Three articles are accepted by EMNLP 2025. Temporal Scaling Law has been selected as an oral presentation.
  • 02/2025, Checkout our latest work on post-training LLMs for reducing memory & latency cost for inference.
  • 01/2025, Two articles are accepted by the main conference of NAACL 2025.
  • 12/2024, Our survey on the Next Token Prediction paradigm is released! Please visit the Github repo and arXiv page.
  • 12/2024, One article is accepted by ICASSP 2025.
  • 12/2024, Scaffold-BPE has been accepted by AAAI 2025.
  • 12/2024, One article is accepted by COLING 2025.
  • 07/2024, Our paper on PEFT has been accepted by ECCV 2024.
  • 04/2024, We have made some novel explorations in the field of LLM pre-training. Checkout Temporal Scaling Law and Scaffold-BPE on arXiv.
  • 03/2024, Checkout our latest work on fine-tuning & task adaptation.
  • 07/2023, Our paper on domain adaptation has been accepted by ICCV 2023.
  • Research

    I'm interested in the efficient adaptation of foundation models to diverse downstream tasks through transfer learning. My research focuses on developing methods that enable these large-scale models to generalize effectively across various domains, even when faced with limited data or computational resources. I aim to make advanced AI models more accessible and applicable to a wide range of real-world applications. For a full list of my publications, please visit my Google Scholar page.

    Selected Conference Articles:

    1. Temporal Scaling Law for Large Language Models
      Yizhe Xiong, Xiansheng Chen, Xin Ye, Hui Chen, Zijia Lin, Haoran Lian, Zhenpeng Su, Wei Huang, Jianwei Niu, Jungong Han, Guiguang Ding
      Keywords: Large Language Model (LLM), Language Modeling, Scaling Law
      EMNLP 2025 (Oral Presentation) | paper

    2. PYRA: Parallel Yielding Re-Activation for Training-Inference Efficient Task Adaptation
      Yizhe Xiong, Hui Chen, Tianxiang Hao, Zijia Lin, Jungong Han, Yuesong Zhang, Guoxin Wang, Yongjun Bao, Guiguang Ding
      Keywords: Transfer Learning, Parameter-Efficient Fine-Tuning (PEFT), Task Adaptation, Model Pruning, Token Pruning
      ECCV 2024 | paper  GitHub repo

    3. Confidence-based Visual Dispersal for Few-shot Unsupervised Domain Adaptation
      Yizhe Xiong, Hui Chen, Zijia Lin, Sicheng Zhao, Guiguang Ding
      Keywords: Domain Adaptation, Transfer Learning, Few-Shot (Low Shot), Few-Shot Unsupervised Domain Adaptation (FUDA)
      ICCV 2023 | paper  GitHub repo

    4. Fast Quiet-STaR: Thinking Without Thought Tokens
      Wei Huang*, Yizhe Xiong*, Xin Ye*, Zhijie Deng, Hui Chen, Zijia Lin, Guiguang Ding (∗ denotes equal contribution)
      Keywords: Large Language Model (LLM), Language Modeling, Reasoning, Reinforcement Learning
      EMNLP 2025 Findings | paper

    5. Breaking the Stage Barrier: A Novel Single-Stage Approach to Long Context Extension for Large Language Models
      Haoran Lian∗, Junmin Chen∗, Wei Huang∗, Yizhe Xiong∗, Wenping Hu*, Guiguang Ding, Hui Chen, Jianwei Niu, Zijia Lin, Fuzheng Zhang, Di Zhang (∗ denotes equal contribution)
      Keywords: Large Language Model (LLM), Language Modeling, Long Context Extrapolation
      COLING 2025 | paper

    6. LBPE: Long-token-first Tokenization to Improve Large Language Models
      Haoran Lian, Yizhe Xiong, Zijia Lin, Jianwei Niu, Shasha Mo, Hui Chen, Peng Liu, Guiguang Ding
      Keywords: Large Language Model (LLM), Language Modeling, Machine Translation, Byte-Pair Encoding (BPE)
      ICASSP 2025 | paper

    7. Scaffold-BPE: Enhancing Byte Pair Encoding for Large Language Models with Simple and Effective Scaffold Token Removal
      Haoran Lian, Yizhe Xiong, Jianwei Niu, Shasha Mo, Zhenpeng Su, Zijia Lin, Peng Liu, Hui Chen, Guiguang Ding
      Keywords: Large Language Model (LLM), Language Modeling, Machine Translation, Byte-Pair Encoding (BPE)
      AAAI 2025 | paper

    Journal Articles:

    1. Cross-Modality Prompts: Few-shot Multi-label Recognition with Single-label Training
      Zixuan Ding, Zihan Zhou, Hui Chen, Tianxiang Hao, Yizhe Xiong, Sicheng Zhao, Qiang Zhang, Jungong Han
      Keywords: Multi-label Recognition, Transfer Learning, Few-Shot (Low Shot)
      IEEE Transactions on Multimedia | paper

    Other Selected Works:

    1. UniAttn: Reducing Inference Costs via Softmax Unification for Post-Training LLMs
      Yizhe Xiong, Wei Huang, Xin Ye, Hui Chen, Zijia Lin, Haoran Lian, Zhenpeng Su, Jungong Han, Guiguang Ding
      Keywords: Large Language Model (LLM), Language Modeling, Model Compression, Post-Training, Supervised Fine-Tuning (SFT)
      Under review | paper

    2. Neutralizing Token Aggregation via Information Augmentation for Efficient Test-Time Adaptation
      Yizhe Xiong, Zihan Zhou, Yiwen Liang, Hui Chen, Zijia Lin, Tianxiang Hao, Fan Zhang, Jungong Han, Guiguang Ding
      Keywords: Test-Time Training, Vision Transformers, Model Compression
      Under review | paper

    Academic Services

  • Served as conference reviewer for NeurIPS 2024&2025, IJCAI 2024, ACL Rolling Review 2024&2025, ICLR 2024, ICML 2025, ICCV 2025, ACM MM 2025, AAAI 2026.

  • Served as journal reviewer for IEEE Transactions on Image Processing, IEEE Transactions on Multimedia.

  • Awards

  • 2024, Academic Scholarship, School of Software, Tsinghua University.

  • 2024, First Place and Gold Prize, VISION'24 Data Challenge, ECCV 2024.

  • 2023, Academic Scholarship, School of Software, Tsinghua University.

  • 2022, Outstanding Graduate, Department of Computer Science and Technology, Tsinghua University.

  • Many thanks go to Dr. Yunhe Wang, who shared the source code of his homepage.