Huanyu Zhang 张桓瑜
Huanyu Zhang 张桓瑜

Ph.D Student

University of Chinese Academy of Sciences
Institute of Automation, Chinese Academy of Sciences
Research Intern @ Microsoft Research Asia

About Me

I am a third-year Ph.D Student of artificial intelligence at the State Key Laboratory of Multimodal Artificial Intelligence Systems (MAIS) in the Institute of Automation, Chinese Academy of Sciences (CASIA), advised by Prof.Tieniu Tan. Before starting my Ph.D, I was an undergraduate student in Automation at Xi’an Jiaotong University.

I’m deeply passionate about advancing ​Multimodal AI, ​General AI, and ​Large Foundation Models, particularly through the integration of ​vision and text. My current research focuses on ​multimodal reasoning, and developing novel methodologies to unlock the ​intriguing and surprising potential of large models, while also having previously worked on ​time series analysis. I warmly welcome academic collaborations and discussions—feel free to reach out!

Download CV
Interests
  • Multimodal Reasoning
  • Large Multimodal Models
  • Time Series Analysis
Education
  • PhD Student

    CASIA

  • BSc in Automation

    Xi'an Jiaotong University

Featured Publications
Recent Publications
(2025). Imagine while Reasoning in Space: Multimodal Visualization-of-Thought. arXiv preprint arXiv:2501.07542.
(2025). MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans?. The Thirteenth International Conference on Learning Representations.
(2024). TimeRAF: Retrieval-Augmented Foundation model for Zero-shot Time Series Forecasting. arXiv preprint arXiv:2412.20810.
(2024). LogoRA: Local-Global Representation Alignment for Robust Time Series Classification (TKDE 2024). IEEE Transactions on Knowledge and Data Engineering.