Shiyu Huang   黄世宇

Researcher, Zhipu AI

No.1 Zhongguancun East Road, Haidian District
Beijing, China, 100084.
Email: huangsy1314@163.com

[OpenRL]     [知乎]     [Google Scholar]     [TARTRL]     [GitHub]     [Linkedin]     [CV]

Visitors: 4919

Biography

I am a researcher in Zhipu AI. Before that, I was a research scientist in 4Paradigm Inc. and the leader of OpenRL Lab. I received my B.E. and Ph. D. degrees (co-advised by Prof. Jun Zhu and Prof. Ting Chen) from the Department of Computer Science and Technology, Tsinghua University in July, 2017 and June, 2022. My researches focus on deep reinforcement learning, multi-agent reinforcement learning, distributed reinforcement learning, RL for robotics, LLM as agent, artificial general intelligence (AGI) and generative artificial intelligence (GAI). I have also spent time working at RealAI Inc. , Huawei Noah's Ark Lab, Tencent AI Lab, Carnegie Mellon University and Sensetime Inc. . And I am also the founder of the OpenRL Lab(GitHub stars) and TARTRL group.

We are looking for self-motivated interns and full-timers who have a strong background in mathematics/computer science and are eager to get involved in cutting-edge, fundamental AI research. Please feel free to drop me an email if you are interested in collaborating with me.

Publications && Preprints

(* equal contribution)

Highlight
CogVLM2: Visual Language Models for Image and Video Understanding
Wenyi Hong, Weihan Wang, Ming Ding, Wenmeng Yu, Qingsong Lv, Yan Wang, Yean Cheng, Shiyu Huang, Junhui Ji, Zhao Xue, Lei Zhao, Zhuoyi Yang, Xiaotao Gu, Xiaohan Zhang, Guanyu Feng, Da Yin, Zihan Wang, Ji Qi, Xixuan Song, Peng Zhang, Debing Liu, Bin Xu, Juanzi Li, Yuxiao Dong, Jie Tang
arXiv:2408.16500, 2024
GitHub stars GitHub forks
[PDF] [Code] [Huggingface] [BibTeX]
@article{hong2024cogvlm2, title={CogVLM2: Visual Language Models for Image and Video Understanding}, author={Hong, Wenyi and Wang, Weihan and Ding, Ming and Yu, Wenmeng and Lv, Qingsong and Wang, Yan and Cheng, Yean and Huang, Shiyu and Ji, Junhui and Xue, Zhao and others}, journal={arXiv preprint arXiv:2408.16500}, year={2024} }
Highlight
CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer
Zhuoyi Yang*, Jiayan Teng*, Wendi Zheng, Ming Ding, Shiyu Huang, Jiazheng Xu, Yuanming Yang, Xiaohan Zhang, Xiaotao Gu, Guanyu Feng, Da Yin, Wenyi Hong, Weihan Wang, Yean Cheng, Yuxuan Zhang, Ting Liu, Bin Xu, Yuxiao Dong, Jie Tang
arXiv:2408.06072, 2024
GitHub stars GitHub forks
[PDF] [Code] [Huggingface] [BibTeX]
@article{yang2024cogvideox, title={CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer}, author={Yang, Zhuoyi and Teng, Jiayan and Zheng, Wendi and Ding, Ming and Huang, Shiyu and Xu, Jiazheng and Yang, Yuanming and Hong, Wenyi and Zhang, Xiaohan and Feng, Guanyu and others}, journal={arXiv preprint arXiv:2408.06072}, year={2024} }
Highlight
LVBench: An Extreme Long Video Understanding Benchmark
Weihan Wang, Zehai He, Wenyi Hong, Yean Cheng, Xiaohan Zhang, Ji Qi, Shiyu Huang, Bin Xu, Yuxiao Dong, Ming Ding, Jie Tang
arXiv: 2406.08035, 2024
GitHub stars
[PDF] [Project] [Code] [Huggingface] [BibTeX]
@misc{wang2024lvbench, title={LVBench: An Extreme Long Video Understanding Benchmark}, author={Weihan Wang and Zehai He and Wenyi Hong and Yean Cheng and Xiaohan Zhang and Ji Qi and Shiyu Huang and Bin Xu and Yuxiao Dong and Ming Ding and Jie Tang}, year={2024}, eprint={2406.08035}, archivePrefix={arXiv}, primaryClass={cs.CV} }

Talks

Projects

Patents

Honors & Awards

Competitions

Services

Organizer for:
NeurIPS 2023 Workshop on New in ML

Reviewer for:
AAAI 2025, NeurIPS 2024, ICML 2024, ICLR 2024, AAAI 2024, NeurIPS 2023, AISTATS 2023, AAAI 2023, ICLR 2023, NeurIPS 2022, ICML 2022, AISTATS 2022, AAAI 2022, ICLR 2022, NeurIPS 2021, ICML 2021, AAAI 2021, NeurIPS 2020

Teaching

2020 Spring, TA in Big Data and Machine Intelligence, instructed by Zhen Chen
2019 Fall, TA in Big Data and Machine Intelligence, instructed by Zhen Chen
2019 Spring, TA in Machine Learning, instructed by Prof. Jun Zhu

Visitor Map