沈晓宇

助理教授

xyshen@eitech.edu.cn

背景介绍:

2015年本科毕业于南京大学软件学院,之后在德国马克斯普朗克信息研究所获得博士学位,师从欧洲科学院院士、知识图谱创始人Gerhard Weikum和语言模型专家Dietrich Klakow。研究方向主要包括大模型高效推理、动态稀疏化、多模态对齐、复杂检索场景的指令跟随、问答系统应用等。


2020年9月加入Amazon Alexa AI柏林研发中心担任机器学习科学家职位,领导Alexa 智能客服问答项目,服务4亿用户。2023年11月加入宁波东方理工大学(暂名)任助理教授、副研究员、博士生导师,与上海交大、中科大和香港理工联合培养博士生。


已在ACL、ICML、ICLR等人工智能和自然语言处理顶级会议发表60余篇论文,被引3700余次,获COLING最佳demo论文奖、ACL最佳主题论文奖、CIS-RAM最佳论文奖提名等。曾多次在东京大学、剑桥大学、Google Brain等做特邀报告。是ACL、EMNLP、AISTAS、NeurIPS、TOIS在内多个顶级会议和期刊的委员会成员、ACL2023/2025的领域主席、High-Performance Computing for AI in Big Model Era 的主题编辑。


研究领域:

流式多模态大模型(Streaming Multimodal Large Models)
研究如何构建能够持续处理异步、动态输入(如文本、语音、图像、视频流)的多模态大模型,重点关注信息融合的时序一致性、低延迟推理、以及在实时环境中的资源调度优化,推动多模态模型在交互式系统和连续感知场景中的应用落地。


指令遵循式检索器(Instruction-Following Retrieval Systems)
探索如何使检索系统不仅理解查询内容,还能准确遵循复杂自然语言指令中的条件、偏好与约束,如指定格式、过滤特定信息、控制返回结果的排序与完整性,从而提升检索器的灵活性、可控性和任务完成能力。


大模型参数动态压缩(Dynamic Parameter Compression for LLMs)
研究在推理过程中,根据输入样本复杂度或推理动态特征,自适应地选择性激活模型参数,实现推理时的稀疏计算与动态剪枝,在兼顾推理速度与能效的同时,保持大模型的推理准确性和通用性。


链式推理优化(Optimization of Chain-of-Thought Reasoning)
针对大模型在复杂推理任务中的链式推理过程,系统性研究推理链的结构建模、推理粒度控制、推理路径动态选择等问题,旨在提升大模型推理的准确率、可解释性与推理效率,助力大模型在复杂决策与推理密集型任务中的实际应用。


教育背景:

2015-2021:博士(主修自然语言处理),萨尔大学计算机系-马克斯普朗克信息研究所

2011-2015:学士(主修软件工程),南京大学软件工程系


工作经历:

2020-2023:亚马逊Alexa AI,机器学习科学家


学术经历:

2018/5-2018/9:理化学研究所人工智能研究中心,访问学者

2016/9-2017/1:东京大学计算机系/国立情报所,访问学者


学术兼职(部分)

Topic editor in "High-Performance Computing for AI in Big Model Era"

Area chair in question answering track of ACL2023


获奖情况及荣誉:

  • CIS-RAM 2024 Best Paper Finalist

  • ACL 2023 Special Theme Paper Award

  • 2020年优秀自费留学生奖学金

  • COLING 2020 Best Demo Paper Award

  • 马克斯普朗克协会博士奖学金

  • 南京大学优秀毕业设计

  • 南京大学优秀毕业生

  • 国家奖学金


代表性论著:

总体情况

60余篇人工智能顶会论文


论著信息及引用数据

Google Scholar:

http://scholar.google.com/citations?hl=en&user=BWfPrE4AAAAJ


10篇代表作(*表示通讯作者)

  1. Shen, Xiaoyu*, Akari Asai, Bill Byrne, and Adria De Gispert. "xPQA: Cross-Lingual Product Question Answering in 12 Languages." In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 5: Industry Track), pp. 103-115. 2023

  2. Dawei Zhu, Xiaoyu Shen*, Marius Mosbach, Andreas Stephan, and Dietrich Klakow. 2023. Weaker Than You Think: A Critical Look at Weakly Supervised Learning. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 14229–14253, Toronto, Canada. Association for Computational Linguistics.

  3. Tang, Ze, Xiaoyu Shen*, Chuanyi Li, Jidong Ge, Liguo Huang, Zhelin Zhu, and Bin Luo. "AST-trans: Code summarization with efficient tree-structured attention." In Proceedings of the 44th International Conference on Software Engineering, pp. 150-162. 2022.

  4. Su, Hui, Weiwei Shi, Xiaoyu Shen*, Zhou Xiao, Tuo Ji, Jiarui Fang, and Jie Zhou. "Rocbert: Robust chinese bert with multimodal contrastive pretraining." In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 921-931. 2022.

  5. Chang, Ernie, Xiaoyu Shen*, Hui-Syuan Yeh, and Vera Demberg. "On Training Instance Selection for Few-Shot Neural Text Generation." In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pp. 8-13. 2021.

  6. Su, Hui, Xiaoyu Shen*, Zhou Xiao, Zheng Zhang, Ernie Chang, Cheng Zhang, Cheng Niu, and Jie Zhou. "Moviechats: Chat like humans in a closed domain." In Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), pp. 6605-6619. 2020.

  7. Shen, Xiaoyu*, Ernie Chang, Hui Su, Cheng Niu, and Dietrich Klakow. "Neural Data-to-Text Generation via Jointly Learning the Segmentation and Correspondence." In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 7155-7165. 2020.

  8. Shen, Xiaoyu*, Yang Zhao, Hui Su, and Dietrich Klakow. "Improving latent alignment in text summarization by generalizing the pointer generator." In Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), pp. 3762-3773. 2019.

  9. Shen, Xiaoyu*, Jun Suzuki, Kentaro Inui, Hui Su, Dietrich Klakow, and Satoshi Sekine. "Select and Attend: Towards Controllable Content Selection in Text Generation." In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 579-590. 2019.

  10. Shen, Xiaoyu*, Hui Su, Wenjie Li, and Dietrich Klakow. "Nexus network: Connecting the preceding and the following in dialogue generation." In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 4316-4327. 2018.