Research Overview
Currently I'm interested in the following areas from different perspectives:
-
Machine intelligence
- Multimodal (audio, speech, vision, etc.) perception, reasoning
- Language models, agents, collaboration and orchestration, alignment
- Causality, first-principles thinking
- World models, autonomous systems
- Universal representation learning
-
Computational cognitive science
- Human-like learning, thinking
- Social cognition, theory of mind in agents
- Rapid generalization, data efficiency
|
Selected Publications
|
|
Speech World Model: Causal State-Action Planning with Explicit Reasoning for Speech
Xuanru Zhou, Jiachen Lian, Henry Hong, Xinyi Yang, Gopala Anumanchipalli
Preprint.
|
|
|
Unlocking Strong Supervision: A Data-Centric Study of General-Purpose Audio Pre-Training Methods
Xuanru Zhou, Yiwen Shao, Wei-Cheng Tseng, Dong Yu
Preprint. A unifed, general-purpose audio pre-training framework capable of bridging speech, music, and environmental sounds.
|
|
|
Towards Accurate Phonetic Error Detection through Phoneme Similarity Modeling
Xuanru Zhou, Jiachen Lian, Cheol Jun Cho, Tejas Prabhune, Shuhe Li, William Li, Rodrigo Ortiz, Zoe Ezzes, Jet Vonk, Brittany Morin, Rian Bogley, Lisa Wauters, Zachary Miller, Maria Gorno-Tempini, Gopala Anumanchipalli
2025 Interspeech. A phonetic error detection system for pronunciation evaluation and articulatory feedback.
[Project Page] ( Oral Presentation )
|
|
|
Automatic Detection of Articulatory-Based Disfluencies in Primary Progressive Aphasia
Jiachen Lian, Xuanru Zhou, Zoe Ezzes, Jet
Vonk, Brittany Morin, David Baquirin, Zachary Miller, Maria Luisa Gorno Tempini
and Gopala
Krishna Anumanchipalli,
2025 JSTSP. An efficient AI Agent for Language Screening and Spoken Language Learning.
[Project Page]
|
|
|
SSDM: Scalable Speech Dysfluency Modeling
Jiachen Lian, Xuanru Zhou, Zoe Ezzes, Jet
Vonk, Brittany Morin, David Baquirin, Zachary Miller, Maria Luisa Gorno Tempini
and Gopala
Krishna Anumanchipalli,
2024 NeurIPS. An AI Agent for Speech Therapy and Spoken Language Learning. A foundation model for scientific research, engineering deployment and business development .
[Project Page] ( NeurIPs Scholar Award )
|
|
|
Time and Tokens: Benchmarking End-to-End Speech Dysfluency Detection
Xuanru Zhou, Jiachen Lian, Cheol Jun Cho, Zoe Ezzes, Jet M.J. Vonk, Brittany T. Morin, David Paul Galang Baquirin, Zachary A. Miller, Maria Luisa Gorno-Tempini, Gopala Anumanchipalli,
Preprint. Open Source Benchmarking Dysfluency Modeling
[Project Page]
[Code]
|
|
|
SoK: Dataset Copyright Auditing in Machine Learning Systems
Linkang Du*, Xuanru Zhou*, Min Chen, Chusong Zhang, Zhou Su, Peng Cheng, Jiming Chen, Zhikun Zhang
2025 IEEE S&P
|
|
|
Stutter-Solver: End-to-end Multi-lingual Dysfluency Detection
Xuanru Zhou, Cheol Jun Cho, Ayati Sharma, Brittany Morin, David Baquirin, Jet
Vonk, Zoe Ezzes, Zachary Miller, Maria Luisa Gorno Tempini,
Jiachen Lian,
and Gopala
Krishna Anumanchipalli,
2024 SLT. Multi-lingual Co-Dysfluency Detector with Articulatory Simulation
[Code] ( Student Grant Award )
|
|
|
YOLO-Stutter: End-to-End Region-Wise Speech Dysfluency Detection
Xuanru Zhou, Anshul Kashyap, Steve Li, Ayati Sharma, Brittany Morin, David Baquirin, Jet
Vonk, Zoe Ezzes, Zachary Miller, Maria Luisa Gorno Tempini,
Jiachen Lian,
and Gopala
Krishna Anumanchipalli,
2024 Interspeech. Dysfluency Modeling as Object Detection . [Code] ( ISCA Student Grant Award ).
|
|
|
Selected Awards
2024 NeurIPs Scholar Award
Outstanding Undergraduate Graduate of Zhejiang University
2024 ISCA Student Travel Award
2024 IEEE SLT Student Travel Grant
Zhejiang Provincial Government Scholarship
|
Trivia
My favorite artist is Lady Gaga, go support her new album MAYHEM! I'm also fan of Kpop music.
Before I went to middle school, I was a member of the city swimming team.
I enjoy playing with dogs and cats.
|
|