Yunxiang Mo 莫云翔

I am an undergraduate at the Hong Kong University of Science and Technology (HKUST), pursuing a double major in Computer Science and Mathematics with an Extended Major in Artificial Intelligence (CGA: 4.1 / 4.3). I am fortunate to be advised by Prof. Yangqiu Song and Dr. Tianshi Zheng at the HKUST KnowComp Group.

My research interests center on natural language processing, with a focus on the reasoning and evaluation of large language models and vision-language models. I am especially interested in abductive and multimodal reasoning — how models form, defend, and revise hypotheses under ambiguity.

I am currently looking for a research-exchange position in the U.S. for the upcoming term. If you are a faculty member working on related topics and have an opening, I would be glad to chat — please feel free to reach out via email.

🔥 News

2026.03: 🎉 An extended version of DixitWorld was accepted to ACL 2026 (AC meta-review 9/10). [OpenReview]
2026.01: 🎉 ScaleCUA was accepted to ICLR 2026 as an Oral. [arXiv]
2025.10: 🎉 DixitWorld received a Spotlight at the EMNLP 2025 Workshop (BlackBox NLP). [arXiv]

💼 Experience

2026.04 – present | Research Intern, Stanford HAI, Stanford, CA, USA

Mentored by Dr. Fang Wu in the groups of Prof. Yejin Choi and Prof. Jure Leskovec.

2025.04 – present | Undergraduate Researcher, HKUST KnowComp Group, Hong Kong SAR, China

Mentored by Dr. Tianshi Zheng in the group of Prof. Yangqiu Song.

2025.06 – 2025.08 | Machine Learning Engineer Intern, Beijing Ingenic Semiconductor Co., Ltd., Beijing, China

Developed and optimized ML models for embedded and on-chip AI scenarios; built training, evaluation, and inference pipelines in PyTorch; deployed models to edge devices under tight latency and memory constraints.

2025.01 | Intern, Benchmark Architectural Design Co., Ltd.

Developed front-end modules with the MFC framework for an internal mini-program project; UI design, event handling, and system debugging in a small team.

📚 Publications

Robust Decision-Making for LLM Agents in Multi-Turn Reasoning

Under review

LLM agents in multi-turn reasoning frequently collapse into self-locking loops, where approximate belief tracking causes them to revisit the same hypotheses without making epistemic progress. We formalize the structural conditions under which such loops arise and show that the failure mode persists across frontier models even when standard information-seeking objectives are applied. To address it, we propose a training-free, distributionally-robust information-gain objective that explicitly hedges against belief-tracking error and restores exploratory progress without any fine-tuning. The method is evaluated on multi-turn reasoning, planning, and decision-making benchmarks across both open- and closed-source LLM agents.

A Multi-Domain LLM Benchmark for Scientific Hypothesis Generation

Under review

Scientific hypothesis generation is an open-ended, multi-step task that current LLM benchmarks evaluate poorly: free-text outputs are scored inconsistently, and most setups exclude the literature-grounded reasoning that real scientists rely on. We construct a multi-domain benchmark spanning multiple scientific disciplines, paired with an anchored 5-dimensional rubric that scores coherence, factual consistency, and the presence of boilerplate or hedging language. The benchmark supports two evaluation modes — direct prompting and an agentic mode that allows tool-augmented literature search — making it possible to attribute performance gains to the underlying model versus the surrounding agent scaffold.

OpenReview Details

DixitWorld: Evaluating Multimodal Abductive Reasoning in Vision-Language Models with Multi-Agent Dixit Gameplay

Yunxiang Mo, Tianshi Zheng, Qing Zong, Jiayu Liu, Baixuan Xu, Yauwai Yim, Chunkit Chan, Jiaxin Bai, Yangqiu Song.

ACL 2026 AC 9/10

Extended version of the workshop paper below — adds a Medium difficulty tier (252 vs. 168 QA items), a 72B-parameter scaling ablation, and calibration/sensitivity analyses.

arXiv Details

ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data

Zhaoyang Liu, Jingjing Xie, Zichen Ding, Zehao Li, Bowen Yang, Zhenyu Wu, Xuehui Wang, Qiushi Sun, Shi Liu, Weiyun Wang, Shenglong Ye, Qingyun Li, Xuan Dong, Yue Yu, Chenyu Lu, Yunxiang Mo, Yao Yan, Zeyue Tian, Xiao Zhang, Yuan Huang, Yiqian Liu, Weijie Su, Gen Luo, Xiangyu Yue, Biqing Qi, Kai Chen, Bowen Zhou, Yu Qiao, Qifeng Chen, Wenhai Wang.

ICLR 2026 Oral

My contribution: data pipeline and cross-platform workflow components in the open-source codebase.

arXiv Details

DixitWorld: Evaluating Multimodal Abductive Reasoning in Vision-Language Models with Multi-Agent Dixit Gameplay

Yunxiang Mo, Tianshi Zheng, Qing Zong, Jiayu Liu, Baixuan Xu, Yauwai Yim, Chunkit Chan, Jiaxin Bai, Yangqiu Song.

EMNLP 2025 Workshop Spotlight

Original workshop version; extended version accepted to ACL 2026 Main (above).

🛠️ Projects

Facere — AI-Native Hardware Design Agent

A CLI for end-to-end hardware design — schematics, PCB layouts, and physical simulation.

with Yiping Zhao, Jiahao Zhang, and Zichen Lin. Independently developed.

A terminal-native agent that drafts and edits schematics and PCB layouts alongside KiCad 9. We built a hardware-aware MCP server backed by a curated 153-motif schematic corpus, paired with a sister PCB physical-simulation package, so the agent can plan, edit, and verify designs end-to-end. Distributed as a single-command bootstrap installer.

PastPaper Master — AI Past-Exam Tutor

A web app that helps HKUST students practice past exam papers.

with Yiping Zhao and Jiahao Zhang. Independently developed.

A full-stack AI tutoring tool for HKUST students preparing past exam papers. GPT-4o auto-segments and tags every question in an uploaded PDF; Qwen-plus generates a per-question knowledge primer, scaffolded hint, and step-by-step solution. The workbench features side-by-side PDF↔question navigation, photo-OCR handwriting grading, automatic variant-problem generation, and an error book with spaced review.

🏆 Honors and Awards

University’s Scholarship Scheme for Continuing Undergraduate Students, HKUST, 2024. Top 1% of continuing undergraduates.
S.S. Chern Class, HKUST. Honor for top academic performance across all mathematics coursework.
Dean’s List Honor, HKUST, 2024 & 2025. GPA above 3.7.

🤝 Academic Services

(Coming soon.)

🎓 Teaching

Teaching Assistant, Discrete Mathematics — HKUST.
Teaching Assistant, Exploring Artificial Intelligence — HKUST.