Profile
Overview
I am a Master's student in Cyberspace Security at USTC, advised by Prof. Qi Chu and Prof. Nenghai Yu.
My research focuses on LLM evaluation, safety, alignment, and realistic benchmark design, with recent work on dialogue, search, reasoning, and multimodal systems; from Nov. 2024 to Dec. 2025, I was an algorithm intern at ByteDance Seed, where I worked on evaluation-centric systems for large-model applications.
- Position
- M.Eng. student at USTC; former algorithm intern at ByteDance Seed (2024.11–2025.12).
- Research
- LLM evaluation, safety, alignment, reasoning benchmarks, and AI security.
- Approach
- Build realistic testbeds and evaluation frameworks that reveal hidden model weaknesses.
Highlights
News
-
2026-02
WorldTravel introduced a realistic multimodal travel-planning benchmark spanning 150 real-world scenarios and 2,000+ rendered webpages, revealing a sharp drop in feasibility from text-only to multimodal settings.
arXiv Paper -
2025-11
DiscoX introduced an expert-domain discourse-level translation benchmark focused on document coherence, terminology consistency, and cross-sentence faithfulness.
arXiv Paper -
2025-11
MME-CC introduced a multimodal benchmark for cognitive-capacity evaluation that stresses reasoning-intensive visual-language understanding rather than shallow perception.
arXiv Paper -
2025-09
-
2025-09
Show older news
-
2025-02
-
2025-01
Hello Again! was accepted to NAACL 2025 for its study of long-term personalized dialogue with memory retrieval and dynamic persona modeling across sessions.
Accepted Paper -
2024-06
The preprint version of Hello Again! introduced a model-agnostic personalized dialogue agent for long-term memory and persona-aware interaction.
-
2023-12
Joined LDS Lab at USTC and began research on LLM evaluation, dialogue systems, and safety-oriented questions.
Research
Publications
-
2026 ICLR
-
2026 ICLR
-
2025 NAACL
Lead authorship. Hello Again! LLM-powered Personalized Agent for Long-term Dialogue
-
2025 EMNLP
Preprints & Manuscripts
-
2026 arXiv
Recent preprint
Lead authorship. WorldTravel: A Realistic Multimodal Travel-Planning Benchmark with Tightly Coupled Constraints
-
2026 Manuscript
Lead authorship. Agent4Weakness: An Agentic Framework for In-Depth Model Weakness Discovery
-
2026 Manuscript
Lead authorship. When Agents Look the Same: Quantifying Distillation-Induced Similarity in Tool-Use Behaviors
-
2025 arXiv
-
2025 arXiv
Background
Experience
-
2025-
M.Eng., Cyberspace Security
University of Science and Technology of China
Research on LLM safety, evaluation, alignment, and benchmark design.
-
2024.11–2025.12
Algorithm Intern
ByteDance Seed, Seed-Evaluation Team
Developed realistic evaluation pipelines and benchmark suites for large-model applications.
-
2023-2024
Research Intern
NExT++ Lab, National University of Singapore
Conducted research on LLM-based dialogue agents for long-term personalization and memory-aware response generation.
-
2021-2025
B.Eng., Information Security
University of Science and Technology of China
Coursework and early research in information security, machine learning, and AI security.
Honors
Awards
- Outstanding Graduate Award USTC 2025
- Outstanding Student Bronze Award USTC 2024
- Wang Xiaomo Talent Program Scholarship ×4 USTC 2021-2024