Yingli ZHOU

I received my Ph.D. in March 2026 from the School of Data Science at The Chinese University of Hong Kong, Shenzhen, advised by Prof. Yixiang Fang. I am a postdoctoral researcher at LIRIS, CNRS / Université Claude Bernard Lyon 1 (Lyon, France) supported by ERC-Advanced Go-Y Project, working with Prof. Angela Bonifati (ACM Fellow, SIGMOD Chair) on graph data management and reliable data infrastructure for AI.

I received my M.Eng. and B.Eng. from Harbin Institute of Technology in 2022 and 2020 respectively, under the supervision of Prof. Yunming Ye. I was a visiting student at the National University of Singapore in 2025, where I worked with Prof. Xiaokui Xiao.

Research Interests

My research agenda is to build causality-aware graph data management and reliable graph data infrastructure for AI. I study how graph structure, causal signals, and scalable data systems can make AI applications more trustworthy, explainable, and efficient.

Current directions include:

  • Causality-aware graph data management: causal property graphs, causal provenance query, and scalable intervention analysis over graph data.
  • Reliable graph data infrastructure for AI: graph-based RAG, structured retrieval, graph memory, and evaluation pipelines for knowledge-intensive AI systems.
  • Scalable graph analytics: efficient algorithms for densest subgraph discovery, community search, clique counting/listing, and graph similarity.
  • Large models for data systems: LLM-powered and pretrained methods for data analysis, database optimization, and DBMS testing.

I am open to collaborations, invited talks, and visiting opportunities in data management, database systems, and graph-based AI infrastructure. Please reach out via email.

Selected Publications

2026

HAMMER: An Automatic RAG Tuning System via Hierarchical Memory-Guided Monte Carlo Tree Search

Yingli Zhou, Zixuan Wang, Yixiang Fang

SIGMOD'26, Proceedings of the ACM on Management of Data

HAMMER studies how to automatically tune retrieval-augmented generation systems through hierarchical memory and search, connecting graph-based retrieval, system optimization, and reliable AI pipelines.

A Semantics-aware Approach for Graph Edit Distance Estimation over Knowledge Graphs

Yingli Zhou, HuiZhong Wang, Chenhao Ma, Yixiang Fang

VLDB'26, Proceedings of the VLDB Endowment

This work develops semantics-aware graph similarity estimation for knowledge graphs, supporting robust graph data management when structure alone is insufficient.

Efficient Anchored Densest Subgraph Discovery: Improved Time Complexity and Practical Performance

Yingli Zhou, Youran Sun, Yixiang Fang

SIGMOD'26, Proceedings of the ACM on Management of Data

This paper advances scalable graph analytics by improving the theory and practice of anchored densest subgraph discovery, a core primitive for graph data exploration.

2025

In-depth Analysis of Graph-based RAG in a Unified Framework

Yingli Zhou, Yaodong Su, Youran Sun, Shu Wang, Taotao Wang, Runyuan He, Yongwei Zhang, Sicong Liang, Xilin Liu, Yuchi Ma, Yixiang Fang

VLDB'25, Proceedings of the VLDB Endowment

This work provides a unified framework for benchmarking and analyzing graph-based RAG, clarifying when graph structure improves retrieval and generation over noisy, evolving data.

Recent News

  • March 2026: I received my Ph.D. from CUHK-Shenzhen and will continue my research as an ERC postdoctoral researcher at LIRIS, CNRS / Université Lyon 1.
  • February 2026: Two papers were accepted to SIGMOD 2026.
  • January 2026: One paper was accepted to VLDB 2026.
  • September 2025: One paper was accepted to VLDB 2026.
  • March 2025: Our paper on the systematic benchmark and analysis of graph-based RAG became available on arXiv. [arXiv]

Selected Open-source Projects

DIGIMON / GraphRAG: the first unified graph-based RAG prototype system for structured retrieval and reasoning over complex data. GitHub Repo stars [papers]
EraRAG: the first graph-based RAG system to handle evolving documents. GitHub Repo stars [arXiv]
BookRAG: a hierarchical structure-aware index-based approach for retrieval-augmented generation on complex documents. GitHub Repo stars [arXiv]
MicroWorld: a lightweight system for turning multi-modal event materials into structured graphs, agent populations, and inspectable social simulations. GitHub Repo stars [site]

Personal Information

  • I have been a big fan of Jay Chou for around 18 years. I also grew up listening to Avril Lavigne and Coldplay, and their music has been a big part of my life.
  • I love watching football, Dota2 and LOL. My favorite national teams are Portugal and France. As for clubs, I support Manchester City and used to support Real Madrid when Cristiano Ronaldo played there.
  • I enjoy running in my free time, and my current goal is to complete a marathon.