Yingli ZHOU

I’m a Ph.D. Candidate in the School of Data Science at The Chinese University of Hong Kong, Shenzhen (CUHK-Shenzhen), advised by Prof. Yixiang Fang, and working closely with Prof. Chenhao Ma and Wensheng Luo. I got my Master degree from Harbin Institute of Technology Shenzhen in June 2022, under the supervision of Prof. Yunming Ye. Before that, I received my Bachelor’s degree from Harbin Institute of Technology in June 2020.

I am always open for possible collaborations, and visiting opportunities, please do not hesitate to contact me if you are interested!

Interests

My research interests mainly focus on large-scale data management and data mining, particularly graph data management and Large Language Models (LLMs) for data management. Specifically, my research spans the following topics:

  • Build efficient and lightweight 🚀 graph-based retrieval-augmented generation (RAG) methods and systems to enhance the factual accuracy, adaptability, interpretability, and trustworthiness of next-generation language models.
  • Design simple yet effective algorithms 💫 for graph mining, utilizing linear programming and spectral methods, focusing on densest subgraph discovery and graph clustering.
  • Develop tools 🔧 or systems 🔨 that leverage LLMs for data analysis tasks, such as Data2Insight, Data Cleaning, and LLM-empowered Data Processing System.

Yingli is working hard 😭😭😭 to produce impactful 🔥 and novel work 🌟. In addition, I am passionate about open-source communities and familiar with database kernels (such as TiDB).

Graph-based RAG Projects

News

  • 2025.03 💥💥 Our paper about the systemly benchmarck and analysis of graph-based RAG is available on arXiv! [arXiv]!
  • 2025.01 💥💥 One Paper “Efficient Historical Butterfly Counting in Large Temporal Bipartite Networks via Graph Structure-aware Index” is accepted by VLDB 2025!
  • 2024.12 💥💥 One Paper “In-depth Analysis of Densest Subgraph Discovery in a Unified Framework” is accepted by VLDB 2025!
  • 2024.11 💥💥 One Paper “PRICE: A Pretrained Model for Cross-Database Cardinality Estimation” is accepted by VLDB 2025!
  • 2024.06 💥💥 One Paper “Efficient Maximal Motif-Clique Enumeration over Large Heterogeneous Information Networks” is accepted by VLDB 2024!
  • 2024.06 💥💥 Our paper about the systemly benchmarck and analysis of densest subgraph discovery is available on arXiv! [arXiv]!
  • 2024.04 💥💥 One Paper “Efficient Parallel D-core Decomposition at Scale” is accepted by VLDB 2024!