林宇翔

Yuxiang Lin (Lín, Yǔ Xiáng)

I am a Class of 2026 undergraduate student at Duke Kunshan University (DKU), which also grants me a Duke University degree upon graduation. I major in Applied Mathematics and Computational Sciences with a track in Computer Science. I am seeking a PhD position.

I am interested in the intersection of computer science and discrete mathematics, specifically:

Combinatorial and graph algorithms: I became interested in them during algorithmic programming contests such as NOIP and ICPC, which I have participated in since middle school. I have also worked on combinatorial structures such as matroids and the sunflower lemma in my undergraduate research.
Program analysis and verification: I enjoy applying formal methods to reason about program behavior and ensure correctness and security. Currently, I am using KLEE, a symbolic execution engine, to detect side-channel vulnerabilities in code.

I also have extensive experience with large language models (LLMs), both in research and in production. Nevertheless, I still prefer the "hand-crafted" side of computer science—algorithms and systems built through logic and structure rather than statistical learning.

Outside of academics, I enjoy the freedom of traveling on two wheels. My greatest adventure so far has been a two-week, 2,000-mile solo motorcycle trip through the Blue Ridge Parkway and three national parks. I am also a metalhead and still learning to play the guitar.

Email: yl925@duke.edu Github: theta-lin CV: (PDF) LinkedIn (less updated): yuxiang-lin

Research

Program Analysis for Side-Channel Attacks — Prof. Danfeng Zhang (Duke CS+ Summer Program)

Durham, U.S. · May 2025 – Present

Timing side-channel attacks exploit variations in a program's execution time to infer sensitive information. Our work focuses on detecting secret-dependent control flow and memory access in LLVM bitcode—cases where the value of secret inputs can influence which paths a program takes or which memory locations it touches, both of which may be indirectly observable by an attacker.

Prior work such as CtChecker identifies potential side-channel vulnerabilities using static taint analysis. We aim to overcome the limitations of purely static approaches by leveraging symbolic execution. Unlike static analysis, symbolic execution can produce concrete inputs that trigger a timing leak, and it is generally sound (i.e., it reports only true positives, aside from implementation constraints).

We are extending KLEE, a symbolic execution engine, to automatically detect secret-dependent control flow and memory access. Our modifications introduce product-program construction directly into KLEE's source code. In addition, we have developed automated benchmarking tools to validate detected vulnerabilities at the binary level, compare them with results from CtChecker, and visualize the findings.

CS+ project final presentation (PDF)

Product programs — The product program construction for control-flow vulnerabilities.

Matroids and Sunflower Lemma — Prof. Italo Simonelli (Undergraduate Signature Work)

Kunshan, China · Jan 2024 – Present

A matroid is a combinatorial structure that generalizes the concept of linear independence. It consists of a ground set E and a collection I of subsets of E, called independent sets, that satisfy three axioms. I am proposing a new combinatorial structure called a weak matroid, which relaxes these axioms.

Analogous to how graphic matroids arise from undirected graphs, I have defined graphic weak matroids using directed graphs. I am currently studying the properties of these graphic weak matroids, utilizing C++ and the Boost Graph Library for enumeration and experimentation.

A sunflower is a family of sets in which all pairwise intersections of the sets are the same, and that intersection is called the core. The parts of each set outside the core are called petals. The sunflower lemma states that any sufficiently large family of fixed-size sets necessarily contains a sunflower. I am currently exploring a variant of the sunflower lemma that connects to the structure of weak matroids.

ChatDKU — DKU Edge Intelligence Lab (Prof. Bing Luo , DKU Summer Research Scholars)

May 2024 — Jan 2025

As a university, DKU has a vast amount of institutional knowledge, but it's often siloed in different websites and documents. We created ChatDKU to be a central, privacy-first AI assistant for the DKU community. To ensure data privacy and security, the entire system is deployed locally on DKU's own servers, using a high-performance backend stack built with vLLM and SGLang for scalable model inference, Text Embeddings Inference for embeddings before vector search, and Flask for backend serving.

The system is a modular agentic architecture composed of three key components: a Planner, a set of Tools (like search), and a Memory module. This architecture powers our agentic Retrieval-Augmented Generation (RAG) loop: the agent plans how to answer a query, repeatedly uses tools to search for information, synthesizes findings into its memory, and only answers when it assesses the information is sufficient. We built this architecture using DSPy, which acts as a barebone framework for composing these modules. DSPy's most crucial feature is its ability to run automated prompt optimization, which allowed us to fine-tune the complex reasoning steps. We also utilized LlamaIndex for unified data ingestion and querying.

This project taught me a critical lesson about engineering in the real world: frameworks like DSPy and LlamaIndex are fantastic for rapid prototyping, but their abstractions become a major bottleneck in production. They make it very difficult to access and tune the underlying prompts, yet direct prompt engineering is one of the most effective ways to improve performance.

I had to go "under the hood" to solve this. My contributions focused on moving beyond the limitations of these frameworks:

Observability: I manually integrated Arize Phoenix with our custom agent architecture to visualize the actual prompts being sent to the LLM.
Direct Prompt Control: After identifying poorly performing prompts, I modified DSPy's underlying code to directly inject our own optimized prompts.
Upstream Contributions: I debugged several issues we hit during development and contributed our fixes back to the open-source LlamaIndex and DSPy repositories.

This experience highlighted another key takeaway: while there are many working "toy examples" online, using frameworks like DSPy or LlamaIndex in production reveals a significant number of bugs and stability issues. Also, we found LlamaIndex's retriever modules in particular were too abstract and hard to control. I have replaced that part of the stack entirely by implementing a unified search service using Redis, which combines vector search and keyword search. Finally, I used k6 for load testing the entire stack to ensure it could handle large-scale use by the community.

Additionally, I found it interesting that there is no asynchronous recursive website scrapers supporting features such as black/white list and SAML login (at least not at the time I worked on ChatDKU). Therefore, I also implemented a simple scraper with these features in Python with AIOHTTP.

ChatDKU website (DKU internal network only)

ChatDKU screenshot 1 — Left and middle: ChatDKU interface. Right: ChatDKU launch video.

ChatDKU screenshot 2 — Left and middle: ChatDKU interface. Right: ChatDKU launch video.

Publications

Do LLM Agents Know How to Ground, Recover, and Assess? A Benchmark for Epistemic Competence in Information-Seeking Agents

J. Shao, Y. Lin, M. P. Lohani, Y. Miao, B. Luo. arXiv e-prints, arXiv:2509.22391, 2025.

Most benchmarks for LLM search agents only grade the final answer, which hides major flaws in reasoning. We developed SeekBench, a process-level benchmark and dataset to evaluate an agent's deeper epistemic competence.

Our contributions were:

A Process-Level Benchmark: We created a structured trace schema to evaluate the entire reasoning process (reasoning, search, evidence), enabling scalable, step-by-step evaluation.
A New Metrics Framework: We defined and formalized three core competencies: Groundedness (is reasoning based on facts?), Recovery (does it adapt to bad search results?), and Calibration (does it know when to stop?).
Large-Scale Findings: Our evaluation (over 28,000 traces) revealed that standard accuracy metrics are misleading. We found that RL-trained agents excel at gathering evidence but struggle with reasoning, and that a striking 72.7% of reasoning steps weren't grounded in any evidence at all.

SeekBench dataset overview diagram — Figure 1 in paper: Overview of the SeekBench dataset and annotation schema.

Projects

See my GitHub page for my select course projects.

Affordable Embroidery Machine

Sep 2023 — May 2025

Designed an affordable, scalable CNC embroidery machine. Created CAD models in FreeCAD, 3D-printed the prototype, and used Arduino for motion control.

FedCampus

Jan 2024 — Apr 2024

Maintained the Django backend for FedCampus, a federated learning and analytics platform used by 90 active DKU users.

Teaching

Teaching Assistant, COMPSCI 201 (Introduction to Programming and Data Structures)

Fall 2023 · Duke Kunshan University

Held weekly office hours and provided in-depth support to 20 students mostly regarding programming in Java and implementation of data structures.

Peer Tutor, STATS 102 (Introduction to Data Science)

2023 — 2024 · Duke Kunshan University

Trained in Tutor Essentials (CRLA-aligned) and conducted bi-weekly sessions covering NumPy, matplotlib, and scikit-learn.

Awards

Silver Medal — ICPC Asia Xi'an Regional (2023)
Gold Medal — ICPC Asia Hefei Regional (2022)
Bronze Medal — ICPC Asia-East Continent Final (2022)
DKU Student Leader of the Year Award — (2023)
National Encouragement Scholarship — (2023)