AI4PhysSci Lab: Evaluator & Agent

Insert image left Evaluator & Agent are highly collaborative modules that Collaboration with experiment teams

Evaluator : AI for Experimental Design

The goal of this direction is to transform the experimental design by replacing trial-and-error with few-shot minimal data adaptive optimization with approaches like Bayesian optimization (BO), active learning, and reinforcement learning to efficient exploration of complex and high-dimensional experimental spaces. In our research, we build our models with uncertainty-aware algorithm, outlier managements, low-dimensional representations, and domain-specific priors to ensure practical performance in noisy and constrained real-world chemistry experimental settings. By forming feedback loops, we hope to help the experimentalists to provide a general tool to acheive closed-loop discovery in chemistry, biology, and materials science.

- ODBO (Outlier-detected Bayesian Optimization): a ML protein directed evolution protocol that integrates low-dimensional and function-value-based protein encoding, search space prescreening with BO & outlier detection surrogate modeling to efficiently navigate noisy large sequence spaces and recommend high-fitness variants with minimal experimental cost.

Problem-driven Fine-tuning and Benchmark Constructions for Scientific LLM Agent

In this topic, we plan to construct domain-specific datasets by combining outputs from upstream simulation/emulation/prediction layers with curated data from literature data-mining. We are also interested in collaborating with computer scientists to develop scientist-centric evaluation metrics for Scientific LLMs (e.g., NatureLM). This targeted approach ensures that fine-tuning aligns with real scientific tasks, enabling LLM agents to learn domain-relevant reasoning, symbolic logic, and experimental workflows grounded in chemistry. This direction will involve close collaboration with computer science groups to bridge foundation models and scientific discovery.

- The impact of large language models on scientific discovery: a preliminary study using GPT-4 : Benchmark LLM4Sci performance in various tasks.

Scientific Multi-agent System

Our scientific multi-agent system research aims to blend AI reasoning, simulation, and automation layers so that different agents specialize in planning experiments, coordinating complex workflows, and validating results across the physical sciences. These projects emphasize modular “agents” that exchange intermediate results, re-plan when simulations fail, and incorporate human-domain expertise only when necessary. A representative effort is:

- Li, W., Ren, J., Cheng, L., Gong, C. Autonomous Quantum Simulation through Large Language Model Agents. arXiv:2601.10194. (2026).