Andy Zhao – Academic Homepage

News

[Feb 2026] Recipient of Travel Grant for Research Conference
[Oct 2025] Selected to be MATS 9.0 (Winter 2025) Batch under Neel Nanda from Google DeepMind; Completed a project on Interpretable Feature Circuits in LLM Chain‑of‑Thought Reasoning during Training Phase
[Sep 2025] Started serving as a Teaching Assistant for EE 510, Graduate Level Algebra for Engineers at USC
[Jul 2025] Invited talk at Samsung Advanced ML Lab on “Evaluation and Benchmark of Agentic System and Multi‑Agent LLM Post‑Training”
[May 2025] Samsung Research Internship at Mountain View started, focusing on Agentic System
[Feb 2025] Recipient of Early Career Innovator Award
[Sep 2024] Awarded Ming Hsieh Fellowship
[Feb 2024] Awarded Viterbi Graduate Fellowship and Annenberg School Award

Research Highlights

Selected recent projects and publications that capture my interests in reinforcement learning, strategic decision making and LLM post‑training. Additional research I did before my PhD can be found on ResearchGate and Google Scholar.

Episodic Zero‑Sum Game Learning
IEEE TAC 2026

Develops sample‑efficient algorithms for learning in episodic zero‑sum games under unknown rewards and dynamics, with provable regret bounds and empirical performance on simulated autonomous driving scenarios.

Fine‑Tuning LLMs for Strategic Multi‑Agent Reasoning figure

Fine‑Tuning LLMs for Strategic Multi‑Agent Reasoning
AISTATS 2026

Introduces a framework for aligning large language models with game‑theoretic reasoning tasks, combining preference optimisation with reinforcement learning in Markov games.

Generalized Quantal Response Equilibrium figure

Generalized Quantal Response Equilibrium: Existence and Efficient Learning
NeurIPS 2025

We establish existence proofs and propose a scalable learning algorithm for a broad class of quantal response equilibria, enabling robust modelling of bounded‑rational agents in multi‑agent systems.

End‑to‑End Learning for Non‑Markovian Control figure

End‑to‑End Learning for Non‑Markovian Control
ICML 2024

Presents an RL framework for solving non‑Markovian optimal control problems by learning latent representations and controllers jointly, bridging model‑based control and deep RL.

Experience

Samsung Research America

Mountain View, CA

Generative AI Research Intern, Advanced ML Lab (AWS Bedrock, Kendra, LangChain/LangGraph)

May 2025 – Present

• Built & optimized LLM search; improved top‑k relevance by 35% & latency by 40%.

• Designed & deployed LLM recommender; boosted CTR by 20%.

University of Southern California

Los Angeles, CA

Research Assistant

Aug 2024 – Present

• Developed multi‑agent general‑ and zero‑sum game algorithms for LLM post‑training & autonomous systems.

• Designed multi‑agent & multi‑objective direct preference optimisation for LLM alignment.

LLM Stealth Startup (MIT Momentum Accelerator)

Boston, MA & San Francisco, CA

Founding Engineer

Aug 2023 – Aug 2024

• Post‑trained an LLM on 1 000 docs for 100+ users, enabling friendships & dating.

• Built personalised event AI with LangGraph; retention +500%.

Ameriprise Financial

Boston, MA

Data Scientist

Feb 2023 – Aug 2023

• Built models to forecast European real‑estate growth at 500 m resolution.

• Improved test accuracy by 50% with deep nets; added interpretability via boosting & tailored metrics.

MIT Operations Research Center

Boston, MA

Research Assistant

Sep 2022 – Feb 2023

• Proposed IP approximation algorithms beating state‑of‑the‑art on NP‑hard problems.

• Developed ML scheme maximizing patient survival across treatments; improved AUC to 80%.

Alpha Square Group

New York, NY

Quantitative Research Intern

Jun 2022 – Aug 2022

• Built NLP models summarising alternative data & quantifying keywords.

• Built scoring system for 500+ deals in primary‑market investments.

Education (Teaching / Coursework / Website)

University of Southern California

Los Angeles, CA

PhD Candidate in Electrical Engineering & Computer Science, Center of Autonomy & Artificial Intelligence, GPA:4/4

2023 – Present

• Awards include the Ming Hsieh Fellowship, Viterbi Graduate Fellowship, Annenberg School Award and Early Career Innovator Award.

Massachusetts Institute of Technology

Cambridge, MA

Master of Science in Business Analytics & Operations Research, GPA:5.0/5.0

2022 – 2023

• Second place in the MIT Analytics Lab Contest (40 teams); Top 10 in the MIT Hackathon (1,000+ attendees).

University of California, Berkeley

Berkeley, CA

Bachelor of Science in Computer Science, Statistics, Mathematics & Data Science, GPA:3.9/4.0

2018 – 2022

• Awards: first place in the California Actuarial League competition; finalist in the Mathematics Contest in Modeling.

• Honors Thesis: Data Science Honors – Time series modelling on Finance; EECS Honors – PCA & Facial Recognition.