Towards Intrinsic Interpretability of Large Language Models: A Survey of Design Principles and Architectures
Yutong Gao*, Qinglin Meng*, Yuan Zhou, Liangming Pan
My research focuses on machine learning theory, reinforcement learning, and large language models. I am interested in the theoretical foundations of learning, including statistical learning theory, online learning, and algorithms with provable guarantees.
* Equal contribution
Yutong Gao*, Qinglin Meng*, Yuan Zhou, Liangming Pan
Steve Hanneke, Qinglin Meng, Shay Moran, Amirreza Shaeiri (alphabetical order)
Steve Hanneke, Qinglin Meng, Amirreza Shaeiri (alphabetical order)
Shuang Qiu*, Boxiang Lyu*, Qinglin Meng*, Zhaoran Wang, Zhuoran Yang, Michael I. Jordan