合成生物学 ›› 2023, Vol. 4 ›› Issue (3): 464-487.DOI: 10.12211/2096-8280.2023-008

• 特约评述 • 上一篇    下一篇


陈志航, 季梦麟, 戚逸飞   

  1. 复旦大学药学院,上海 201203
  • 收稿日期:2023-01-13 修回日期:2023-03-15 出版日期:2023-06-30 发布日期:2023-07-05
  • 通讯作者: 戚逸飞
  • 作者简介:陈志航(1998—),男,硕士研究生。研究方向为人工智能蛋白质设计。 E-mail:zhihangchen21@m.fudan.edu.cn
    季梦麟(2000—),男,硕士研究生。研究方向为人工智能蛋白质设计。 E-mail:22211030067@m.fudan.edu.cn
    戚逸飞(1983—),男,副研究员,硕士生导师。研究方向为生物大分子结构和功能模拟以及人工智能药物设计。 E-mail:yfqi@fudan.edu.cn
  • 基金资助:

Research progress of artificial intelligence in desiging protein structures

CHEN Zhihang, JI Menglin, QI Yifei   

  1. School of Pharmacy,Fudan University,Shanghai 201203,China
  • Received:2023-01-13 Revised:2023-03-15 Online:2023-06-30 Published:2023-07-05
  • Contact: QI Yifei



关键词: 蛋白质设计, 蛋白质工程, 人工智能, 深度学习, 蛋白质序列与结构


Proteins are essential to life as they carry out a great variety of biological functions. Protein sequences determine their three-dimensional structures, and therefore physiological functions. Proteins with specific functions have important applications in many fields such as biomedicine, where they are utilized in drug design and delivery. In the past, protein engineering and directed evolution are commonly used to improve the activity and stability of proteins. These methods, however, are both complex and expensive, as they require a large number of biological experiments for validation. Computational protein design (CPD) allows the design of amino acid sequences based on desired protein functions and structures, and more intriguingly, generation of proteins even not found in nature. Conventional CPD uses energy function and optimization algorithm to design protein sequences. In recent years, with the rapid development of artificial intelligence (AI) technique, the accumulation of big data and the development of high speed computing, AI has made great progresses in learning, and been successfully applied in CPD. In this review, based on the input constraints and sampling space size, we present a systematic overview of recent applications of AI in protein design from three aspects: fixed-backbone design, flexible-backbone design, and sequence structure generation. We focus on algorithms and protein feature encoding, present the effect of dataset size and architectural improvements on model performance in prediction, and showcase several enzymes, antibodies, and binding proteins that were successfully designed using these models. The advantages of AI compared with traditional CPD methods are also discussed. Finally, we highlight challenges in AI-aided protein design, and propose some strategies for solutions.

Key words: protein design, protein engineering, artificial intelligence, deep learning, protein sequence and structure
