合成生物学 ›› 2023, Vol. 4 ›› Issue (3): 422-443.DOI: 10.12211/2096-8280.2023-004

• 特约评述 • 上一篇    下一篇

基于人工智能和计算生物学的合成生物学元件设计

王晟1, 王泽琛1,2, 陈威华1, 陈珂1, 彭向达1, 欧发芬1, 郑良振1,3, 孙瑨原1,4, 沈涛1, 赵国屏3   

  1. 1.上海智峪生物科技有限公司,上海 200030
    2.山东大学,山东 济南 250100
    3.中国科学院深圳先进技术研究院,广东 深圳 518055
    4.中国科学院微生物研究所,北京 100101
  • 收稿日期:2023-01-11 修回日期:2023-04-03 出版日期:2023-06-30 发布日期:2023-07-05
  • 通讯作者: 王晟
  • 作者简介:王晟(1983—),男,博士,上海智峪生物科技有限公司CEO,中国科学院深圳先进技术研究院客座研究员。研究方向为基于深度学习的蛋白质结构预测、基于人工智能的合成生物学。 E-mail:wangsheng@zelixir.com
    王泽琛(1997—),男,博士研究生。研究方向为基于深度学习的蛋白质-配体相互作用预测和虚拟筛选。 E-mail:wangzechen@mail.sdu.edu.cn
    陈威华(1982—),男,博士研究生。研究方向为合成生物学方向,基因合成与组装。 E-mail:chenweihua@zelixir.com
  • 基金资助:
    中国科学院国际大科学计划培育专项(153D31KYSB20170121)

Design of synthetic biology components based on artificial intelligence and computational biology

Sheng WANG1, Zechen WANG1,2, Weihua CHEN1, Ke CHEN1, Xiangda PENG1, Fafen OU1, Liangzhen ZHENG1,3, Jinyuan SUN1,4, Tao SHEN1, Guoping ZHAO3   

  1. 1.Shanghai Zelixir Biotech Company Ltd. ,Shanghai 200030,China
    2.Shandong University,Jinan 250100,Shandong,China
    3.Shenzhen Institute of Advanced Technology,Chinese Academey of Sciences,Shenzhen 518055,Guangdong,China
    4.Institute of Microbiology,Chinese Academey of Sciences,Beijing 100101,China
  • Received:2023-01-11 Revised:2023-04-03 Online:2023-06-30 Published:2023-07-05
  • Contact: Sheng WANG

摘要:

合成生物学是按照一定的规律综合已有的信息设计和构建全新的生物元件、装置和系统,或者重新设计已有的天然生物系统。合成生物学的核心在于设计、改造、重建或制造生物元件、生物反应系统、代谢途径与过程,乃至创造具有生命活动能力的细胞和生物个体,为解决人类发展在环境、资源、能源等方面面临的若干重大挑战提供新技术方案。毫无疑问,从DNA重组到基因电路设计,合成生物学的发展为众多领域带来全新的解决方案,优良的催化与调控元件是设计高效、鲁棒的系统的基础。然而,合成生物学的元件通常是天然的生物大分子,其固有的复杂性限制了对其工程化改造,导致合成生物技术的潜力未能得到充分发掘。随着人工智能(artificial intelligence,AI)与计算生物学的兴起和发展,有望助力该技术更好地发挥其价值。本文主要介绍了基于AI与计算生物学的不同类型的元件设计,聚焦催化元件、调控元件、传感元件三类元件的设计和前沿进展以及生物元件改造在合成生物学研究领域中的应用方面的研究进展。

关键词: 人工智能, 合成生物学, 计算生物学, 蛋白质设计, 生物序列

Abstract:

The primary objective of synthetic biology is to conceptualize, engineer, and construct novel biological components, devices, and systems based on established principles and extant information or to reconfigure existing natural biological systems. The core concept of synthetic biology encompasses the design, modification, reconstruction, or fabrication of biological components, reaction systems, metabolic pathways and processes, and even the creation of cells and organisms with functions or living characteristics. This burgeoning field offers innovative technologies to address challenges with sustainable development in environment, resource, energy, and so on. Undeniably, synthetic biology has yielded significant progress in numerous fields, ranging from DNA recombination to gene circuit design, yet its full potential remains insufficiently explored, but the emergence and application of artificial intelligence (AI) definitely can facilitate the development of synthetic biology for more applications. From a synthetic biology perspective, essence for life is rooted in digitalization and designability. This article reviews current advances in computational biology, particularly AI for synthetic biology to be more efficient and effective, focusing on the development of biocatalysts, regulators, and sensors. De novo enzyme design has been successfully implemented by using Rosetta software, as AI exhibiting significant potential for generating innovative structures and protein sequences with diverse functions. Also, the reprogramming of natural enzymes for specific purposes is crucial for synthetic biology applications. By employing various force fields and sampling techniques, promiscuity and thermal stability can be modified to accommodate specific requirements rather than those with natural hosts. AI can be integrated into the life-cycle of synthetic biology through an active learning paradigm, which enables alterations in enzyme specificity, and demonstrates potential for accurately and rapidly predicting mutation effects, surpassing force-field-based methods. The rapidly decreasing cost of sequencing has facilitated the characterization of cis-regulators, primarily DNA and RNA, with high-throughput. Concurrently, more trans-regulators have been identified in sequenced genomes. The expanding wealth in big data serves as a driving force for AI. AI models have successfully predicted the strength of promoters, ribosome binding sites (RBSs), and enhancers, and generated artificial protomers and RBSs. Recent progress in RNA structure prediction is expected to aid the design of RNA elements. Sensors, vital for genetic circuits and other applications such as toxin detection, typically involve interactions among various molecules, including nucleic acids, proteins, small organic molecules, and metal ions. Consequently, sensor design necessitates the integration of diverse computational biology tools to balance accuracy and computational cost. As the pool of data keeps growing, we anticipate that AI will be increasingly applied to the design of more bio-parts.

Key words: artificial intelligence, synthetic biology, computational biology, protein design, biological sequences

中图分类号: