Synthetic Biology Journal ›› 2023, Vol. 4 ›› Issue (4): 629-650.DOI: 10.12211/2096-8280.2022-073

• Invited Review • Previous Articles     Next Articles

Advances and applications of evolutionary analysis and big-data guided bioinformatics in natural product research

Fanzhong ZHANG1,2, Changjun XIANG1,2,3, Lihan ZHANG1,2   

  1. 1.Key Laboratory of Precise Synthesis of Functional Molecules of Zhejiang Province,Department of Chemistry,School of Science,Westlake University,Hangzhou 310030,Zhejiang,China
    2.Institute of Natural Sciences,Westlake Institute for Advanced Study,Hangzhou 310024,Zhejiang,China
    3.Department of Chemistry,Fudan University,Shanghai 200243,China
  • Received:2022-12-08 Revised:2023-02-21 Online:2023-09-14 Published:2023-08-31
  • Contact: Lihan ZHANG

进化与大数据导向生物信息学在天然产物研究中的发展及应用

张凡忠1,2, 相长君1,2,3, 张骊駻1,2   

  1. 1.西湖大学理学院化学系,浙江省功能分子精准合成重点实验室,浙江 杭州 310030
    2.浙江西湖高等研究院,理学研究所,浙江 杭州 310024
    3.复旦大学化学系,上海 200243
  • 通讯作者: 张骊駻
  • 作者简介:张凡忠(1991—),女,博士后。研究方向为微生物天然产物分离鉴定,细菌中PKS和NRPS的进化导向基因挖掘。E-mail:zhangfanzhong@westlake.edu.cn
    张骊駻(1989—),男,特聘研究员,博士生导师。研究方向为天然产物分子多样性与生物合成进化,以及进化导向生物合成改造等。E-mail:zhanglihan@westlake.edu.cn
  • 基金资助:
    国家自然科学基金(22177092);浙江省领军型创新创业团队(2020R01004);杭州市科技发展计划(20201203B122)

Abstract:

Nature has invented a myriad of natural products through billions of years of evolution. Natural products own unique structural features selected by evolutionary pressure and serve as a treasure trove for drug discovery. The rapid growth of microbial genomic data now provides new opportunities for evolutionary and big data analysis of biosynthetic gene clusters, which not only gives us a clearer picture about the global landscape of natural products, but also enables us to reveal the evolutionary trajectory of natural products. Such holistic understanding of natural products can facilitate the phylogeny-guided genome mining, allow better functional prediction of biosynthetic enzymes, and even open the door to biosynthetic redesign to create non-natural molecules by evolution-guided engineering. The core essence of evolutionary and large-scale bioinformatics lies in that it visualizes the entire sequence space and their distribution of a particular analyte family. Therefore, big data-driven bioinformatics has the potential to answer some challenging questions such as "How many natural products remain to be discovered?", and "How long can natural products discovery be sustainable?" This review summarizes recent advances in the application of evolution and big data-guided bioinformatics for natural products research from several perspectives including: ① natural product discovery; ② functional and structural prediction of biosynthetic enzymes and their products; ③ bioengineering, with the emphasis on the assembly line enzymes such as polyketide synthases and non-ribosomal peptide synthetases. Due to the modular domain architecture they have, the assembly line enzymes have been the main targets for genome mining. The phylogenetic analysis of their domains has shown to be a powerful and effective way to predict their enzymatic function and substrate specificity. Recently, the evolutionary mechanism of the assembly line enzymes has been investigated, and several evolution-guided engineering strategies were shown to have much higher efficiency for the assembly line reprogramming, providing a potential breakthrough for the bioproduction of complex polyketides and peptides. Non-modular enzymes are also discussed with selected representative examples. Finally, we present current challenges and future prospects of big data-driven natural products research. {L-End}

Key words: natural products, evolution, genome mining, biosynthetic engineering, bioinformatics

摘要:

自然界亿万年的进化孕育出了丰富的天然产物资源,进而为药物研发提供了巨大的分子宝库。进化导向生物信息学方法在微生物天然产物研究中发挥着越来越重要的作用。微生物基因组数据的快速增长为生物合成基因簇的大数据分析以及进化分析提供了新机遇,不仅让我们对天然产物全景图有了更清晰的认识,还能够揭示天然产物的进化规律,利用进化分析方法和大数据资源挖掘新型的药物先导天然产物,理解生物合成酶,甚至设计改造生物合成体系创造非天然分子。本文综述了近年来进化和大数据导向生物信息学应用于天然产物研究中的相关进展,强调了进化与大数据在生物合成酶的功能预测、进化机理、基因挖掘以及生物合成改造方面的应用,最后分析了目前面临的问题并对未来发展趋势进行了展望。

关键词: 天然产物, 进化, 基因挖掘, 生物合成改造, 生物信息学

CLC Number: