合成生物学 ›› 2021, Vol. 2 ›› Issue (5): 697-715.DOI: 10.12211/2096-8280.2021-012

• 特约评述 • 上一篇    下一篇

基因组挖掘在天然产物发现中的应用和前景

杨谦1, 程伯涛1, 汤志军1, 刘文1,2   

  1. 1.中国科学院上海有机化学研究所,生命有机化学国家重点实验室,上海 200032
    2.中国科学院上海有机化学研究所,湖州生物制造中心,浙江 湖州 313000
  • 收稿日期:2021-01-27 修回日期:2021-04-05 出版日期:2021-11-19 发布日期:2021-11-19
  • 通讯作者: 刘文
  • 作者简介:杨谦(1994—)女,博士,博士后。研究方向为天然产物化学及以基因组扫描为手段的新型天然产物发现。E-mail:yangqian117@sioc.ac.cn
    刘文(1971—),男,研究员,博士生导师。研究方向为复杂天然产物的生物合成(遗传学、生物化学和化学),以产量提高和结构多样性为目的组合生物合成,以基因组扫描为手段的新型天然产物发现。E-mail:wliu@mail.sioc.ac.cn
  • 基金资助:
    国家重点研发计划(2019YFA0905400);中国科学院B类先导科技专项(XDB20020200);王宽诚率先人才计划

Applications and prospects of genome mining in the discovery of natural products

Qian YANG1, Botao CHENG1, Zhijun TANG1, Wen LIU1,2   

  1. 1.State Key Laboratory of Bioorganic and Natural Products Chemistry,Shanghai Institute of Organic Chemistry,Chinese Academy of Sciences,Shanghai 200032,China
    2.Huzhou Center of Bio-Synthetic Innovation,Shanghai Institute of Organic Chemistry,Chinese Academy of Sciences,Huzhou 313000,Zhejiang,China
  • Received:2021-01-27 Revised:2021-04-05 Online:2021-11-19 Published:2021-11-19
  • Contact: Wen LIU

摘要:

天然产物一直以来都是药物先导化合物的重要来源。在药物发现领域,基因组数据常用来识别潜在的药物靶点或寻找先前被忽视的天然产物的生物合成基因簇。尽管基因组测序发现了微生物和植物中存在大量未开发的化学多样性,然而,仅仅利用传统的分离分析方法获取新的天然产物已经无法满足药物发展的需求。随着基因组时代的到来,数字化的基因组挖掘已经成为天然产物发现的重要组成部分。伴随着高通量测序方法的发展和DNA数据的丰富,各种基因组挖掘方法和工具被开发出来,以指导发现和表征这些天然产物。本文综述了近年来基因组挖掘的网络工具、数据库和方法,着重介绍次级代谢产物生物合成基因簇的挖掘手段,从经典的基因组挖掘到基于抗性基因挖掘、基于系统进化发育的挖掘,并对基因组挖掘在天然产物发现中的地位和前景进行了展望。

关键词: 基因组挖掘, 天然产物, 网络工具, 数据库

Abstract:

Natural products have been an abundant source of leader compounds for new drugs, but traditional isolation and analysis technologies to obtain novel natural products cannot satisfy the requirement for drug discovery. Genomic data have been utilized for identifying potential drug targets, or exploring biosynthesis pathways for natural products that were neglected before. Genome sequencing has unveiled a plethora of undeveloped chemical diversity in microorganisms and plants. From genome sequences, a large amount of information is available, from functional enzymes to conserved patterns/signatures, even potential structures and features that can be interpreted to hunt for new biocatalysts. With the advent of the genomic era, the computational mining of genomes has become an important part in the discovery of novel natural products as drug leads. Meanwhile, the development of high-throughput sequencing and the establishment of DNA database, genome mining methods and tools have contributed to the discovery and characterization of these natural products. In spite of the diversity of natural products, the biosynthetic rules and thus the biosynthetic machineries for many of these compounds are often remarkably conserved, which is highlighted in the high amino acid sequence similarity of the core biosynthetic enzymes, such as polyketides synthases (PKS), non-ribosomally peptides synthetases (NRPS), and many others. Besides, most of natural products are considered to be produced by the host to kill or limit the growth of competitors through the inhibition or inactivation of essential housekeeping enzymes. Therefore, accumulating knowledge on the self-resistance mechanisms, for instance, mining for SRE (self-resistance enzyme), have promoted research on natural products. Moreover, a phylogeny-guided mining approach provides a method to quickly screen a large number of microbial genomes or metagenomes to detect new biosynthetic gene clusters of interest, and many web tools and databases have been developed and utilized by researchers to mine for key enzymes. This paper reviews recent advances in the genome mining tools, databases and approaches, with a focus on the ways of mining biosynthetic gene clusters (BGCs) of natural products, from classical genome mining to resistance-based and phylogeny-guided mining, and also include a short overview on status and perspective in the discovery of novel natural products.

Key words: genome mining, natural products, web tools, database

中图分类号: