Synthetic Biology Journal ›› 2021, Vol. 2 ›› Issue (3): 384-398.DOI: 10.12211/2096-8280.2020-085

• Invited Review • Previous Articles     Next Articles

The pivotal biochemical methods in DNA data storage

GAO Yanmin, TANG Mengtong, LIU Qian, QIAO Hongyan, WANG Taoxue, QI Hao   

  1. Key Laboratory of Systems Bioengineering (Ministry of Education),School of Chemical Engineering and Technology,Tianjin University,Tianjin 300350,China
  • Received:2020-11-30 Revised:2021-02-08 Online:2021-07-13 Published:2021-06-30
  • Contact: QI Hao


郜艳敏, 唐梦童, 刘倩, 乔宏艳, 王桃雪, 齐浩   

  1. 天津大学化工学院,系统生物工程教育部重点实验室,天津 300350
  • 通讯作者: 齐浩
  • 作者简介:郜艳敏(1990—),女,博士研究生。研究方向为DNA信息存储和核酸检测。
  • 基金资助:


With the rapid progress in biotechnology, especially array-based DNA synthesis and Next Generation Sequencing (NGS), DNA demonstrated its great advantage in data storage capacity, storage stability and repeatable reading. However, there is still vast challenge regarding current biochemical methods used in manipulation of the large-scale oligonucleotide (oligo) pool carrying digital information. For example, DNA integrity and stability are affected by preservation conditions, such as temperature and humidity. The dropout and mutation (substitute, insertion, or deletion) of DNA oligo have been enlarged in biased manipulations including chemical synthesis, amplification (PCR) and NGS. Large unevenness of the oligo copy number lead to require more sequencing resource to recover all necessary strands in the pool. In addition, missing sequences and base error increase the cost of decoding process. Therefore, DNA data storage is still confined in the laboratory. From the perspective of the biochemical methods for manipulating large-scale oligo pool, we have summarized the causes of biochemical problems such as heterogeneity of oligo copy number, mutation, and DNA decay in the process of microarray DNA synthesis, storage and amplification. And we have summed up a series of biochemical methods developed to address these problems, from oligo synthesis to amplification. These methods include improved synthesis process, adjusted chemical process parameters, modified oligo pool normalization method, optimized PCR condition, variant PCR (emulsion PCR) and novel isothermal amplification (strand displacement amplification). In addition, some measures should be taken in the encoding strategy to mitigate the oligo copy unevenness and aid the error correction. Moreover, we have proved the feasibility and efficiency of these biochemical methods in reducing the abovementioned problems in DNA data storage. Finally, we have discussed and analyzed the challenges in the existing DNA data storage. With the development of biotechnology and strategies of encoding and decoding, we believe that these bottle-neck issues will be solved and DNA data storage will be applied in real-world application in the near future.

Key words: DNA data storage, array-based DNA synthesis, oligo copy unevenness, oligo pool normalization, amplification bias, PCR, isothermal amplification



关键词: DNA信息存储, 芯片合成, 寡核苷酸不均一性, 文库均一化, 扩增偏好性, PCR, 恒温扩增

CLC Number: