合成生物学

• 特约评述 •    

DNA存储的关键技术:编码、纠错、随机访问与安全性

徐怀胜1, 石晓龙2, 刘晓光3, 徐苗苗4   

  1. 1.广州商学院现代信息产业学院,广东 广州 511363
    2.广州大学计算科技研究院,广东 广州 510006
    3.广州体育学院运动健康学院,广东 广州 510620
    4.广州中医药大学体育健康学院,广东 广州 510006
  • 收稿日期:2024-08-26 修回日期:2024-10-15 出版日期:2024-10-17
  • 通讯作者: 徐苗苗
  • 作者简介:徐怀胜(1996—),男,硕士,助教。研究方向为DNA信息存储。E-mail:hsxu@gcc.edu.cn
    徐苗苗(1992—),女,师资博士后。研究方向为合成生物学及DNA存储。E-mail:miaomiaoxu@gzucm.edu.cn
  • 基金资助:
    国家自然科学基金(32300964)

Key technologies of DNA storage: encoding, error correction, random access, and security

Huaisheng XU1, Xiaolong SHI2, Xiaoguang LIU3, Miaomiao XU4   

  1. 1.School of Modern Information Industry,Guangzhou College of Commerce,Guangzhou 511363,Guangdong,China
    2.Institute of Computing Science and Technology,Guangzhou University,Guangzhou 510006,Guangdong,China
    3.College of Sports and Health,Guangzhou Sport University,Guangzhou 510620,Guangdong,China
    4.School of Physical Education and Health,Guangzhou University of Chinese Medicine,Guangzhou 510006,Guangdong,China
  • Received:2024-08-26 Revised:2024-10-15 Online:2024-10-17
  • Contact: Miaomiao XU

摘要:

DNA信息存储是一种利用DNA分子作为数据载体的新型存储技术,通过合成特定序列的DNA来编码信息,并通过测序技术实现数据的读出。相比于传统的磁性、光学和电子存储介质,DNA存储在存储密度、数据保存时间和能源效率等方面具有显著优势,且不易受电磁干扰的影响。随着全球数据总量的猛增,DNA存储以其高效的存储能力、潜在的低维护成本和易于合成的化学特性,逐渐成为研究热点。本文首先介绍了DNA存储的基本流程,然后综述了DNA信息存储涉及到的关键技术,尤其是编码策略、纠错技术、随机访问及DNA信息加密的研究进展。探讨了当前DNA存储技术的发展现状和主要挑战,如高成本、写入和读取速度慢等问题,并提出了可能的技术改进方向。此外,本文还展望了DNA存储未来的发展前景,强调其在大数据时代的潜在应用和革命性影响,指出了实现商业化应用所需解决的关键技术瓶颈。

关键词: DNA信息存储, DNA合成, 信息编码, DNA纳米技术, 合成生物学

Abstract:

DNA information storage is a new storage technology that uses DNA molecules as data carriers. It encodes information by synthesizing DNA with a specific sequence and reads out data through sequencing technology. Compared with traditional magnetic, optical and electronic storage media, DNA storage has significant advantages in storage density, data retention time and energy efficiency, and is not easily affected by electromagnetic interference. With the rapid increase in the total amount of global data, DNA storage has gradually become a research hotspot with its efficient storage capacity, potential low maintenance cost and easy-to-synthesize chemical properties. However, DNA storage technology is still in its early stages of development and there are still many technical bottlenecks to be overcome. For example, an important advantage of DNA storage is its ultra-high storage density and long-term stability. However, achieving this goal requires overcoming many technical challenges, such as reducing the synthesis error rate and improving the encoding efficiency. Understanding and mastering existing key technologies, such as DNA encoding, error correction technology, random access and DNA information encryption technology, will help identify and optimize the shortcomings, thereby promoting further innovation and development of the technology. Encoding strategy is one of the core aspects of DNA storage technology, directly determining data storage efficiency, reading accuracy, and error correction capability. To achieve efficient and stable DNA information storage, it is essential to develop more advanced encoding algorithms to enhance storage density, reduce synthesis and sequencing error rates, and ensure data accuracy and integrity. Moreover, the security of DNA information storage is becoming increasingly important, particularly in terms of data security and privacy protection. As a potential data carrier, DNA needs to address challenges related to data encryption, information hiding, and tamper resistance to ensure data confidentiality and integrity. Therefore, integrating modern cryptographic techniques with DNA storage to establish a secure and reliable information storage system has become a key research focus in this field.This article first introduces the basic process of DNA storage, and then reviews the key technologies involved in DNA information storage, especially the research progress of encoding strategies, error correction technology, random access and DNA information encryption. In addition, the current development status and main challenges of DNA storage technology were also discussed. For example, the scale of DNA data storage in the laboratory is small and the operation time is long, with a single synthesis operation time lasting about 24 hours. Moreover, most DNA storage steps rely on the participation of experimenters, making it difficult to automate the information storage and reading process. With the advancement of synthetic biology and encoding and decoding methods, we believe that these bottleneck problems will be solved in the near future, and promote the transformation of technology from the laboratory to practical applications, thereby promoting the industrialization process.

Key words: DNA information storage, DNA synthesis, information encoding, DNA nanotechnology, synthetic biology

中图分类号: