合成生物学

• 项目总结 •    

多方协同DNA安全信息存取:迈向DNA-硅基混合存储设施

刘家坤1,2, 尤迪3, 鲜于运雷4, 曲强1   

  1. 1.中国科学院深圳先进技术研究院,广东 深圳,518055
    2.深圳大学医学部生物医学工程学院,广东 深圳,518055
    3.华东理工大学生物工程学院,上海,200237
    4.浙江大学生物系统工程与食品科学学院,浙江 杭州 310058
  • 收稿日期:2025-06-30 修回日期:2025-09-03 出版日期:2025-09-04
  • 通讯作者: 曲强
  • 作者简介:刘家坤(1987—),深圳大学医学部生物医学工程学院副研究员,主要研究领域是CRISPR使能技术开发和人工合成细胞等。
    曲强(1985—),中国科学院深圳先进技术研究院研究员,博导,主要研究领域是区块链、数据库系统与数据智能系统。
  • 基金资助:
    国家重点研发计划合成生物学重点专项(2020YFA0909100)

Multi-party collaborative secured DNA data storage and access: toward a hybrid dna-silicon storage facility

LIU Jiakun1,2, YOU Di3, XIANYU Yunlei4, QU Qiang1   

  1. 1.Shenzhen Institutes of Advanced Technology,Chinese Academy of Sciences,Shenzhen 518055,Guangdong,China
    2.Shenzhen University Medical School,School of Biomedical Engineering,Shenzhen 518055,Guangdong,China
    3.School of Biotechnology,East China University of Science and Technology,Shanghai 200237,China
    4.College of Biosystems Engineering and Food Science,Zhejiang University,Hangzhou 310058,Zhejiang,China
  • Received:2025-06-30 Revised:2025-09-03 Online:2025-09-04
  • Contact: QU Qiang

摘要:

随着大数据时代的到来,传统硅基存储面临密度低、能耗高、寿命短等瓶颈,DNA存储技术凭借其超高密度(理论上可达EB/g级)和千年级稳定性成为颠覆性解决方案。然而,现有研究多聚焦单方场景,多方协同下的安全存取、高效编解码及生物兼容性等关键问题亟待突破。在此背景下,国家重点研发计划“合成生物学”重点专项支持了《多方协同合成基因信息安全存取方法研究》青年科学家项目。本项目提出了一种融合对称-非对称混合加密编码架构与工程菌株生物兼容性设计的DNA存储系统(MSP-DNA),首次实现了多方协同场景下的数据动态管理与生物安全性增强。通过构建基于Merkle-DAG的增量存储模型,系统将基因编辑操作降低90%以上;开发的BO-DNA编码算法通过混沌映射优化大幅度抑制非特异性杂交误差,存储密度达1.77比特/核苷酸。结合CRISPR-Cas和Argonaute核酸酶双认证检测平台,系统实现0.1fM级灵敏度的信息检索。研究结果表明,工程化放线菌底盘在极端环境下的数据稳定性得到显著提升,安全加密方法可以有效抵抗多种密码学攻击,并对 DNA 存储过程产生的数据错误有一定纠错能力。本研究为下一代生物-电子融合存储基础设施建设提供关键技术支撑,助力破解“冷数据”长期存储与“热数据”实时访问的协同创新难题。

关键词: DNA存储, 多方安全协同, 混沌加密, Merkle-DAG模型, 生物兼容性编码, CRISPR-Cas和Argonaute核酸酶双认证

Abstract:

With the advent of the big data era, traditional silicon-based storage faces bottlenecks such as low density, high energy consumption, and short lifespan. DNA storage technology, leveraging its ultra-high density (theoretically reaching EB/g levels) and millennium-scale stability, has emerged as a disruptive solution. Since 2012, scientists such as George Church and Sri Kosuri start to use DNA as data storage media. To improve the use of DNA data storage, DNA Data Storage Alliance of industry and academic organizations had founded in many countries. As the writing and reading speed of data in DNA is far behind of that in computer, DNA data storage is useful for cold data storage. With the development of DNA sequencing and synthesizing, maybe one day we could use DNA computer. Controlling access to DNA data storage systems is critical. Traditional cybersecurity measures, such as passwords and two-factor authentication, may not be sufficient for protecting genetic information. Utilize Multi-Factor Authentication (MFA) to guarantee access control measures more robust. Organizations may mitigate the potential risk of unauthorized use of DNA storage systems by requesting multiple stages of authentication. However, existing research predominantly focuses on single-party scenarios, leaving critical challenges like secure multi-party access, efficient encoding/decoding, and biocompatibility under collaborative frameworks unresolved. Against this backdrop, the National Key R&D Program of China's "Synthetic Biology" key special project funded the Young Scientist Project "Research on Secure Multi-party Access Methods for Synthetic Genetic Information." This project proposes MSP-DNA—a DNA storage system integrating a symmetric-asymmetric hybrid encryption architecture with engineered strain biocompatibility design—pioneering the achievement of dynamic data management and enhanced biosafety in multi-party collaborative scenarios. By establishing a Merkle-DAG-based incremental storage model, the system reduces gene editing operations by over 90%. The developed BO-DNA encoding algorithm significantly suppresses non-specific hybridization errors through chaotic mapping optimization, achieving a storage density of 1.77 bits per nucleotide (nt). Coupled with a CRISPR-Cas/Argonaute dual nuclease authentication platform, the system enables information retrieval at 0.1 fM sensitivity. Results demonstrate substantially improved data stability of the engineered actinomycete chassis under extreme environments. The cryptographic approach effectively resists multiple attacks while exhibiting intrinsic error-correction capabilities against DNA storage artifacts. This research provides key technological foundations for next-generation bio-electronic hybrid storage infrastructure, addressing the co-innovation challenge of long-term "cold data" preservation and real-time "hot data" access.

Key words: DNA storage, Multi-party secure collaboration, Chaotic encryption, Merkle-DAG model, Biocompatible encoding, CRISPR-Cas/Argonaute dual nuclease authentication

中图分类号: