| Feb 25, 2025 |
Bacteria encapsulated in microspheres create living, searchable DNA data storage system |
| (Nanowerk Spotlight) The exponential growth of global data is outpacing conventional storage technologies, driving interest in alternative approaches. DNA offers an exceptionally high theoretical storage density, surpassing silicon-based methods by several orders of magnitude. However, despite its advantages in longevity and stability, high synthesis costs remain a major barrier to widespread adoption. |
| Advances in DNA nanotechnology have enabled more sophisticated approaches to DNA-based storage, but challenges remain in both preservation and retrieval efficiency. Current DNA storage systems can be broadly categorized into in vitro methods, which protect DNA at the molecular level, and in vivo methods, which embed DNA into living cells. Each method has distinct limitations, and efficient random access retrieval of specific data from large DNA databases remains a major challenge. |
| Researchers at Tsinghua University have now developed a method that transforms bacterial cells into searchable, long-lasting "files" through a system they call Engineered Living Memory Microspheroids (ELMMs). This approach, detailed in Advanced Materials ("Engineered Living Memory Microspheroid-Based Archival File System for Random Accessible In Vivo DNA Storage"), creates microspheres containing bacteria with data-encoded plasmids that can be sorted, stored at room temperature, and selectively accessed using fluorescence markers. The system combines the advantages of both in vivo and in vitro DNA storage while addressing key limitations that have hindered each approach individually. |
![]() |
| Writing-storage-access-reading-restore closed-loop workflow for the engineered living memory microspheroid-based archival file system. a) The generic framework for DNA-based storage. b) Our DNA storage system consists of seven important steps: i) Data encoding and synthesizing. Color images are converted to 26 × 26 black and white bitmap files, which are encoded into DNA sequences via established methods and synthesized via a commercial method. ii) Set selected features. Files are categorized into 1–3 feature sets on the basis of plasmid functions in our library, primarily using fluorescent protein expression for easy identification. iii) Information integration and preservation. DNA sequences are integrated into selected plasmid vectors and fabricated into uniform gel microspheres, termed engineered living memory microspheroids (ELMMs), via droplet microfluidics. These ELMMs serve as files, forming a living DNA file database through mixing, centrifugation, and lyophilization. The entire file database can be securely preserved at room temperature in a dry environment. iv) Random access. File retrieval is achieved by rehydrating lyophilized ELMMs in a lysogeny broth (LB) medium and querying via the fluorescence-activated sorting (FAS) method. Target files can be retrieved directly, whereas nontargeted files can be recycled back into the database. v) Copying. The target files can be cultured, and single colonies can emerge on the culture plates. Some bacteria are refabricated into microspheroids via microfluidics to replenish the original database. vi) Reading and vii) Decoding. The remaining bacteria undergo Sanger sequencing followed by a decoding process to obtain the original data, completing the write‒store‒access‒read‒restore cycle. (Image: Reprinted with permission by Wiley-VCH Verlag) (click on image to enlarge) |
| Previous DNA storage systems have primarily followed two separate paths—keeping DNA molecules in protective containers (in vitro) or incorporating them into living organisms (in vivo). In vitro methods excel at preservation but struggle with efficient data retrieval, while in vivo storage enables better data manipulation but requires continuous cultivation and risks environmental release of genetically modified microorganisms. The ELMM system bridges this divide by physically encapsulating bacteria in hydrogel microspheres, creating standardized units that can be preserved without refrigeration and accessed with precision. |
| "Unlike conventional in vivo storage methods that rely on co-cultivating bacterial strains, this unit facilitates the physical encapsulation and isolation of various GMMs, supporting DNA storage at room temperature and reducing energy consumption," explain the researchers, led by Prof. Zhuo Xiong. "Additionally, it allows for direct retrieval of specific files or subsets from a database through physical sorting." |
| The researchers demonstrated their system by encoding five simple black and white images as DNA sequences, each tagged with descriptive labels like "yellow," "animal," or "aquatic." These tags correspond to specific fluorescent proteins expressed by the bacterial cells—essentially creating a key-value pairing system where the cell's visual properties indicate the data it contains. |
| To construct these living memory units, the team integrated synthesized DNA sequences into plasmids containing genes for fluorescent proteins like mCherry and EGFP. They transformed bacteria with these plasmids and encapsulated them in hydrogel microspheres using droplet microfluidics—a process completed in just five minutes, dramatically faster than traditional encapsulation methods that can require days. |
| This speed represents a significant advancement over previous approaches. For instance, earlier methods using silica microspheres needed up to four days for encapsulation, and extracting information from these containers proved equally complex. The ELMM system also avoids a critical flaw in silica-based methods: degradation of surface barcodes that frequently leads to retrieval failures. |
| When data access is needed, the freeze-dried microspheres are rehydrated and sorted using fluorescence-activated sorting (FAS). This technique allows direct physical separation of files based on their fluorescent signatures, with unselected files recycled back into the database. The selected bacteria can then be cultured to produce multiple copies—some refabricated into new microspheres to replenish the database, others sequenced to read the stored information. |
| The team tested this retrieval system with increasingly challenging scenarios. Even when mixing just 10 target files with 10,000 non-target files, they achieved over 50% sorting accuracy in a single pass. More sophisticated retrieval operations using Boolean logic combinations like "NOT yellow," "yellow AND animal," and multi-channel sorting also demonstrated high precision. |
| A key advantage of the ELMM approach lies in its storage stability. After lyophilization (freeze-drying), the microspheres remained viable during three months of room temperature storage. Upon rehydration, they retained over 97% of their original fluorescence and perfectly preserved the encoded information. The system even withstood seven consecutive lyophilization-rehydration cycles while maintaining retrievable data and consistent physical properties essential for accurate sorting. |
| The theoretical capabilities of this system are substantial, with the potential to sort and retrieve an extremely large number of distinct file types. The system’s theoretical maximum retrieval speed is 196.72 MB/s, based on its DNA storage capacity and sorting rate. Each ELMM stores 254,886 bp of DNA, with an encoding density of 0.125 bytes per base pair and an effective information percentage (EIP) of 90%, yielding a file size of approximately 28.7 KB per ELMM. With a sorting speed of 70,000 ELMMs per second, this results in a maximum retrieval throughput of 196.72 MB/s. |
| However, in practice, the team reports experimental retrieval speeds of 550,375 bytes per second (≈0.55 MB/s) due to real-world inefficiencies, such as sorting losses and system overhead. |
| "By utilizing N optical channels, to retrieve 2N file types, each with a minimum of 10 copies, ELMM offers a digital-to-biological information solution, ensuring the preservation, access, replication, and management of files within large-scale DNA databases," the researchers write. |
| Despite these advances, the current implementation has limitations. The storage density, while exceeding hard disk drives by approximately two orders of magnitude, remains below the theoretical maximum for bare DNA storage. The researchers acknowledge environmental concerns about genetically modified organisms, though the encapsulation strategy significantly reduces these risks by physically containing the bacteria. |
| Future improvements could incorporate magnetic particles or other functional materials into the matrix, refine the microsphere structure, or leverage advances in synthetic biology and artificial intelligence to enable more sophisticated biological information manipulation. The researchers also envision standardized interfaces between digital and biological information systems to create integrated hybrid storage solutions. |
| As information creation continues to outpace conventional storage capabilities, the ELMM system offers a promising approach to long-term data preservation that reduces energy consumption while enabling precise retrieval. The technology represents an important link between digital data and biological storage, potentially paving the way for sustainable management of rarely accessed but critically important information in an era of yottabyte-scale challenges. |
By
Michael
Berger
– Michael is author of four books by the Royal Society of Chemistry:
Nano-Society: Pushing the Boundaries of Technology (2009),
Nanotechnology: The Future is Tiny (2016),
Nanoengineering: The Skills and Tools Making Technology Invisible (2019), and
Waste not! How Nanotechnologies Can Increase Efficiencies Throughout Society (2025)
Copyright ©
Nanowerk LLC
|
![]()
Become a Spotlight guest author! Join our large and growing group of guest contributors. Have you just published a scientific paper or have other exciting developments to share with the nanotechnology community? Here is how to publish on nanowerk.com. |

By 