Twist Bioscience, Illumina and Western Digital form alliance with Microsoft to advance data storage in DNA

 Twist BioscienceIllumina,and Western Digital announced the formation of an alliance with Microsoft to advance the field of DNA data storage. These founding companies, alongside member organizations, will work together to create a comprehensive industry roadmap that will help the industry achieve interoperability between solutions and help establish the foundations for a cost-effective commercial archival storage ecosystem for the explosive growth of digital data.

DNA is an incredible molecule that, by its very nature, provides ultra-high-density storage for thousands of years,” said Emily M. Leproust, Ph.D., CEO and co-founder of Twist Bioscience. “By joining with other technology leaders to develop a common framework for commercial implementation, we drive a shared vision to build this new market solution for digital storage.”

DNA data storage has the potential to deliver a true low-cost archival data storage solution. While current storage technologies have limited longevity and require data migration for long-term data storage, DNA provides a stable format storage medium that is durable for thousands of years when properly stored. In addition, DNA enables cost effective and rapid duplication. Importantly, it is incredibly dense, with 10 full length digital movies fitting into a volume the size of a single grain of salt. Digital data stored in DNA can be stored in a variety of containers including capsules, pellets or encased in glass beads.

At Microsoft Research, we proactively address the future challenges of technology, with sustainability in mind,” commented Karin Strauss, Ph.D., senior principal research manager at Microsoft. “In collaboration with University of Washington, we have demonstrated a fully automated end-to-end system capable of storing and retrieving data from DNA, and we have separately stored 1GB of data in DNA synthesized by Twist and recovered data from it. We’re encouraged by the potential for more sustainable data storage with DNA and look forward to collaborating with others in the industry to explore early commercialization of this technology.”

By 2024, 30% of digital businesses will mandate DNA storage trials, addressing the exponential growth of data poised to overwhelm existing storage technology.

There is an unmet need for a new long-term archival storage medium that keeps up with the rate of digital data growth,” said Steffen Hellmold, vice president corporate strategic initiatives, Western Digital. “We estimate that almost half of the data storage solutions shipped in 2030 will be used to archive data as the overall temperature of data is cooling down. We are committed to providing a full portfolio of storage solutions addressing the demand for hot, warm and cold storage.”

A key component of a DNA data storage system is its ability to read back the digital information when needed,” stated Alex Aravanis, M.D., Ph.D., chief technology officer at Illumina. “We believe Illumina’s innovative sequencing technology will be critical in enabling this market at commercial scale and look forward to collaborating with other leaders in their respective fields to make this a viable, long-term solution for archival storage.”

Twist Bioscience, Illumina, Western Digital and Microsoft are joining the Alliance as founding members. In addition to developing an industry roadmap, the DNA Data Storage Alliance plans to develop use cases in various markets and industries as well as promote and educate the larger storage community to promote adoption of this future solution. The following organizations have joined the alliance as members:

The Alliance welcomes additional participation from organizations within DNA and data storage industries that would like to contribute to this emerging ecosystem.

How to Store Digital Data in DNA

To store data in DNA, first, a data file is converted from its digital sequence of 0’s and 1’s into a DNA sequence of A’s, C’s, T’s and G’s. The DNA data file is then synthesized (“written”) in short segments of DNA (200 to 300 bases long) and stored. In addition to storing part of the data file, each short segment contains an index to indicate its place within the overall data file. To retrieve the data, the segments are sequenced (“read”) and then decoded back into the original file. One feature of the indexing system is it allows part of the file to be biologically recovered (“random access”) before sequencing, so only data of interest is sequenced. In addition, all data is recovered error-free because error-correcting algorithms are used during the encode/decode process.