DNA can store digital data, such as visual and audio files
Science Picture Co / Alamy
Artificial intelligence can read data stored in DNA strands within 10 minutes rather than the days required for previous methods, bringing DNA storage closer to practical use in computing.
“DNA can store vast amounts of data in an extremely compact form and remain intact for thousands of years,” says Daniella Bar-Lev at the University of California, San Diego. “Additionally, DNA is naturally replicable, offering a unique advantage for long-term data preservation.”
But retrieving the information encoded within DNA is a monumental challenge because the strands are mixed and jumbled together when stored. During the data-encoding process, individual strands are sometimes replicated imperfectly, and some fragments may be lost entirely. As a result, reading data stored in DNA can resemble reconstructing a book from a box filled with shredded, typo-ridden pages.
“Traditional methods struggle with this chaos, requiring days of processing,” says Bar-Lev. The new approach “streamlines this with AI trained to spot patterns in the noise”, she says.
Bar-Lev and her colleagues developed an AI-powered method called DNAformer that can quickly and accurately decode jumbled DNA sequences. The system includes a deep learning AI model trained to reconstruct DNA sequences, a separate computer algorithm that identifies and corrects errors and a third decoding algorithm that converts everything back into digital data while fixing any remaining mistakes.
In experiments, DNAformer could read 100 megabytes of DNA-stored data nearly 90 times faster than the next fastest method – which was developed with traditional, rules-based computing algorithms – while achieving better or comparable accuracy. The decoded data included a coloured image of test tubes, a 24-second audio clip of astronaut Neil Armstrong’s famous moon landing speech and written text about why DNA is a promising data storage medium.
The team plans to develop versions of DNAformer tailored to newer techniques for encoding data into DNA, says Omer Sabary at Technion – Israel Institute of Technology.
“Crucially, because our approach does not rely on specific [DNA] synthesis or sequencing methods, it can be adapted to future, as-yet-undeveloped technologies that may be more commercially viable,” he says.
Topics: