As a society, we are creating more data than ever before meaning we're at risk of running out of traditional storage.
As a result, there is a race to create new ways to store our photos and files. Although the concept of storing such data on DNA isn't new, a pair of researchers from Columbia University and the New York Genome Center has now used the technique to store an operating system, movie, and other files on biological molecules.
"DNA has the potential to provide large-capacity information storage," Yaniv Erlich and Dina Zielinski write. "Here we report a storage strategy, called DNA Fountain, that is highly robust and approaches the information capacity per nucleotide". The system was able to store files in oligonucleotides (DNA molecules), and was able retrieve the data.
In total, six files were encoded onto the DNA: an operating system, the film Arrival of a train at La Ciotat, a $50 Amazon gift card, a virus, a Pioneer plaque, and a 1948 academic paper. Although the files were relatively small – totalling 2,146,816 bytes – the paper says the method used was able to store 60 per cent more data on DNA than in previous circumstances.
Using an algorithm, dubbed DNA Fountain, the academics compressed these six files into a master document and then split the data into binary code. The strings of binary were packaged into 'droplets' and mapped to the nucleotide DNA bases: A, G, C and T. A statement published alongside the work says the algorithm detected letter combinations that create errors and added a barcode to the droplets so they could be reassembled.
In the research paper, the academics added that they have achieved a "perfect decoding" of the data and that within coming decades "DNA might become an economically viable solution for long-term, high-latency storage". Overall 72,000 DNA strands were created and startup Twist Bioscience converted the data. To retrieve it, the scientists used DNA sequencing techniques.
The team also looked at the potential for the system. "Finally, we explored the limit of our architecture in terms of bytes per molecule and obtained a perfect retrieval from a density of 215 petabytes per gram of DNA, orders of magnitude higher than previous reports". However, before any DNA data storage system can be created and used in computer systems, the cost needs to be reduced. The DNA synthesis used in the system cost the researchers $3500/Mbyte.