An Exploit to Combat Bringing the Dinosaurs Back

Written by

Scientists at the University of Washington have pioneered a way to use DNA to store and transmit malware.

If your initial reaction is “What!!??”, you’re not alone.

The scientists, Peter Ney, Karl Koscher, Lee Organick, Luis Ceze and Tadayoshi Kohno, started with the four basic building blocks of our double-helix blueprint—the nucleobases known as thymine (T), adenine (A), guanine (G) and cytosine (C). If you reach deep and remember your high school biology class, you’ll remember that these, in combinations of two, make up nucleotides, and nucleotides make up DNA strands. The scientists simply linked this genetic code to binary code by assigning 1s and 0s to the nucleobases. And hey presto! Here’s some computer executable malware, encoded into DNA.

All too easy, my friends (well not really).

So what does it mean? Can a strand of hair or drop of saliva set off a world-wide ransomware crisis?

Well, no—I say that with a tinge of regret, because while that would be horrible, it’s so futuristic and sci-fi that it appeals to the Neuromancer fan in me.

As we reported elsewhere, at the heart of this is the sequencing software that geneticists use to read and convert DNA sequences into computer files that can be analyzed.

“The modern DNA sequencing and analysis pipeline is large, complicated and computationally-intensive,” the researchers said in their paper. “DNA is pre-processed in a wet lab and analyzed with a high-throughput sequencer (itself a computer) that performs image analysis. It is then common to conduct a wide range of computational tasks with the raw output from the sequencer using many software utilities.”

The researchers, using an altered version of that sequencing software that contains a known vulnerability, experimented to see if they could use synthetic DNA samples to compromise the system and infect it with malware.

The first step was to synthesize a real, biological DNA sequence embedded with a malicious exploit. Then it would be sent through the regular sequencing and analytics process. A successful exploit would result in arbitrary code execution.

As for how the bad guys could use this, they would first have to uncover a vulnerability in the target system, then write a specific malicious exploit for it that could be embedded within a DNA sample. Then they would have to create that DNA sample and then have the lab sequence and analyze it in order to execute the code.

That’s a lot of specialized steps (and the team would need a working knowledge of genetics), so this isn’t for the script kiddies. But a successful compromise would mean gaining the ability to sabotage sequencing efforts—a potential competitive tool when it comes to medial research—or the ability to identify the species of a sample, which could be commercially sensitive in domains like drug discovery.

It could also be useful if someone wanted to, say, disrupt a plan to bring the dinosaurs back to life and house them on a jungle island so children can come visit. We should know by now that would probably have some bad consequences, right? No? Seriously, the answer to that question is no?

I’m curious as to the ability to invade the privacy of individuals like no one has before, by siphoning off their genetic code. Could perpetrators gain access to individuals’ information, like what cancers someone is susceptible to—making it theoretically possible to build a long-con murder plot around that? At the rate that we’re all sending off cheek swabs to Ancestry.com and the like, the attack surface seems large.

The researchers discovered a side channel that does leak genetic information—however, of the 235 million bases represented in the target sample, 16,521 were recoverable (about 0.007% of the data). To put that in context, the human genome contains around 3.2 billion bases. So the answer is, probably not. Whew.

What’s hot on Infosecurity Magazine?