Nebula Genomics’ Blockchain-Based DNA Data-sharing and Analysis Platform

One of the most powerful aspects of blockchain technology is that it is generic, and not limited to a few vertical applications. Recent examples reported here on Block Explorer News from widely-differing fields include using a blockchain to store publications, to manage real-time drone flight data, and as the basis of a mobile voting platform. To that growing list can be added the important domain of personal genomics – the sequencing and analysis of an individual’s DNA – thanks to a new service from the company Nebula Genomics:

We will spur genomic data growth by significantly reducing the costs of personal genome sequencing, enhancing genomic data protection, enabling buyers to efficiently acquire genomic data, and addressing the challenges of genomic big data. We will accomplish this through decentralization, cryptography, and utilization of the blockchain.

The potential of personal genomics has been clear for some time. By sequencing the DNA of many individuals, the hope is that the diagnosis and treatment of existing diseases will be improved, along with finding ways to prevent future health problems. Personal genomics potentially allows personalized therapies, tailored precisely to individuals on the basis of their DNA. And by combining millions of genomes it will be possible to understand diseases better, and come up with new drugs to treat them.

Current approaches have significant problems. The cost of sequencing an individual’s complete DNA has dropped significantly in recent years to around $1000, and is expected to fall below $100 in the near future. However, the equipment required to do so is still expensive, which means that sequencing is typically carried out by a few large organizations. They not only store the results in a central database, which represents a security risk for such sensitive information, but they typically retain ownership of that sequence data. Financial benefits from discoveries made using the genomic data also generally stay with the organizations or companies that hold the DNA. Nebula Genomics hopes to address all those problems using blockchain technology.

The central idea of the company’s approach is that individuals retain ownership and control of their sequenced genomic data, but sell access to it in a secure way. All data-sharing records are stored immutably in the Nebula blockchain, which is based on Ethereum, and plays a key role in mediating transactions between the individual and the companies that wish to use the genomic data. These will typically be pharmaceutical and biotech companies. Currently, they are buying DNA data in bulk from existing genomics companies, or are setting up their own sequencing programs.

Using the Nebula network, individuals would be paid by companies for access to their DNA using tokens purchased by the latter from Nebula Genomics, with fiat money. While sharing data and receiving payments, individuals remain pseudo-anonymous. Nebula network addresses are cryptographic identifiers that are not associated with any personal information. Individuals would in turn use the tokens to pay Nebula Genomics for the genome sequencing. In addition, companies could pay individuals tokens for completing surveys that provided health information to be used alongside their genomic data. The use of Ethereum smart contracts allows companies to create customized surveys:

data buyers may choose to pay all survey participants an equal amount of Nebula tokens or alternatively define different token amounts that will be awarded for different combinations of responses. For example, if a survey participant is found to be affected by a condition that is of interest to the data buyer, the highest token reward will be automatically paid out. Responses that suggest that the survey participant is not affected by the condition in question will trigger a lower token payment. Contradictory responses indicating dishonesty will not be rewarded.

The main Nebula network is built on top of the Blockstack framework: “an open-source effort to re-decentralize the internet; it builds a new internet for decentralized applications and enables users to own their application data directly.” It, too, depends on blockchain technology for critical aspects:

Identity is user-controlled and utilizes the blockchain for secure management of keys, devices and usernames. When users login with apps, they are anonymous by default and use an app-specific key, but their full identity can be revealed and proven at any time. Keys are for signing and encryption and can be changed as devices need to be added or removed.

Under the hood, Blockstack provides a decentralized domain name system (DNS), decentralized public key distribution system, and registry for apps and user identities.

The user-centric approach of Blockstack fits well with the philosophy behind Nebula Genomics. As a result, individuals not only retain control of their sequence data but are free to store it on any service that supports the Blockstack storage system. This portability ensures that users of the Nebula platform are not locked into the company, and can use their data outside the Nebula network. Nebula’s DNA software will be available as a Blockstack distributed app that is executed locally on a user’s personal data, allowing individuals to analyze their own DNA.

When other companies are granted permission by individuals to use personal genomic data within the Nebula Genomics framework, there are additional privacy safeguards. Once access to an individual’s DNA sequence has been purchased and recorded in the Nebula blockchain, the data is sent in an encrypted form to special compute nodes, which use two advanced technologies to protect sensitive personal data: Intel’s Software Guard Extensions (SGX) and homomorphic encryption.

SGX operates by allocating hardware-protected memory where code and data reside, called an enclave. By restricting processing of genomic data to enclaves on compute nodes, the risk of privacy loss is reduced. SGX can be combined with homomorphic encryption of DNA sequences to speed up certain operations. Homomorphic encryption allows data to be pre-processed without needing to decrypt it. It is then passed to an enclave for decryption and analysis. The Blockstack underpinning means that secure compute nodes can be operated by Nebula Genomics; on the servers of companies that have bought access to an individual’s genomic data; or through any third party that complies with the overall architecture.

As the above indicates, Nebula Genomic’s platform addresses several key challenges involving the storage and processing of highly personal data in a way that leaves the individual in control. Since similar problems exist across most industries, this suggests that these kinds of blockchain-based frameworks could be applicable far beyond the world of DNA sequencing and analysis.

Featured image by Nebula Genomics.

Glyn Moody

Glyn Moody is a freelance journalist who writes and speaks about privacy, surveillance, digital rights, open source, copyright, patents and general policy issues involving digital technology. He started covering the business use of the Internet in 1994, and wrote the first mainstream feature about Linux, which appeared in Wired in August 1997. His book, "Rebel Code," is the first and only detailed history of the rise of open source, while his subsequent work, "The Digital Code of Life," explores bioinformatics - the intersection of computing with genomics.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.