All Big Data is equal, but some Big Data may be more equal than others

Biosystems Analytics

We are in the era of Big Data in human genomics: a vast treasure-trove of information on human genetic variation either is or will soon be available.   This includes older projects such as the HapMap, and 1000 Genomes to the in-progress 100,000 Genomes UK.  Two technologies have made this possible: the advent of massively parallel “next generation” sequencing where each individuals’ DNA is fragmented and amplified into billions of pieces; and powerful computational algorithms that use these fragments (or “reads”) to identify all the “variants” – any changes that are different to the “reference genome” – in each individual.

With existing tools this has become a relatively straightforward task.  Identification of single nucleotide polymorphisms or variants (SNVs) – single base differences between the individual and the reference genome – especially medically relevant ones –  is beginning to become routine. A project I worked on with…

View original post 780 more words

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s