We are in the era of Big Data in human genomics: a vast treasure-trove of information on human genetic variation either is or will soon be available. This includes older projects such as the HapMap, and 1000 Genomes to the in-progress 100,000 Genomes UK. Two technologies have made this possible: the advent of massively parallel “next generation” sequencing where each individuals’ DNA is fragmented and amplified into billions of pieces; and powerful computational algorithms that use these fragments (or “reads”) to identify all the “variants” – any changes that are different to the “reference genome” – in each individual.
With existing tools this has become a relatively straightforward task. Identification of single nucleotide polymorphisms or variants (SNVs) – single base differences between the individual and the reference genome – especially medically relevant ones – is beginning to become routine. A project I worked on with…
View original post 780 more words