Chromosomes: The ENCODE Project seeks to unravel mysteries of the human genome. ( )
Chromosomes: The ENCODE Project seeks to unravel mysteries of the human genome. ( )

Translating the human instruction manual

a/Science & Technology by

Humans are a complicated bunch. We all have different hopes and aspirations, likes and dislikes, dreams and realities. We like to think of ourselves as the most complicated and advanced species on this earth.

Given this presumed complexity, the results of the Human Genome Project—a massive effort to record every sequence of human DNA, and a precursor of the ENCODE project—showed a pitiful number of protein coding genes, which came as a surprise.

The Human Genome Project was expected to demonstrate that humans have a huge number of genes in our DNA, but as it turns out, we have only about 21,000 protein coding genes. Drosophila, the annoying little flies that hover over our overripe fruit, have 15,000. Daphnia, a water flea barely visible to our eye has 31,000. So what makes a human so complex?

The Human Genome Project was intended to uncover the blueprint of our species—the instructional manual that would tell us who we are and how we work. Disappointingly, the information discovered was incomplete; the project gave us only the index page of the human instruction manual. These 21,000 genes that set us apart account for little more than one per cent of our entire genome. The remaining 99 per cent  is often referred to as ‘junk.’ Identifying a role for this ‘junk DNA’ was the task undertaken by the ENCODE (Encyclopedia of DNA Elements) consortium.

The ENCODE consortium, formed soon after the publication of the Human Genome Project, includes 32 research groups and 440 scientists. ENCODE published their work in 30 journals earlier this month.

One of the key findings of the ENCODE project was that a large portion of our genome has a functional role—as high as 80 per cent, though some believe this number is exaggerated. According to critics, the real proportion of our genome that is functionally active is around 20 to 30 per cent.

Nevertheless, this functionally-active portion of our genomic expanse is encoded for multiple regulatory sites, thereby controlling the production of the previously identified 21,000 genes.

“Think of these regulatory sites as switches,” Daniel Tessier, vice-president of the Technology Centres at McGill University and Genome Quebec, said. “The Human Genome Project gave us a two-dimensional version of the human genome, while ENCODE gives us a three-dimensional version of our genome. We now know the regulatory hotspots that control our protein coding genes.”

The preliminary data from the ENCODE project was first published in 2007. Since then, advances in techniques and tools have enabled researchers to further delve into the vast expanses of our genome. The work involved investigating how the four nucleotides: A, C, T and G—the backbone of our DNA—form regulatory sequences that interact with other sites of our genome.

The work published this month arises from studies on multiple laboratory cell lines, “the appropriate model system,” according to Tessier, “since we can’t really go around sampling humans.” The work of the ENCODE project is far from complete, and the next steps will involve human samples.

“For instance,” Tessier said, this would involve “taking two different biopsies from a person suffering from cancer, so that we can compare and contrast changes in the regulatory regions, and how these relate to the development of disease.”

Tessier said that work such as this could lead to personalized medicine.

“In 10 years, we might all have a little chip in our health care cards that holds our personal human genome sequence,” he said. “Personalised medicine is a buzzword right now, but it is what the future holds. This work sets the foundation for the future, similar to how we are now enjoying the fruits of technology that were created a couple of decades ago.”

There is some criticism leveled at the ENCODE consortium, notably on their broad definition of ‘functional’ that resulted in the large figure of 80 per cent. Additionally, according to CNN, the ENCODE project has already recieved over $288 million in funding. Given the amount of work left to fully map out the three-dimensional structure of our genome, completing ENCODE could be costly. Despite these drawbacks, researchers argue that the data coming out of this project are fundamental to understanding the true nature of our genomes.

“The Human Genome Project was an endeavour similar to putting man on the moon,” Tessier said. “It is just as elaborate, extravagant and forward thinking.”

This month’s deluge of data on the ENCODE project is only the beginning; there are still an infinite number of questions to be asked about the vast universe that lies in each and every one of our cells. The ENCODE Project puts us on the cusp of an incredibly exciting phase in our understanding of the human species.