DNA has more than genes; non-coding regions get into spotlight

Somdatta Karak

A global collaborative study published in Nature emphasises investigating non-coding regions of DNA to understand what defines primates, and distinguishes humans among them. The study holds significance not only for human health but also for the conservation of endangered primate species, providing insights into genetic health and population dynamics.

Somdatta news titleimage
Sub-adult male lion-tailed macaque from Western Ghats, PC: G Umapathy.

What defines a primate from other mammals? What sets a human apart among other primates? While the outward looks set the premise, scientists have wanted to know where those differences stem from. Earlier studies on genes, DNA regions that code for proteins, to look for these answers have not found a definitive answer. The 20,000 genes present in humans are also there in other primates, for example.

Hence, a global collaborative work, now published in Nature, argue for looking at the non-coding regions of the DNA to solve these mysteries. These are regions of DNA that regulate gene function by increasing or decreasing the gene’s expression or allow for making different variants of a protein from the same gene. The study was led by Kyle Kai-How Farh at Illumina Artificial Intelligence Laboratory, Illumina, San Diego, USA, Tomas Marques Bonet at Institució Catalana de Recerca i Estudis Avançats, Barcelona, Spain and Jeffrey Rogers at Baylor College of Medicine, Houston, USA, with contributions of Indian primate samples from G Umapathy at CSIR-Centre for Cellular and Molecular Biology, Hyderabad, India.

The strategy of their study was conceptually straightforward – to compare the genomes of primates, including humans and other mammals, and find if any signatures are unique to different animal groups.

Finding signatures would mean those regions of DNA hold information that is key to those families of animals. But they had to first procure high quality genome sequences of primates, representing its vast diversity of species. 

Until this study, genome sequences were available for only 52 primate species out of 500+ known species. The research team needed to collect primate genome samples from the different tropical countries where primates are found. This spans the Central and South Americas, Africa and Madagascar, and South and South-East Asia. This endeavour brought over hundred scientists across these different continents together to put together genomes of primates available in their regions, and create a database of genome sequences of 239 primate species covering 86% genera from all primate families. 

Of these, 17 species are found in India, where most of them are endemic to our country and are threatened with extinction. G Umapathy, Chief Scientist at the Laboratory for the Conservation of Endangered Species-CCMB, and Manu Shivakumara and Mihir Trivedi, PhD researchers, collected the samples and sequenced them. 

Comparing these DNA sequences, we found 111,318 non-coding regions to be conserved only in primate genomes that are not present in other mammals. We call these as primate-specific evolutionarily constrained sequences. The significance of most of these genomic regions was previously unclear, but our experimental evidence supports their role in regulating gene expression. We also find 93 mutations in these regions known to be associated with complex human traits and disorders,” said Manu.

Igor Ulitsky, Professor at the Weizmann Institute of Science is an expert who utilises experimental methods and computational biology to understand non-coding DNA. He said, 

Until recently, only few genomes of non-human primates were available and their quality was poor. Recent developments in the scale and affordability of DNA sequencing enable reconstruction of hundreds of primate genome with high quality.

This study takes the computational analysis of these sequences to the next level, by building and analysing a comparative map of these genomes, with a particular focus on identifying specific regions and letters within the genomes that are evolving either more slowly than others or than expected by chance, indicating that these sequences are important within primates. He also added, The data allows the authors to call regions that are evolving slowly and are, thus, probably functionally important at unprecedented resolution. This enables more powerful interpretation of the variation occurring within human populations, and its effects on differences between humans.”

Not limited to human health, Umapathy believes this study will be important for conservation of endangered primates as well. He said, Mutations arising in these constrained DNA regions and their prevalence can tell us about the genetic health of populations of endangered species.”

Written By

Somdatta is trained in life sciences, and loves working with educators and students. She is an ex-Teach for India fellow. In her current role, she leads science communication and public outreach at CSIR-Centre for Cellular and Molecular Biology, Hyderabad. She …