Reconstructing demography and natural selection from high-coverage complete genomes: application to the GCAT dataset

Research Leader:
Francesc Calafell and Elena Bosch

The main evolutionary forces, and, in particular, natural selection and demography (changes in effective population size through time) leave a detectable footprint in the human genome that can be recovered with different methods. GCAT has produced a unique resource: >800 complete genomes, sequenced at ~30X coverage from volunteers residing in Catalonia.

This project propose to retrieve the evolutionary information contained in them with a double scope: reconstructing the demographic history of this population, and detecting putative genome regions that may have been under natural selection. We aim to estimate the historical trajectory of effective population size, with increases (population expansions) and decreases (to the point of genetic bottlenecks, though they are unlikely to have happened here) through time. By comparing shared haplotypes with reference populations, we can detect external contributions brought by immigration (such as the French diaspora of the Religion Wars of the 16th and 17th centuries).

As for natural selection, we aim to identify the genetic footprints of recent adaptations on the Catalan population and recognize the selective pressures that drove them. For that, we will use classical neutrality tests for the detection of strong selective sweeps as well as specific approaches for the detection of polygenic adaptation. By exploring selective signals and subtle changes in allele frequencies in aggregate at the multiple loci contributing to immunological and metabolic traits we aim to recognize the genomic footprint of relevant epidemics in our recent history such as the Plague or the 1918 Influenza pandemic and of potential past famine periods.

This project will leverage on the whole genome information produced by GCAT to infer the demographic history and the biological adaptations of the population currently residing in Catalonia. Since this population has heterogeneous recent geographical origins, we need to take into account what is known about the GCAT volunteers regarding these origins, name their birthplaces and those of their parents and grandparents