Intriguing papers that were published in the previous month, with highlights.
Functional annotation
“Here, we integrated DNaseI footprinting data with sequence-based transcription factor (TF) motif models to predict the impact of a genetic variant on TF binding across 153 tissues and 1,372 TF motifs … As an example, the enrichment for LDL level-associated SNPs is 9.1-fold higher among SNPs predicted to affect HNF4 binding sites than in a background model already including tissue-specific annotation.”
“Functional annotations can be used to improve power to detect associations. To increase power by incorporating the information that these annotations hold, we suggest a weighting scheme based on the observed enrichment using a weighted Bonferroni correction similar to that suggested by Roeder et al. for linkage analysis … we note that the number of significant signals using the enrichments as weights (n = 166) was not far from being optimal (n = 185), and this approach had more power than the standard Bonferroni method (n = 146) to detect associations … The commonly accepted P = 5 × 10^−8 threshold is outdated and will not be applicable in future GWAS.”
“Here we develop an unsupervised approach to integrate these different annotations into one measure of functional importance that, unlike most existing methods, is not based on any labeled training data. We show that the resulting meta-score has better discriminatory ability using disease-associated and putatively benign variants from published studies (in both coding and noncoding regions) than the recently proposed CADD score.”
Gene expression
“After sequencing 2-kb promoter regions of 472 genes in 410 healthy adults, we performed a quadratic regression of rare variant count on bins of peripheral blood transcript abundance from microarrays. The overall burden test results are consistent with rare and private regulatory variants driving high or low transcription at specific loci, potentially contributing to disease.”
GWAS
G = E: What GWAS Can Tell Us about the Environment, Gage et al. PLOS Genet
“as large, richly phenotyped cohort studies (e.g., UK Biobank) emerge, it will become possible to identify modifiable exposures from genetic data and to dissect those pathways within the same cohort … A failure to appreciate this point will hamper our ability to translate the results of GWAS into health benefits, by focusing attention on possible biological pathways when, in fact, the target for intervention could be a modifiable environmental or behavioural exposure.”
Schizophrenia risk from complex variation of complement component 4, Sekar et al. Nature
“Schizophrenia’s strongest genetic association at a population level involves variation in the major histocompatibility complex (MHC) locus, but the genes and molecular mechanisms accounting for this have been challenging to identify. Here we show that this association arises in part from many structurally diverse alleles of the complement component 4 (C4) genes … These results implicate excessive complement activity in the development of schizophrenia and may help explain the reduced numbers of synapses in the brains of individuals with schizophrenia.”
A Robust Example of Collider Bias in a Genetic Association Study, Day et al. AJHG
“In summary, we have demonstrated that adjusting for causally associated covariates can create apparently highly robust, but actually biologically spurious, associations. The extent of this collider bias is almost perfectly inversely related to the strength of the exposure-collider association. Consideration of causal inference modeling and unadjusted test statistics is therefore of great importance in the design and interpretation of genetic (and non-genetic1) association studies.”
Popgen
Ancient gene flow from early modern humans into Eastern Neanderthals, Kuhlwilm et al. Nature
“We conclude that in addition to later interbreeding events, the ancestors of Neanderthals from the Altai Mountains and early modern humans met and interbred, possibly in the Near East, many thousands of years earlier than previously thought.”
“A script necessary to convert the input produced by samtools v0.1.19 to be compatible with PLINK was not run when merging the ancient genome, Mota, with the contemporary populations SNP panel, leading to homozygote positions to the human reference genome being dropped as missing data (the analysis of admixture with Neandertals and Denisovans was not affected) … the geographic extent of the genetic impact of this migration was overestimated: The Western Eurasian backflow mostly affected East Africa and only a few Sub-Saharan populations; the Yoruba and Mbuti do not show higher levels of Western Eurasian ancestry compared to Mota.”
The Kalash Genetic Isolate? The Evidence for Recent Admixture, Hellenthal et al. AJHG
“These observations indicate that, contrary to the claim of Ayub et al. that the ancestors of the Kalash have been isolated from the ancestors of other extant populations for over 8,000 years, there is in fact strong evidence that they have not been isolated over this time frame.”
“we sequenced the genomes of four Biaka Pygmies … we fit models using the joint allele frequency spectrum … Our two best-fit models both suggest ancient divergence between the ancestors of the farmers and Pygmies, 90,000 or 150,000 yr ago.”
“Our inference method rejects the hypothesis that the ancestors of [anatomically modern humans] were genetically isolated in Africa, thus providing the first whole genome-level evidence of African archaic admixture. Our inferences also suggest a complex human evolutionary history in Africa, which involves at least a single admixture event from an unknown archaic population into the ancestors of AMH, likely within the last 30,000 yr.”