Better late than never
GWAS/causal mechanisms
The Supplemental Materials to this paper contain the most thorough and clear description of cutting-edge GWAS analyses I have read of late
“Here we describe an alternative experimental approach to identify functional risk variants based on three recent innovations in genetics and molecular biology: (i) the prioritization of GWAS-identified risk variants in regulatory elements such as distal enhancers annotated based on genome-scale epigenetic data; (ii) the generation of genetically controlled isogenic pluripotent stem cell lines in which specific disease-associated genetic variants are the sole modified experimental variable using efficient gene-editing technologies such as the CRISPR/Cas9 system; and (iii) the analysis of cis-acting effects of candidate variants on allele-specific gene expression through deletion or exchange of disease-associated regulatory elements.”
“It is also striking to note how many genetic variants influence multiple traits but without a consistent correlation in effect sizes … Another possibility is that a given genetic variant often influences the function of multiple cell types through separate molecular pathways or that the effects of a variant on two related phenotypes vary according to an individual’s environmental exposures.”
Regulatory mechanisms
“For most traits, evidence of increased connectivity between perturbed genes extended to variants that did not pass the genome-wide significance threshold, indicating that regulatory network information will be useful for prioritizing candidate variants.”
“Altogether, 3,601 of our bQTLs have been previously implicated by GWAS, either directly or indirectly via LD (r2 > 0.8 in YRI). These represent 2,282 different disease-associated variants, 995 (44%) of which are associated with bQTLs for multiple factors. Interestingly, this is 8.0-fold higher than the overall fraction of bQTLs associated with multiple factors, suggesting that variants affecting multiple TFs are more likely to impact phenotypes … intersecting GWAS loci with all binding sites of a TF may yield misleading results because these overlaps are dominated by SNPs with no effect on TF binding.”
“Surprisingly, enhancers with low-affinity binding sites can mediate robust tissue specific patterns of gene expression when they are organized with optimal syntax. Such enhancers may be a vastly underappreciated feature of the regulatory genome.”
“We introduce an open source package Basset to apply CNNs [deep convolutional neural networks] to learn the functional activity of DNA sequences from genomics data. We trained Basset on a compendium of accessible genomic sites mapped in 164 cell types by DNase-seq and demonstrate greater predictive accuracy than previous methods. Basset predictions for the change in accessibility between variant alleles were far greater for GWAS SNPs that are likely to be causal relative to nearby SNPs in linkage disequilibrium with them.”
“The resulting models accurately predict individual enhancer–promoter interactions across multiple cell lines with a false discovery rate up to 15 times smaller than that obtained using the closest gene … Most of this signature is not proximal to the enhancers and promoters but instead decorates the looping DNA.”
Gene expression
RNA splicing is a primary link between genetic variation and disease, Li et al. Science
“About ~65% of expression quantitative trait loci (eQTLs) have primary effects on chromatin, whereas the remaining eQTLs are enriched in transcribed regions … splicing QTLs are major contributors to complex traits, roughly on a par with variants that affect gene expression levels.”
Imputing Gene Expression in Uncollected Tissues Within and Beyond GTEx, Wang et al. AJHG
“By analyzing data from nine selected tissue types in the GTEx pilot project, we demonstrated that harnessing expression quantitative trait loci (eQTLs) and tissue-tissue expression-level correlations can aid imputation of transcriptome data from uncollected GTEx tissues. More importantly, we showed that by using GTEx data as a reference, one can impute expression levels in inaccessible tissues in non-GTEx expression studies.”