Molecular triangulation: bridging linkage and molecular-network information for identifying candidate genes in Alzheimer's disease.


A major challenge in human genetics is identifying the molecular basis of common heritable disorders. In contrast to rare single-gene diseases, multifactorial disorders are thought to arise from the combined effect of multiple gene variants, such that any single variant may have only a modest effect on disease susceptibility. We present a method to identify genes that may harbor a significant proportion of the genetic variation that predisposes individuals to a given multifactorial disorder. First, we perform an automated literature analysis that predicts physical interactions (edges) among candidate disease genes (seed nodes, selected on the basis of prior information) and other molecular entities. We derive models of molecular networks from this analysis and map the seed nodes to them. We then compute the graph-theoretic distance (the minimum number of edges that must be traversed) between the seed nodes and all other nodes in the network. We assume that nodes that are found in close proximity to multiple seed nodes are the best disease-related candidate genes. To evaluate this approach, we selected four seed genes, each with a proven role in Alzheimer’s disease (AD). The method performed well in predicting additional network nodes that match AD gene candidates identified manually by an expert. We also show that the method prioritizes among the seed nodes themselves, rejecting false-positive seeds that are derived from (noisy) whole-genome genetic-linkage scans. We propose that this strategy will provide a valuable means to bridge genetic and genomic knowledge in the search for genetic determinants of multifactorial disorders.

Proceedings of the National Academy of Sciences of the United States of America