Matiasz NJ, Wood J, Wang W, Silva AJ, Hsu W
IEEE International Conference on Bioinformatics and Biomedicine, 2017.
Publication year: 2017

Biologists synthesize research articles into coherent models—ideally, causal models, which predict how systems will respond to interventions. But it is challenging to derive causal models from articles alone, without primary data. To enable causal discovery using only literature, we built software for annotating empirical results in free text and computing valid explanations, expressed as causal graphs. This paper presents our meta-analytic pipeline: with the “research map” schema, we annotate results in literature, which we convert into logical constraints on causal structure; with these constraints, we find consistent causal graphs using a state-of-the-art, causal discovery algorithm based on answer set programming. Because these causal graphs show which relations are underdetermined, biologists can use this pipeline to select their next experiment. To demonstrate this approach, we annotated neuroscience articles and applied a “degrees-of-freedom” analysis for concisely visualizing features of the causal graphs that remain consistent with the evidence—a model space that is often too large for a machine to compute quickly, or for a researcher to examine exhaustively.