2014 Volume 41 Issue 1 Pages 41-64
While randomized controlled experiments are often considered the gold standard for predicting causal relationships between variables, they are expensive if one is interested in understanding the complete set of causal relationships governing a large set of variables and it may not be possible to manipulate certain variables due to ethical or practical constraints. To address these scenarios, procedures have been developed which use conditional independence relationships among variables when they are passively observed to predict which variables may or may not be causally related to other variables. Until recently, most of these procedures assumed that the data consisted of a single i.i.d. dataset of observations, but in practice researchers often have access to multiple similar datasets, e.g. from multiple labs studying the same problem, which measure slightly different variable sets and where recording conventions and procedures may vary. This paper discusses recent state of the art approaches for predicting causal relationships using multiple observational and experimental datasets in these contexts.