Measures of Early-life Behavior and Later Psychopathology in the LifeCycle Project - EU Child Cohort Network: A Cohort Description

Background The EU LifeCycle Project was launched in 2017 to combine, harmonize, and analyze data from more than 250,000 participants across Europe and Australia, involving cohorts participating in the EU-funded LifeCycle Project. The purpose of this cohort description is to provide a detailed overview of the major measures within mental health domains that are available in 17 European and Australian cohorts participating in the LifeCycle Project. Methods Data on cognitive, behavioral, and psychological development has been collected on participants from birth until adulthood through questionnaire and medical data. We developed an inventory of the available data by mapping individual instruments, domain types, and age groups, providing the basis for statistical harmonization across mental health measures. Results The mental health data in LifeCycle contain longitudinal and cross-sectional data from birth throughout the life course, covering domains across a wide range of behavioral and psychopathology indicators and outcomes, including executive function, depression, ADHD, and cognition. These data span a unique combination of qualitative data collected through behavioral/cognitive/mental health questionnaires and examination, as well as data from biological samples and indices in the form of imaging (MRI, fetal ultrasound) and DNA methylation data. Harmonized variables on a subset of mental health domains have been developed, providing statistical equivalence of measures required for longitudinal meta-analyses across instruments and cohorts. Conclusion Mental health data harmonized through the LifeCycle project can be used to study life-course trajectories and exposure-outcome models that examine early life risk factors for mental illness and develop predictive markers for later-life disease.

Effects of early-life exposures on later-life mental health are well known, but more research to understand and elucidate the pathways from stressors to outcomes is needed. The LifeCycle Project -EU Child Cohort Network, a Horizon 2020 project, is a pan-European and Australian initiative comprised of 19 pregnancy and birth cohorts, established to study exposure-tooutcome associations and trajectories across the life course (https://lifecycle-project.eu/). 1 In general, studies in LifeCycle aim to construct developmental trajectories, develop risk assessment models, measure developmental adaptations, and evaluate mediating epigenetic effects to better understand the consequences of early-life exposures to stressors for risk factors and diseases in adulthood. The large sample sizes achieved through this consortium facilitate high statistical power needed for increased accuracy of estimates and more robust findings.
Mental health is one of the main outcomes within the LifeCycle Project. 1 While mortality rates for many noncommunicable diseases have steadily declined in some populations over the past few decades, such as coronary heart disease 2,3 and chronic obstructive pulmonary disease, 4 the global burden of mental illness is on the rise. 5 The impact of mental illness on disability and socioeconomic prosperity is increasing around the world, and it is predicted that mental illness will contribute more to disability-adjusted life years (DALYs) than any other category of diseases by the year 2030. 6 An understanding of how mental health impacts and mediates disease risk and prognosis for other conditions is also beginning to emerge, with recent meta-analyses revealing significantly higher risks for cardiovascular 7 and metabolic 8 diseases linked to severe mental illness.
This cohort description focuses on the extensive work done to catalogue and harmonize variables related to cognitive, behavioral, and psychological development within the broader LifeCycle consortium. 1 It is well-recognized that experiences in early life play an important part in shaping later mental health, 9 and the data within the LifeCycle Project permit analyses of these associations. LifeCycle includes many pregnancy and birth cohorts that prospectively collected data on offspring from conception and across different ages of child, adolescent, and adult development. The availability of data from multiple followup assessments is essential for probing questions about causality and linking early-life stressors with later life mental health symptoms and outcomes.
The mental health studies in LifeCycle aim to investigate epidemiological interrelations between early-life exposures, behavior, and cognition, with later mental and physical health. Towards this end we have harmonized measures from 17 LifeCycle cohorts to enable studies that examine how environmental stressors in utero and in early childhood affect, or are associated with, psychological trajectories, behaviors, and mental outcomes throughout childhood, adolescence, and adulthood. Additionally, we are examining the nature and degree of mediation of these associations through epigenetic changes and brain development (Figure 1). To our knowledge, the data compiled for these studies within LifeCycle represents the largest ongoing consolidation of childhood behavior, psychopathology, and cognition data to date, encompassing more than 200 multidimensional and multi-informant established mental health measures collected from at least 250,000 participants. The geographic coverage is broad, spanning across much of northern, western, central, and southern Europe, as well as Western Australia (Figure 2). Mental health data from more than 250,000 children are available (as of June 2021), including either mother-child or mother-father-child cohorts, and the study population is diverse with respect to the age of the participants, cohort types, and data collection periods ( Table 1)   The participating cohorts include child participants with follow-up data ranging from birth until adulthood (Table 2). Questionnaires, medical records, doctor diagnoses, and registries were variably used across the cohorts to collect data at different ages, but all of the cohorts collected baseline data during pregnancy or at birth and included a follow-up data collection at least once by the time the child participant was 24 months of age. Although the regularity of follow-up differs substantially across cohorts, ranging from annually to many years apart, at least half of the cohorts performed some type of follow-up data collection for all incremental age groups up until 6 years of age. The overlapping age ranges enable comprehensive comparative analyses of mental health constructs between and within the populations to which these index children belong.

Main outcome measures Psychological, motor, and cognitive measures
Mental and cognitive disorders comprise some of the most frequently diagnosed conditions in children under 18 years of age. The combined data resource will contain information pertaining to the children from more than 200 mental health measures, covering eight clinical domains across 60 dimensions (eTable 1). A majority of these measures assess domains under a broad banner of 'mental health', encompassing psychological, cognitive and behavioral functions and development (67.0%; 136 of 203) and covering dimensions such as neurodevelopmental disorders, internalizing and externalizing symptoms, temperament, and mental diagnoses. Further domains include language skills (31.0%; 63 of 203), executive functions (29.1%; 59 of 203), memory (11.3%; 23 of 203) and general intelligence (8.4%; 17 of 203) (eTable 1). There are many commonalities between mental health domain-types and significant overlap in the age groups with measures in specific domains ( Figure 3). This makes it possible to harmonize the data. 30 Most of the cohorts continue to follow their participants, and the availability of harmonized data will tend to increase with time.
There are a number of approaches to harmonize data, and several of these have been described and successfully implemented in large collaborations. 10,[31][32][33] The LifeCycle Project has developed a protocol to generate harmonized variables across a selection of important cognitive and mental health domains. This harmonization approach creates standardized scores and percentiles for important domains, such as internalizing and externalizing symptoms, ADHD and ASD symptoms and diagnosis, and language and motor functions. Percentiles and standardized scores were used, as they allow the pooling of mental health outcome data collected using different scales or instruments. One of the biggest harmonization challenges this project faced was obtaining a thorough inventory of the available mental health data in individual cohorts, which was overcome by mapping the available data by instrument, measure, age group, and domain. A subset of cohorts has also employed items from the same mental health, cognitive, and motor function measures, and these data can be pooled or co-analyzed without the need for harmonization ( Figure 4). All of the measures harmonized thus far by age and cohort can be found in the LifeCycle online catalogue (https://catalogue.lifecycle-project.eu/).

PLANNED ANALYSES
Early-life exposureslifestyle, migration, socioeconomic, and urban environment The LifeCycle online catalogue 10 also contains information on harmonized data on diverse measures of exposures early in life. These will enable the analysis of risk models for mental health that assess the nature and impact of indirect and direct exposures experienced in early life and comorbidities on adverse mental health symptoms and other health conditions. Comprehensive exposure-outcome analyses will also be used to develop predictive markers for mental health in children and adolescents, which may help shape the prediction of mental disorders, allowing for targeted early intervention. ALSPAC follow-up data is based on number of parents completing at least some of the questionnaire(s) on young person up to age 7 years, and number of children attending clinic from age 7 years and onwards. b CHOP follow-up data is based on number of children with at least one anthropometric measurement at the considered age. c DNBC follow-up data at 5 years based on a subsample, selected based on parental alcohol characteristics. d Parent-reported data. e Self-reported data. f DNBC data collection for 18-year follow-up is currently ongoing. g EDEN follow-up data is based on number of children with at least one neurodevelopment assessment at the considered age. h Teacher-reported data. i Clinical data. j NINFEA baseline data refers to no. pregnant women recruited. Nader JL, et al.

Mediating pathways -brain development
Early life is a particularly vulnerable time-window for brain development. The vital stages of neurogenesis, proliferation, and migration occur almost exclusively during fetal development, and experience-dependent brain connectivity (ie, myelination) is largely shaped and completed in early childhood. 34 Research-based evidence has repeatedly linked brain structure, volume, and connectivity indicators to a number of behavioral and cognitive outcomes. [35][36][37] However, study samples are often limited in size and population diversity, and only few longitudinal studies exist. 38 A subset of cohorts in LifeCycle have participant data on structural brain imaging (ALSPAC, n = 950; Generation R, Early-life Behavior and Later Psychopathology in LifeCycle n ≈ 4,000 20 ; NFBC1966, n = 1,000; NFBC1986, n = 600), and will be contributing information on neuroanatomical markers, such as total brain volume, cortical grey matter, white matter volume, ventricular volume, and volumes of subcortical brain structures, including the hippocampus and amygdala. In addition, structural and functional connectivity metrics have been assessed. Data have been collected through neuroimaging techniques, such as fetal ultrasound and magnetic resonance imaging (MRI) in childhood and adulthood. These data enable LifeCycle to describe changes in structural and functional development of the brain from fetal life and infancy and to subsequently associate this brain development in early life with psychopathology outcomes in childhood, adolescence, and adulthood.

Mediating pathways -epigenetics
An increasing number of studies are beginning to demonstrate the importance of epigenetic modification in mediating the risk of disease, including mental health outcomes. Epigeneticallymodified loci have been linked to a wide range of mental disorders, such as schizophrenia, 39 as well as childhood onset disorders, such as ADHD 40 and ASD, 41 but conflicting and nonreplicated associations mean that the causal relationships remain poorly understood. 42 LifeCycle mental health studies can currently analyze DNA methylation data on 14,368 offspring cohort participants ( Figure 5), measured at birth (cord or placenta blood; N = 7,783), childhood (0-12 years; N = 3,055), adolescence (12-18 years; N = 2,680), or adulthood (>18 years; N = 850). Six of the thirteen contributing cohorts additionally contain longitudinal epigenetic data (ALSPAC, CHOP [multiple age groups in childhood], EDEN, Generation R, INMA, and RHEA). The particular focus will be to identify epigenetic mechanisms that mediate the effect of early-life exposures on behavioral and cognitive development, as well as mental health outcomes, such as ASD, ADHD, depression, and anxiety. This means it will be possible to track epigenetic changes in participants with behavioral and/or neurodevelopmental outcomes across time and study causal relationships between environmental exposures in pregnancy or early life and laterlife mental health outcomes mediated by DNA methylation.

Framework for collaborative analyses
LifeCycle aims to perform most of the analyses through DataSHIELD. 43,44 With the recent launch of the platform and its analytical features for use with LifeCycle harmonized data, a number of novel collaborative studies have begun to form within the theme of mental health. Examples of planned and ongoing exposure-outcome analyses include infant feeding patterns and school-age externalizing behaviors; maternal smoking in pregnancy and adverse child behaviors; associations among sleep, behavior, and cognition; sibling effects and prematurity; and socioeconomic inequalities and general mental health trajectories. Results from these studies are currently pending, but they have already shown that independent participant data resources have been successfully harmonized and can be coanalyzed. The quantity and breadth of mental health and cognitive data available that have been mapped and harmonized by the LifeCycle mental health research group is a singular resource to enable developmental studies of mental health. These data will play an important role in replicating previous findings with enhanced statistical power, expanding upon previous associations through larger and more diverse samples, and in the development of novel models to describe how multi-faceted early-life exposures can shape and influence the landscape of mental health in later life.

STRENGTHS AND LIMITATIONS
There are many strengths inherent in large consortia such as LifeCycle. 1 Key among these is that LifeCycle is building the EU Child Cohort Network, a sustainable research network that will enable continued exploitation of the LifeCycle data, metadata, and collaborative progress beyond the usual timelines of a funded grant. Another important strength is the ability to study age differences and age-related mental health and cognitive changes; this developmental aspect will help to understand the long-and short-term consequences of early-life exposures, and how other factors, such as epigenetic changes, may mediate later health outcomes. Geographic diversity is also a key feature; it provides enhanced location coverage and generalizability of results and also facilitates intra-and inter-population comparisons. This Early-life Behavior and Later Psychopathology in LifeCycle makes it possible to make more reliable causal inferences due to different confounding structures. The number of critical mental health domains covered is another strength, allowing for exposure-outcome research into many important and well-studied areas within this field. The availability of the harmonization protocols, coupled with the extensive overview of mental health measures, including detailed information on the dimensions and age ranges across cohorts, provides users with an integrated catalogue of psychological, cognitive, and psychomotor data in participating cohorts. Furthermore, the use of DataSHIELD enables a flexible and data-secure approach that allows new cohorts and centers to link into the analysis network and contribute with their own data, as well as the addition of newly harmonized data as these are collected and updated. This open-source analysis platform "takes the analysis to the data, not the data to the analysis", providing researchers with the ability to remotely analyze data from multiple datasets without being able to access the data itself. 44,45 Removing the need to physically share data externally means participating cohorts bypass ethical concerns related to the protection of privacy and other issues that arise when participant data are being sent internationally to multiple users, so it addresses some important ethico-legal considerations that are often associated with individual-level data sharing and analysis.
The heterogeneity of the psychological and cognitive measures available presents a potential limitation. Depending on the specific research question under investigation and measurement equivalence of constructs between different instruments, robust harmonisation 30,32 of certain measures may not be possible or may be limited to a small number of cohorts. This reduces the sample size or the range of participant ages that are possible to include. Within-country geographical bias of many of the cohorts may also present a weakness. Specifically, the urban-centric nature of many of the studies could mean that the generalizability of findings will be somewhat skewed, and the population-level inferences will need to take this bias into account. Furthermore, DNA methylation and brain imaging data are only available for less than 10% of the total study participants. These smaller sample sizes may limit the number and strength of associations that can be found, as well as the distribution of participant ages and geographic and ethnic origins. However, the cohort studies are continuously expanding and adding new data on their participants, including phenotypic, genetic, epigenetic, and biological data. The collaborative groundwork laid by LifeCycle will make it possible to continue building upon the analyses that have been performed and help to mitigate some of the limitations that have been described.

DATA ACCESS
LifeCycle has developed an application procedure for data use proposals as described by Jaddoe et al. 1 It should be noted that approvals for data use and associated fees remain under the purview of the participating cohorts. This is the case regardless of whether one applies through LifeCycle or directly to the cohort, and these practices may vary across cohorts. The project strives to conduct as many analyses as possible within DataSHIELD. DataSHIELD is freely available to download and use (http://www.datashield. ac.uk/). This enables external cohorts to collaborate with LifeCycle and perform co-analyses. For more information, please visit the official website for the LifeCycle Project (https://lifecycle-project. eu/), or refer to the consortium design paper. 1 In some cases, data sharing and transfer agreements will need to be developed. These may vary due to country-specific practices and restrictions, as outlined by local General Data Protection