The LEAD (Lung, Heart, Social, Body) Study: Objectives, Methodology, and External Validity of the Population-Based Cohort Study

Background The Lung, hEart, sociAl, boDy (LEAD) Study (ClinicalTrials.gov; NCT01727518; http://clinicaltrials.gov) is a longitudinal, observational, population-based Austrian cohort that aims to investigate the relationship between genetic, environmental, social, developmental and ageing factors influencing respiratory health and comorbidities through life. The general working hypothesis of LEAD is the interaction of these genetic, environmental and socioeconomic factors influences lung development and ageing, the risk of occurrence of several non-communicable diseases (respiratory, cardiovascular, metabolic and neurologic), as well as their phenotypic (ie, clinical) presentation. Methods LEAD invited from 2011–2016 a random sample (stratified by age, gender, residential area) of Vienna inhabitants (urban cohort) and all the inhabitants of six villages from Lower Austria (rural cohort). Participants will be followed-up every four years. A number of investigations and measurements were obtained in each of the four domains of the study (Lung, hEart, sociAl, boDy) including data to screen for lung, cardiovascular and metabolic diseases, osteoporosis, and cognitive function. Blood and urine samples are stored in a biobank for future investigations. Results A total of 11.423 males (47.6%) and females (52.4%), aged 6–80 years have been included in the cohort. Compared to governmental statistics, the external validity of LEAD with respect to age, gender, citizenship, and smoking status was high. Conclusions In conclusion, the LEAD cohort has been established following high quality standards; it is representative of the Austrian population and offers a platform to understand lung development and ageing as a key mechanism of human health both in early and late adulthood.


INTRODUCTION
Chronic respiratory diseases, such as chronic obstructive pulmonary disease (COPD) and asthma, are among the most prevalent, severe and costly human diseases. COPD currently is the 4 th cause of death worldwide and it is expected to be the 3 rd in a decade. 1 It affects about 10% of the adult population in European countries. However, estimates in Austria almost triple that figure. 2,3 Tobacco smoking is the main risk factor for COPD in developed countries but recent epidemiological studies indicate that about a quarter of individuals with chronic airflow limitation are never-smokers, and that early life events can also contribute to its pathogenesis. 4 The influence of other environmental exposures either at work or at home, in urban and rural populations, is unclear. 5 Likewise, asthma is the most prevalent chronic disease in childhood and affects about 5% of the population (children and adults). 6 It is estimated that there are 300 million asthma patients worldwide. 7 Recently, attention has focused on lung developmental issues during pregnancy, infancy, and adolescence as important drivers of respiratory diseases later in life. 8 Both genetic 9 and environmental factors, such as active 10 and passive smoking, 11 environmental pollution, 12 recurrent bronchopulmonary infections, 13 socioeconomic status, 14 occupation, and diet, can alter normal lung. Low birth-weight is associated with lower lung function later in life. 15, 16 Yet, the potential elements and pathogenic mechanisms that control and influence lung development are not well established. 13,17,18 The Framingham cohort study, among others, has shown that suboptimal lung function is the best predictor of respiratory and, of note, cardiovascular health during adulthood, 19 but the relationship between respiratory health and other health conditions is unclear. 20 Non-communicable diseases (NCDs) are the major global health problem of the XXI century. 21 NCD's are caused by complex gene-environment interactions across the lifespan, from fetus to old age 22,23 and include, among others, chronic respiratory diseases, cardiovascular diseases, metabolic diseases, osteoporosis, and neuropsychiatric diseases. NCDs often coexist in the same patient. 24 For instance, cardiovascular diseases, skeletal muscle dysfunction, osteoporosis, metabolic syndrome, and depression are highly prevalent in patients with COPD 25 and contribute significantly to limit their quality of life and prognosis. [26][27][28][29][30] Asthma is also associated with increased prevalence of comorbidities, albeit they frequently remain undiagnosed. 31 The LEAD (Lung, hEart, sociAl, boDy) study (ClinicalTrials. gov; NCT01727518; http:==clinicaltrials.gov) is a longitudinal, observational, population-based cohort that aims to investigate the relationship between genetic, environmental, developmental, and ageing factors influencing respiratory health and comorbidities through life. The general working hypothesis of LEAD is that the interaction of gene-environmental and socioeconomic factors influences lung development, the risk of occurrence of several major respiratory and cardiovascular diseases and other NCD's, as well as their phenotypic (ie, clinical) presentation. Accordingly, the general goal of LEAD is to provide valid scientific information that contributes to better understand how genetic, environmental and socioeconomic factors influence: (1) the normal and pathologic lung growth, development and ageing (ie, the natural history of lung function in normal and pathological conditions); (2) the risk of development of major NCD's like COPD, asthma, cardiovascular diseases, metabolic diseases (as diabetes and metabolic syndrome), osteoporosis, and neuropsychiatric diseases (mental disorders including anxiety, depression and impaired cognitive function); and, (3) the phenotypic heterogeneity and complex of chronic respiratory diseases, COPD in particular, and their relation with coexisting comorbidities. Within this general framework, different specific projects of LEAD will develop their own working hypothesis and specific goals in four domains (Lung, hEart, sociAl, boDy). Here, we describe in detail the methodology used and the external validity (ie, representativeness of the general Austrian population) of LEAD.

Ethics
The local Ethics committee of Vienna approved the study (protocol number: EK-11-117-0711). Participants signed informed consent; those for children under the age of 18 had to be signed by their parents=legal representative.

Study design
LEAD (ClinicalTrials.gov; NCT01727518; http:==clinicaltrials. gov) is a longitudinal, observational, population based cohort study with stratified samples from Vienna (urban population) and Lower Austria (rural population). Recruitment and first study visit started in February 2012 and finished in September 2016. A flow chart with numbers for initial recruitment, inclusion, and main measurements is presented as Figure 1. In total, 11,423 male and female, aged 6-80 years participated and completed the first visit, including main measurements as described in detail in Table 1. LEAD is directed by a Steering Committee, consisting of four Austrian academic clinical physicians (Marie-Kathrin Breyer, Robab Breyer-Kohansal, Sylvia Hartl, and Otto Burghuber). An international advisory board of five international respiratory experts (Alvar Agusti, Torben Sigsgaard, Michael Studnicka, Claus Vogelmeier, and Emiel Wouters) was convened. All of them discussed and agreed on the design of the study and interpretation of the results and have full access to the LEAD database.

Recruitment strategy
For the urban cohort, we used the national inhabitants' register to invite a randomized stratified sample (by age, gender, and residential area) of Vienna inhabitants to participate. For the rural cohort, due to the low total inhabitants' number, every registered inhabitant of six villages from Lower Austria was invited. Selected urban and rural participants received a personalized invitation letter. If the selected person did not respond within 30 days, a maximum of two other personalized invitation letter were sent at regular intervals. Due to the national Austrian data protection act, we did not have access to information on other contact details (eg, telephone number). Exclusion criteria were pregnancy, current breast feeding, or poor German language skills. Individuals participated voluntarily without any reimbursement.

Follow-up
Every participant will be assessed every 4 years from individual's first study visit with all measurements described in detail in Table 1. In case of loss to follow-up, the national inhabitants' register provides information on emigration or death. As the study is designed longitudinally, the numbers of loss to follow-up will be substituted with re-recruitment (by age, gender, and residential area).

Measurements
All measurements were performed at the LEAD study centre of the Ludwig Boltzmann Institute for COPD and Respiratory Epidemiology at the Otto Wagner Hospital in Vienna, Austria. Table 1 details all measurements obtained in each participant according to her=his age. To screen for major NCD's, a number of investigations and measurements were obtained in each of the four domains of the study (Lung, hEart, sociAl, boDy), as detailed in Table 1 and only briefly discussed below. Blood and urine samples were analyzed for routine clinical measurements and stored in a biobank for future investigations.

Lung domain
Lung function measures included pre-and post-bronchodilation spirometry and static lung volumes, effort-independent measures The LEAD Study: Methodology and Baseline Data of oscillatory resistance, and carbon monoxide lung diffusing capacity (DLCO). Measurements were obtained according to international recommendations 32 and reference values used correspond to those of the Global Lung Function Initiative (GLI). 33 Skin Prick Tests for major allergens (see Table 1) were obtained in every participant. Smoking history and exposure to environmental tobacco smoke (ETS) was recorded. History of respiratory diseases, allergy, and related medication from the individual and spouses, as well as respiratory symptoms, were collected using a questionnaire.
To investigate COPD and asthma in more detail, a subgroup of participants was invited for additional measurements. Every participant with a positive Skin Prick Test or doctor-diagnosed asthma or allergy or elevated blood eosinophils was re-invited for bronchial provocation and fractional exhaled nitric oxide (FeNO) testing and an Asthma Control Test TM (subgroup 1). Every participant with an forced expiratory volume in the first second (FEV 1 )= forced vital capacity (FVC) below Lower Limits of Normal by GLI 33 or below 70% was re-invited for a 6-minute walking test, COPD and health related questionnaires, and Alpha 1-Antitrypsin testing (subgroup 2). Details are explained in Figure 2.

HEart domain
Cardiovascular measurements included arterial blood pressure, automated electrocardiogram, carotid femoral-pulse wave velocity, and blood pressure measurements at both the upper and the lower extremities. History of cardiovascular diseases and events and related medication from the individual and spouses were collected using a questionnaire.

SociAl domain
To evaluate environmental risk factors for chronic respiratory diseases, data from the Environment Agency Austria of seventeen monitoring units was used to determine the average concentrations of particulate matter 10 (PM10) and Nitrogen oxide (NO X ) within 10 m 3 of every participant's home and workplace. A map showing the monitoring units in Vienna and the average exposure on PM10 and NOx in the year 2015 is presented as Figure 3. Socioeconomic status (income, education, occupation) of the individual or parents=legal representative (if underage) was collected. To study the presence of neuropsychiatric diseases such as anxiety, depression, and impaired cognitive function, we used standardized questionnaires and test modules (as detailed in Table 1).

BoDy domain
To determine the presence of diabetes and metabolic syndrome, we measured fasting glucose in peripheral venous blood, glycated hemoglobin (HbA1c), body mass index, waist circumference, fat mass=fat free mass, blood lipid profiles, and blood pressure. The presence of osteoporosis was determined using dual-energy X-ray absorptiometry. History of diabetes, metabolic syndrome, and osteoporosis and related medication from the individual and spouses were collected using a questionnaire.

Quality control
To guarantee optimal quality of the whole process, the following measures were implemented from scratch: (1) all clinical examiners (medical students) were trained by senior staff (medical doctors=lung function technicians) and were supervised regularly; (2) standardized protocols were predetermined for every single measurement, based on international standards; (3) all questionnaires are interview based; (4) a web-tool was designed (www.linkthat.eu) that included control mechanism, which permits inconclusive data entry (eg, typing errors, implausible numbers), and all data from study equipment is uploaded directly by data transfer to avoid manual data input Questionnaire Arterial blood pressure 50 By Sphygmomanometer Hokanson S12™ and DS400 Aneroid™ 12-lead electrocardiogram 51 Cardiac infarction injury score Cardiosoft, GE Healthcare ® , Austria Non-invasive applanation tonometry 52 Arterial stiffness by carotid femoral-pulse wave velocity (PWV) and Augmentation time Index by pulse wave analysis (PWA) Spygmocor, Novomed ® , Austria Ankle-brachial index 53 By sphygmomanometer and Doppler probe at upper and the lower extremities ELCAT ® GmbH, Germany   errors; (5) study monitoring is performed periodically and included extensive plausibility checks as well as regular monitoring reports of the interim data set; and, (6) special emphasis is placed on quality control of every measurement report, including a rigorous post hoc quality control of all original measurements by the steering committee.

External validity
To determine the "representativeness" of the population included in LEAD compared to the Austrian population, we did extensive external validity testing. First, we compared LEAD results with those the most recent Austrian population data published by the Governmental Statistic Department in 2015. 34 Second, the LEAD study population was compared with the Austrian Governmental Microcensus in terms of age, gender citizenship, educational level, and smoking status. 35,36 The Microcensus is a randomly chosen Austrian household's survey (quarterly sampled), in males and females aged 15-80 years, depicting socio-demographic details and, in a subsample, self-reported health status including smoking status. Due to the compulsory participation (denial punished by a fee) the Microcensus survey has a very high response rate (>99%) and high reliability.

Statistical analysis
For this manuscript, continuous data are described using arithmetic means, standard deviations (SD) and ranges; binary and ordinal data by frequencies and percentages. In general, the level of significance (α) was set to 5% (Bonferroni-Holm correction applied whenever necessary). To analyze external validity ("representativeness") α was set to 0.1% (99.9% confidence intervals) to prevent wrongly significant results for confidence intervals caused only because of the large sample sizes. All statistical analyses have been performed using SPSS 24.0 (IBM Corp, Armonk, NY, USA). 37 In future analysis of specific LEAD projects we plan to use knowledge-driven and data-driven (unbiased) analysis, including principal component analysis, cluster analysis, and network analysis, to understand the complexity and heterogeneity of the cohort and different subgroup of participants. 22,38,39 Table 2 presents the major baseline characteristics of the LEAD study cohort stratified by age. We included 1.344 children age 6 ≤ 18 years (male 54.4%) and 10,079 adults ≥18 years (male 46.65%). Smoking exposure was high since almost one fifth of children aged 6 ≤ 18 years had been exposed to environmental tobacco smoke and more than half of the adult participants were former or current smokers (56%). Former male smokers have a higher exposure to cigarette smoke compared to current smokers; therefore, we stratified all smokers using pack years (PY) showing that 19.5% have or have had a high cigarette consumption (>20 PY). All in all, most participants (86.9%) have any history of tobacco smoke exposure (passive and=or active).

External validity
Compared to other epidemiological studies, 40 the overall response rate in LEAD was low (total 8.7%, male: 7.7%, female 9.8%). The low participation rate is probably related to the very rigorous Austrian data protection law that prohibits iterative invitations and=or telephone contact. The LEAD study is a single-centered investigation, with participants having long travel to have examinations (no monetary incentive). As residential area was part of the recruitment strategy, a homogenous weighted proportional population sample of various Viennese districts had to be guaranteed. When comparing participation rates between Lower Austria, districts far away and nearby the LEAD  (7) BMI, body mass index; CI, confidence interval; SD, standard deviation.
The LEAD Study: Methodology and Baseline Data study center, no difference were found (8.7% vs 6.5% vs 7.1%; all P-values non-significant). It is known, that participation rates in health surveys decreased over the past decades, 40 and the considered personal benefit participating in a health survey in Austria may be low due to the high provision of health care. Despite this, the external validity ("representativeness") of the LEAD cohort is very high. Table 3 shows that the demographic characteristics of LEAD participants are almost identical to those of the general Austrian population, stratified by age and gender.
On the other hand, Table 4 compares the sociodemographic data (citizenship and educational level) of the LEAD cohort (15-80 years) with that of the Austrian Governmental Microcensus (15-80 years). The LEAD cohort was very similar in terms of citizenship but not in terms of educational level, which was shifted towards higher education. Finally, it is important to note that the smoking history of the LEAD study cohort was well matched with the results from the Austrian Microcensus both, in male and female participants ( Figure 4).  n.s., not significant.

1.
To obtain all measurements in the entire study population including valid lung function testing (bodyplethysmography in particular) for proper comparability throughout age groups, only participants ≥6 years could be included. In addition, radiation exposure by dual-energy X-ray absorptiometry is prohibited in those <6 years of age. Therefore, first infant years of lung function development will not be available in this study. 2. Information on individual's early life exposure like birth history, childhood respiratory infections, diagnoses, and maternal= paternal data on occupational exposure, migration, life style, and smoking behavior are documented only via questionnaires and retrospectively in those <18 years. The validity is demonstrated by the accompanying parents at the study visit. In those >18 years this information may be lacking due to individual's memory (gap) or nescience.
3. The LEAD cohort shows higher education levels compared to the general Austrian population. This is a well-known challenge within populational based studies based on voluntary participation. 41 Education is a well described and recognized factor on lung function, in both childhood and adulthood. 42 This weakness is addressed as a major limitation within the LEAD cohort. To at least reduce this effect the authors will define recommended individual's (for adults) and mothers=fathers (for children) socioeconomic status, including education, occupation, and income. 43

DISCUSSION
The ongoing LEAD study includes a large and carefully characterized cohort representative for the general population of Austria with respect to age, gender, and smoking status. The high quality of the dataset is guaranteed by the protocol based standard measurements, the trained personnel and interview based questionnaires obtained in all participants, through the direct data transfer from the equipment into the study specific tailored web tool preventing manual data input errors, and through regular study monitoring reports, interim data sets, and plausibility checks. In addition, rigorous post hoc quality control of every single measurement report is done by the steering committee.
The representativeness of the LEAD study cohort is supported by the comparison with the governmental data of the Austrian population that shows almost identical distribution of age, gender and citizenship. The shift towards higher education levels in the LEAD study cohort is also seen in other epidemiological studies, and is likely due to an increased health awareness of higher educated participants. 44 The LEAD study has some clear strengths: (1) to our knowledge, it is among the biggest longitudinal population-based cohort studies providing pre-and post-bronchodilation spirometry and body plethysmography and diffusing capacity results to better characterize chronic respiratory disease as recommended by the European Respiratory Society 45 ; (2) to validate prospectively lung function trajectories from infancy to late adulthood, 4 the LEAD cohort included participants from 6 to 80 years, and most of the risk factors recently discussed 13,46,47 have been considered in the analysis; (3) to study the impact of environmental factors on lung function, data from seventeen environmental monitoring units in Vienna calculate the average concentrations of particulate matter 10 (PM10) and Nitrogen oxide (NO X ) within 10 m 3 of every participant's home and work place address using the emission and combustion model 48 ; (4) to cover risk factors based on symptoms, socioeconomic status, smoking status, and selfrated general health, a LEAD study cohort tailored questionnaire was generated for different age groups based on validated questionnaires (for detailed information see Table 2). In addition, anxiety and depression, quality of life, and cognitive function were also evaluated; (5) to minimize recall bias for undiagnosed or self-reported coexisting diseases, the LEAD study measures key organ function data to screen for cardiovascular, metabolic, and body composition data for muscle=fat distribution as well as for osteoporosis and screening test for cognitive function. This approach provides a framework for the exploration of the relationships between lung health, age, and multi-morbidity in a broader context; and, finally, (6) LEAD has created a blood=urine biobank, hosted by the Medical University of Vienna, which is open to future international collaborations to better understand genetic risk factors and biomarkers influencing lung function development and decline.
In summary, the LEAD cohort has been established following high quality standards; it is representative of the Austrian population and offers a platform to understand lung development and ageing as a key mechanism of human health both, in early and late adulthood.