Annals of Clinical Epidemiology
Online ISSN : 2434-4338
Introduction to Matching in Case-Control and Cohort Studies
Masao Iwagami Tomohiro Shinozaki
Author information

2022 Volume 4 Issue 2 Pages 33-40


Matching is a technique through which patients with and without an outcome of interest (in case-control studies) or patients with and without an exposure of interest (in cohort studies) are sampled from an underlying cohort to have the same or similar distributions of some characteristics. This technique is used to increase the statistical efficiency and cost efficiency of studies. In case-control studies, besides time in risk set sampling, controls are often matched for each case with respect to important confounding factors, such as age and sex, and covariates with a large number of values or levels, such as area of residence (e.g., post code) and clinics/hospitals. In the statistical analysis of matched case-control studies, fixed-effect models such as the Mantel-Haenszel odds ratio estimator and conditional logistic regression model are needed to stratify matched case-control sets and remove selection bias artificially introduced by sampling controls. In cohort studies, exact matching is used to increase study efficiency and remove or reduce confounding effects of matching factors. Propensity score matching is another matching method whereby patients with and without exposure are matched based on estimated propensity scores to receive exposure. If appropriately used, matching can improve study efficiency without introducing bias and could also present results that are more intuitive for clinicians.


Matching is mainly used in observational studies, including case-control and cohort studies. Matching is a technique by which patients with and without an outcome of interest (in case-control studies) or patients with and without an exposure of interest (in cohort studies) are sampled from an underlying cohort to have the same or similar distributions of characteristics such as age and sex.

The main purpose of matching is to increase study efficiency for data collection and subsequent statistical analysis. Matching helps researchers reduce the volume of data for collection without much loss of information (i.e., improving cost efficiency) and obtain more precise estimates than simple random sampling of the same number of patients (i.e., improving statistical efficiency). In addition, in cohort studies, matching can remove or reduce confounding effects of matching factors.

This paper aims to introduce basic principles of matching in case-control and cohort studies, with some recent examples.


2.1.  Unmatched Case-Control Sampling

A case-control study is a design used to compare levels of exposures between cases and controls defined by the status of outcome of interest. In typical case-control studies, cases are all patients with an outcome in an underlying cohort, with multiple control selection strategies, as explained below. Despite outcome-dependent sampling (which is also called “biased sampling”) that introduces selection bias, data collection only for cases and controls enables researchers to estimate some associational measures (such as risk ratio, odds ratio, and rate ratio) that would be obtained in an underlying cohort study, unless sampling depends on the exposure status in controls. Specifically, the exposure-outcome odds ratio in a cumulative incidence sampling (also called exclusive sampling) of controls is expectedly identical to the odds ratio in the underlying cohort (Fig. 1A), the odds ratio in a case-cohort sampling (also called inclusive sampling) is equal to the risk ratio in the underlying cohort (Fig. 1B), and the odds ratio in a risk set sampling (also called concurrent sampling) is equal to the rate ratio (or hazard ratio, according to analysis models) in the underlying cohort (Fig. 1C). Note that none of the above interpretations of odds ratios requires a rare disease assumption. Researchers can even restore other associational measures (such as risk differences) in the underlying cohort from each sampling design if auxiliary data on patients not selected as cases or controls are available [1, 2].

Fig. 1 Graphical representation of cumulative incidence sampling (A), case-control sampling (B), and risk set sampling (C) for 10 example patients in a cohort. ● indicates an outcome onset and time at selection as a case. ○ indicates time at selection as a control.

In Fig. 1, patients are followed up from when they enter the cohort, regardless of the calendar date. This is a common timeframe used in cohort studies such as randomized controlled trials, registry-based cohort studies, and hospital-based cohort studies. Meanwhile, in population-based cohort studies, calendar time is often used as a time frame, where a risk set sampling is usually used to sample controls for each case at the same calendar time (Fig. 2).

Fig. 2 Graphical representation of a risk set sampling for 10 example patients in a population-based cohort. ● indicates an outcome onset and time at selection as a case. ○ indicates time at selection as a control.

In a study requiring primary data collection, case-control study designs are efficient because only information on cases and selected controls, instead of all people in the underlying cohort, is collected and used for statistical analysis. Especially for rare outcomes, a cohort study recruiting many people to observe a sufficient number of outcomes is not feasible. However, a case-control design would still be feasible, with reduced costs and efforts.

In a study with the secondary use of existing cohort data, case-control sampling is usually unnecessary [3]. Such post-hoc sampling would miss the opportunity to estimate absolute risks of the outcome in the cohort, which is an important indicator in evidence-based medicine or policymaking. However, case-control study designs are still used sometimes if researchers want to (i) collect additional data on confounding factors by reviewing medical records or questionnaires, (ii) use stored samples to measure new biomarkers, (iii) require adjudication of individual study endpoints with special expertise in their assessment and classification, and (iv) make it convenient to assess triggers of an acute event by flexibly modeling the exposure window at varying proximities to the event of interest [4].

2.2.  Purpose of Matching in Case-Control Studies

Similar to cohort studies, case-control studies typically require confounder adjustment using stratified analysis or regression modeling. To further improve statistical efficiency in adjusted analyses, case-control studies may match controls on confounders to be adjusted for, i.e., sampling a control(s) with an identical (or nearly identical) value of confounders for each case. When the total number of cases and controls to be sampled is fixed, the adjusted odds ratio estimates are likely to be less variable (i.e., more statistically efficient) in case-control data matched on strong confounders than in unmatched data.

Besides common confounding factors such as age and sex, area of residence (e.g., post code) or clinics/hospitals (which patients are registered to or visit) are sometimes matched between cases and controls. If variables with a large number of values or levels (e.g., over 1,000 post codes or clinics/hospitals) are adjusted for as “surrogate” confounders in the statistical analysis, at least one case and one control in each area (or clinic/hospital) are needed; otherwise, the data are discarded in the fixed-effect models (stratification). Although a case and control may rarely come from the same area (or clinic/hospital) in unmatched case-control sampling, matching can ensure that the pairs (or sets) of cases and controls are derived from the same area (or clinics/hospitals). Consequently, the odds ratio adjusted for these variables can be efficiently estimated.

Caution is needed for the effect of case-control matching on confounding: matching itself does not have a role in adjusting for confounding factors but rather introduces selection bias [5]. Therefore, as explained later, statistical analysis with fixed-effect adjustment, such as the Mantel-Haenszel odds ratio estimator and conditional logistic regression models, is necessary to estimate an unbiased confounder-adjusted odds ratio. In addition, if a case and controls become too similar by matching too many variables, statistical efficiency in the fixed-effect analysis will be reduced, which is called over-matching [6]. Thus, it is generally not recommended to match many variables in case-control studies.

2.3.  Choice of Matching Ratio

Because the number of cases (which are often rare diseases) is usually much smaller than that of potential controls, the matching ratio (i.e., ratio of cases:controls in each matched set) is often set to 1:n. If the ratio is set to 1:1, the design is called a pair-matched case-control study. In practice, many studies set the matching ratio to 1:4 or 1:5, whereas other studies opt to set it to a large ratio, such as 1:7 [7] and 1:10 [8]. In unmatched case-control settings, the gain of statistical power sharply increases until the ratio 1:4 or 1:5 and then slowly increases thereafter [9]. However, this may not be always true in matched case-control settings. The matched case-control design generally requires stratification on matching factors, which completely discards the information of matched sets of cases and controls with concordant exposure (i.e., a set containing people exposed only or people unexposed only). Thus, a 1:4 or 1:5 matching ratio may still have substantial power loss if (i) cases and controls in the same strata of matching factors have similar exposure patterns or (ii) exposure is rare (e.g., <15%) in an underlying cohort [10].

Sometimes, a case cannot find a prespecified number of controls. For example, in a case-control study planning 1:4 matching, some cases could find only less than four controls. However, it is not necessary to exclude these pairs when matching factors or matched sets of cases and controls are stratified in the analysis. The mixture of pairs with different matching ratios will not result in a biased estimate as long as an adequate adjustment for matching factors is adopted.

2.4.  Choice of Matching With and Without Replacement

It is necessary to decide whether the same individual can be sampled repeatedly as a control (called matching with replacement) or only once (called matching without replacement). Researchers need to choose one of the two as the main analysis, considering the balance between the demerit of not finding sufficient number of controls (i.e., many pairs not achieving the prespecified 1:n matching ratio) by matching without replacement and the demerit of decreased statistical efficiency if the same individual is repeatedly included as a control by matching with replacement. If the number of controls is much larger than that of the case, the choice would not make a big difference in the estimated odds ratios. Notably, in risk set sampling, (i) people with the outcome (i.e., cases) should be potentially selected as controls until they become a case to represent the underlying cohort, and (ii) the same person should be selected as a control several times at different time points, meaning that matching without replacement is biased [11].

2.5.  Statistical Analysis in Matched Case-Control Studies

To remove the selection bias artificially introduced by case-control matching, it is necessary to “stratify” data on matching factors in the statistical analysis. One traditional method is the Mantel-Haenszel odds ratio estimator that stratifies on matching factors themselves (e.g., subgroups by age group and sex, if controls are matched on these factors) or matched sets (e.g., each pair of a case and control). The Mantel-Haenszel estimator adjusts for matching factors as fixed effects and estimates a common odds ratio assumed to be constant across strata. The Mantel-Haenszel odds ratio estimator consistently estimates the common odds ratio when each stratum contains sparse data (e.g., only two patients, one case and one control, in each stratum) but the number of strata increases. Adjusting for confounding factors besides the matching factors by additional stratification within the matching factor strata is infeasible.

As another method, it is much more common to use a conditional logistic regression model, which estimates the common stratum- and covariate-specific odds ratio by stratifying on matching factors while adjusting for other confounders as covariates [12]. For example, when a control is matched on age and hospital of a case, stratification of matched pairs using the conditional logistic regression model will eliminate the confounding effect of these matching variables. Additionally, medical conditions that may confound the exposure-outcome relationship within the age-hospital strata can be adjusted for by including them as covariates in the model without introducing unnecessary bias.

Notably, simple adjustments of matching factors by including them as covariates in (unconditional) logistic regression models are not recommended in matched case-control studies. For example, an unconditional logistic regression model for an outcome, including exposure and age as covariates in age-matched case-control data, provides a biased estimate of age-adjusted odds ratio, even if the model correctly specifies the association between the outcome and covariates in an underlying cohort [13]. This is because the selection bias induced by matching distorts the association between the outcome and matching factors, resulting in residual bias owing to model misspecification.

Finally, time at matching (time from cohort entry, calendar time, or possibly age as time from birth) can be considered one of the “matching factors” in risk set sampling. If the hazard of disease incidence varies with time and the exposure prevalence changes during follow-up, time should be accounted for as a “confounder.” To do so, one can use the Mantel-Haenszel odds ratio estimator or a conditional logistic regression model, which estimates the hazard ratio constant over time (and across other matching factors, if any) that would be modeled by the Cox proportional hazards model in an underlying cohort.


3.1.  Example 1: A Case-Control Study with Primary Data Collection

Hayashi et al. conducted a case-control study to identify factors associated with calciphylaxis (calcific uremic arteriolopathy), a rare and fatal complication characterized by painful skin ulceration and necrosis, in patients undergoing hemodialysis for end-stage renal disease [14]. The researchers representing the Japanese Calciphylaxis Study Group sent questionnaires to hemodialysis centers in Japan and included 28 cases with a definitive diagnosis of calciphylaxis. For each case, two controls matched for age and hemodialysis duration were randomly selected from the same dialysis center. Clinical information, including known and unknown (but suspected) risk factors for calciphylaxis, was collected for cases and controls. Univariable logistic regression analyses showed that warfarin therapy, lower serum albumin levels, higher plasma glucose levels, and higher serum calcium levels were significantly associated with calciphylaxis. A multivariable logistic regression analysis showed that warfarin therapy and lower serum albumin levels (per 1 g/dL decrease) were still significantly associated with calciphylaxis, with an adjusted odds ratio of 10.1 (95% confidence interval [CI] 1.63–62.7) and 12.7 (95% CI 2.35–68.6), respectively.

3.2.  Example 2: A Case-Control Study with Secondary Use of Existing Cohort Data

Iwagami et al. conducted a case-control study to identify medical diagnoses strongly associated with the incidence of long-term care needs certification, using linked medical and long-term care insurance data from two cities in Japan [15]. The participants were aged ≥75 years, had no previous long-term care needs certification, and had at least one medical insurance claim record during the study period. Cases were newly certified people for long-term care needs during the study period, whereas controls were randomly selected in a 1:4 ratio and matched for age category, sex, city, and calendar date (index date). Multivariable conditional logistic regression analysis was conducted to estimate the association between 22 categories of medical diagnoses recorded during the period of exposure definition (past 6 months of index date) and new long-term care needs certification, under the assumption that exposures are independent of each other. Among 38,338 eligible people, 5,434 people newly received long-term care needs certification and were matched with 21,736 controls. In the multivariable conditional logistic regression analysis, the adjusted odds ratio (95% CI) was the largest for femur fractures (8.80 [6.35–12.20]), followed by dementia (6.70 [5.96–7.53]), pneumonia (3.72 [3.19–4.32]), hemorrhagic stroke (3.31 [2.53–4.34]), Parkinson’s disease (2.74 [2.07–3.63]), and other fractures (2.68 [2.38–3.02]).


4.1.  Rationale for Matching in Cohort Studies

Matching can also be used in cohort studies. Patients with and without the exposure of interest are matched on some patient characteristics and compared for the incidence of outcomes. Matching is rarely used in observational cohort studies with primary data collection (with some exceptions such as sibling design and spouse survey) probably because most observational cohort studies are conducted without pre-specifying a certain exposure, for a wide range of research questions. In contrast, matching is sometimes used in cohort studies with the secondary use of existing databases to reduce computational burden by selecting a subset of data without sacrificing statistical precision. In addition, unlike case-control matching, cohort matching removes or reduces the confounding effects of matching factors [5].

A matched cohort study may also be conducted from a practical viewpoint: it would provide an intuitive presentation of patient characteristics in “comparable” exposure groups matched on important confounding factors such as age, sex, and calendar time. As crude absolute measures (such as risks and rates) during the follow-up period are easily summarized in exposed and unexposed patients, clinicians unfamiliar with statistical analysis can grasp the difference between the two groups in a non-statistical manner.

In cohort studies, patients with and without the exposure of interest at the start of the follow-up, such as smoking and use of a certain drug, are matched in a 1:1 or 1:n ratio (Fig. 3). In practice, exposure is dichotomized (i.e., presence or absence of exposure, rather than the level of exposure), and the exposure status of selected patients is assumed to remain unchanged during the follow-up period.

Fig. 3 Graphical representation of a matched-pair cohort study for 10 example patients in a cohort. Solid lines indicate that people are exposed, dotted lines denote that people are not exposed, and ● indicates the incidence of outcome.

In population-based cohort studies, the exposure status may change according to the calendar time. For example, people without diabetes may be diagnosed as having the disease one day. As another example, patients who have never used a certain drug with potential carcinogenic effects before may start taking it one day. In such situations, researchers can create matched sets of patients with and without the exposure of interest at the same calendar time (Fig. 4). However, the exposure status of the matched sets is assumed to remain unchanged during the follow-up period. In the presence of time-varying exposures, survival analysis with time-dependent covariates or by censoring the follow-up of a patient when his/her exposure status changes may provide estimates of associational measures (e.g., hazard ratios) free from time-related biases [16]. However, in general, a matched cohort study is unsuitable if the exposure status frequently changes between “on” and “off” in the same patient. Furthermore, although the method exists [17], causal interpretation of associational measures for time-varying exposures estimated in matched cohort studies requires additional consideration.

Fig. 4 Graphical representation of a matched-pair cohort study for 10 example patients in a population-based cohort. Solid lines denote that people are exposed, dotted lines denote that people are not exposed, ▼ indicates the timing of the matched-pair cohort inclusion in the exposed group, ▽ indicates the timing of the matched-pair cohort inclusion in the non-exposed group, and ● indicates the incidence of outcome.

4.2.  Choice of Matching Factors, Matching Ratio, and Matching With or Without Replacement

Matching factors in the secondary use of existing databases often include age (age category or age within a range, such as ±2 years), sex, area of residence (e.g., post code) or clinics/hospitals (which patients are registered to or visit), and calendar time. Although cohort matching on known confounders typically leads to an efficiency gain in adjusted estimates, there are exceptions depending on associational measures (e.g., risk difference or ratio) and underlying models (e.g., additive or multiplicative risk models) [18]. If the statistical efficiency is rather worsened by matching, the resulting estimates suffer from “over-matching.”

Regarding the matching ratio, 1:4 or 1:5 is sometimes chosen in matched-pair cohort studies, whereas 1:1 may be chosen more frequently to prioritize simplicity and intuitiveness. Mixed matching ratios (meaning that, for example, some pairs are matched in a ratio of 1:4, whereas other pairs are matched by a ratio of 1:3, 1:2, or 1:1 between exposed and unexposed people) will not cause bias if matching variables or matched sets are adjusted for in the analysis. In contrast, as such varying matching ratios do not balance the distributions of matching factors in exposed and unexposed people, the unadjusted comparison in the matched cohort still suffers from confounding bias.

Matching with or without replacement remains the choice of researchers, although matching without replacement may be more intuitive for clinicians.

4.3.  Statistical Analysis in Matched-Pair Cohort Study

Unlike case-control matching, non-mixed cohort matching completely or partially removes the confounding effect of matching factors without introducing additional selection bias. Hence, fixed-effect models for matched sets (e.g., conditional logistic regression, stratified Poisson, or stratified Cox regression models) may or may not be an option. Other possible statistical methods include i) covariate adjustment for matching factors, ii) random-effect adjustment for matched sets, and iii) marginal regression modeling without stratification on matching factors but with cluster-robust variance accounting for matched sets as clusters. The differences between fixed-effect models and other possible statistical methods are the estimand and modeling assumptions of the analysis [19].

Caution is needed in the sense that matching can only “balance” distributions in sampled (i.e., matched) data, and such balance is easily affected by additional adjustment for or stratification on other variables. Therefore, ignoring matching factors (i.e., adjusting for additional unmatched variables without adjusting for matching factors) would cause bias in estimates [20].

In some matched-pair cohort studies, observation time is prematurely terminated immediately after the follow-up of his/her matched counterpart is completed by an event or censoring [21]. The impact of such termination is minimal when adopting stratified Cox models. However, in statistical methods other than stratified Cox models, termination is not generally encouraged because information is then discarded in an irremediable manner [22].

4.4.  Propensity Score Matching

Although the aforementioned matching method is specifically called exact matching, matching based on the propensity score, which is the probability of receiving exposure within the confounder stratum to which a patient belongs, is another type of matching method known as marginal matching [13]. Propensity score matching was featured in a previous paper of this seminar series [23]. Briefly, patients with and without exposure are matched based on estimated propensity scores to receive exposure at a certain time point, mostly at the time of cohort inclusion. Consequently, the distribution of the measured confounding factors defining the propensity scores are balanced between the two groups in the propensity score-matched samples. Researchers using this method should be aware of the theoretical subtleties in propensity score matching, such as the lack of justification for interval estimation for propensity score-matched estimates using off-the-shelf software [24] and bias owing to additional adjustment for risk factors not balanced by propensity score matching [25].


5.1.  Example 1: A Cohort Study with Exact Matching

Ohbe et al. conducted a population-based matched cohort study to examine the risk of cardiovascular events after a spouse’s intensive care unit (ICU) admission, using the JMDC claims database, which includes employees of relatively large Japanese companies and their family members in Japan [26]. Among 1,082,208 eligible married couples (2,164,416 spouses), the researchers identified 7,815 spouses of patients who were admitted to the ICU for more than 2 days. From the rest of the study population, they randomly selected a non-exposure group with a ratio of one spouse in the exposure group to four individuals in the non-exposure group, matched for age, sex, and medical insurance status on the same date (index date). When examining the primary outcome, the percentage of any visits for cardiovascular diseases 1–4 weeks after the spouse’s ICU admission was 2.7% (210/7815) in the exposure group and 2.1% (666/31 250) in the non-exposure group, with an adjusted odds ratio of 1.27 (95% CI, 1.08–1.50). Secondary outcomes, which included any hospitalization for cardiovascular disease or hospitalization for severe cardiovascular events, were also significantly more frequent in the exposure group. The odds ratios became closer to 1 (i.e., the null association) 4 weeks after the index date. Thus, the authors concluded that ICU admission of a spouse can be a risk factor for cardiovascular events 1–4 weeks after the date of the spouse’s ICU admission.

5.2. Example 2: A Cohort Study with Propensity Score Matching

Nagasu et al. conducted a registry-based propensity score-matched cohort study using the Japan Chronic Kidney Disease Database (J-CKD-DB) [27] to examine the protective effects of sodium-glucose cotransporter 2 (SGLT2) inhibitors on kidneys compared with other glucose-lowering drugs. The researchers identified patients with CKD who started SGLT2 inhibitors or other glucose-lowering drugs. On the day of initiation, they calculated a propensity score for SGLT2 inhibitor initiation for each patient and created a 1:1 propensity score-matched cohort (n = 1,033 pairs). Regarding the primary outcome, during follow-up, the mean annual rates of estimated glomerular filtration rate (eGFR) change were −0.47 (95% CI −0.63 to −0.31) and −1.22 (−1.41 to −1.03) mL/min/1.73 m2 per year in the SGLT2 inhibitor and other glucose-lowering drug groups, respectively (P < 0.001). Regarding the secondary outcome, there were 30 patients with a composite kidney outcome (50% eGFR decline or end-stage kidney disease) in the SGLT2 inhibitor group (14 events/1,000 patient-years) and 73 in the other glucose-lowering drug group (36 events/1,000 patient-years), with a hazard ratio of 0.40 (95% CI 0.26–0.61). Thus, compared with other glucose-lowering drugs, the initiation of SGLT2 inhibitors was associated with a significantly lower rate of eGFR decline and a lower risk of composite kidney outcome.


We have provided an overview and some recent examples of matching in case-control and cohort studies. Matching in case-control studies can increase study efficiency, including both cost and statistical efficiencies. Nevertheless, caution is still warranted since inappropriate sampling of controls and application of statistical analysis without stratification would result in a biased estimate. In cohort studies, exact matching can increase efficiency and remove or reduce the confounding effect of matching factors, whereas a propensity score matching can be used to balance the distributions of measured confounding factors between exposed and unexposed individuals. If appropriately used, matching can improve study efficiency without introducing bias and can present results that are more intuitive for clinicians.


We would like to thank Dr. Hiroyuki Ohbe of the Department of Clinical Epidemiology and Health Economics, School of Public Health, The University of Tokyo, and Dr. Motohiko Adomi in the Department of Epidemiology, Harvard T.H. Chan School of Public Health, for their critical reading of the manuscript and feedback.


No potential competing interests relevant to this paper are reported.

© 2022 Society for Clinical Epidemiology

This article is licensed under a Creative Commons [Attribution-NonCommercial-NoDerivatives 4.0 International] license.