Article ID: 211024
The olfactory receptor (OR) gene family is comprised of hundreds of intact and disrupted genes in humans. The compositions and copy number variation (CNV) of disrupted and intact OR genes among individuals is expected to cause variation in olfactory perception. However, little is known about OR genetic variation in many human populations. In this study, we used targeted capture enrichment and massive parallel short-read sequencing methods to examine genetic variation of OR genes, as well as of neutral genome regions as references, for 69 anonymized unrelated Japanese individuals. The capture probes were designed for 398 intact OR genes in the human reference genome hg38, and 85 neutral references. Probes were also designed for four unannotated and 99 ‘nearly-intact’ (hg38-pseudo) OR genes in hg38 and 53 chimpanzee OR genes in the Pantro3.0 genome database with no orthologs in hg38. All the hg38 OR genes and one Pantro 3.0 OR gene were retrieved. The mean sequencing depth was significantly higher than that of the 1000 Genomes Project. A total of 30 OR genes from hg38-intact and hg38-pseudo categories were newly found to be segregating pseudogenes. One hg38-pseudo OR gene was intact in all individuals. CNV was detected in 63 OR genes. Tajima’s D analysis for OR genes and neutral references was consistent with balancing selection to maintain allelic differences in intact OR genes. These results demonstrate that the targeted capture by probes with diversity-oriented design is far more effective than a whole-genome approach to retrieve OR genes and achieve high-depth sequencing and thus to reveal polymorphisms for the OR multigene family. The composition of OR genes in the human reference genome hg38 does not necessarily represent those in many humans, implying higher perceptual variation than previously thought. The current study inspires further investigation with a similar approach at a global scale.