Unraveling Genetic Factors in Cleft Lip and Palate Among Japanese Patients: A Two Decade Effort
Watanabe A
Published on: 2023-12-14
Abstract
Objective
The development of cleft lip and palate (CL/P) has been studied around the world, but no single gene for cleft lip and palate has yet been identified. There are racial differences in the development of this disease, and its pathogenesis in the Japanese may have a complex pathomechanism. This two-decade study project, initiated in 2003, aimed to investigate the genetic underpinnings and potential racial variations contributing to the pathogenesis of cleft lip and palate in Japanese subjects.
Design
The study projects employed Sanger sequencing for mutation analysis, single nucleotide polymorphism (SNP) association analysis, Transmission Disequilibrium Test (TDT) through a case-control study, and GWAS, next-generation sequencing.
Setting
Participants
The study encompassed various Japanese and Vietnamese populations and analyzed candidate genes known for their links to cleft lip and palate (e.g. RYK, EPHB2, EPHB3, TGF-B3, DLX3, PAX9, CLPTM1, TBX10, PVRL1, TBX22, IRF6, WNT5A, WNT9B, TP63, MSX1, TFAP2A, DLX4, MN1).
Interventions
Methods included mutation analysis, SNPs association analysis, Genome-wide association analysis (GWAS), and next-generation sequencing to uncover novel genetic variants.
Main Outcome Measure(s): The primary outcomes included the identification of impactful missense mutations, novel SNPs, and potential genetic variants associated with cleft lip and palate in Japanese participants.
Results
Significant findings included impactful missense mutations, novel SNPs, and regions of interest in GWAS.
Conclusion
The study advanced our understanding of genetic factors linked to cleft lip and palate in Japanese subjects but did not yield a definitive pathogenesis explanation. It underscored the role of racial variations. Further research using long-read sequencers is needed to explore epigenomic factors and rare variants, deepening our grasp of the condition and its genetic intricacies.
Keywords
Cleft Lip and Palate; Genetic Factors; Mutation Analysis; Genome-Wide Association Studies; Next-Generation Sequencing.Introduction
Cleft lip and palate (CL/P) represent a multifactorial genetic disorder influenced by a combination of genetic and environmental factors (Figure 1). Despite extensive research, the fundamental cause of this condition remains enigmatic. The congenital nature of CL/P is suggested by the relative risk rate (λs) in affected siblings, derived from recurrence and morbidity rates, typically ranging from 5 to 10 [1]. While familial clusters have been reported, the Japanese population exhibits a distinctive epidemiological pattern characterized by sporadic occurrences and notably high prevalence [2]. Signaling molecules and morphogens involved in multiple signaling pathways, such as WNT, TGF/BMP, and FGF, have been researched worldwide?for their involvement in the etiology of cleft lip and cleft lip and palate [3-8]. However, this signaling pathway theory does not apply to the etiology of this disease in Asians.
Consequently, delving into the pathogenesis of CL/P in Japanese individuals becomes a paramount endeavor for unraveling the intricate origins of this condition. In 2003, we initiated a comprehensive project aimed at investigating the etiology of CL/P with a specific focus on Asian populations, where this condition exhibits a notably high prevalence. Our project received approval from the Ethical Review Committee (No. 584) of Nagasaki University, Aichi-Gakuin University, and Tokyo Dental University.
This initiative coincided with the culmination of the Human Genome Project, a milestone in genomics research. During this period, both in Japan and internationally, research into multifactorial genetic diseases using sequencing technologies to identify SNPs largely concentrated on candidate genes with well-established biochemical functions, such as those associated with diabetes, hypertension, and allergic diseases. This focus stemmed from the limitations in available research methodologies and scientific knowledge, which made it exceedingly challenging to conduct research that explored linkage disequilibrium across the entire genome. In this report, we detail the evolution of our research techniques and our team’s progress in unraveling the underlying causes of CL/P within the Japanese population.
Figure 1: Models For the Pathogenesis of Single-Gene or Multifactorial Genetic Diseases.
Sanger Sequencing Approach For Mutational Analysis, Case-Control Association Analysis Of Snps, And The Transmission Disequilibrium Test (TDT)
This study was performed in collaboration with Nagasaki University, Tokyo Dental College, and Aichi-Gakuin University. A total of 145 Japanese patients and their parents (271 individuals) were included, along with 210 Vietnamese patients and their parents (294 individuals). Additionally, we included 474 healthy Japanese individuals and 474 healthy Vietnamese individuals as our control group. Candidate genes reported in knockout mice with CL/P from European and American populations were the focus of our research. These candidate genes included: RYK (3q22), EPHB2 (1p36.1-35), EPHB3 (3q28-27), TGF-B3 (14q24), DLX3
(17q3), PAX9 (14q12-q13), CLPTM1 (19q13.2), TBX10 (11q13.1), PVRL1 (11q23), and TBX22 (Xq21.1). To address the uncertainty surrounding the pathogenesis of CL/P, a two-way study design was adopted, allowing for a comprehensive exploration of potential genetic factors.
Based on the hypothetical pathogenesis of the disease, we used two study designs. The first design operated under the assumption that the disease might be caused by a single gene locus due to its occurrence within familial clusters. This design focused on the analysis of mutations using direct sequencing. In contrast, the second design proposes that the disease’s pathogenesis is multifactorial. For this design, we conducted a case-control study analyzing the SNPs and then the Transmission Disequilibrium Test (TDT), which tests for the presence of linkage under linkage disequilibrium [9].
We identified a novel missense mutation, 640A>G (S214G), in the PAX9 gene in a Japanese patient with CL/P, as well as in this patient’s sibling and mother. This missense mutation was located within a region that exhibited conservation across different species. Notably, the PAX9 gene is classified as a homeobox gene characterized by a DNA-binding paired domain without clustering. PAX9 is widely expressed in mesenchyme cells derived from neural appendages, playing a crucial role in the development of the palate and teeth. Furthermore, our investigation unveiled a 1335G>A (Y452C) missense mutation in the RYK gene, identified in a Vietnamese patient and his father. To assess the functional impact of this mutant RYK gene, it was introduced into an expression vector, and a colony formation assay was conducted in NIH3T3 cells. The results of this assay demonstrated an approximately 50% reduction in activity compared to wild-type RYK [10]. Over 88% of RYK knockout mice, characterized by the absence of this receptor-type tyrosine kinase, exhibited craniofacial abnormalities, including a ruptured secondary palate, stunted maxilla and mandible, a flat face, and a shorter nose. This finding underscores the potential significance of RYK in the development of craniofacial features. In addition to these genetic findings, the case-control study using SNPs identified 27 novel SNPs located within the DLX3, TGF-B3, PAX9, and CLPTM1 genes, providing valuable insights into the genetic variations associated with CL/P. Moreover, a significant difference was observed in the TGF-B3 gene located on 14q24. TGF-B3 plays a vital role in various biological activities, such as cell proliferation, migration, differentiation, and transformation from epithelial cells into mesenchymal tissue. This gene also involves palatal development due to its presence in median epithelial cells within the palate. When the median epithelial suture forms, TGF-B3 expression diminishes promptly following its formation. Therefore, the connection between TGF-B3 and CL/P has been extensively explored by numerous researchers, often drawing upon the observations of cleft palate development in knockout mice associated with TGF-B3 [11]. These collective findings emphasize the significance of TGF-B3 in the context of CL/P etiology.
While Filipino and Caucasian patients have shown positive results in the TDT for IRF6, our study yielded contrary outcomes in Japanese and Vietnamese patients. These findings indicate a potential involvement of RYK and PAX9 in the disease’s pathogenesis when CL/P is viewed as a single-gene disease. This discrepancy in results between different populations underscores the complexity of the condition’s genetic underpinnings. If CL/P is considered a multifactorial genetic disease, TGF-B3 emerges as a potential factor in its pathogenesis [12].
Numerous studies from different countries have identified genes associated with this condition. However, the exact mechanisms underlying the development of CL/P remain unclear. Moreover, our findings indicated that mutational analysis in Japanese patients has yielded few positive results compared to the previous studies conducted in other countries.
This suggests that the genetic factors contributing to CL/P in the Japanese population may differ from those reported elsewhere. Therefore, the Sanger sequencing method to analyze all the previously reported related genes is impractical, as it would consume significant time and resources with no immediate outcomes. Alternative research approaches are warranted.
Genome-Wide Association Analysis (GWAS) For Cleft Lip and Palate
Multifactorial genetic diseases are primarily associated with specific SNPs in certain genes. Recent advancements in GWAS have significantly contributed to the identification of these SNPs associated with CL/P. To date, more than 100 susceptibility loci have been implicated in the development of CL/P [13]. GWAS plays a pivotal role in this process because of its capacity to genotype 500,000 to 1 million SNPs from an extensive pool of over 10 million SNPs spanning the entire human genome. This method enables the statistical exploration of associations between SNP frequencies and a wide range of diseases and quantitative traits.
We conducted GWAS of SNP genotyping data, utilizing 137 samples from patients with CL/P and 490 control samples. SNPs with suboptimal typing were systematically filtered out based on criteria, including minor allele frequency (MAF), Hardy-Weinberg equilibrium test (HWE), and SNP call rate. After rigorous quality control filtering, we identified 299,224 SNPs out of the initial 909,508 that met the conditions of MAF exceeding 5%, HWE P-value greater than 0.001, and SNP call rate over 95%.
Further refinement involved the examination of SNPs showing significant differences by plotting diagrams on the Genotyping Console, ultimately selecting those with precise genotyping. Our analysis revealed the SNPs located within the regions of 1q32.2, 6q22.33, and 12q14.1 were associated with the pathogenesis of this disease. However, we did not identify any SNPs that met the significance threshold for a strong association. Notably, the results of our GWAS in Japanese patients diverged from those previously conducted on individuals from other racial backgrounds [14]. While this project did not identify the gene linked to CL/P development in the Japanese population, we found it significant that there was a response related to IRF6 (1q32.2). IRF6 has been reported as a causative gene for Van der Woude syndrome and CL/P in other racial groups. Moreover, many of the identified loci show low heritability, suggesting that their contribution to the development of CL/P remains unclear. This underscores the complexity of uncovering the precise pathogenesis and genetic contribution.
When comparing heritability determined through GWAS in twin heritability, we observed variations in different diseases. Twin heritability stands at 80–90% for height, 30%–60% for coronary artery disease, and 25% for type 2 diabetes. In contrast, the GWAS heritability was 5% for height, 2.8% for myocardial infarction, and only 6% for type 2 diabetes [15]. These comparisons provide insights into the complex genetic underpinnings of multifactorial diseases, underscoring the intricate nature of CL/P etiology in the Japanese population. The observed variation between twin heritability and GWAS heritability is referred to as an unexplained genetic factor (missing heritability).
Several factors contribute to this phenomenon: GWAS primarily focused on common SNPs with a MAF of 5% or higher. These SNPs, while informative, may not have individually exerted a significant effect on disease susceptibility. The intricate gene-gene interactions and gene-environment interactions were not examined in GWAS, potentially overlooking crucial factors in disease development?[16]. GWAS often did not explore epigenomic factors such as DNA methylation and histone modifications, which can play pivotal roles in regulating gene expression and disease susceptibility. In multifactorial genetic diseases such as CL/P, rare variants and gene interactions are believed to be involved in disease development. Therefore, we have chosen to perform genomic analysis using next-generation sequencing methods to identify novel gene mutations and variants that may have eluded previous GWAS investigations.
Targeting Cleft Lip and Palate-Associated Genes Using Next-Generation Sequencing
The introduction of next-generation sequencing, which facilitates the determination of targeted sequences by binding and rearranging DNA fragments into template sequences ranging from 100 to 300 base pairs in length, has revolutionized the analysis of genes within specific regions. In contrast to the Sanger sequencing method, next-generation sequencing offers substantial time efficiency. In 2017, we used a next-generation sequencing approach for a trio analysis of 78 Japanese patients with CL/P and 18 CP, along with their parents.
The specific targeted genes included IRF6 (1q32.2), WNT5A (3p14.3), WNT9B (17q21.32), TP63 (3q28), MSX1 (4p16), TFAP2A (6p24.3), PAX9 (14q12-q 13), DLX3 (17q21.33), DLX4 (17q21.33), and MN1 (22q12.1). These genes were selected through our previous projects and papers reporting genes associated with cleft lip and palate. Our analysis included both introns, which constitute non-coding regions, and exons that encode genes. Traditional sequencing approaches typically focus on only the exons. However, our analysis of these ten genes also included introns, which serve as crucial enhancers and promoters for gene transcription regulation. To enhance the specificity of the analysis, variants with MAF exceeding 0.5% were excluded to utilize valuable genetic information. All identified variants were subsequently confirmed using Sanger sequencing.
We discovered a noteworthy missense mutation, 359C>T (P120L), in the DLX4 gene in a single patient with CL/P. This novel mutation had not been documented in existing databases. The DLX gene plays a pivotal role in cranial neural crest cells and exerts regulatory control over the development of various craniofacial structures, including teeth, cartilage, craniofacial bone, and connective tissue. The missense mutation identified within the DLX4 gene among patients with CL/P was located within the homeodomain, a critical functional region responsible for the regulation of morphogenesis. This homeodomain plays a central role in activating or deactivating the transcription of various genes, thereby influencing key developmental processes. In silico analysis conducted using Polyphen2, SIFT, and PROVEAN indicated that this mutation had a detrimental impact on molecular function.
Furthermore, novel variants (WNT5A 639+918C>T, TFAP2A 45+2755C>G, WNT9B 904+3061A>G, TP63 1349+8281T>C, PAX9 4+272G>A) in highly conserved sequences within the non-coding regions were observed in three patients with CL/P and two patients with CP. These conserved sequences in non-coding regions are known to be involved in the evolution of organisms. They serve as enhancers and promoters, and these highly conserved sequences in non-coding regions are important in evolution. Moreover, they serve as enhancers and promoters governing the expression levels of genes and, as such, necessitate further validation.
In summary, our genetic analysis of Japanese patients with CL/P using next-generation sequencing technology unveiled novel mutations and variants that had eluded previous GWAS [16]. Nonetheless, the underlying causes of CL/P onset remain enigmatic, presenting a continuing challenge in the field of craniofacial genetics.
Future Perspective
Genomic studies of multifactorial diseases such as CL/P rely on the aggregation of SNPs, each exerting a subtle influence on disease risk. The foundational approach involves examining SNP occurrence frequencies in both affected and unaffected cohorts to understand the causes of the disease. This approach aligns with the prevailing “common-disease common-variant hypothesis” (Figure 2), which suggests that multifactorial diseases are predominantly influenced by variants with high population frequencies.
Figure 2: Relationship Between Allele Frequency and Effect Size of Genomic Variants in Disease Pathogenesis.
Most of the variants associated with cleft lip and cleft palate, as identified through GWAS, were mainly located in non-coding regions, with only 4.9% of these variants found in coding regions [17]. Moreover, the variants identified in the GWAS generally showed weak effects, characterized by odds ratios ranging from 1.05 to 1.2.
The existing genetic framework for multifactorial diseases is based on two hypotheses. The first posits that GWAS-identified variants located within non-coding gene regions are pivotal due to their regulatory role in controlling gene expression, despite their subtle impacts. The second hypothesis postulates the existence of unknown rare variants that remained undetectable through GWAS. To test these hypotheses, a comprehensive analysis of the susceptibility loci associated with cleft lip and cleft palate as detected in GWAS using next-generation sequencing is imperative. This approach aims to unveil the intricate genetic landscape of this condition. Detecting rare variants using next-generation sequencing requires a large sample size, similar to the GWAS analysis. Therefore, it is imperative to generate whole-genome data for the Japanese population.
A genomic analysis using next-generation sequencing will serve as the foundation for genomic research to explore multifactorial diseases. This approach facilitates the mapping of diseases and loci. Furthermore, once the mechanisms governing disease expression are fully understood, it will enable the prediction of genetic risk scores.
Conversely, our project, “What causes cleft lip and palate in the Japanese population?” has made significant progress with the development of analytical techniques. However, a conclusive determination is yet to be made. Looking ahead, our future endeavors will delve into epigenome abnormalities, encompassing areas such as DNA methylation, histone methylation, histone
acetylation, and other structural abnormalities that have remained unexplored. In the long term, our aspiration is for cleft lip and cleft palate treatment to evolve toward a realm of healthcare where patients can exercise choices beyond the conventional, burdensome surgical interventions.
Acknowledgments
We would like to express our gratitude to Dr. Takeshi Uchiyama Dr. Akira Yamaguchi of Tokyo Dental University, and Dr. Koichiro Yoshiura of Nagasaki University for their significant contributions to the successful execution of this project.
Funding Information
This work was supported by JSPS KAKENHI Grant-in-Aid for Scientific Research (C) JP20K10124.
References
- Mossey PA, Modell B. Epidemiology of oral clefts 2012: An international perspective. Front Oral Biol. 2012; 16: 1-18.
- Beaty TH, Ruczinski I, Murray JC, Marazita ML, Munger RG, Hetmanski JB, et al. Evidence for gene-environment interaction in a genome wide study of isolated, non-syndromic cleft palate. Genet Epidemiol. 2011; 35: 469-478.
- Arwa B, Melita I. Orofacial Clefts: Genetics of Cleft Lip and Palate. 2023: 14, 1603.
- Reynolds K, Zhang S, Sun B, Garland MA, Ji Y, Zhou CJ. Genetics and signaling mechanisms of orofacial clefts. Birth Defects Res. 2020; 112: 1588-1634.
- Iwata J, Parada C, Chai Y. The mechanism of TGF-β signaling during palate development. Oral Dis. 2011; 17: 733-744.
- Liu W, Sun X, Braut A, Mishina Y, Behringer RR, Mina M, Martin JF. Distinct functions for Bmp signaling in lip and palate fusion in mice. Development. 2005; 132: 1453-1461.
- Ueharu H, Mishina Y. BMP Signaling during Craniofacial Development New Insights into Pathological Mechanisms Leading to Craniofacial Anomalies. Front Physiol. 2023; 14.
- Hammond NL, Brookes KJ, Dixon MJ. Ectopic Hedgehog Signaling Causes Cleft Palate and Defective Osteogenesis. J Dent Res. 2018; 97: 1485-1493.
- Ewens WJ, Spielman RS. The transmission/disequilibrium test: history, subdivision, and admixture. Am J Hum Genet. 1995; 57: 455-464.
- Watanabe A, Akita S, Tin NTD, Natsume N, Nakano Y, Niikawa N, et al. A Mutation in RYK is a Genetic Factor for Nonsyndromic Cleft lip and Palate. The Cleft Palate-Craniofacial Journal. 2006; 43: 310-316.
- Proetzel G, Pawlowski SA, Wiles MV, Yin M, Boivin GP, Howles PN, et al. Transforming growth factor-beta 3 is required for secondary palate fusion. Nat Genet. 1995; 11: 409-414.
- Ichikawa E, Watanabe A, Nakano Y, Akita S, Hirano A, Kinoshita A, et al. PAX9 and TGFB3 are susceptible to nonsyndromic cleftlip with or without cleft palate in the Japanese: Population-based and family-based candidate gene analyses. J Hum Genet. 2006; 51: 38-46.
- McCarthy MI, Abecasis GR, Cardon LR, Goldstein DB, Little J, Loannidis JPA, et al. Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nat Rev Genet. 2008; 9: 356-369.
- Hikida M, Tsuda M, Watanabe A, Kinoshita A, Akita S, Hirano A, et al. No evidence of association between 8q24 and susceptibility to nonsyndromic cleft lip with or without palate in Japanese population. Cleft Palate Craniofac J. 2012; 49: 714-717.
- Manolio TA, Collins FS, Cox NJ, Goldstein DB, Hindorff LA, Hunter DJ, et al. Finding the missing heritability of complex diseases. Nature. 2009; 461: 747-753.
- Eichler EE, Flint J, Gibson G, Kong A, Leal SM, Moore JH, et al. Missing heritability and strategies for finding the underlying causes of complex disease. Nat Rev Genet. 2010; 11: 446-450.
- Maruano MT, Humbert R, Rynes E, Thurman RE, Haugen E, Wang H, et al. Systematic localization of common disease-associated variation in regulatory DNA. Science. 2012; 337: 1190-1195.