Aim An in silico pathway analysis was performed in an attempt to identify new biomarkers for cervical carcinoma.
Methods Three publicly available Affymetrix gene expression data sets (GSE5787, GSE7803, GSE9750) were retrieved, vouching for a total 9 cervical cancer cell lines, 39 normal cervical samples, 7 CIN3 samples and 111 cervical cancer samples. An Agilent data set (GSE7410; 5 normal cervical samples, 35 samples from invasive cervical cancer) was selected as a validation set. Predication analysis of microarrays was performed in the Affymetrix sets to identify cervical cancer biomarkers. We compared the lists of differentially expressed genes between normal and CIN3 samples on the one hand (n=1923) and between CIN3 and invasive cancer samples on the other hand (n=628).
Results Seven probe sets were identified that were significantly overexpressed (at least 2 fold increase expression level, and false discovery rate <5%) in both CIN3 samples respective to normal samples and in cancer samples respective to CIN3 samples. From these, five probes sets could be validated in the Agilent data set (P<0.001) comparing the normal with the invasive cancer samples, corresponding to the genes DTL, HMGB3, KIF2C, NEK2 and RFC4. These genes were additionally overexpressed in cervical cancer cell lines respective to the cancer samples. The literature on these markers was reviewed
Conclusion Novel biomarkers in combination with primary human papilloma virus (HPV) testing may allow complete cervical screening by objective, non-morphological molecular methods, which may be particularly important in developing countries
- cervical cancer
- cervical cancer screening
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
Statistics from Altmetric.com
What is already known about this subject?
Human papilloma virus (HPV) infections play a crucial role in cervical cancer carcinogenesis.
Only a minority of patients with HPV infections develop invasive cervical cancer.
HPV testing is currently used from cervical cancer screening and can be improved using new biomarkers.
What does this study add?
DTL, HMGB3, KIF2C, NEK2 and RFC4 are significantly overexpressed in both CIN3 samples respective to normal samples and in cancer samples respective to CIN3 samples.
These genes were additionally overexpressed in cervical cancer cell lines respective to the cancer samples.
How might this impact on clinical practice?
These novel biomarkers in combination with primary HPV testing may allow complete cervical screening by objective, non-morphological molecular methods.
It is necessary to identify a cut-off level expression of these genes to create an algorithm for molecular screening.
Persistent infections with oncogenic human papilloma virus (HPV) play a central role in the carcinogenesis of carcinoma of the uterine cervix.1–4 However, only a small fraction of high-risk HPV-positive women will eventually develop a clinically relevant lesion as the majority of HPV infections induce low-grade precursor lesions that are cleared spontaneously.4 The molecular biological events involved in progression from low-grade lesions to high-grade lesions and invasive cancer are not well understood. Induction of genomic instability and global disruption of gene expression, particularly in the HPV E6 and E7 oncoproteins, has been implicated in HPV-associated carcinogenesis. Novel biomarkers that allow monitoring of these essential molecular events in histological and cytological specimens are likely to improve the detection of lesions that have a high risk of progression, in both primary screening and triage settings.1 4
Although polyvalent HPV vaccines seem promising to eradicate cervical cancer in the future, this disease will be with us for the next decades as these vaccines are at present not generally available in the Western and developing world, while not being entirely effective.2 This implicates that screening will remain important as a strategy to reduce mortality caused by carcinoma of the cervix uteri. Rapid molecular methods for detecting HPV DNA have become commercially available over the last decade and have been introduced for HPV-based cervical cancer screening in some countries.1 4 Despite several advantages, HPV detection has a low positive predictive value for cervical cancer implicating that HPV-positive women need to be triaged with additional testing to determine optimal risk stratification and management.3 5 Disease-specific biomarkers such as p16(INK4A), HPVE6/E7 mRNA, topoisomerase 2a, Ki67 or novel methylation assays have been evaluated in various settings but are at this point not sufficiently validated to be incorporated in HPV-based screening programmes.1 In the PALMS Study it was recently demonstrated that p16/Ki67 dual-stained cytology combines superior sensitivity and non-inferior specificity over cytology for detecting CIN2+ in a group of 27 349 women attending routine cervical cancer screening.6 Reduced expression of the tumour suppressor CDNK2A leads to downstream overexpression of p16 in premalignant cervical cells. Some small panels such as methylation status of cell adhesion molecule 1 (CADM1) and T-lymphocyte maturation associated protein (MAL) showed to have promise as triage tests for HPV-positive women.5–10 However, a recent study assessing DNA methylation status of MAL, ADCYAP1, PAX1 and CADM in 205 patients with low-grade or high-grade CIN and cervical cancer demonstrated that ADCYAP1 and PAX1 had a relatively better discriminatory ability than did methylated MAL and CADM1, which illustrates that the best panel still needs to be discovered.11
Recently we conducted an in silico analysis looking for driver pathways on all publicly available Affymetrix data sets containing normal and pretreatment (pre)invasive cancer samples with relevant clinical information.12 In the present paper we addressed and validated the biomarkers which came up in this analysis and we reviewed the literature on the subject.
Materials and methods
Patient data sets
Patient data sets used were described previously.12 Briefly all publicly available Affymetrix data sets (HGU133-series: GSE5787, GSE7803, GSE9750) containing normal and pretreatment (pre)invasive cervical cancer samples with relevant clinical information were retrieved from the Gene Expression Omnibus (GEO; http://www.ncbi.nlm.nih.gov/gds). The only available Agilent data set (GSE741) fulfilling the same criteria was selected as a validation set. The three Affymetrix gene expression data sets vouched for a total of nine cervical cancer cell lines, 39 normal cervical samples, 7 CIN3 samples and 111 cervical cancer samples. The Agilent data set contained data of 5 normal cervical samples and 35 samples from invasive cervical cancer. No information was available on the HPV status of the samples.
Data normalisation and exploration
Data normalisation and exploration was described in detail in previous papers.12 13
Biomarker analysis for early diagnosis
To detect potential biomarkers for early diagnosis, we performed predication analysis of microarrays comparing normal cervical samples to CIN3 samples and CIN3 samples to invasive cervical cancer samples as described previously.12 We also compared the lists of differentially expressed probe sets between normal versus CIN3 samples, CIN3 versus cancer samples and normal versus cancer samples (figure 1).
Immunohistochemical validation for high mobility group-box 3 (HMGB3) was performed on formalin-fixed paraffin-embedded sections of normal, CIN III and invasive cervical cancer samples using a polyclonal antibody against the C-terminal region of HMGB3 (Aviva Systems Biology, San Diego, USA), diluted 1/100 on a Dako autostainer.
Biomarker discovery for early detection
In order to identify biomarkers for potential early detection of cervical cancer, a biomarker discovery analysis of the Affymetrix microarrays was performed comparing: (A) expression profiles of the normal cervical samples versus the CIN3 samples; and (B) the normals versus the invasive cancer samples. Prediction analysis of genes comparing the top 100 most differentially expressed genes in the normal and invasive samples showed that CDNK2A, MAL, ECT2 and PPP1R3C came up as potential biomarkers (figure 2).12 Of these genes, CDNK2A, ECT2 and PPP1R3C could be validated in the Agilent GSE7410 data set (figure 3). Expression of CDNK2A and ECT was lower in the CIN III and invasive samples compared with the normals making these genes less suitable for early detection. Expression of PPP1R3C is higher in CIN III and invasive cancer versus the normals, but levels of expression are similar in the premalignant and malignant samples in the Affymetrix data sets, which is also inconvenient in a triage setting. Therefore, we decided to adopt a second approach which was described in detail in a previous paper.12 Briefly we compared the lists of differentially expressed genes between normal and CIN3 samples on the one hand (n=1923) and between CIN3 and invasive cancer samples on the other hand (n=628) and looked for genes with higher expression in invasive cancer compared with CIN III and higher expression in CIN III compared with the normals. Seven probe sets were identified that were significantly overexpressed (at least 2 fold increased expression level, and false discovery rate <5%) in both CIN3 samples respective to normal samples and in cancer samples respective to CIN3 samples. From these, six probe sets corresponded to six unique genes: Aurora kinase A (AURKA), denticleless E3 ubiquitine protein ligase homologue DTL, HMGB3, kinesin family member 2 C (KIF2C), NIMA(never in mitosis gene a)-related kinase 2 (NEK2) and RFC4. Five probe sets could be validated in the Agilent data set (P<0.001) comparing the normal with the invasive cancer samples, corresponding to the genes DTL, HMGB3, KIF2C, NEK2 and RFC4 (figure 4). AURKA reached borderline significance (P 0.073) in a similar analysis. There was no difference in the expression of these genes in samples of patients with lymph node metastases or without lymph node metastases in the Agilent samples. The above genes were additionally overexpressed in cervical cancer cell lines respective to the cancer samples, suggesting they are cancer cell intrinsic and thus can be considered as potential biomarkers for cervical cancer tailored to early diagnosis. Genes that showed significantly lower expression in (pre)cancer samples compared with normal samples were not retained as biomarkers, as the absence of a marker for (pre)malignancy may be caused by technical errors and therefore is more difficult to use in daily practice than the presence of the marker.
Validation of the data
Data validation was discussed in detail in a previous paper.12 In addition we performed immunochemical staining for HMGB3 in normal cervix, CIN III and invasive carcinoma and could show absent staining in normal cervix, absent to weak staining in CIN III and clear strong nuclear staining in invasive carcinomas (figure 5).
In the present in silico study we could identify six biomarkers which have the potential to be used for early detection of patients at risk to develop progressive cervical lesions: AURKA, DTL, HMGB3, KIF2C, NEK2 and RFC4. All these genes play pivotal roles in the control of proliferation and differentiation. Using a different methodology Koch and Wiese performed a similar microarray analysis of 24 normal and 102 cervical cancer biopsies from four pooled publicly available studies.14 They found seven probes which are induced in all cervical cancer stages and hereby confirmed the relevance of AURKA, DTL and NEK2 in addition to GINS1, PAK2, PRKDC and CEP 55.
AURKA is a member of the evolutionary conserved Aurora serine/threonine kinase family which is important to maintain genomic stability.15 16 Overexpression of AURKA promotes cell proliferation through G1/S cell cycle transition and to antiapoptosis, and can cause polyploidy and chromosomal instability, xenograft tumour growth and chemoresistance.17 18 Simultaneous inhibition of AURKA and AURKB results in a dramatic decrease in spindle microtubule stability.19 The human AURKA gene maps to 20q13, a region frequently amplified in breast cancers and is also overexpressed in several tumours.20 Although it has been suggested that AURKA expression may predict the outcome of patients with cervical cancer, its precise function and molecular mechanism in cervical cancer pathogenesis remains unclear.15 Using immunohistochemical staining in 180 cervical cancer tissues, Sun et al showed that AURKA is overexpressed in cervical cancer and that expression is significantly correlated with tumour size (P=0.023), lymphovascular space involvement (P<0.001) and deep invasion (P=0.014).18 Twu et al performed immunohistochemical staining of AURKA and AURKB in 20 samples of normal cervix, 35 CIN III samples and 95 invasive cervical carcinoma samples (76 squamous and 19 adenocarcinomas) and could show that expression of these genes is significantly increased in invasive carcinoma and CIN III.21 Overexpression of AURKA was higher in squamous carcinoma compared with adenocarcinoma (50% vs 21%, P=0.023). There was correlation between AURKA and AURKB expression and survival.22 A screen of the human kinome has identified AURKA as being synthetically lethal on the background of HPV infection.23
DTL is an early checkpoint regulating gene interacting with p21.24 25 It is also known as CDT2 (CDC10-dependent transcript 2), DCAF2, L2DTL or RAMP.24 Checkpoint genes maintain genomic stability by arresting cells after DNA damage. Many of these genes also control cell cycle events in unperturbed cells. DTL/CDT2 is required for normal cell cycle control, primarily to prevent replication.25 DTL promotes genomic stability as an essential component of the CUL4-DDB1 complex that controls CDT1 levels.26 It has been shown that changes in the expression of TP53, which play a major role in the pathogenesis of cervical cancer, affects its downstream miRNAs and their most important gene targets MEIS1, AGTR1, DTL, TYMS and BAK1 in head and neck squamous cell carcinoma.27 28 The DTL gene was a found to be of functional relevance in the tumorigenesis of hepatocellular, gastric, colonic and breast carcinoma, and rhabdomyosarcoma, and may be of prognostic importance.25 28–32 According to the Human Protein Atlas project (www.proteinatlas.org) low DTL expression was found in cervical cancer but was also seen in normal cervical tissue.
NEK2 is a serine/threonine kinase involved in the regulation of centrosome duplication and spindle assembly during mitosis.33 34 Dysregulation of these processes causes chromosome instability and aneuploidy.33 There are three isoforms that result from alternate splicing of this gene, named NEK2A, NEK2B and NEK3C. NEK2A is 31% structurally identical to AURKA.33 Subcellular localisation analysis shows that NEK2A resides in both cell nucleus and cytoplasm.34 It displays a cell cycle dependent expression pattern, being low in G1, increasing through S and G2 to reach peak in late G2/M and decreasing on entry into mitosis.35 Overexpression of NEK2 has been reported in cervical and other cancer cell lines and several neoplastic diseases such as preinvasive and invasive breast carcinomas, lung adenocarcinomas, testicular seminomas, liver cancer, pancreatic carcinomas, prostate carcinomas, and diffuse large B cell lymphomas.36–40 NEK2 is a bad prognostic factor in patients with breast cancer, non-small cell lung cancer, colonic carcinoma and pancreatic carcinoma.36–41 Because NEK2A has such a broad spectrum of roles in different cell processes, it is an attractive target for treatment.42
HMGB3, also known as HMG4, HMG2A, is a recently discovered member of the high mobility group (HMG) superfamily of HMG proteins, and is classified with HMGB1 and HMGB2 into the HMG-box subfamily.43 The 80% identity between HMG-box proteins suggests similar functions at a molecular level. HMGB1 and HMGB2 can interact with DNA and subsequently bend linear DNA, thereby facilitating nucleoprotein complex formation through alteration of local chromatin architecture.43–45 HMGB3 has been reported to be overexpressed in a variety of human cancers such as gastric cancer, oesophageal squamous cell carcinoma, bladder cancer, non-small cell lung cancer and breast cancer.46–51 Overexpression of HMGB3 is correlated with aggressive behaviour and poor prognosis in almost all of these tumour types.48–51 HMGB3 has been identified in Hela cervical cancer cells.52 According to the Human Protein Cancer Atlas cervical cancers have variable staining for HMGB3 whereas staining is absent in normal cervix. No further data are available on the expression of HMGB3 in cervical cancer.
KIF2C encodes a kinesin-like protein that functions as a microtubule-dependent molecular motor.53 The encoded protein can depolymerise microtubules, thereby promoting the anaphase of mitotic chromosome segregation and may be required to coordinate the onset of sister centromere separation.53 KIF2C therefore plays an important role during cell proliferation and may be an essential gene in carcinogenesis. KIF2C can induce spontaneous CD4(+) T cell responses of the Th1-type which are tightly controlled by peripheral T regulatory cells, and may be an attractive target for antigen-specific immunotherapies.54 This hypothesis was confirmed by Lu et al who were able to show that tumour infiltrating lymphocytes recognising mutated KIF2C and POLA2 epitopes could mediate complete durable regressions in patients with metastatic melanoma.53 Comprehensive expression analysis in other human cancers demonstrated KIF2C overexpression in gastric, pancreatic, glioma, ovarian, head and neck, and breast cancers.55–59 According to the Human Protein Cancer Atlas most cervical cancers have variable staining for KIF2C whereas staining is absent in normal cervix.
The human replication factor C (RFC) is a multinumerical protein consisting of five distinct subunits that are highly conserved through evolution.60 The RFC family functions as clamp loaders that load PCNA onto DNA in an ATP-dependent process during DNA synthesis.60 RFC is involved in DNA repair following DNA damage.61 The RFC4 gene that encodes for the fourth largest subunit of the RFC complex has been reported to be deregulated in diverse malignancies including prostate cancer, head and neck squamous cell carcinomas, hepatocellular carcinoma, colonic carcinomas and cervical cancer.62–68 Using cDNA array comparative genomic hybridisation Narayan et al indentified a number of over-represented and deleted genes in 29 cases of cervical carcinoma.64 This analysis exhibited frequent and robust upregulated expression for RFC4 and KIF4A among others such as EPHB2, CDCA8, MUC4, MMP1, MMP13 and AKT1 in cervical cancer compared with normal cervix. Comparing three primary isolates and two established cervical cancer cell lines to normal keratinocytes Kang et al could detect overexpression of RPA, RFC, PCNA and DNA polymerase, which seem to play a role in adeno-associated virus DNA replication.63 Increased expression of sine oculis homebox homologue 1 (SIX1, a master regulator of DNA replication in cervical cancer cells) can be induced by the E7 oncoprotein of HPVs in cervical intraepithelial neoplasia and cervical cancer and can result in higher levels of expression of genes related to the initiation of DNA replication such as RFC4.69 Recently Niu et al showed that dysregulation of CDKN2A, IL1R2 and RFC4 may contribute to cervical cancer progression and may be potential diagnostic markers.70 Huang et al could identify and validate a seven-gene signature (consisting of UBL3, FGF3, BMI1, PDGFRA, PTPRF, NOL7 and RFC4) in, respectively, a training set (n=50) and a testing set (n=50) of invasive cervical cancer samples from 100 patients using a custom oligonucleotide microarray. Multivariate analysis showed that International Federation of Gynecology and Obstetrics (FIGO stage) and the seven-gene signature are independent prognostic factors associated with relapse-free survival of patients with cervical cancer.71
Novel biomarkers in combination with primary HPV testing are needed to enable complete cervical screening by objective, non-morphological molecular methods, which may be particularly important in developing countries.3 5 The potential biomarkers in this paper should be further validated in cytology and histological samples of patients with normal, cervical, intraepithelial neoplasias and primary and recurrent cervical cancer. Particularly assessing the diagnostic performance of cytology and/or HPV positivity versus a quantitative polymerase chain reaction (PCR) measuring the expression of a panel of the above genes warrants furthers study. In an interesting study proving this concept Nischalke et al could demonstrate that measuring IGF2BP3, HOXB7 and NEK2 mRNA levels by PCR in addition to cytology has the potential to improve diagnostic precision to detect malignant biliary disorders from brush cytology specimens.41 Sensitivity and specificity were highest when the three diagnostic markers were combined with routine cytology. As the expression levels are not mutually exclusive between the normal and cancer samples, an algorithm using multiple biomarkers will have to be developed in the future to use this test for molecular screening of cervical cancer. Further studies are also needed to clarify whether some of the retained genes may also be used as therapeutic targets.72
Contributors PAvD and SVL designed the study and performed the data acquisition and analysis. Interpretation of the data and drafting and revision of the manuscript was performed by PAvD, CR, CVB, XBT, PP, JPB and SVL.
Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.
Competing interests PAvD, CR and XBT received travel/expenses/congress grants from Roche.
Patient consent Not required.
Ethics approval The studies generating the gene array data sets (GSE5787, GSE7803, GSE9750, GSE7410) used were conducted in accordance with the ethical standards of the relevant institutional and/or national research committees and with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards.
Provenance and peer review Not commissioned; externally peer reviewed.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.