AUTHOR=Nim Hieu T. , Furtado Milena B. , Ramialison Mirana , Boyd Sarah E. TITLE=Combinatorial Ranking of Gene Sets to Predict Disease Relapse: The Retinoic Acid Pathway in Early Prostate Cancer JOURNAL=Frontiers in Oncology VOLUME=7 YEAR=2017 URL=https://www.frontiersin.org/journals/oncology/articles/10.3389/fonc.2017.00030 DOI=10.3389/fonc.2017.00030 ISSN=2234-943X ABSTRACT=Background

Quantitative high-throughput data deposited in consortia such as International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA) present opportunities and challenges for computational analyses.

Methods

We present a computational strategy to systematically rank and investigate a large number (210–220) of clinically testable gene sets, using combinatorial gene subset generation and disease-free survival (DFS) analyses. This approach integrates protein–protein interaction networks, gene expression, DNA methylation, and copy number data, in association with DFS profiles from patient clinical records.

Results

As a case study, we applied this pipeline to systematically analyze the role of ALDH1A2 in prostate cancer (PCa). We have previously found this gene to have multiple roles in disease and homeostasis, and here we investigate the role of the associated ALDH1A2 gene/protein networks in PCa, using our methodology in combination with PCa patient clinical profiles from ICGC and TCGA databases. Relationships between gene signatures and relapse were analyzed using Kaplan–Meier (KM) log-rank analysis and multivariable Cox regression. Relative expression versus pooled mean from diploid population was used for z-statistics calculation. Gene/protein interaction network analyses generated 11 core genes associated with ALDH1A2; combinatorial ranking of the power set of these core genes identified two gene sets (out of 211 − 1 = 2,047 combinations) with significant correlation with disease relapse (KM log rank p < 0.05). For the more significant of these two sets, referred to as the optimal gene set (OGS), patients have median survival 62.7 months with OGS alterations compared to >150 months without OGS alterations (p = 0.0248, hazard ratio = 2.213, 95% confidence interval = 1.1–4.098). Two genes comprising OGS (CYP26A1 and RDH10) are strongly associated with ALDH1A2 in the retinoic acid (RA) pathways, suggesting a major role of RA signaling in early PCa progression. Our pipeline complements human expertise in the search for prognostic biomarkers in large-scale datasets.