The focus of this research is to combine statistical and machine learning tools in application to a high-throughput biological data set on ionizing radiation response. The analyzed data consist of two gene expression sets obtained in studies of radiosensitive and radioresistant breast cancer patients undergoing radiotherapy. The data sets were similar in principle; however, the treatment dose differed. It is shown that introducing mathematical adjustments in data preprocessing, differentiation and trend testing, and classification, coupled with current biological knowledge, allows efficient data analysis and obtaining accurate results. The tools used to customize the analysis workflow were batch effect filtration with empirical Bayes models, identifying gene trends through the Jonckheere–Terpstra test and linear interpolation adjustment according to specific gene profiles for multiple random validation. The application of non-standard techniques enabled successful sample classification at the rate of 93.5% and the identification of potential biomarkers of radiation response in breast cancer, which were confirmed with an independent Monte Carlo feature selection approach and by literature references. This study shows that using customized analysis workflows is a necessary step towards novel discoveries in complex fields such as personalized individual therapy.
If the inline PDF is not rendering correctly, you can download the PDF file here.
Abbott A. (2015). Researchers pin down risks of low-dose radiation Nature523(7558): 17–8.
Alexa A. and Rahnenfuhrer J. (2010). topGO: Enrichment analysis for gene ontology R Package Version 2.30.
Ashburner M. Ball C.A. Blake J.A. Botstein D. Butler H. Cherry J.M. Davis A.P. Dolinski K. Dwight S.S. Eppig J.T. Harris M.A. Hill D.P. Issel-Tarver L. Kasarskis A. Lewis S. Matese J.C. Richardson J.E. Ringwald M. Rubin G.M. and Sherlock G. (2000). Gene Ontology: Tool for the unification of biology Nature Genetics25(1): 25.
Berger J.O. and Pericchi L.R. (1996). The intrinsic Bayes factor for model selection and prediction Journal of the American Statistical Association91(433): 109–122.
Bersani C. Xu L. Vilborg A. Lui W. and Wiman K. (2014). Wig-1 regulates cell cycle arrest and cell death through the p53 targets FAS and 14-3-3σOncogene33(35): 4407.
Bolstad B.M. Irizarry R.A. Åstrand M. and Speed T.P. (2003). A comparison of normalization methods for high density oligonucleotide array data based on variance and bias Bioinformatics19(2): 185–193.
Brenner D.J. Doll R. Goodhead D.T. Hall E.J. Land C.E. Little J.B. Lubin J.H. Preston D.L. Preston R.J. Puskin J.S. Ron E. Sachs R.K. Samet J.M. Setlow R.B. and Zaider M. (2003). Cancer risks attributable to low doses of ionizing radiation: Assessing what we really know Proceedings of the National Academy of Sciences100(24): 13761–13766.
Brodsky R.A. Vala M.S. Barber J.P. Medof M.E. and Jones R.J. (1997). Resistance to apoptosis caused by PIG-A gene mutations in paroxysmal nocturnal hemoglobinuria Proceedings of the National Academy of Sciences94(16): 8756–8760.
Cruz-Garcia L. O’Brien G. Donovan E. Gothard L. Boyle S. Laval A. Testard I. Ponge L. Woźniak G. Miszczyk L. Candéias S.M. Ainsbury E. Widlak P. Somaiah N. and Badie C. (2018). Influence of confounding factors on radiation dose estimation in in vivo validated transcriptional biomarkers Health Physics115(1): 90–101.
Dai M. Wang P. Boyd A.D. Kostov G. Athey B. Jones E.G. Bunney W.E. Myers R.M. Speed T.P. Akil H. Watson S.J. and Meng F. (2005). Evolving gene/transcript definitions significantly alter the interpretation of GeneChip data Nucleic Acids Research33(20): e175–e175.
Elf A.-K. Bernhardt P. Hofving T. Arvidsson Y. Forssell-Aronsson E. Wängberg B. Nilsson O. and Johanson V. (2017). NAMPT inhibitor GMX1778 enhances the efficacy of 177Lu-DOTATATE treatment of neuroendocrine tumors Journal of Nuclear Medicine58(2): 288–292.
Fargeas A. Albera L. Kachenoura A. Dréan G. Ospina J.-D. Coloigner J. Lafond C. Delobel J.-B. De Crevoisier R. and Acosta O. (2015). On feature extraction and classification in prostate cancer radiotherapy using tensor decompositions Medical Engineering and Physics37(1): 126–131.
Finnon P. Kabacik S. MacKay A. Raffy C. AHern R. Owen R. Badie C. Yarnold J. and Bouffler S. (2012). Correlation of in vitro lymphocyte radiosensitivity and gene expression with late normal tissue reactions following curative radiotherapy for breast cancer Radiotherapy and Oncology105(3): 329–336.
Francescatto M. Chierici M. Dezfooli S.R. Zandonà A. Jurman G. and Furlanello C. (2018). Multi-omics integration for neuroblastoma clinical endpoint prediction Biology Direct13(1): 5.
Guidi G. Maffei N. Vecchi C. Gottardi G. Ciarmatori A. Mistretta G. M. Mazzeo E. Giacobazzi P. Lohr F. and Costi T. (2017). Expert system classifier for adaptive radiation therapy in prostate cancer Australasian Physical & Engineering Sciences in Medicine40(2): 337–348.
Jagga Z. and Gupta D. (2015). Machine learning for biomarker identification in cancer research—developments toward its clinical application Personalized Medicine12(4): 371–387.
Johnson W.E. Li C. and Rabinovic A. (2007). Adjusting batch effects in microarray expression data using empirical Bayes methods Biostatistics8(1): 118–127.
Joiner M.C. (2004). A simple α/β-independent method to derive fully isoeffective schedules following changes in dose per fraction International Journal of Radiation Oncology Biology Physics58(3): 871–875.
Jonckheere A.R. (1954). A distribution-free k-sample test against ordered alternatives Biometrika41(1/2): 133–145.
Kabacik S. Mackay A. Tamber N. Manning G. Finnon P. Paillier F. Ashworth A. Bouffler S. and Badie C. (2011). Gene expression following ionising radiation: Identification of biomarkers for dose estimation and prediction of individual response International Journal of Radiation Biology87(2): 115–129.
Kabacik S. Manning G. Raffy C. Bouffler S. and Badie C. (2015). Time dose and ataxia telangiectasia mutated (ATM) status dependency of coding and noncoding RNA expression after ionizing radiation exposure Radiation Research183(3): 325–337.
Kong X. Liu N. and Xu X. (2014). Bioinformatics analysis of biomarkers and transcriptional factor motifs in down syndrome Brazilian Journal of Medical and Biological Research47(10): 834–841.
Krol L. (2015). Distributed Monte Carlo feature selection: Extracting informative features out of multidimensional problems with linear speedup in S. Kozielski et al. (Eds.) Beyond Databases Architectures and Structures. Advanced Technologies for Data Mining and Knowledge Discovery Springer Cham pp. 463–474.
Manning G. Kabacik S. Finnon P. Bouffler S. and Badie C. (2013). High and low dose responses of transcriptional biomarkers in ex vivo X-irradiated human blood International Journal of Radiation Biology89(7): 512–522.
Meehan T.F. Vasilevsky N.A. Mungall C.J. Dougall D.S. Haendel M.A. Blake J.A. and Diehl A.D. (2013). Ontology based molecular signatures for immune cell types via gene expression analysis BMC Bioinformatics14(1): 263.
Mullenders L. Atkinson M. Paretzke H. Sabatier L. and Bouffler S. (2009). Assessing cancer risks of low-dose radiation Nature Reviews Cancer9(8): 596.
Papiez A. Finnon P. Badie C. Bouffler S. and Polanska J. (2014). Integrating expression data from different microarray platforms in search of biomarkers of radiosensitivit International Work-Conference on Bioinformatics and Biomedical Engineering Granada Spain Vol. 1 pp. 484–493.
Park B. Yee C. and Lee K.-M. (2014). The effect of radiation on the immune response to cancers International Journal of Molecular Sciences15(1): 927–943.
Parmar C. Grossmann P. Bussink J. Lambin P. and Aerts H.J. (2015). Machine learning methods for quantitative radiomic biomarkers Scientific Reports5(13087): 13087.
Ray M. Yunis R. Chen X. and Rocke D.M. (2012). Comparison of low and high dose ionising radiation using topological analysis of gene coexpression networks BMC Genomics13(1): 190.
Reinhardt M.J. Kubota K. Yamada S. Iwata R. and Yaegashi H. (1997). Assessment of cancer recurrence in residual tumors after fractionated radiotherapy: A comparison of fluorodeoxyglucose L-methionine and thymidine The Journal of Nuclear Medicine38(2): 280.
Schmid P.R. Palmer N.P. Kohane I.S. and Berger B. (2012). Making sense out of massive data by going beyond differential expression Proceedings of the National Academy of Sciences109(15): 5594–5599.
Shao L. Luo Y. and Zhou D. (2014). Hematopoietic stem cell injury induced by ionizing radiation Antioxidants & Redox Signaling20(9): 1447–1462.
Terpstra T.J. (1952). The asymptotic normality and consistency of Kendall’s test against trend when ties are present in one ranking Proceedings of the Koninklijke Nederlandse Akademie van Wetenschappen55(1): 327–333.
UNSCEAR (2000). Sources and Effects of Ionizing Radiation Vol. 1 United Nations Publications New York NY.
Weichselbaum R.R. Hallahan D. Fuks Z. and Kufe D. (1994). Radiation induction of immediate early genes: Effectors of the radiation-stress response International Journal of Radiation Oncology Biology Physics30(1): 229–234.
Yarnold J. Ashton A. Bliss J. Homewood J. Harper C. Hanson J. Haviland J. Bentzen S. and Owen R. (2005). Fractionation sensitivity and dose response of late adverse effects in the breast after radiotherapy for early breast cancer: Long-term results of a randomised trial Radiotherapy and Oncology75(1): 9–17.
Zhan Q. (2005). GADD45A a p53-and BRCA1-regulated stress protein in cellular response to DNA damage Mutation Research/Fundamental and Molecular Mechanisms of Mutagenesis569(1): 133–143.