Open Access Open Access  Restricted Access Subscription Access

Machine Learning in Early Genetic Detection of Multiple Sclerosis Disease: A Survey


Affiliations
1 College of Computing and Information Technology, Arab Academy for Science Technology and Maritime Transport, Cairo,, Egypt
 

Multiple sclerosis disease is a main cause of non-traumatic disabilities and one of the most common neurological disorders in young adults over many countries. In this work, we introduce a survey study of the utilization of machine learning methods in Multiple Sclerosis early genetic disease detection methods incorporating Microarray data analysis and Single Nucleotide Polymorphism data analysis and explains in details the machine learning methods used in literature. In addition, this study demonstrates the future trends of Next Generation Sequencing data analysis in disease detection and sample datasets of each genetic detection method was included .in addition, the challenges facing genetic disease detection were elaborated.

Keywords

Multiple Sclerosis, Machine Learning, Microarray, Single Nucleotide Polymorphism, Early Disease Detection, Next Generation Sequencing.
User
Notifications
Font Size

  • K. Berer and G. Krishnamoorthy, “Microbial view of central nervous system autoimmunity,” FEBS Letters, vol. 588, no. 22. Elsevier, pp. 4207–4213, 17-Nov-2014.
  • M. Naghavi et al., “Global, regional, and national age-sex specific all-cause and cause-specific mortality for 240 causes of death, 1990-2013: A systematic analysis for the Global Burden of Disease Study 2013,” Lancet, vol. 385, no. 9963, pp. 117–171, Jan. 2015.
  • J. L. Salzer and B. Zalc, “Myelination,” Current Biology, vol. 26, no. 20. Cell Press, pp. R971–R975, 24-Oct-2016.
  • R. M. Van Der Vuurst De Vries et al., “Application of the 2017 Revised McDonald Criteria for Multiple Sclerosis to Patients with a Typical Clinically Isolated Syndrome,” JAMA Neurol., vol. 75, no. 11, pp. 1392–1398, Nov. 2018.
  • O. Olerup et al., “Primarily chronic progressive and relapsing/remitting multiple sclerosis: Two immunogenetically distinct disease entities,” Proc. Natl. Acad. Sci. U. S. A., vol. 86, no. 18, pp. 7113–7117, Sep. 1989.
  • K. Bashir and J. N. Whitaker, “Clinical and laboratory features of primary progressive and secondary progressive MS,” Neurology, vol. 53, no. 4, pp. 765–771, Sep. 1999.
  • A. Compston and A. Coles, “Multiple sclerosis,” The Lancet, vol. 372, no. 9648. Elsevier, pp. 1502–1517, 25-Oct-2008.
  • P. Dilokthornsakul, R. J. Valuck, K. V Nair, J. R. Corboy, R. R. Allen, and J. D. Campbell, “Multiple sclerosis prevalence in the United States commercially insured population,” 2016.
  • M. T. Wallin et al., “The prevalence of MS in the United States: A population-based estimate using health claims data,” Neurology, vol. 92, no. 10, pp. E1029–E1040, Mar. 2019.
  • N. El-Tallawy, W. M A Farghaly, R. Badry, N. A. Metwally, M. Abd El Hamed, and M. R. Kandil, “Prevalence of multiple sclerosis in al Quseir city, red sea governorate, egypt,” 2016.
  • P. Heydarpour, S. Khoshkish, S. Abtahi, M. Moradi-Lakeh, and M. A. Sahraian, “Multiple Sclerosis Epidemiology in Middle East and North Africa: A Systematic Review and Meta-Analysis,” Neuroepidemiology, vol. 44, no. 4, pp. 232–244, 2015.
  • M. Stangel, I. K. Penner, B. A. Kallmann, C. Lukas, and B. C. Kieseier, “Towards the implementation of ‘no evidence of disease activity’ in multiple sclerosis treatment: the multiple sclerosis decision model.,” Ther. Adv. Neurol. Disord., vol. 8, no. 1, pp. 3–13, Jan. 2015.
  • A. Zager, “Modulating the immune response with the wake-promoting drug modafinil: a potential therapeutic approach for inflammatory disorders,” Brain. Behav. Immun., Apr. 2020.
  • R. J. Ramteke and K. Monali, “Automatic Medical Image Classification and Abnormality Detection Using K-Nearest Neighbour,” 2012.
  • Y. Zhao et al., “Exploration of machine learning techniques in predicting multiple sclerosis disease course,” 2017.
  • M. J. Fartaria et al., “Automated detection of white matter and cortical lesions in early stages of multiple sclerosis,” J. Magn. Reson. Imaging, 2016.
  • S. H. Wang et al., “Multiple sclerosis identification by 14-layer convolutional neural network with batch normalization, dropout, and stochastic pooling,” Front. Neurosci., 2018.
  • V. Saccà et al., “Evaluation of machine learning algorithms performance for the prediction of early multiple sclerosis from resting-state FMRI connectivity data,” Brain Imaging Behav., 2019.
  • C. Cavaliere et al., “Computer-aided diagnosis of multiple sclerosis using a support vector machine and optical coherence tomography features,” Sensors (Switzerland), 2019.
  • A. P. del Palomar et al., “Swept source optical coherence tomography to early detect multiple sclerosis disease. The use of machine learning techniques,” PLoS One, 2019.
  • K. Bendfeldt et al., “Multivariate pattern classification of gray matter pathology in multiple sclerosis,” Neuroimage, vol. 60, no. 1, pp. 400–408, Mar. 2012.
  • M. F. Rachmadi et al., “Limited One-time Sampling Irregularity Map (LOTS-IM) for Automatic Unsupervised Assessment of White Matter Hyperintensities and Multiple Sclerosis Lesions in Structural Brain Magnetic Resonance Images,” Comput. Med. Imaging Graph., 2020.
  • Y. Zhang et al., “Comparison of machine learning methods for stationary wavelet entropy-based multiple sclerosis detection: Decision tree, k -nearest neighbors, and support vector machine,” Simulation, 2016.
  • M. Torabi, H. Moradzadeh, R. Vaziri, R. D. Ardekani, and E. Fatemizadeh, “Multiple sclerosis diagnosis based on analysis of subbands of 2-D wavelet transform applied on MR-images,” in 2007 IEEE/ACS International Conference on Computer Systems and Applications, AICCSA 2007, 2007.
  • D. R. Nayak, R. Dash, and B. Majhi, “Brain MR image classification using two-dimensional discrete wavelet transform and AdaBoost with random forests,” Neurocomputing, 2016.
  • S. H. Wang et al., “Multiple Sclerosis Detection Based on Biorthogonal Wavelet Transform, RBF Kernel Principal Component Analysis, and Logistic Regression,” IEEE Access, 2016.
  • X. Wu and M. Lopez, “Multiple Sclerosis Slice Identification by Haar Wavelet Transform and Logistic Regression,” 2017.
  • L. Mezzaroba et al., “Antioxidant and Anti-inflammatory Diagnostic Biomarkers in Multiple Sclerosis: A Machine Learning Study,” Mol. Neurobiol., 2020.
  • S. E Fiedler et al., “Analysis of IL-6, IL-1β and TNF-α production in monocytes isolated from multiple sclerosis patients treated with disease modifying drugs,” J. Syst. Integr. Neurosci., 2017.
  • E. Tönnies and E. Trushina, “Oxidative Stress, Synaptic Dysfunction, and Alzheimer’s Disease,” Journal of Alzheimer’s Disease. 2017.
  • A. M. Witkowska et al., “Serum Levels of Biomarkers of Immune Activation and Associations With Neurological Impairment in Relapsing-Remitting Multiple Sclerosis Patients During Remission,” Biol. Res. Nurs., 2016.
  • S. L. Andersen et al., “Metabolome-based signature of disease pathology in MS,” Mult. Scler. Relat. Disord., 2019.
  • E. J. deAndrés-Galiana, G. Bea, J. L. Fernández-Martínez, and L. N. Saligan, “Analysis of defective pathways and drug repositioning in Multiple Sclerosis via machine learning approaches,” Comput. Biol. Med., vol. 115, Dec. 2019.
  • S. Ghafouri-Fard, M. Taheri, M. D. Omrani, A. Daaee, and H. Mohammad-Rahimi, “Application of Artificial Neural Network for Prediction of Risk of Multiple Sclerosis Based on Single Nucleotide Polymorphism Genotypes,” J. Mol. Neurosci., Mar. 2020.
  • I. Barbulovic-Nad, M. Lucente, Y. Sun, M. Zhang, A. R. Wheeler, and M. Bussmann, “Bio-microarray fabrication techniques - A review,” Critical Reviews in Biotechnology, vol. 26, no. 4. Taylor and Francis Inc., pp. 237–259, 01-Dec-2006.
  • J. G. Duarte and J. M. Blackburn, “Advances in the development of human protein microarrays,” Expert Review of Proteomics, vol. 14, no. 7. Taylor and Francis Ltd, pp. 627–641, 03-Jul-2017.
  • A. Brazma et al., “Minimum information about a microarray experiment (MIAME) - Toward standards for microarray data,” Nature Genetics, vol. 29, no. 4. Nature Publishing Group, pp. 365–371, 2001.
  • R. Edgar, “Gene Expression Omnibus: NCBI gene expression and hybridization array data repository,” Nucleic Acids Res., 2002.
  • V. E. Velculescu, L. Zhang, B. Vogelstein, and K. W. Kinzler, “Serial analysis of gene expression,” Science (80-. )., vol. 270, no. 5235, pp. 484–487, Oct. 1995.
  • P. Guo, Q. Zhang, Z. Zhu, Z. Huang, and K. Li, “Mining gene expression data of multiple sclerosis,” PLoS One, vol. 9, no. 6, p. e100052, Jun. 2014.
  • J.-C. Corvol et al., “Abrogation of T cell quiescence characterizes patients at high risk for multiple sclerosis after the initial neurological event,” 2008.
  • R. Ulrich, A. Kalkuhl, U. Deschl, and W. Baumgärtner, “Machine learning approach identifies new pathways associated with demyelination in a viral model of multiple sclerosis Keywords: cholesterol • demyelination • immunohistology • microarray • multiple sclerosis • random forest machine learning algorithm • s,” J. Cell. Mol. Med, vol. 14, no. 2, pp. 434–448, 2010.
  • Y. D. Zhang, C. Pan, J. Sun, and C. Tang, “Multiple sclerosis identification by convolutional neural network with dropout and parametric ReLU,” J. Comput. Sci., 2018.
  • V. G. Tusher, R. Tibshirani, and G. Chu, “Significance analysis of microarrays applied to the ionizing radiation response,” Proc. Natl. Acad. Sci. U. S. A., 2001.
  • G. Stelzer et al., “GeneDecks: Paralog hunting and gene-set distillation with genecards annotation,” Omi. A J. Integr. Biol., 2009.
  • M. W. Nachman, “Single nucleotide polymorphisms and recombination rate in humans,” Trends in Genetics. 2001.
  • S. Srinivasan and J. Batra, “Single nucleotide polymorphism typing,” in Encyclopedia of Bioinformatics and Computational Biology: ABC of Bioinformatics, 2018.
  • P. R. Burton et al., “Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls,” Nature, 2007.
  • L. K. Hoeffding, A. Rosengren, J. H. Thygesen, H. Schmock, T. Werge, and T. Hansen, “Evaluation of shared genetic susceptibility loci between autoimmune diseases and schizophrenia based on genome-wide association studies,” Nord. J. Psychiatry, 2017.
  • S. Ghafouri-Fard, M. Taheri, M. D. Omrani, A. Daaee, and H. Mohammad-Rahimi, “Application of Artificial Neural Network for Prediction of Risk of Multiple Sclerosis Based on Single Nucleotide Polymorphism Genotypes,” J. Mol. Neurosci., 2020.
  • C. Lopez, S. Tucker, T. Salameh, and C. Tucker, “An unsupervised machine learning method for discovering patient clusters based on genetic signatures,” J. Biomed. Inform., 2018.
  • B. A. Goldstein, A. E. Hubbard, A. Cutler, and L. F. Barcellos, “An application of Random Forests to a genome-wide association dataset: Methodological considerations and new findings,” BMC Genet., vol. 11, p. 49, Jun. 2010.
  • J. Ostmeyer et al., “Statistical classifiers for diagnosing disease from immune repertoires: A case study using multiple sclerosis,” BMC Bioinformatics, vol. 18, no. 1, p. 401, Sep. 2017.
  • F. B. S. Briggs et al., “Evidence for CRHR1 in multiple sclerosis using supervised machine learning and meta-analysis in 12 566 individuals,” Hum. Mol. Genet., 2010.
  • S. Purcell et al., “PLINK: A Tool Set for Whole-Genome Association and Population-Based Linkage Analyses,” Am. J. Hum. Genet., vol. 81, no. 3, pp. 559–575, Sep. 2007.
  • M. M. A. El Hamid, N. M. Ali, M. N. Saad, M. S. Mabrouk, and O. G. Shaker, “Multiple sclerosis: an associated single-nucleotide polymorphism study on Egyptian population,” Netw. Model. Anal. Heal. Informatics Bioinforma., vol. 9, no. 1, p. 48, Dec. 2020.
  • S. L. Andersen et al., “Metabolome-based signature of disease pathology in MS,” Mult. Scler. Relat. Disord., vol. 31, pp. 12–21, Jun. 2019.
  • J. Ostmeyer et al., “Statistical classifiers for diagnosing disease from immune repertoires: A case study using multiple sclerosis,” BMC Bioinformatics, 2017.
  • R. Sun, K. L. Hsieh, and J. J. Sosnoff, “fall Risk prediction in Multiple Sclerosis Using postural Sway Measures: A Machine Learning Approach.”
  • C. Cortes, “Support-Vector Networks,” 1995.
  • V. Kecman, “Support Vector Machines – An Introduction,” 2005, pp. 1–47.
  • T. Hastie, R. Tibshirani, and J. Friedman, Springer Series in Statistics The Elements of Statistical Learning - Data Mining, Inference, and Prediction. 2009.
  • D. Ignatov and A. Ignatov, “Decision stream: Cultivating deep decision trees,” in Proceedings - International Conference on Tools with Artificial Intelligence, ICTAI, 2018, vol. 2017-November, pp. 905–912.
  • F. Wang Cynthia Rudin CSAIL, “Falling Rule Lists,” 2015.
  • M. Barsacchi, A. Bechini, and F. Marcelloni, “An analysis of boosted ensembles of binary fuzzy decision trees,” Expert Syst. Appl., vol. 154, p. 113436, Sep. 2020.
  • T. K. Ho, “The random subspace method for constructing decision forests,” IEEE Trans. Pattern Anal. Mach. Intell., 1998.
  • B. Xu, J. Z. Huang, G. Williams, Q. Wang, and Y. Ye, “Classifying very high-dimensional data with random forests built from small subspaces,” Int. J. Data Warehous. Min., 2012.
  • R. E. Kass, “Statistical inference: The big picture,” Stat. Sci., 2011.
  • A. Pérez, P. Larrañaga, and I. Inza, “Bayesian classifiers based on kernel density estimation: Flexible classifiers,” Int. J. Approx. Reason., 2009.
  • C. C. Aggarwal and C. C. Aggarwal, “An Introduction to Neural Networks,” in Neural Networks and Deep Learning, Springer International Publishing, 2018, pp. 1–52.
  • C. C. Aggarwal and C. C. Aggarwal, “Advanced Topics in Deep Learning,” in Neural Networks and Deep Learning, Springer International Publishing, 2018, pp. 419–458.
  • F. Agostinelli, M. Hoffman, P. Sadowski, and P. Baldi, “Learning activation functions to improve deep neural networks,” in 3rd International Conference on Learning Representations, ICLR 2015 - Workshop Track Proceedings, 2015.
  • D. Hendrycks and K. Gimpel, “Bridging Nonlinearities and Stochastic Regularizers with Gaussian Error Linear Units,” arXiv, 2016.
  • J. Tolles and W. J. Meurer, “Logistic regression: Relating patient characteristics to outcomes,” JAMA - Journal of the American Medical Association, vol. 316, no. 5. American Medical Association, pp. 533–534, 02-Aug-2016.
  • K. P. Murphy, Machine learning : a probabilistic perspective. MIT Press, 2012.
  • B. U. Park, L. Simar, and V. Zelenyuk, “Nonparametric estimation of dynamic discrete choice models for time series data,” Comput. Stat. Data Anal., 2016.
  • E. M. Sweeney et al., “A comparison of supervised machine learning algorithms and feature vectors for MS lesion segmentation using multimodal structural MRI,” PLoS One, vol. 9, no. 4, p. e95753, Apr. 2014.
  • A. S. Altheneyan and M. E. B. Menai, “Naïve Bayes classifiers for authorship attribution of Arabic texts,” J. King Saud Univ. - Comput. Inf. Sci., 2014.
  • M. Mayilvaganan and D. Kalpanadevi, “Comparison of classification techniques for predicting the performance of students academic environment,” in 2014 International Conference on Communication and Network Technologies, ICCNT 2014, 2015.
  • L. Low, M. Tammi, L. Low, and M. T. Tammi, “Introduction to Next Generation Sequencing Technologies,” in Bioinformatics, 2017.
  • G. B. Han and D. H. Cho, “Genome classification improvements based on k-mer intervals in sequences,” Genomics, vol. 111, no. 6, pp. 1574–1582, Dec. 2019.
  • E. Aun Id, V. Kisand, T. Tenson, and M. Remm Id, “A k-mer-based method for the identification of phenotype-associated genomic biomarkers and predicting phenotypes of sequenced bacteria,” 2018.
  • A. Kishk et al., “A Hybrid Machine Learning Approach for the Phenotypic Classification of Metagenomic Colon Cancer Reads Based on Kmer Frequency and Biomarker Profiling,” in 2018 9th Cairo International Biomedical Engineering Conference, CIBEC 2018 - Proceedings, 2019, pp. 118–121.
  • T. Kim, H. D. Seo, L. Hennighausen, D. Lee, and K. Kang, “Octopus-toolkit: A workflow to automate mining of public epigenomic and transcriptomic next-generation sequencing data,” Nucleic Acids Res., vol. 46, no. 9, May 2018.
  • A. M. Bolger, M. Lohse, and B. Usadel, “Trimmomatic: a flexible trimmer for Illumina sequence data,” Bioinformatics, vol. 30, no. 15, pp. 2114–2120, Aug. 2014.
  • C. Chen, S. S. Khaleel, H. Huang, and C. H. Wu, “Software for pre-processing Illumina next-generation sequencing short read sequences,” Source Code Biol. Med., 2014.
  • T. Lencz et al., “PLINK: A tool Set for wholegenome ssociation and population based linkage analyses,” Front. Genet., 2018.
  • M. Abadi et al., “TensorFlow: A system for large-scale machine learning,” in Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2016, 2016.
  • L. Rampasek and A. Goldenberg, “TensorFlow: Biology’s Gateway to Deep Learning?,” Cell Systems. 2016.
  • S. Andrews, “FASTQC A Quality Control tool for High Throughput Sequence Data,” Babraham Inst., 2015.
  • A. Haroon, “PubMed (http://www.ncbi.nlm.nih.gov/PubMed),” The Lancet, 1998. .

Abstract Views: 280

PDF Views: 156




  • Machine Learning in Early Genetic Detection of Multiple Sclerosis Disease: A Survey

Abstract Views: 280  |  PDF Views: 156

Authors

Nehal M. Ali
College of Computing and Information Technology, Arab Academy for Science Technology and Maritime Transport, Cairo,, Egypt
Mohamed Shaheen
College of Computing and Information Technology, Arab Academy for Science Technology and Maritime Transport, Cairo,, Egypt
Mai S. Mabrouk
College of Computing and Information Technology, Arab Academy for Science Technology and Maritime Transport, Cairo,, Egypt
Mohamed A. AboRezka
College of Computing and Information Technology, Arab Academy for Science Technology and Maritime Transport, Cairo,, Egypt

Abstract


Multiple sclerosis disease is a main cause of non-traumatic disabilities and one of the most common neurological disorders in young adults over many countries. In this work, we introduce a survey study of the utilization of machine learning methods in Multiple Sclerosis early genetic disease detection methods incorporating Microarray data analysis and Single Nucleotide Polymorphism data analysis and explains in details the machine learning methods used in literature. In addition, this study demonstrates the future trends of Next Generation Sequencing data analysis in disease detection and sample datasets of each genetic detection method was included .in addition, the challenges facing genetic disease detection were elaborated.

Keywords


Multiple Sclerosis, Machine Learning, Microarray, Single Nucleotide Polymorphism, Early Disease Detection, Next Generation Sequencing.

References