Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

Logistic Regression for Breast Cancer Analysis


Affiliations
1 Department of Computer Science, The Northcap University, India
     

   Subscribe/Renew Journal


In this study, logistic regression on mammograms is used to diagnose breast cancer. The aim of using logistic regression is to obtain the significant clinical factors contributing more towards higher probability of breast cancer. The sample data set is taken from UC Irvine repository and modeled using the regression model. A 10-fold cross validation is applied on the training data set to avoid the over fitting problem. The sample data set contains mammograms samples collected by a survey conducted by the Radiologist. The classification table of 450 samples illustrations the correct classification percentage for mammogram as 96.6%. The result is then compared with 30 validated samples, correct classification 68.9%.The simulation results claims that the used linear regression model is able to map relationships among attributes by giving more accurate classification

Keywords

Breast Cancer, Mammograms, Prediction, Logistic Regression, Factors and Accuracy.
User
Subscription Login to verify subscription
Notifications
Font Size

  • . Al-Ghamdi, A. S. Using logistic regression to estimate the influence of accident factors on accident severity. Accident Analysis & Prevention 34(6) (2002): 729-741.
  • . Archer, K. J., S. Lemeshow, and Hosmer, D. W., Goodness-of-fit tests for logistic regression models when data are collected using a complex sampling design. Computational Statistics & Data Analysis 51 (9) (2007): 4450-4464.
  • . P. C. and J. V. Tu, Automated variable selection methods for logistic regression produced unstable models for predicting acute myocardial infarction mortality. Journal of Clinical Epidemiology 57(11) (2004): 1138-1146.
  • . Bagley, S. C., H. White, and Golomb, B. A. Logistic regression in the medical literature: Standards for use and reporting, with particular attention to one medical domain. Journal of Clinical Epidemiology 54(10) (2001): 979-985.
  • . Balleyguier, C., S. Ayadi, K. V. Nguyen, D. Vanel, C. Dromain, and R. Sigal ,BIRADS(TM) classification in mammography. European Journal of Radiology 61(2) (2007): 192-194.
  • . Colditz, G. A., W. C. Willett, D. J. Hunter,M. J. Stampfer, J. E. Manson, C. H. Hennekens, B. A. Rosner, and F. E. Speizer, Family History, Age, and Risk of Breast Cancer: Prospective Data From the Nurses' Health Study. Journal of Clinical Medicine 270(3) (1993): 338-343.
  • . Kamber, M., Winstone, L., Gong, W., Cheng, S., & Han, J. Generalization and decision tree induction: efficient classification in data mining. In Research Issues in Data Engineering, 1997. Proceedings. Seventh International Workshop on 1997:. 111-120.
  • . Ngai, E. W., Xiu, L., & Chau, D. C. Application of data mining techniques in customer relationship management: A literature review and classification. Expert systems with applications, 36(2) (2009):2592-2602.
  • . Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., & Witten, I. H. The WEKA data mining software: an update. ACM SIGKDD explorations newsletter 11(1) (2009):10-18.
  • . Steinbach, M., Karypis, G., & Kumar, V. A comparison of document clustering techniques. In KDD workshop on text mining 400(1) (2000): 525-526.
  • . Mantel, N. The detection of disease clustering and a generalized regression approach. Cancer research, 27(2 Part 1) (1967):209-220.
  • . Ng, A. Y., Jordan, M. I., & Weiss, Y. On spectral clustering: Analysis and an algorithm. In Advances in neural information processing systems (2002): 849-856.
  • . Al-Hajj, M., Wicha, M. S., Benito-Hernandez, A., Morrison, S. J., & Clarke, M. F. (2003). Prospective identification of tumorigenic breast cancer cells. Proceedings of the National Academy of Sciences, 100(7) (2003):3983-3988.
  • . Gunjal, B. L. Wavelet based color image watermarking scheme giving high robustness and exact corelation. International Journal of Emerging Trends in Engineering and Technology (IJETET), 1(1) (2011): 21-30.
  • . Concato, J., Feinstein, A. R., & Holford, T. R. The risk of determining risk with multivariable models. Annals of internal medicine, 118(3) (1993): 201-210.

Abstract Views: 325

PDF Views: 4




  • Logistic Regression for Breast Cancer Analysis

Abstract Views: 325  |  PDF Views: 4

Authors

Bhoomi Sharma
Department of Computer Science, The Northcap University, India
Abhimanyu Abhimanyu
Department of Computer Science, The Northcap University, India
Anuradha Anuradha
Department of Computer Science, The Northcap University, India
Yogita Gigras
Department of Computer Science, The Northcap University, India

Abstract


In this study, logistic regression on mammograms is used to diagnose breast cancer. The aim of using logistic regression is to obtain the significant clinical factors contributing more towards higher probability of breast cancer. The sample data set is taken from UC Irvine repository and modeled using the regression model. A 10-fold cross validation is applied on the training data set to avoid the over fitting problem. The sample data set contains mammograms samples collected by a survey conducted by the Radiologist. The classification table of 450 samples illustrations the correct classification percentage for mammogram as 96.6%. The result is then compared with 30 validated samples, correct classification 68.9%.The simulation results claims that the used linear regression model is able to map relationships among attributes by giving more accurate classification

Keywords


Breast Cancer, Mammograms, Prediction, Logistic Regression, Factors and Accuracy.

References