A B C D E F G H I J K L M N O P Q R S T U V W X Y Z All
Elazab, Ahmed
- Adaptive Model to Estimate Most Significant Features for Oversampling Medical Data
Authors
1 Department of Computer Science, Fayoum University, Fayoum, 63514, EG
2 General Organization of Physical Planning, GOPP, Cairo, 11311, EG
Source
Artificial Intelligent Systems and Machine Learning, Vol 10, No 8 (2018), Pagination: 172-176Abstract
Computerized classification plays an important role in the classification of cancer stage. Hence, there is a growing need for automatic classification of cancer data. In this paper, a new model for stage classification of cancer data is developed. The proposed model use decision tree data mining technique that is trained on medical data to classify cancer data into several stages based on the weights of the most significant features. Hence, this paper proposes boost strapping approach for generating data oversampling to solve the problems of medical data scarcity. Ultimately, the proposed model improve the gain ratio technique to predict the weights of the factors (attributes) that affected in the staging of each patient case before and after oversampling. The performance of the proposed model is evaluated to develop more cost-effective and easy to use systems that support clinicians. The experimental results show that the proposed model precision is 94% for the original dataset and 90% for the oversampling dataset. The result illustrates the promising capabilities of the model for detecting breast cancer stages by minimum data set and minimum attributes.Keywords
Classification, Cancer Data, Decision Trees, Gain Ratio, Oversampling.References
- J Han and M. Kamber, “Data Mining: Concepts and Techniques”, San Diego: Academic Press, 2001.
- D. Hand, H.Mannila, and P.Smyth, “Principles of Data Mining”, London: MIT Press, 2001.
- J. Han and M. Kamber, "Data Mining Concepts and Techniques”, Morgan Kauffman Publishers, 2000.
- Agarwal G, Ramakant P, Forgach ER, Rendon JC, Chaparro JM, Basurto CS, et al. Breast cancer care in developing countries. World J Surg. 2009; 33(10):2069–76.
- Kumar S, Burney IA, Al Ajmi A, Al Moundhri MS. Changing trends of breast cancer survival in sultanate of oman. J Oncol. 2011; 2011:316243.
- Anu Alias, B.Paulchamy, "Detection of Breast Cancer Using Artificial Neural Networks". International Journal of Innovative Research in Science, Engineering and Technology, ISSN: 2319-8753, Vol. 3, Issue 3, March 2014.
- Singletary, S. Eva. "Rating the risk factors for breast cancer" Annals of surgery 237, no. 4 (2003): 474-482.
- Kanwal P. S. Raghav, Leonel F. Hernandez-Aya, Xiudong Lei, Mariana Chavez-Mac Gregor and et al., "Impact of low estrogen/progesterone receptor expression on survival outcomes in breast cancers previously classified as triple negative breast cancers", Cancer ; 118(6): 1498–1506. doi:10.1002/cncr.26431, 15 March, 2012.
- Yang Li, Qing Zhang, Ruiyang Tian, Qi Wang and et al., " Lysosomal transmembrane protein LAPTM4B promotes autophagy and tolerance to metabolic stress in cancer cells", 71(24): 7481–7489. doi:10.1158/00085472.CAN-11-0940, Cancer Res., 15 December, 2011.
- Hatem A Azim Jr, Fedro A Peccatori, Sylvain Brohée, Daniel Branstetter and et al., " RANK-ligand (RANKL) expression in young breast cancer patients and during pregnancy", DOI 10.1186/s13058-0150538-7, Breast Cancer Research 17:24, 2015.
- Amany Edward Seedhom1 and Nashwa Nabil Kamal, MD, “Factors Affecting Survival of Women Diagnosed with Breast Cancer in ElMinia Governorate, Egypt", Jul-Sep; 2(3): 131–138, Int J Prev Med.
- P. J. Hardefeldt, S. Edirimanne, and G. D. Eslick, “Diabetes increases the risk of breast cancer: a meta-analysis.,” Endocr. Relat. Cancer, vol. 19, no. 6, pp. 793–803, Dec. 2012.
- Pereira A1, Garmendia ML, Alvarado ME, Albala C, "Hypertension and the risk of breast cancer in Chilean women: a case-control study", Asian Pac J Cancer Prev. 2012; 13(11):5829-34.
- Jasmin Teresa Ney, Ingolf Juhasz-Boess, Frank Gruenhage, Stefan Graeber and et al., " Genetic polymorphism of the OPG gene associated with breast cancer", BMC Cancer 2013, DOI: 10.1186/1471-2407-1340, BioMed Central Ltd. 2013.
- Doebar SC, van den Broek EC, Koppert LB, Jager A, Baaijens MH, Obdeijn IA, van Deurzen CH, " Extent of ductal carcinoma in situ according to breast cancer subtypes: a population-based cohort study", Breast Cancer Res Treat. 2016 Jun 18.
- Krishnan K, Baglietto L, Apicella C, Stone J, Southey MC, English DR, Giles GG, Hopper JL "Mammographic density and risk of breast cancer by mode of detection and tumor size: a case-control study", Breast Cancer Res. 2016 Jun 18;18(1):63.
- Kovacevic, M., Huang, R., & You, Y. (2006). Bootstrapping for variance estimation in multi-level models fitted to survey data. ASA Proceedings of the Survey Research Methods Section, 3260-3269.
- Chicago.
- G. a. K. E. F. De'ath, "Classification and regression trees: a powerful yet simple technique for ecological data analysis," Ecology, pp. 81, no. 11 (3178-3192.), 2000.
- J. &. B. R. (. Platkiewicz, “A threshold equation for action potential initiation," pLoS Comput Biol, 6(7), e1000850. 2010.
- Fraud News Detection for Online Social Networks
Authors
1 Institute of Statistical Studies and Research, Cairo University, EG
Source
Artificial Intelligent Systems and Machine Learning, Vol 10, No 8 (2018), Pagination: 177-182Abstract
Social media plays a vital role in all online aspects now, including personal communication, business and economics. It even affects political aspects seriously. A huge amount of available information, especially micro blogs is considered as a massive growth rate of human users, which is represented in the unprecedented diversity of its participants in terms of backgrounds, reasons and languages a revolution in its possibility of sharing public information, besides there is the way it makes its participants use their devices and perform their mission.
Twitter, as a most famous used type of online social networking, contains huge data and news that throw the light on the content investigation in the tweets. This paper has discussed a proposed approach for determining the credibility of spread news on such social networks in two phases: The first phase is to detect the fake users enabling to ignore the news given by fake users. The second phase detects the credibility of the news content for the previously checked
Account users by using the similarity measures and most popular machine learning algorithms such as (Support vector machine, Decision tree, Neural networks, Naive Bayes, Random forest) that enhance the credibility examining. The accuracy of the results of this phase is 99.8 %. In the second phase the news content credibility is detected by using the most popular similarity measures (Jacard, Cosine and Dice), which Jacard ended up with 95.4%percentage of accuracy.
Keywords
Fraud News, Support Vector Machine, Neural Networks, Naive Bayes, Random Forest, Fraud Text.References
- El azab, A., Mahmood A. Mahmood, El-Aziz, A.,”Effectiveness of web usage mining techniques in business application”, web usage mining techniques and application across industries, p.p.324-350,igi global, 2017.
- M. Dash, H. Liu,”Feature Selection for Classification”, Intelligent Data Analysis, Vol 1, p.p. 131156, 1997.
- Kazem Jahanbakhsh, Yumi Moon,”The predictive power of social media: On the predictability of US Presidential Elections using Twitter”, Social and Information Networks, arXiv: 1407. 0622, 2014.
- Adamic L., ZhangJ., Bakshy E., and Ackerman M., ZhangJ .BakshyE., ”Knowledge sharing and yahoo answers: everyone knows something”. Processed in 17th international conference on World Wide Web, ACM, pp 665-674, 2012.
- Carlos Castillo, Marcelo Mendoza, Barbara Poblete, ”Information Credibility on Twitter”, the 20th international conference on World wide web ACM, 675-684,2011.
- Fang Jin, Edward Dougherty, Parang Saraf, Yang Cao, Naren Ramakrishnan,”Epidemiological Modeling of News and Rumors on Twitter”. The 7th SNA-KDD Workshop 13 (SNA-KDD13), August 11, 2013.
- Aditi Gupta, Hemank Lamba, Ponnurangam Kumaraguru, Anupam Joshi, ”Faking Sandy: characterizing and identifying fake images on Twitter during Hurricane Sandy”, In Proceedings of the 22nd international conference on World Wide Web companion, 729-736,2013.
- Rajdev, Meet,”Fake and Spam Messages: Detecting Misinformation during Natural Disasters on Social Media”. All Graduate Theses and Dissertations. Paper 4462, 2015.
- Supraja Gurajala, Joshua S White, Brian Hudson, Brian R Voter, Jeanna N Matthews, ”Profile characteristics of fake Twitter accounts” in SM Society ’15, July 27- 29, Toronto, ON, Canada, 2015.
- Fabricio Benevenuto, Gabriel Magno, Tiago Rodrigues, and Virgilio Almeida,”Detecting Spammers on Twitter”, CEAS 2010 - Seventh annual Collaboration, Electronic messaging, AntiAbuse and Spam Conference July 13-14, 2010, Redmond, Washington, US.
- Ahmed El Azab, Amira M. Idrees, Mahmood A. Mahmood, Hesham Hefny ,”Fake Account Detection in Twitter Based on Minimum Weighted Feature set”,World Academy of Science, Engineering and Technology International Journal of Computer, Electrical, Automation, Control and Information Engineering Vol 10, No 1,p.p. 13-18 2016.
- Karegowda, A. G., Manjunath, A. S., & Jayaram, M. A. (2010). Comparative study of attribute selection using gain ratio and correlation based feature selection. International Journal of Information Technology and Knowledge Management, 2(2), 271-277.
- Hu, X., & Liu, H. (2012). Text analytics in social media. In Mining text data (pp. 385-414). Springer, Boston
- Niwattanakul, S., Singthongchai, J., Naenudorn, E., & Wanapu, S. (2013, March). Using of Jaccard coefficient for keywords similarity. In Proceedings of the International MultiConference of Engineers and Computer Scientists (Vol. 1, No. 6).
- Cresci, S., Di Pietro, R., Petrocchi, M., Spognardi, A., Tesconi, M.,”A Fake Follower Story: improving fake accounts detection on Twitter”, IIT-CNR, Tech. Rep. TR-03, 2014.
- Vahed Qazvinian Emily Rosengren Dragomir R. Radev Qiaozhu Mei, ”Rumor has it: Identifying Misinformation in Microblogs”, Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, p.p. 15891599, Edinburgh, Scotland, UK, July 2731, Association for Computational Linguistics,2011