Open Access Open Access  Restricted Access Subscription Access

A Critique Survey on Diverse Approaches of Web Content Mining

1 School of CSA, REVA University, India

Web mining is used to find the different patterns in data by various categories like web usage mining, web structure mining and web content mining. The method used to gather data by web spiders and web search engines are known as web content mining. The formation of a website can be tartan by using web structure mining and we can test the data of a user’s browser by using web usage mining. The web content mining is a second phase of web mining, which deals with extraction of images, graphs, text etc. The spotlight of this work is to present a brief survey on different techniques used in web content mining. We presented a brief review of different web content mining approaches like multimedia mining, unstructured mining, structured mining and semi-structured mining.


Web Mining, Web Content Mining, Multimedia Mining, Web Crawlers, Summarization, Information Extraction.
Font Size

  • Mughal, Muhammd Jawad Hamid. "Data Mining: Web Data Mining Techniques, Tools and Algorithms: An Overview." International Journal of Advanced Computer Science and Applications 9, no. 6 (2018).
  • Irfan, Shadab, and Subhajit Ghosh. "Web Mining for Information Retrieval." International Journal of Engineering Science 17277 (2018).
  • Mebrahtu, Andemariam, and Balu Srinivasulu. "Web Content Mining Techniques and Tools." International Journal of Computer Science and Mobile Computing 6, no. 4 (2017).
  • Santosh Kumar Rath, Smaranika Mohapatra, And Jharana Paikaray. "Web Mining: A Tool for information retrieval from Online Marketing." Internation Journal Of Advance Research And Innovative Ideas In Education 2, no. 2 (2016) : 1329-1333.
  • Gandhi, Kalgi, and Nidhi Madia. "Information extraction from unstructured data using RDF." In 2016 International Conference on ICT in Business Industry & Government (ICTBIG), pp.1-6. IEEE, 2016.
  • Gupta, Vishal, and Gurpreet S. Lehal. "A survey of text mining techniques and applications." Journal of emerging technologies in web intelligence 1, no. 1 (2009): 60-76.
  • Lee, Sungjick, and Han-joon Kim. "News keyword extraction for topic tracking." In 2008 Fourth International Conference on Networked Computing and Advanced Information Management, vol. 2, pp. 554-559. IEEE, 2008.
  • Elfayoumy, Sherif, and Jenny Thoppil. "A survey of unstructured text summarization techniques." The International Journal of Advanced Computer Science and Applications 5, no. 7 (2014): 149-54.
  • Subhendu Kumar Pani, Deepak Mohapatra and Bikram Keshari Ratha. UTKALUNIVERSITY, RCMA RCEM. "Integration of web mining and web crawler: Relevance and state of art." Integration 2, no. 03 (2010): 772-776.
  • Xia, Yingju, Yuhang Yang, Shu Zhang, and Hao Yu. "Automatic wrapper generation and maintenance." In Proceedings of the 25th Pacific Asia Conference on Language, Information and Computation. 2011.
  • Grundland, Mark, and Neil A. Dodgson. "Color histogram specification by histogram warping." In Color Imaging X: Processing, Hardcopy, and Applications, vol. 5667, pp. 610-622. International Society for Optics and Photonics, 2005.
  • Kotsiantis, S., D. Kanellopoulos, and P. Pintelas. "Multimedia mining." WSEAS Transactions on Systems 3, no. 10 (2004): 3263-3268.
  • Mohamad, Fatma Susilawati, Azizah Abdul Manaf, and Suriayati Chuprat. "Histogram matching for color detection: A preliminary study." In 2010 International Symposium on Information Technology, vol. 3, pp. 1679-1684. IEEE, 2010.
  • Bharanipriya, V., and V. Kamakshi Prasad. "Web content mining tools: a comparative study." International Journal of Information Technology and Knowledge Management 4, no. 1 (2011): 211-215.
  • Johnson, Faustina, and Santosh Kumar Gupta. "Web content mining techniques: a survey." International Journal of Computer Applications 47, no. 11 (2012).
  • Malarvizhi, R., and K. Saraswathi. "Web Content Mining Techniques Tools & Algorithms–A Comprehensive Study." International Journal of Computer Trends and Technology (IJCTT) 4, no. 8 (2013): 2940-2945.
  • Kumar, Anurag, and Ravi Kumar Singh. "A Study on Web Structure Mining." International Research Journal of Engineering and Technology (IRJET) 4, no. 1 (2017): 715-720.
  • Sathya, S., and N. Rajendran. "A review on text mining techniques." Int. J. Comput. Sci. Trends Technol 3, no. 5 (2015): 274-284.
  • Azir, Mohd Amir Bin Mohd, and Kamsuriah Binti Ahmad. "Wrapper approaches for web data extraction: A review." In 2017 6th International Conference on Electrical Engineering and Informatics (ICEEI), pp. 1-6. IEEE, 2017.
  • Hahn, Udo, and Inderjeet Mani. "The challenges of automatic summarization" Computer 33, no. 11 (2000): 29-36.

Abstract Views: 443

PDF Views: 0

  • A Critique Survey on Diverse Approaches of Web Content Mining

Abstract Views: 443  |  PDF Views: 0


P. V. Varish
School of CSA, REVA University, India
C. K. Lokesh
School of CSA, REVA University, India
School of CSA, REVA University, India


Web mining is used to find the different patterns in data by various categories like web usage mining, web structure mining and web content mining. The method used to gather data by web spiders and web search engines are known as web content mining. The formation of a website can be tartan by using web structure mining and we can test the data of a user’s browser by using web usage mining. The web content mining is a second phase of web mining, which deals with extraction of images, graphs, text etc. The spotlight of this work is to present a brief survey on different techniques used in web content mining. We presented a brief review of different web content mining approaches like multimedia mining, unstructured mining, structured mining and semi-structured mining.


Web Mining, Web Content Mining, Multimedia Mining, Web Crawlers, Summarization, Information Extraction.
