Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

Automatic Extraction of Corporate Names from Web Texts


Affiliations
1 Department of Information Science, University of Madras, Chennai 600 005, India
     

   Subscribe/Renew Journal


Names of corporate bodies are among the important entities that are of value in identifying and retrieving a web resource. This paper describes a program developed and implemented to identify and extract corporate names from web texts. The program uses trigger words to identify corporate names. The paper also reports an expenment that was carried out to test the feasibility and utifity of the method. The results are analyzed and some problems for further research are identified.
User
Subscription Login to verify subscription
Notifications
Font Size

  • Chen, Keh-Jiann and Chen, Chao-jan. Knowledge Extraction for Identification of Chinese Organization Names, 2003. http://acl.eldoc.ub.rug.n1/miiTor/w/wOO/wOO-1203.pdf
  • Chooi Ling, GOH etc. Chinese Unknown Word Identification Using Character-based Tagging and Chunking, 2003. http://cl.aist-nara.ac.jp/papers/2003/ling-g/ACL-2003.pdf
  • Fox, Heidi. Learning to Extract and Classify Names from Texts, 1998. www.ieeexplore.ieee.org/iel4/5875/15679/00728133.pdf
  • Mori, Junichiro et. al. Keyword extraction from the Web for FOAF Metadata, 2001. www.w3.org.2001/sw/Europe/events/foafgalway/papers/fp/Keyword_extraction_from_the_web
  • Patman, Frankie and Thompson, Paul. Names: A New Frontier in Text Mining, 2003. www.ists.dartmouth.edu/IRIA/projects/semantic/svNamesNewFrontier.pdf
  • Poibeau, Thierry and Kosseim, Leila. Proper Name Extraction From Non-Journalistic Texts, 2001. www.iro.umontreal/~kosseim/publications/clin.pdf
  • Ravin, Yael. Extracting Names from Natural Language Text, 200L www.research.ibm.com/talentydocuments/20338.pdf

Abstract Views: 317

PDF Views: 0




  • Automatic Extraction of Corporate Names from Web Texts

Abstract Views: 317  |  PDF Views: 0

Authors

K. Sivasamy
Department of Information Science, University of Madras, Chennai 600 005, India
K. S. Raghavan
Department of Information Science, University of Madras, Chennai 600 005, India

Abstract


Names of corporate bodies are among the important entities that are of value in identifying and retrieving a web resource. This paper describes a program developed and implemented to identify and extract corporate names from web texts. The program uses trigger words to identify corporate names. The paper also reports an expenment that was carried out to test the feasibility and utifity of the method. The results are analyzed and some problems for further research are identified.

References