There is a vast amount of unstructured Arabic information on the Web, this data is always organized in semi-structured text and cannot be used directly. This research proposes a semi-supervised technique that extracts binary relations between two Arabic named entities from the Web. Several works have been performed for relation extraction from Latin texts and as far as we know, there isn't any work for Arabic text using a semi-supervised technique. The goal of this research is to extract a large list or table from named entities and relations in a specific domain. A small set of a handful of instance relations are required as input from the user. The system exploits summaries from Google search engine as a source text. These instances are used to extract patterns. The output is a set of new entities and their relations. The results from four experiments show that precision and recall varies according to relation type. Precision ranges from 0.61 to 0.75 while recall ranges from 0.71 to 0.83. The best result is obtained for (player, club) relationship, 0.72 and 0.83 for precision and recall respectively.
Keywords
Relation Extraction, Information Extraction, Pattern Extraction, Semi-Supervised, Arabic language and Web Mining.
User
Font Size
Information