Morphology-based Entity and Relational Entity Extraction Framework for Arabic

Amin Jaber*, Fadi A. Zaraket**

*Purdue University, West Lafayette, IN

**American University of Beirut, Beirut 1107 2020, Lebanon

 

Rule-based techniques to extract relational entities from documents allow users to specify desired entities with natural language questions, finite state automata, regular expressions and structured query language. They require linguistic and programming expertise and lack support for Arabic morphological analysis. We present a morphology-based entity and relational entity extraction framework for Arabic (MERF). MERF requires basic knowledge of linguistic features and regular expressions, and provides the ability to interactively specify Arabic morphological and synonymity features, tag types associated with regular expressions, and relations and code actions defined over matches of subexpressions. MERF constructs entities and relational entities from matches of the specifications. We evaluated MERF with several case studies. The results show that MERF requires shorter development time and effort compared to existing application specific techniques and produces reasonably accurate results within a reasonable overhead in run time.