UTILIZING ROOTS AND PATTERNS TO IDENTIFY ARABIC NAMED ENTITIES

AHMED, ABDULMONEM and HANÇRLİOĞULLARI, AYBABA and TOSUN, ALİ RIZA (2022) UTILIZING ROOTS AND PATTERNS TO IDENTIFY ARABIC NAMED ENTITIES. Asian Journal of Mathematics and Computer Research, 29 (2). pp. 33-42. ISSN 2395-4213

Full text not available from this repository.

Abstract

Named Entity Recognition NER is a subset of information extraction that seeks to recognize and categorize named things in text data into specified categories, such as people's names, organizations' names, geographic locations, and so on. This task has recently attracted a lot of attention due to the discovery it has the potential to boost the performance of a variety of NLP applications. In the domains of Question Answering and Summarization Systems, Information Retrieval and Extraction, Machine Translation, Video Annotation, Semantic Web Search, and Bioinformatics, the majority of difficulties require named entity recognition. Arabic is an inflectional language, which allows for non-concatenative morphological operations on the root. The purpose of this study is to extract and recognize entity names from Arabic articles. We proposed an algorithm for determining names from roots using patterns. We developed it in Python and leveraged the "pyqt5" visual package to see the results immediately, as well as modify and add patterns easily. To replicate the names, we used a random sample of 400 names and 45 different patterns. The algorithm correctly identified 370 names easily and quickly, yielding a success rate of 93%. All names with the same recognized names will be known in the same way by the method and do not need any manipulation in code or design. The names that are not recognized by our algorithm have no roots in the list of known Arabic roots. Our research shows that the approach can recognize names with roots with high speed and accuracy, but it is not possible to identify nouns that are not in the Arabic language using this method. As a result, we recommend using a hybrid method that incorporates multiple concepts.

Item Type: Article
Subjects: GO for STM > Mathematical Science
Depositing User: Unnamed user with email support@goforstm.com
Date Deposited: 21 Dec 2023 06:05
Last Modified: 21 Dec 2023 06:05
URI: http://archive.article4submit.com/id/eprint/2421

Actions (login required)

View Item
View Item