Abstract

Modeling of Arabic Language for Authorship Identification

the International Journal of Scientific & Technology Research • 2021

Back

Publication Information

Authors Heba M. Khalil, Ahmed Taha, Tarek El-Shistawy

Keywords Not Available

Journal the International Journal of Scientific & Technology Research

Publisher Not Available

Volume 10

Issue 5

Pages 157–162

publication.type International

Paper Link Open Link

Supplementary Materials Not Available

Abstract

With the vast volume of data processed in digital form today, the need for and capability of analysing and processing this data for forensic
authorship authentication has increased. The focus of study has concentrated on English, Spanish, and German. Arabic language has received less
attention from the academic community due to the difficulty and length of Arabic sentences. This article provides a set of stylometric features derived
from the study of many articles' parts of expression, including adjectives ratio, sentence size, conjunctions, and others. This details is classified into two
categories: statistical features and linguistic features. The AdaBoost and Bagging ensemble approaches have been proposed in this research to
maximise predictive efficiency in Arabic articles by using multiple learning. The results indicate that the Bagging model achieves average accuracy of
91.5 %, while the AdaBoost model achieves the highest accuracy of 93.6 %.