A Machine Learning Approach to Arabic Native Language Identification

Authors

  • Seifeddine Mechti Faculty of Economics and management of Sfax, Tunisia MIRACL Laboratory,Route de Tunis Km 10 B.P. 242 SFAX 302
  • Lamia Hadrich Belguith Faculty of Economics and management of Sfax, Tunisia MIRACL Laboratory,Route de Tunis Km 10 B.P. 242 SFAX 302

Keywords:

Native language identification, standard deviation, machine learning

Abstract

Native Language Identification (NLI) is the task of identifying a writer’s native language (L1) based only on their writings in a second language (the L2). This Paper presents a method for the identification of the learners of the Arabic language (ANLI). The contribution of our method revolves around the use of the standard deviation for the optimization of supervised learning. This technique explores a multitude of linguistic features extracted from the text. The feature selection stage allowed improving the results that outperformed those achieved by the best systems applied on the same corpus. The achieved accuracy outperformed that of the state-of-art (45% vs 41%), taking into account the limited data and the unavailability of accurate tools dedicated to the Arabic language.

Downloads

Download data is not yet available.

Downloads

Published

2019-01-01

How to Cite

Seifeddine Mechti, & Lamia Hadrich Belguith. (2019). A Machine Learning Approach to Arabic Native Language Identification. International Journal of Computer Information Systems and Industrial Management Applications, 11, 9. Retrieved from https://cspub-ijcisim.org/index.php/ijcisim/article/view/430

Issue

Section

Original Articles