A Machine Learning Approach to Arabic Native Language Identification
Keywords:
Native language identification, standard deviation, machine learningAbstract
Native Language Identification (NLI) is the task of identifying a writer’s native language (L1) based only on their writings in a second language (the L2). This Paper presents a method for the identification of the learners of the Arabic language (ANLI). The contribution of our method revolves around the use of the standard deviation for the optimization of supervised learning. This technique explores a multitude of linguistic features extracted from the text. The feature selection stage allowed improving the results that outperformed those achieved by the best systems applied on the same corpus. The achieved accuracy outperformed that of the state-of-art (45% vs 41%), taking into account the limited data and the unavailability of accurate tools dedicated to the Arabic language.
Downloads
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2023 International Journal of Computer Information Systems and Industrial Management Applications

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.