Challenges of Using Deep Learning to Analyse Texts Written in the Algerian Arabic Dialect
Keywords:
deep learning, text recognition, Algerian dialect, ambiguity, semantic ambiguityAbstract
The study of Arabic dialects, particularly Algerian Arabic, and texts written in them poses significant challenges for automatic processing and the tasks required by researchers and users, especially given the widespread emergence of dialect-written posts on social media platforms. Various problems have consequently become apparent that hinder the identification of words and expressions in these dialects. These include ambiguity due to the lack of digitised corpora, issues related to orthographic writing systems and interference between dialectal writing and Modern Standard Arabic or foreign languages. Accordingly, we conducted a study of a sample of Facebook posts published by students of the Department of Arabic Language and Literature at the University of Mascara. The posts were written in the Algerian dialect (Darja), and we observed the difficulties this presented in recognising the words they contained. This has led researchers to rely on deep learning techniques to achieve a certain level of success.