A Hybrid Translation Pipeline for Low-Resource Dialects: Translating English to a Ahirani Using NLLB and Rule-Based Adaptation

Neha Telrandhe; Rakesh Kadu

doi:10.70917/ijcisim-2026-2618

Authors

Neha Telrandhe Ramdeobaba University (Shri Ramdeobaba College of Engineering and Management), Nagpur, Maharashtra, India.
Rakesh Kadu Ramdeobaba University (Shri Ramdeobaba College of Engineering and Management), Nagpur, Maharashtra, India.

DOI:

https://doi.org/10.70917/ijcisim-2026-2618

Keywords:

Low-resource languages, Machine translation, Ahirani dialect, Rule-based translation, Dictionary-based translation, NLLB, Multilingual models, Neural machine translation, Dialect adaptation, hybrid approach

Abstract

Machine translation for low-resource languages remains a significant challenge due to the unavailability of large-scale parallel corpora and limited linguistic resources. This paper presents an exploratory study that compares two translation approaches for a low-resource dialect: a basic rule-based method using a custom English-to-dialect dictionary and a neural machine translation approach using Meta AI’s pretrained No Language Left Behind (NLLB-200) model. The NLLB model was used to translate from English to standard Marathi, followed by a post-processing step to adapt the output to the dialect using dictionary-based substitutions. This stepwise pipeline allows us to observe the differences in output quality, grammatical correctness, and contextual accuracy. The results highlight the strengths and limitations of both approaches, offering insight into the feasibility and challenges of applying neural models to dialectal translation in low-resource settings.

Downloads

Download data is not yet available.

A Hybrid Translation Pipeline for Low-Resource Dialects: Translating English to a Ahirani Using NLLB and Rule-Based Adaptation

Authors

DOI:

Keywords:

Abstract

Downloads

Downloads

Published

How to Cite

Issue

Section

Information