Multilingual Natural Language Prompts and Code Generation: A Study on Large Language Model Cross-Linguistic Performance
DOI:
https://doi.org/10.70917/ijcisim-2026-2677Keywords:
Large Language Models, Multilingual Code Generation, Cross-Linguistic Performance, Language Efficiency Score (LES), Code Efficiency Index (CEI)Abstract
Large Language Models (LLMs) have become the most powerful technique for code generation and have profound impact on contemporary software development process. Moreover, while existing GPT models perform well in translating natural language prompts into executable code, their behavior when prompted with input in multiple lan- guages has not been sufficiently studied till now, even though developers around the world speak and write in many different tongues. It is crucial to understand such a behavior in order to achieve fair AI-aided programming. Most prior work centers on English or limited bilingual studies, leaving uncertainty about how language influences code quality and efficiency.
In this paper, we examine GPT-4.5’s cross-lingual perfor- mance in Python code generation. Our multilingual bench- mark includes 30 algorithmic tasks across six computer science domains, tested in 30 languages. We measure execution time, efficiency and accuracy using Language Efficiency Score (LES) and Code Efficiency Index (CEI) and supported by clustering and correlation analysis .We have considered 900 samples, efficiency and execution stability showes a strong correlation (r > 0.92), while prompt length has little impact.The Natural Languages Tamil, Ukrainian, and Japanese yield the most efficient code, whereas English, Persian, and Mandarin produce d longer, slower scripts.
Our results proved that prompt language(Natural Language) matters in LLM code generation, emphasizing the importance of multilingual-aware prompt engineering for efficiency and robustness in real-world software development process.