A Unified Deep Learning Framework for Software Bug Category Prediction Using Mixed Embeddings
DOI:
https://doi.org/10.7091710.70917/ijcisim-2026-1968Keywords:
Software Bug, Categorization Mixed Embeddings, Deep Learning, BERT, CodeBERT, Bug Report Classification, Software Defect Prediction Bug TriagingAbstract
Software bug classification has been considered an important aspect on enhancing software maintenance and the defect management process within large rescue software development projects. Bug reports may include a wide range of information e.g. textual description, code snippets and stack trace information which can give invaluable information in determining the nature of the software defects. But the main limitation of the traditional methods of classification of bugs is based on the analysis of textual characteristics, and this approach does not allow them to capture the structural information that can be found in the code and execution traces. The paper is a proposal of a consolidated deep learning architecture to predict the category of bugs in software based on mixed set of embeddings of various elements of bug reports. The proposed implementation involves the use of BERT that creates contextual embeddings based on natural language descriptions and CodeBERT that extracts embeddings on a code snippet and stack trace information. Such embeddings are summed in a mixed embedding model, which is subsequently utilized to prepare a classifier to act classifying bugs in a category of UI/UX, a category of performance and a category of security-related bugs. Evaluation of open source bug data set of size 6151 bug reports were performed in an experimental manner. The findings indicate that the proposed model has a better classification performance as compared to the baseline model based on GloVe embeddings and LSTM. These additions containing textual, code, and stack trace embeddings make feature representation much more useful and enhance the usability of automated bug-triaging systems in the software development setting.