Research on Automatic Creation Method of Documentary Background Music Based on Deep Generative Modeling
DOI:
https://doi.org/10.70917/ijcisim-2026-0111Keywords:
documentary background music; source separation model; music generation model; RNN; bidirectional LSTMAbstract
Addressing issues such as lengthy production cycles and insufficient adaptability in the creation of background music for documentaries, this paper proposes an automatic composition method based on deep generative models. A source separation model based on RNN is designed to simultaneously process multi-track features in mixed audio. A bidirectional LSTM music generation model is constructed to leverage bidirectional temporal modeling capabilities to capture the global structural features of music, thereby optimizing the generation of melodies and chords. Experiments use classic documentary soundtrack segments as training data, and model performance is validated through spectral analysis, spectrogram comparison, and human subjective evaluation. Results show that after 2,000 iterations, the frequency distribution of the generated music converges with the sample music, and the bidirectional LSTM structure converges faster and produces better results than unidirectional models. In subjective evaluations, the model significantly outperformed the control model in terms of naturalness (4.35 points), creativity (4.52 points), and musicality (4.47 points), with a score difference of less than 0.5 points compared to real music.
Downloads
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Hexi Wang, Mingjie Wang

This work is licensed under a Creative Commons Attribution 4.0 International License.