A HYBRID FRAMEWORK FOR AUTOMATIC MINUTES OF MEETING GENERATION FROM ONLINE MEETING VIDEOS
DOI:
https://doi.org/10.7091710.70917/ijcisim-2026-1957Keywords:
Minutes of Meeting, Meeting Summarization, Automatic Speech Recognition, Whisper, Hybrid Summarization, Transformer Models, Natural Language ProcessingAbstract
The fast-growing usage of the online meeting systems has led to the increasing demands of automatic and consistent recordings of the meeting discussions. The preparation of Minutes of Meeting (MoM) manually is time consuming, prone to errors and hard to scale in real world organizational context. The present paper suggests a hybrid approach to the automatic creation of MoM using the online meeting records that implies the combination of speech processing and natural language processing capabilities. The proposed system will turn the meeting videos into structured minutes after a multi-stage pipeline that includes audio extraction, transformer-based Automatic Speech Recognition via Whisper, time-based transcript segmentation, extractive key point identification, and abstractive summarization via a transformer-based large language model. Time-conscious segmentation allows long meeting processing on a large scale, and the hybrid approach to summarization compromises between factual coverage and linguistic coherence. Experimental analysis on the actual meeting recordings shows that the suggested solution is efficient at producing brief and structured meeting summaries. The ROUGE metrics of quantitative evaluation and qualitative analysis in evaluating the feasibility of the system to be deployed in the real world. The structure is reproducible, computationally efficient on the GPU hardware and it is appropriate to automatically document online meetings.
Keywords: Minutes of Meeting, Meeting Summarization, Automatic Speech Recognition, Whisper, Hybrid Summarization, Transformer Models, Natural Language Processing