Automatic summarization of YouTube video transcription text using term frequency-inverse document frequency

Rand Abdulwahid Albeer, Huda F. AL-Shahad, Hiba J. Aleqabie, Noor D. Al-shakarchy

Abstract


Automatic summarization is a technique for quickly introducing key information by abbreviating large sections of material. Summarization may apply to text and video with a different method to display the abstract of the subject. Natural language processing is employed in automated text summarization in this research, which applies to YouTube videos by transcribing and applying the summary stages in this study. Based on the number of words and sentences in the text, the method term frequency-inverse document frequency (TF-IDF) was used to extract the important keywords for the summary. Some videos are long and boring or take more time to display the information that sometimes finds in a few minutes. Therefore, the essence of the proposed system is to find the way to summarize the long video and introduce the important information to the user as a text with few numbers of lines to benefit the students or the researchers that have no time to spend with long videos for extract the useful data. The results have been evaluated using Rouge method on the convolutional neural network (CNN)-dailymail-master data set.

Keywords


Automatic text summarization; Stop words; Term frequency-inverse document frequency; Video transcript; Word frequency;

Full Text:

PDF


DOI: http://doi.org/10.11591/ijeecs.v26.i3.pp1512-1519

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Indonesian Journal of Electrical Engineering and Computer Science (IJEECS)
p-ISSN: 2502-4752, e-ISSN: 2502-4760
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).

shopify stats IJEECS visitor statistics