Open Access
Article
Video Summarization Using U-shaped Non-local Network
Shasha Zang1
Haodong Jin1, *
Qinghao Yu1
Sunjie Zhang1
Hui Yu2
Author Information
Submitted: 22 Nov 2023 | Accepted: 5 Mar 2024 | Published: 26 Jun 2024

Abstract

Video summarization (VS) refers to extraction of key clips with important information from long videos to compose the short videos. The video summaries are derived by capturing a variable range of time dependencies between video frames. A large body of works on VS have been proposed in recent years, but how to effectively select the key frames is still a changing issue. To this end, this paper presents a novel U-shaped non-local network for evaluating the probability of each frame selected as a summary from the original video. We exploit a reinforcement learning framework to enable unsupervised summarization of videos. Frames with high probability scores are included into a generated summary. Furthermore, a reward function is defined that encourages the network to select more representative and diverse video frames. Experiments conducted on two benchmark datasets with standard, enhanced and transmission settings demonstrate that the proposed approach outperforms the state-of-the-art unsupervised methods.

Graphical Abstract

References

Share this article:
Graphical Abstract
How to Cite
Zang, S., Jin, H., Yu, Q., Zhang, S., & Yu, H. (2024). Video Summarization Using U-shaped Non-local Network. International Journal of Network Dynamics and Intelligence, 3(2), 100013. https://doi.org/10.53941/ijndi.2024.100013
RIS
BibTex
Copyright & License
article copyright Image
Copyright (c) 2024 by the authors.

This work is licensed under a This work is licensed under a Creative Commons Attribution 4.0 International License.

scilight logo

About Scilight

Contact Us

Suite 4002 Level 4, 447 Collins Street, Melbourne, Victoria 3000, Australia
General Inquiries: info@sciltp.com
© 2025 Scilight Press Pty Ltd All rights reserved.