From "Combining Global and Local Attention with Positional Encoding for Video Summarization", Proc. of the IEEE Int. Symposium on Multimedia (ISM), Dec. 2021. Written by Evlampios Apostolidis, ...