9:00
9:30
10:00
10:30
11:00
11:30
12:30
13:00
14:00
14:30
15:00
15:30
16:00
17:00
17:30
18:00
18:30
19:30
19:35
20:35
Coffee
Coffee
Coffee
Special Session: Learning from scarce data challenges in the media domain
Lunch
Lunch
Lunch
Special Session: Multimodal Signal processing technologies for Protecting people and environment against Natural Disasters
Coffee
Guided Tour
Reception
Dinner
Keynote 1
tbd
Best Papers
Retrieval-Augmented Transformer for Image Captioning
Sara Sarto, Marcella Cornia, Lorenzo Baraldi and Rita Cucchiara
Hybrid Transformer Network for Deepfake Detection
Sohail Ahmed Khan and Duc-Tien Dang-Nguyen
An exploration into the benefits of the CLIP model for lifelog retrieval
Ly-Duyen Tran, Naushad Alam, Yvette Graham, Liting Zhou and Cathal Gurrin
Special Session: Multimodal Signal processing technologies for Protecting people and environment against Natural Disasters
BiasUNet: Learning Change Detection over Sentinel-2 Image Pairs
Maria Eirini Pegia, Anastasia Moumtzidou, Ilias Gialampoukidis, Björn Þór Jónsson, Stefanos Vrochidis and Ioannis Kompatsiaris
Wildfire Segmentation using Deep-RegSeg Semantic Segmentation Architecture
Rafik Ghali, Moulay Akhloufi, Wided Souidene Mseddi and Marwa Jmal
Ecological Impact Assessment Framework for areas affected by Natural Disasters
Gardyas Bidari Adninda, Kusrini Kusrini, Arief Setyanto, Renindya A Kartikakirana, Rhisa A Suprapto, Arif D Laksito, I Made A Agastya, Krishna Chandramouli, Andrea Majlingova, Yvonne Brodrechtová, Konstantinos Demestichas and Ebroul Izquierdo
Posters and Demos
Posters
- StyleGAN-based CLIP-guided Image Shape Manipulation | Yuchen Qian, Kohei Yamamoto and Keiji Yanai
- Streaming learning with Move-to-Data approach for image classification | Abel Kahsay Gebreslassie, Jenny Benois-Pineau and Akka Zemmari
- Analysing the Memorability of a Procedural Crime-Drama TV Series | Seán Cummins, Lorin Sweeney and Alan Smeaton
- A large-scale TV video and metadata database for French political content analysis and fact-checking | Frédéric Rayar, Mathieu Delalandre and Van-Hao Le
- Relational Database Performance for Multimedia: A Case Study | Björn Þór Jónsson, Aaron Duane and Nikolaj Mertz
- The Potential of Webcam Based Real Time Eye-Tracking to Reduce Rendering Cost | Isabel Kütemeyer
- Self-Supervised Spiking Neural Networks applied to Digit Classification | Benjamin Chamand and Philippe Joly
Demos
- A Virtual Reality Talking Avatar for Investigative Interviews of Maltreat Children | Syed Zohaib Hassan, Pegah Salehi, Michael Alexander Riegler, Miriam Sinkerud Johnson, Gunn Astrid Baugerud, Pål Halvorsen and Saeed Shafiee Sabet
- A Toolchain for Extracting and Visualising Road Traffic Data | Helmut Neuschmied, Florian Krebs, Stefan Ladstätter and Georg Thallinger
Multimedia understanding and classification
An Audio-Visual Dataset and Deep Learning Frameworks for Crowded Scene Classification
Lam Pham, Dat Ngo, Tho Nguyen, Phu Nguyen, Truong Hoang and Alexander Schindler
A Fine Grained Quality Assessment of Video Anomaly Detection
Jiang Zhou, Kevin McGuinness, Noel E. O Connor and Joseph Antony
Learning Co-occurrence Features Across Spatial and Temporal Domains for Hand Gesture Recognition
Mohammad Rehan, Hazem Wannous, Jafar Alkheir and Kinda Aboukassem
Image analysis and enrichment
Sentiment analysis on 3D spaces using deep learning architectures
Konstantinos Chatzistavros, Theodora Pistola, Sotiris Diplaris, Konstantinos Ioannidis, Stefanos Vrochidis and Ioannis Kompatsiaris
Urban Image Geo-Localization Using Open Data on Public Spaces
Mathias Glistrup, Stevan Rudinac and Björn Þór Jónsson
A domain adaptive deep learning solution for scanpath prediction of paintings
Mohamed Amine Kerkouri, Marouane Tliba, Aladine Chetouani and Alessandro Bruno
Special Session: Computer-Assisted Clinical Applications
Segmenting partially annotated medical images
Nicolas Martin, Jean-Pierre Chevallet and Georges Quénot
Chest Diseases classification using CXR and deep ensemble learning
Adnane Ait Nasser and Moulay Akhloufi
Skin Cancer Detection using Ensemble Learning and Grouping of Deep Models
Takfarines Guergueb and Moulay Akhloufi
Multimedia Indexing and Retrieval
ALADIN: Distilling Fine-grained Alignment Scores for Efficient Image-Text Matching and Retrieval
Nicola Messina, Matteo Stefanini, Marcella Cornia, Lorenzo Baraldi, Fabrizio Falchi, Giuseppe Amato and Rita Cucchiara
Improving Nearest Neighbor Indexing by Multitask Learning
Amorntip Prayoonwong, Ke-Long Zeng and Chih-Yi Chiu
Towards Human Performance on Sketch-Based Image Retrieval
Omar Seddati, Stéphane Dupont, Saïd Mahmoudi and Thierry Dutoit
Analysis of the Complementarity of Latent and Concept Spaces for Cross-Modal Video Search
Varsha Devi, Philippe Mulhem and Georges Quénot
Panel
tbd
Image processing and reconstruction
Real-time deblurring network for face AR applications
Juhwan Lee, Jongha Lee and Sangwook Yoo
Hyperspectral Image Reconstruction of Heritage Artwork Using RGB Images and Deep Neural Networks
Ailin Chen, Rui Jesus and Márcia Vilarigues
A survey for image based methods in construction: from images to digital twins
Ilias Koulalis, Nikolaos Dourvas, Theocharis Triantafyllidis, Konstantinos Ioannidis, Stefanos Vrochidis and Ioannis Kompatsiaris
Special Session: Learning from scarce data challenges in the media domain
RIAMAF: Region Information Reassembly Text Detection based on Multi-Feature Fusion
Hengyan Liu, Baolong Guo and Zekun Li
Learning to Detect Fallen People in Virtual Worlds
Fabio Carrara, Lorenzo Pasco, Claudio Gennaro and Fabrizio Falchi
Few-shot Object Detection as a Semi-supervised Learning Problem
Werner Bailer and Hannes Fassold
Deep Features for CBIR with Scarce Data using Hebbian Learning
Gabriele Lagani, Davide Bacciu, Claudio Gallicchio, Fabrizio Falchi, Claudio Gennaro and Giuseppe Amato
Keynote 2
tbd
Music Meets Science
Cultural event supported by SIG MM.
Trio concert | Schubert, Haydn
Performers: Olga Chepovetsky (CH), piano; François Pineau-Benois (FR), violin; Dorottya Standi (AT), cello;


