Wednesday, Sep 14 Thursday, Sep 15 Friday, Sep 16

9:00

9:30

10:00

10:30

11:00

11:30

12:30

13:00

14:00

14:30

15:00

15:30

16:00

17:00

17:30

18:00

18:30

19:30

19:35

20:35

Welcome

9:00 – 9:30 (30 min)
Wednesday, Sep 14

Multimedia understanding and classification

9:00 – 10:00 (60 min)
Thursday, Sep 15

Image processing and reconstruction

9:00 – 10:00 (60 min)
Friday, Sep 16

Keynote 1

9:30 – 10:30 (60 min)
Wednesday, Sep 14
tbd

Coffee

10:00 – 10:30 (30 min)
Thursday, Sep 15

Coffee

10:00 – 10:30 (30 min)
Friday, Sep 16

Coffee

10:30 – 11:00 (30 min)
Wednesday, Sep 14

Image analysis and enrichment

10:30 – 11:30 (60 min)
Thursday, Sep 15

Best Papers

11:00 – 12:30 (90 min)
Wednesday, Sep 14

Special Session: Computer-Assisted Clinical Applications

11:30 – 12:30 (60 min)
Thursday, Sep 15

Keynote 2

11:30 – 12:30 (60 min)
Friday, Sep 16

Lunch

12:30 – 14:00 (90 min)
Thursday, Sep 15

Lunch

12:30 – 14:00 (90 min)
Wednesday, Sep 14

Closing

12:30 – 13:00 (30 min)
Friday, Sep 16

Lunch

13:00 – 14:30 (90 min)
Friday, Sep 16

Multimedia Indexing and Retrieval

14:00 – 15:30 (90 min)
Thursday, Sep 15

Posters and Demos

15:00 – 17:00 (120 min)
Wednesday, Sep 14
and coffee

Coffee

15:30 – 16:00 (30 min)
Thursday, Sep 15

Panel

16:00 – 17:30 (90 min)
Thursday, Sep 15

Steering Committee Meeting

17:30 – 18:30 (60 min)
Thursday, Sep 15
(closed session)

Guided Tour

18:00 – 19:30 (90 min)
Wednesday, Sep 14
Graz tour, ending at reception venue

Reception

19:30 – 22:00 (150 min)
Wednesday, Sep 14

Dinner

19:30 – 22:00 (150 min)
Thursday, Sep 15

Music Meets Science

19:35 – 20:35 (60 min)
Wednesday, Sep 14

Keynote 1

tbd

Best Papers

Retrieval-Augmented Transformer for Image Captioning
Sara Sarto, Marcella Cornia, Lorenzo Baraldi and Rita Cucchiara

Hybrid Transformer Network for Deepfake Detection
Sohail Ahmed Khan and Duc-Tien Dang-Nguyen

An exploration into the benefits of the CLIP model for lifelog retrieval
Ly-Duyen Tran, Naushad Alam, Yvette Graham, Liting Zhou and Cathal Gurrin

Special Session: Multimodal Signal processing technologies for Protecting people and environment against Natural Disasters

BiasUNet: Learning Change Detection over Sentinel-2 Image Pairs
Maria Eirini Pegia, Anastasia Moumtzidou, Ilias Gialampoukidis, Björn Þór Jónsson, Stefanos Vrochidis and Ioannis Kompatsiaris

Wildfire Segmentation using Deep-RegSeg Semantic Segmentation Architecture
Rafik Ghali, Moulay Akhloufi, Wided Souidene Mseddi and Marwa Jmal

Ecological Impact Assessment Framework for areas affected by Natural Disasters
Gardyas Bidari Adninda, Kusrini Kusrini, Arief Setyanto, Renindya A Kartikakirana, Rhisa A Suprapto, Arif D Laksito, I Made A Agastya, Krishna Chandramouli, Andrea Majlingova, Yvonne Brodrechtová, Konstantinos Demestichas and Ebroul Izquierdo

Posters and Demos

Posters

  • StyleGAN-based CLIP-guided Image Shape Manipulation | Yuchen Qian, Kohei Yamamoto and Keiji Yanai
  • Streaming learning with Move-to-Data approach for image classification | Abel Kahsay Gebreslassie, Jenny Benois-Pineau and Akka Zemmari
  • Analysing the  Memorability of a Procedural Crime-Drama TV Series | Seán Cummins, Lorin Sweeney and Alan Smeaton
  • A large-scale TV video and metadata database for French political content analysis and fact-checking | Frédéric Rayar, Mathieu Delalandre and Van-Hao Le
  • Relational Database Performance for Multimedia: A Case Study | Björn Þór Jónsson, Aaron Duane and Nikolaj Mertz
  • The Potential of Webcam Based Real Time Eye-Tracking to Reduce Rendering Cost | Isabel Kütemeyer
  • Self-Supervised Spiking Neural Networks applied to Digit Classification | Benjamin Chamand and Philippe Joly

Demos

  • A Virtual Reality Talking Avatar for Investigative Interviews of Maltreat Children | Syed Zohaib Hassan, Pegah Salehi, Michael Alexander Riegler, Miriam Sinkerud Johnson, Gunn Astrid Baugerud, Pål Halvorsen and Saeed Shafiee Sabet
  • A Toolchain for Extracting and Visualising Road Traffic Data | Helmut Neuschmied, Florian Krebs, Stefan Ladstätter and Georg Thallinger

Multimedia understanding and classification

An Audio-Visual Dataset and Deep Learning Frameworks for Crowded Scene Classification
Lam Pham, Dat Ngo, Tho Nguyen, Phu Nguyen, Truong Hoang and Alexander Schindler

A Fine Grained Quality Assessment of Video Anomaly Detection
Jiang Zhou, Kevin McGuinness, Noel E. O Connor and Joseph Antony

Learning Co-occurrence Features Across Spatial and Temporal Domains for Hand Gesture Recognition
Mohammad Rehan, Hazem Wannous, Jafar Alkheir and Kinda Aboukassem

Image analysis and enrichment

Sentiment analysis on 3D spaces using deep learning architectures
Konstantinos Chatzistavros, Theodora Pistola, Sotiris Diplaris, Konstantinos Ioannidis, Stefanos Vrochidis and Ioannis Kompatsiaris

Urban Image Geo-Localization Using Open Data on Public Spaces
Mathias Glistrup, Stevan Rudinac and Björn Þór Jónsson

A domain adaptive deep learning solution for scanpath prediction of paintings
Mohamed Amine Kerkouri, Marouane Tliba, Aladine Chetouani and Alessandro Bruno

Special Session: Computer-Assisted Clinical Applications

Segmenting partially annotated medical images
Nicolas Martin, Jean-Pierre Chevallet and Georges Quénot

Chest Diseases classification using CXR and deep ensemble learning
Adnane Ait Nasser and Moulay Akhloufi

Skin Cancer Detection using Ensemble Learning and Grouping of Deep Models
Takfarines Guergueb and Moulay Akhloufi

Multimedia Indexing and Retrieval

ALADIN: Distilling Fine-grained Alignment Scores for Efficient Image-Text Matching and Retrieval
Nicola Messina, Matteo Stefanini, Marcella Cornia, Lorenzo Baraldi, Fabrizio Falchi, Giuseppe Amato and Rita Cucchiara

Improving Nearest Neighbor Indexing by Multitask Learning
Amorntip Prayoonwong, Ke-Long Zeng and Chih-Yi Chiu

Towards Human Performance on Sketch-Based Image Retrieval
Omar Seddati, Stéphane Dupont, Saïd Mahmoudi and Thierry Dutoit

Analysis of the Complementarity of Latent and Concept Spaces for Cross-Modal Video Search
Varsha Devi, Philippe Mulhem and Georges Quénot

Panel

tbd

Image processing and reconstruction

Real-time deblurring network for face AR applications
Juhwan Lee, Jongha Lee and Sangwook Yoo

Hyperspectral Image Reconstruction of Heritage Artwork Using RGB Images and Deep Neural Networks
Ailin Chen, Rui Jesus and Márcia Vilarigues

A survey for image based methods in construction: from images to digital twins
Ilias Koulalis, Nikolaos Dourvas, Theocharis Triantafyllidis, Konstantinos Ioannidis, Stefanos Vrochidis and Ioannis Kompatsiaris

Special Session: Learning from scarce data challenges in the media domain

RIAMAF: Region Information Reassembly Text Detection based on Multi-Feature Fusion
Hengyan Liu, Baolong Guo and Zekun Li

Learning to Detect Fallen People in Virtual Worlds
Fabio Carrara, Lorenzo Pasco, Claudio Gennaro and Fabrizio Falchi

Few-shot Object Detection as a Semi-supervised Learning Problem
Werner Bailer and Hannes Fassold

Deep Features for CBIR with Scarce Data using Hebbian Learning
Gabriele Lagani, Davide Bacciu, Claudio Gallicchio, Fabrizio Falchi, Claudio Gennaro and Giuseppe Amato

Keynote 2

tbd

Music Meets Science

Cultural event supported by SIG MM.

Trio concert | Schubert, Haydn

Performers: Olga Chepovetsky (CH), piano; François Pineau-Benois (FR), violin; Dorottya Standi (AT), cello;