9:00
9:30
10:00
10:30
11:00
11:30
11:40
12:30
12:40
13:00
13:50
14:00
14:30
15:00
15:20
15:50
16:20
16:30
17:20
17:30
18:00
18:30
19:30
Keynote 1: Multimedia in Global Free Knowledge Ecosystems
Coffee
Coffee
Special Session: Learning from scarce data challenges in the media domain
Coffee
Keynote 2: The Machine Learning of Time and Dynamics in Images, Videos, Simulations
Lunch
Lunch
Lunch
Special Session: Multimodal Signal processing technologies for Protecting people and environment against Natural Disasters
Coffee
Panel: Multimedia Indexing and Retrieval Challenges in Media Archives
Guided Tour
Dinner
Reception
Keynote 1: Multimedia in Global Free Knowledge Ecosystems (Miriam Redi, Wikimedia Foundation)
Chair: Stefan Rudinac
Best Papers
Chair: Werner Bailer
Retrieval-Augmented Transformer for Image Captioning
Sara Sarto, Marcella Cornia, Lorenzo Baraldi and Rita Cucchiara
Hybrid Transformer Network for Deepfake Detection
Sohail Ahmed Khan and Duc-Tien Dang-Nguyen
An exploration into the benefits of the CLIP model for lifelog retrieval
Ly-Duyen Tran, Naushad Alam, Yvette Graham, Liting Zhou and Cathal Gurrin
Special Session: Multimodal Signal processing technologies for Protecting people and environment against Natural Disasters
Chair: Krishna Chandramouli
BiasUNet: Learning Change Detection over Sentinel-2 Image Pairs
Maria Eirini Pegia, Anastasia Moumtzidou, Ilias Gialampoukidis, Björn Þór Jónsson, Stefanos Vrochidis and Ioannis Kompatsiaris
Wildfire Segmentation using Deep-RegSeg Semantic Segmentation Architecture
Rafik Ghali, Moulay Akhloufi, Wided Souidene Mseddi and Marwa Jmal
Ecological Impact Assessment Framework for areas affected by Natural Disasters
Gardyas Bidari Adninda, Kusrini Kusrini, Arief Setyanto, Renindya A Kartikakirana, Rhisa A Suprapto, Arif D Laksito, I Made A Agastya, Krishna Chandramouli, Andrea Majlingova, Yvonne Brodrechtová, Konstantinos Demestichas and Ebroul Izquierdo
Posters and Demos
Chair: Mathias Lux
Posters
- StyleGAN-based CLIP-guided Image Shape Manipulation | Yuchen Qian, Kohei Yamamoto and Keiji Yanai
- Streaming learning with Move-to-Data approach for image classification | Abel Kahsay Gebreslassie, Jenny Benois-Pineau and Akka Zemmari
- Analysing the Memorability of a Procedural Crime-Drama TV Series | Seán Cummins, Lorin Sweeney and Alan Smeaton
- A large-scale TV video and metadata database for French political content analysis and fact-checking | Frédéric Rayar, Mathieu Delalandre and Van-Hao Le
- Relational Database Performance for Multimedia: A Case Study | Björn Þór Jónsson, Aaron Duane and Nikolaj Mertz
- The Potential of Webcam Based Real Time Eye-Tracking to Reduce Rendering Cost | Isabel Kütemeyer, Mathias Lux
- Self-Supervised Spiking Neural Networks applied to Digit Classification | Benjamin Chamand and Philippe Joly
Demos
- A Virtual Reality Talking Avatar for Investigative Interviews of Maltreat Children | Syed Zohaib Hassan, Pegah Salehi, Michael Alexander Riegler, Miriam Sinkerud Johnson, Gunn Astrid Baugerud, Pål Halvorsen and Saeed Shafiee Sabet
- A Toolchain for Extracting and Visualising Road Traffic Data | Helmut Neuschmied, Florian Krebs, Stefan Ladstätter and Georg Thallinger
Multimedia understanding and classification
Chair: Giuseppe Amato
An Audio-Visual Dataset and Deep Learning Frameworks for Crowded Scene Classification
Lam Pham, Dat Ngo, Tho Nguyen, Phu Nguyen, Truong Hoang and Alexander Schindler
A Fine Grained Quality Assessment of Video Anomaly Detection
Jiang Zhou, Kevin McGuinness, Noel E. O Connor and Joseph Antony
Learning Co-occurrence Features Across Spatial and Temporal Domains for Hand Gesture Recognition
Mohammad Rehan, Hazem Wannous, Jafar Alkheir and Kinda Aboukassem
Image analysis and enrichment
Chair: Klaus Schöffmann
Sentiment analysis on 2D images of urban and indoor spaces using deep learning architectures
Konstantinos Chatzistavros, Theodora Pistola, Sotiris Diplaris, Konstantinos Ioannidis, Stefanos Vrochidis and Ioannis Kompatsiaris
Urban Image Geo-Localization Using Open Data on Public Spaces
Mathias Glistrup, Stevan Rudinac and Björn Þór Jónsson
A domain adaptive deep learning solution for scanpath prediction of paintings
Mohamed Amine Kerkouri, Marouane Tliba, Aladine Chetouani and Alessandro Bruno
Special Session: Computer-Assisted Clinical Applications
Chair: Klaus Schöffmann
Segmenting partially annotated medical images
Nicolas Martin, Jean-Pierre Chevallet and Georges Quénot
Chest Diseases classification using CXR and deep ensemble learning
Adnane Ait Nasser and Moulay Akhloufi
Skin Cancer Detection using Ensemble Learning and Grouping of Deep Models
Takfarines Guergueb and Moulay Akhloufi
Multimedia Indexing and Retrieval
Chair: Björn Þór Jónsson
ALADIN: Distilling Fine-grained Alignment Scores for Efficient Image-Text Matching and Retrieval
Nicola Messina, Matteo Stefanini, Marcella Cornia, Lorenzo Baraldi, Fabrizio Falchi, Giuseppe Amato and Rita Cucchiara
Improving Nearest Neighbor Indexing by Multitask Learning
Amorntip Prayoonwong, Ke-Long Zeng and Chih-Yi Chiu
Towards Human Performance on Sketch-Based Image Retrieval
Omar Seddati, Stéphane Dupont, Saïd Mahmoudi and Thierry Dutoit
Analysis of the Complementarity of Latent and Concept Spaces for Cross-Modal Video Search
Varsha Devi, Philippe Mulhem and Georges Quénot
Panel: Multimedia Indexing and Retrieval Challenges in Media Archives
Chair: Georg Thallinger
- Brecht DECLERCQ, President of FIAT/IFTA (International Association of TV Archives), Digitisation and Acquisition Manager at meemoo, the Flemish Institute for Archives
- Richard WRIGHT, Preservation Guide
- Johan OOMEN, Netherlands Institute of Sound and Vision
- Christoph BAUER, Multimedia Archive of the Austrian Broadcasting Corporation
Brecht Declercq, MA, MSc is the President of FIAT/IFTA, the world association of media archives, and the Digitisation and Acquisition Manager at meemoo – The Flemish Institute for Archiving. He is responsible for the preservation of the Flemish audiovisual heritage, including one of the largest audiovisual digitisation programs currently going on globally. He worked for the Belgian public broadcaster VRT for almost 10 years in several digitisation, media asset management and access projects and led the FIAT/IFTA Preservation and Migration Commission from 2016 to 2019. He’s a frequent conference curator, presenter, guest lecturer, writer and reviewer. He advises policy makers, audiovisual archives and media organisations worldwide.
Richard Wright is an independent consultant on audiovisual preservation and access. His previous positions include Archive Preservation at BBC Research and Development (2007-2011) and Archive Technology Manager at BBC Information & Archives (1994-2007). He has been working on European standards for 30 years, including the widely used Broadcast Wave Format and the EBU guidance on core metadata (EBUCore). Since 1995 Richard has worked on European R&D projects, beginning with Euromedia which developed the first European audiovisual asset management system. He started the series of Presto projects (Presto, PrestoSpace, PrestoPRIME, Presto4U, Presto Centre) on audiovisual digitisation and digital preservation. Before joining the BBC, he worked in speech and hearing research, including speech recognition and synthesis, from 1967 to 1994.
Christoph Bauer was born in 1960 in Vienna/Austria, studied at Vienna’s University of Economics and has several other qualifications like cantor, pianist, organist, choir-conductor, composer, IT-developer, theologist, etc.; when starting to quit fooling around, he joined ORF in 1981. He acted as Project Officer ORF for several EC/IST/ICT/H2020/FAA-Projects (PRESTO, PRIMAVERA, FIRST, NODAL, PRESTOSPACE, eCHASE, PRESTOPRIME, DAVID, EUNOMIA, TailoredMedia, etc.) and is the senior specialist for preservation, digitization and restoration in the ORF archive department. In addition, he is acting as system-administrator for Archive-Systems and AV-Digitization, workflow-developer and AI-mining-specialist (audio&video). Christoph was chairman of the SNML-TNG Management Board (2011-2013), vice-chair of maa (Media-Archives-Austria Association) (2012-2016), member of the ARD K-ARL Expert group for Video-Mining, NKE for the EU-Project “Empowering Society” and lecturer at the University of Vienna. He is the current general-secretary of maa (Media-Archives-Austria Association), member of the ARD medas Expert group Mining (AI) and member of the Digitization & Migration Commission of FIAT/IFTA.
Image processing and reconstruction
Chair: Werner Bailer
Real-time deblurring network for face AR applications
Juhwan Lee, Jongha Lee and Sangwook Yoo
Hyperspectral Image Reconstruction of Heritage Artwork Using RGB Images and Deep Neural Networks
Ailin Chen, Rui Jesus and Márcia Vilarigues
A survey for image based methods in construction: from images to digital twins
Ilias Koulalis, Nikolaos Dourvas, Theocharis Triantafyllidis, Konstantinos Ioannidis, Stefanos Vrochidis and Ioannis Kompatsiaris
Special Session: Learning from scarce data challenges in the media domain
Chair: Hannes Fassold
Learning to Detect Fallen People in Virtual Worlds
Fabio Carrara, Lorenzo Pasco, Claudio Gennaro and Fabrizio Falchi
Few-shot Object Detection as a Semi-supervised Learning Problem
Werner Bailer and Hannes Fassold
Deep Features for CBIR with Scarce Data using Hebbian Learning
Gabriele Lagani, Davide Bacciu, Claudio Gallicchio, Fabrizio Falchi, Claudio Gennaro and Giuseppe Amato
Keynote 2: The Machine Learning of Time and Dynamics in Images, Videos, Simulations (Efstratios Gavves, University of Amsterdam)
Chair: Stefan Rudinac
Music Meets Science
Cultural event supported by SIG MM.
Trio concert | Schubert, Haydn
Performers: Olga Chepovetsky (CH), piano; François Pineau-Benois (FR), violin; Dorottya Standi (AT), cello;