Wang, Yi and Albrecht, Conrad M and Zhu, Xiao Xiang (2022) Self-supervised vision transformers for joint SAR-optical representation learning. In: International Geoscience and Remote Sensing Symposium (IGARSS), pp. 139-142. IGARSS 2022, 2022-07-17 - 2022-07-22, Kuala Lumpur, Malaysia. doi: 10.1109/IGARSS46834.2022.9883983.
PDF
647kB |
Official URL: https://ieeexplore.ieee.org/document/9883983
Abstract
Self-supervised learning (SSL) has attracted much interest in remote sensing and Earth observation due to its ability to learn task-agnostic representations without human annotation. While most of the existing SSL works in remote sensing utilize ConvNet backbones and focus on a single modality, we explore the potential of vision transformers (ViTs) for joint SAR-optical representation learning. Based on DINO, a state-of-the-art SSL algorithm that distills knowledge from two augmented views of an input image, we combine SAR and optical imagery by concatenating all channels to a unified input. Subsequently, we randomly mask out channels of one modality as a data augmentation strategy. While training, the model gets fed optical-only, SAR-only, and SAR-optical image pairs learning both inner- and intra-modality representations. Experimental results employing the BigEarthNet-MM dataset demonstrate the benefits of both, the ViT backbones and the proposed multimodal SSL algorithm DINO-MM.
Item URL in elib: | https://elib.dlr.de/190386/ | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Document Type: | Conference or Workshop Item (Speech) | ||||||||||||||||
Title: | Self-supervised vision transformers for joint SAR-optical representation learning | ||||||||||||||||
Authors: |
| ||||||||||||||||
Date: | 2022 | ||||||||||||||||
Journal or Publication Title: | International Geoscience and Remote Sensing Symposium (IGARSS) | ||||||||||||||||
Refereed publication: | Yes | ||||||||||||||||
Open Access: | Yes | ||||||||||||||||
Gold Open Access: | No | ||||||||||||||||
In SCOPUS: | Yes | ||||||||||||||||
In ISI Web of Science: | No | ||||||||||||||||
DOI: | 10.1109/IGARSS46834.2022.9883983 | ||||||||||||||||
Page Range: | pp. 139-142 | ||||||||||||||||
Status: | Published | ||||||||||||||||
Keywords: | Self-supervised learning, vision transformer, multimodal representation learning, remote sensing | ||||||||||||||||
Event Title: | IGARSS 2022 | ||||||||||||||||
Event Location: | Kuala Lumpur, Malaysia | ||||||||||||||||
Event Type: | international Conference | ||||||||||||||||
Event Start Date: | 17 July 2022 | ||||||||||||||||
Event End Date: | 22 July 2022 | ||||||||||||||||
Organizer: | IEEE GRSS | ||||||||||||||||
HGF - Research field: | Aeronautics, Space and Transport | ||||||||||||||||
HGF - Program: | Space | ||||||||||||||||
HGF - Program Themes: | Earth Observation | ||||||||||||||||
DLR - Research area: | Raumfahrt | ||||||||||||||||
DLR - Program: | R EO - Earth Observation | ||||||||||||||||
DLR - Research theme (Project): | R - Artificial Intelligence | ||||||||||||||||
Location: | Oberpfaffenhofen | ||||||||||||||||
Institutes and Institutions: | Remote Sensing Technology Institute > EO Data Science | ||||||||||||||||
Deposited By: | Wang, Yi | ||||||||||||||||
Deposited On: | 22 Nov 2022 13:14 | ||||||||||||||||
Last Modified: | 24 Apr 2024 20:51 |
Repository Staff Only: item control page