CMIR-NET: A deep learning based model for cross-modal retrieval in remote sensing

Chaudhuri, Ushashi und Banerjee, Biplab und Bhattacharya, Avik und Datcu, Mihai (2020) CMIR-NET: A deep learning based model for cross-modal retrieval in remote sensing. Pattern Recognition Letters, 131, Seiten 456-462. Elsevier. doi: 10.1016/j.patrec.2020.02.006. ISSN 0167-8655.

PDF - Preprintversion (eingereichte Entwurfsversion)
2MB

Offizielle URL: https://www.sciencedirect.com/science/article/abs/pii/S0167865520300453

Kurzfassung

We address the problem of cross-modal information retrieval in the domain of remote sensing. In particular, we are interested in two application scenarios: i) cross-modal retrieval between panchromatic (PAN) and multi-spectral imagery, and ii) multi-label image retrieval between very high resolution (VHR) images and speech based label annotations. Notice that these multi-modal retrieval scenarios are more challenging than the traditional uni-modal retrieval approaches given the inherent differences in distributions between the modalities. However, with the growing availability of multi-source remote sensing data and the scarcity of enough semantic annotations, the task of multi-modal retrieval has recently become extremely important. In this regard, we propose a novel deep neural network based architecture which is considered to learn a discriminative shared feature space for all the input modalities, suitable for semantically coherent information retrieval. Extensive experiments are carried out on the benchmark large-scale PAN - multi-spectral DSRSID dataset and the multi-label UC-Merced dataset. Together with the Merced dataset, we generate a corpus of speech signals corresponding to the labels. Superior performance with respect to the current state-of-the-art is observed in all the cases.

elib-URL des Eintrags:

https://elib.dlr.de/140996/

Dokumentart:

Zeitschriftenbeitrag

Titel:

CMIR-NET: A deep learning based model for cross-modal retrieval in remote sensing

Autoren:

Autoren	Institution oder E-Mail-Adresse	Autoren-ORCID-iD	ORCID Put Code
Chaudhuri, Ushashi	Indian Institute of Technology Bombay	NICHT SPEZIFIZIERT	NICHT SPEZIFIZIERT
Banerjee, Biplab	Indian Institute of Technology Bombay	NICHT SPEZIFIZIERT	NICHT SPEZIFIZIERT
Bhattacharya, Avik	Indian Institute of Technology Bombay	NICHT SPEZIFIZIERT	NICHT SPEZIFIZIERT
Datcu, Mihai	Mihai.Datcu (at) dlr.de	NICHT SPEZIFIZIERT	NICHT SPEZIFIZIERT

Datum:

2020

Erschienen in:

Pattern Recognition Letters

Referierte Publikation:

Open Access:

Gold Open Access:

Nein

In SCOPUS:

In ISI Web of Science:

Band:

131

DOI:

10.1016/j.patrec.2020.02.006

Seitenbereich:

Seiten 456-462

Verlag:

Elsevier

ISSN:

0167-8655

Status:

veröffentlicht

Stichwörter:

image and video processing, deep learning, remote sensing, cross-modal retrieval

HGF - Forschungsbereich:

Luftfahrt, Raumfahrt und Verkehr

HGF - Programm:

Raumfahrt

HGF - Programmthema:

Erdbeobachtung

DLR - Schwerpunkt:

Raumfahrt

DLR - Forschungsgebiet:

R EO - Erdbeobachtung

DLR - Teilgebiet (Projekt, Vorhaben):

R - Fernerkundung u. Geoforschung

Standort:

Oberpfaffenhofen

Institute & Einrichtungen:

Institut für Methodik der Fernerkundung > EO Data Science

Hinterlegt von:

Bratasanu, Ion-Dragos

Hinterlegt am:

19 Feb 2021 18:05

Letzte Änderung:

19 Feb 2021 18:05

Nur für Mitarbeiter des Archivs: Kontrollseite des Eintrags