elib
DLR-Header
DLR-Logo -> http://www.dlr.de
DLR Portal Home | Impressum | Datenschutz | Kontakt | English
Schriftgröße: [-] Text [+]

Refining action segmentation with hierarchical video representations

Ahn, Hyemin und Lee, Dongheui (2021) Refining action segmentation with hierarchical video representations. In: 18th IEEE/CVF International Conference on Computer Vision, ICCV 2021, Seiten 16302-16310. IEEE. international Conference on Computer Vision, 11 Oct - 17 Oct 2021, Virtual. doi: 10.1109/ICCV48922.2021.01599. ISBN 978-166542812-5. ISSN 1550-5499.

[img] PDF
2MB

Offizielle URL: https://openaccess.thecvf.com/content/ICCV2021/html/Ahn_Refining_Action_Segmentation_With_Hierarchical_Video_Representations_ICCV_2021_paper.html

Kurzfassung

In this paper, we propose Hierarchical Action Segmentation Refiner (HASR), which can refine temporal action segmentation results from various models by understanding the overall context of a given video in a hierarchical way. When a backbone model for action segmentation estimates how the given video can be segmented, our model extracts segment-level representations based on frame-level features, and extracts a video-level representation based on the segment-level representations. Based on these hierarchical representations, our model can refer to the overall context of the entire video, and predict how the segment labels that are out of context should be corrected. Our HASR can be plugged into various action segmentation models (MS-TCN, SSTDA, ASRF), and improve the performance of state-of-the-art models based on three challenging datasets (GTEA, 50Salads, and Breakfast). For example, in 50Salads dataset, the segmental edit score improves from 67.9% to 77.4% (MS-TCN), from 75.8% to 77.3% (SSTDA), from 79.3% to 81.0% (ASRF). In addition, our model can refine the segmentation result from the unseen backbone model, which was not referred to when training HASR. This generalization performance would make HASR be an effective tool for boosting up the existing approaches for temporal action segmentation. Our code is available at https://github.com/cotton-ahn/HASR_iccv2021.

elib-URL des Eintrags:https://elib.dlr.de/147186/
Dokumentart:Konferenzbeitrag (Poster)
Zusätzliche Informationen:This work has been partially supported by the Helmholtz Association.
Titel:Refining action segmentation with hierarchical video representations
Autoren:
AutorenInstitution oder E-Mail-AdresseAutoren-ORCID-iDORCID Put Code
Ahn, HyeminHyemin.Ahn (at) dlr.dehttps://orcid.org/0000-0001-8081-6023NICHT SPEZIFIZIERT
Lee, DongheuiDongheui.Lee (at) dlr.dehttps://orcid.org/0000-0003-1897-7664NICHT SPEZIFIZIERT
Datum:Oktober 2021
Erschienen in:18th IEEE/CVF International Conference on Computer Vision, ICCV 2021
Referierte Publikation:Ja
Open Access:Ja
Gold Open Access:Nein
In SCOPUS:Ja
In ISI Web of Science:Ja
DOI:10.1109/ICCV48922.2021.01599
Seitenbereich:Seiten 16302-16310
Verlag:IEEE
ISSN:1550-5499
ISBN:978-166542812-5
Status:veröffentlicht
Stichwörter:Video Action Segmentation; Computer Vision; Deep Learning
Veranstaltungstitel:international Conference on Computer Vision
Veranstaltungsort:Virtual
Veranstaltungsart:internationale Konferenz
Veranstaltungsdatum:11 Oct - 17 Oct 2021
Veranstalter :IEEE Computer Society
HGF - Forschungsbereich:Luftfahrt, Raumfahrt und Verkehr
HGF - Programm:Raumfahrt
HGF - Programmthema:Robotik
DLR - Schwerpunkt:Raumfahrt
DLR - Forschungsgebiet:R RO - Robotik
DLR - Teilgebiet (Projekt, Vorhaben):R - Autonome, lernende Roboter [RO], R - Intuitive Mensch-Roboter Schnittstelle [RO]
Standort: Oberpfaffenhofen
Institute & Einrichtungen:Institut für Robotik und Mechatronik (ab 2013)
Hinterlegt von: Ahn, Hyemin
Hinterlegt am:10 Dez 2021 09:22
Letzte Änderung:20 Jul 2022 12:26

Nur für Mitarbeiter des Archivs: Kontrollseite des Eintrags

Blättern
Suchen
Hilfe & Kontakt
Informationen
electronic library verwendet EPrints 3.3.12
Gestaltung Webseite und Datenbank: Copyright © Deutsches Zentrum für Luft- und Raumfahrt (DLR). Alle Rechte vorbehalten.