elib
DLR-Header
DLR-Logo -> http://www.dlr.de
DLR Portal Home | Imprint | Privacy Policy | Accessibility | Contact | Deutsch
Fontsize: [-] Text [+]

RNA contact prediction by data efficient deep learning

Taubert, Oskar and von der Lehr, Fabrice and Bazarova, Alina and Faber, Christian and Knechtges, Philipp and Weiel, Marie and Debus, Charlotte and Coquelin, Daniel and Basermann, Achim and Streit, Achim and Kesselheim, Stefan and Götz, Markus and Schug, Alexander (2023) RNA contact prediction by data efficient deep learning. Communications Biology, 6 (913). Springer Nature. doi: 10.1038/s42003-023-05244-9. ISSN 2399-3642.

[img] PDF - Published version
943kB

Official URL: https://www.nature.com/articles/s42003-023-05244-9

Abstract

On the path to full understanding of the structure-function relationship or even design of RNA, structure prediction would offer an intriguing complement to experimental efforts. Any deep learning on RNA structure, however, is hampered by the sparsity of labeled training data. Utilizing the limited data available, we here focus on predicting spatial adjacencies ("contact maps") as a proxy for 3D structure. Our model, BARNACLE, combines the utilization of unlabeled data through self-supervised pre-training and efficient use of the sparse labeled data through an XGBoost classifier. BARNACLE shows a considerable improvement over both the established classical baseline and a deep neural network. In order to demonstrate that our approach can be applied to tasks with similar data constraints, we show that our findings generalize to the related setting of accessible surface area prediction.

Item URL in elib:https://elib.dlr.de/199460/
Document Type:Article
Title:RNA contact prediction by data efficient deep learning
Authors:
AuthorsInstitution or Email of AuthorsAuthor's ORCID iDORCID Put Code
Taubert, OskarUNSPECIFIEDUNSPECIFIEDUNSPECIFIED
von der Lehr, FabriceUNSPECIFIEDhttps://orcid.org/0009-0000-2134-6754147328745
Bazarova, AlinaUNSPECIFIEDUNSPECIFIEDUNSPECIFIED
Faber, ChristianUNSPECIFIEDUNSPECIFIEDUNSPECIFIED
Knechtges, PhilippUNSPECIFIEDhttps://orcid.org/0000-0002-4849-0593147328746
Weiel, MarieUNSPECIFIEDUNSPECIFIEDUNSPECIFIED
Debus, CharlotteUNSPECIFIEDUNSPECIFIEDUNSPECIFIED
Coquelin, DanielUNSPECIFIEDUNSPECIFIEDUNSPECIFIED
Basermann, AchimUNSPECIFIEDhttps://orcid.org/0000-0003-3637-3231147328747
Streit, AchimUNSPECIFIEDUNSPECIFIEDUNSPECIFIED
Kesselheim, StefanJülich Supercomputing CentreUNSPECIFIEDUNSPECIFIED
Götz, MarkusKarlsruher Institut für Technologie (KIT)https://orcid.org/0000-0002-2233-1041UNSPECIFIED
Schug, AlexanderUNSPECIFIEDUNSPECIFIEDUNSPECIFIED
Date:6 September 2023
Journal or Publication Title:Communications Biology
Refereed publication:Yes
Open Access:Yes
Gold Open Access:Yes
In SCOPUS:Yes
In ISI Web of Science:Yes
Volume:6
DOI:10.1038/s42003-023-05244-9
Publisher:Springer Nature
ISSN:2399-3642
Status:Published
Keywords:Machine Learning, Deep Learning, Molecular Modeling, RNA, Structure Prediction
HGF - Research field:Aeronautics, Space and Transport
HGF - Program:Space
HGF - Program Themes:Space System Technology
DLR - Research area:Raumfahrt
DLR - Program:R SY - Space System Technology
DLR - Research theme (Project):R - Tasks SISTEC
Location: Köln-Porz
Institutes and Institutions:Institute of Software Technology > High-Performance Computing
Deposited By: von der Lehr, Fabrice
Deposited On:24 Nov 2023 09:03
Last Modified:29 Nov 2023 14:44

Repository Staff Only: item control page

Browse
Search
Help & Contact
Information
OpenAIRE Validator logo electronic library is running on EPrints 3.3.12
Website and database design: Copyright © German Aerospace Center (DLR). All rights reserved.