elib
DLR-Header
DLR-Logo -> http://www.dlr.de
DLR Portal Home | Imprint | Privacy Policy | Accessibility | Contact | Deutsch
Fontsize: [-] Text [+]

Generalizability of Code Clone Detection on CodeBERT

Sonnekalb, Tim and Gruner, Bernd and Brust, Clemens-Alexander and Mäder, Patrick (2022) Generalizability of Code Clone Detection on CodeBERT. In: 37th IEEE/ACM International Conference on Automated Software Engineering, ASE 2022. ACM. ASE 2022, 2022-10-10 - 2022-10-14, Michigan, USA. ISBN 978-145039475-8.

[img] PDF
383kB

Abstract

Transformer networks such as CodeBERT already achieve very good results for code clone detection in benchmark datasets, so one could assume that this task has already been solved. However, code clone detection is not a trivial task. Semantic code clones in particular are difficult to detect. We show that the generalizability of CodeBERT decreases by evaluating two different subsets of Java code clones from BigCloneBench. We observe a significant drop of F1 score when we evaluate different code snippets and different functionality IDs than those used for model building.

Item URL in elib:https://elib.dlr.de/144942/
Document Type:Conference or Workshop Item (Speech)
Title:Generalizability of Code Clone Detection on CodeBERT
Authors:
AuthorsInstitution or Email of AuthorsAuthor's ORCID iDORCID Put Code
Sonnekalb, TimTim.Sonnekalb (at) dlr.dehttps://orcid.org/0000-0002-0067-1790UNSPECIFIED
Gruner, BerndBernd.Gruner (at) dlr.dehttps://orcid.org/0000-0002-4177-2993152393238
Brust, Clemens-Alexanderclemens-alexander.brust (at) dlr.dehttps://orcid.org/0000-0001-5419-1998152393239
Mäder, Patrickpatrick.maeder (at) tu-ilmenau.dehttps://orcid.org/0000-0001-6871-2707UNSPECIFIED
Date:10 October 2022
Journal or Publication Title:37th IEEE/ACM International Conference on Automated Software Engineering, ASE 2022
Refereed publication:Yes
Open Access:Yes
Gold Open Access:No
In SCOPUS:No
In ISI Web of Science:Yes
Publisher:ACM
ISBN:978-145039475-8
Status:Published
Keywords:clone detection, transformer networks, bigclonebench, machine learning on code
Event Title:ASE 2022
Event Location:Michigan, USA
Event Type:international Conference
Event Start Date:10 October 2022
Event End Date:14 October 2022
HGF - Research field:Aeronautics, Space and Transport
HGF - Program:Space
HGF - Program Themes:Space System Technology
DLR - Research area:Raumfahrt
DLR - Program:R SY - Space System Technology
DLR - Research theme (Project):R - Intelligent analysis and methods for safe software development, D - short study [DAT], D - short study [KIZ]
Location: Jena , Köln-Porz , Oberpfaffenhofen
Institutes and Institutions:Institute of Data Science
Institute of Data Science > Data Analysis and Intelligence
Deposited By: Sonnekalb, Tim
Deposited On:05 Dec 2022 10:46
Last Modified:24 Apr 2024 20:44

Repository Staff Only: item control page

Browse
Search
Help & Contact
Information
OpenAIRE Validator logo electronic library is running on EPrints 3.3.12
Website and database design: Copyright © German Aerospace Center (DLR). All rights reserved.