El Baff, Roxanne and Santhanam, Sivasurya and Hecking, Tobias (2021) Quantifying Synergy between Software Projects using README Files Only. In: Proceedings of the 33rd International Conference on Software Engineering and Knowledge Engineering, 33. KSI Research Inc. and Knowledge Systems Institute Graduate School. The Thirty Third International Conference on Software Engineering and Knowledge Engineering (SEKE 2021), 1-10 July, Pittsburgh, USA (Online). doi: 10.18293/SEKE2021-162.
![]() |
PDF
614kB |
Abstract
Software version control platforms, such as GitHub, host millions of open-source software projects. Due to their diversity, these projects are an appealing realm for discovering software trends. In our work, we seek to quantify synergy between software projects by connecting them via their similar as well as different software features. Our approach is based on the Literature-Based-Discovery (LBD), originally developed to uncover implicit knowledge in scientific literature databases by linking them through transitive connections. We tested our approach by conducting experiments on 13,264 GitHub (open-source) Python projects. Evaluation, based on human ratings of a subset of 90 project pairs, shows that our developed models are capable of identifying potential synergy between software projects by solely relying on their short descriptions (i.e. readme files).
Item URL in elib: | https://elib.dlr.de/141909/ | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Document Type: | Conference or Workshop Item (Other) | ||||||||||||
Additional Information: | Will be published in July 1, 2021 | ||||||||||||
Title: | Quantifying Synergy between Software Projects using README Files Only | ||||||||||||
Authors: |
| ||||||||||||
Date: | July 2021 | ||||||||||||
Journal or Publication Title: | Proceedings of the 33rd International Conference on Software Engineering and Knowledge Engineering | ||||||||||||
Refereed publication: | Yes | ||||||||||||
Open Access: | Yes | ||||||||||||
Gold Open Access: | No | ||||||||||||
In SCOPUS: | No | ||||||||||||
In ISI Web of Science: | No | ||||||||||||
Volume: | 33 | ||||||||||||
DOI: | 10.18293/SEKE2021-162 | ||||||||||||
Publisher: | KSI Research Inc. and Knowledge Systems Institute Graduate School | ||||||||||||
Series Name: | Proceedings of the 33rd International Conference on Software Engineering and Knowledge Engineering | ||||||||||||
Status: | Accepted | ||||||||||||
Keywords: | repository mining, natural language processing, recommendation system, readme cluster | ||||||||||||
Event Title: | The Thirty Third International Conference on Software Engineering and Knowledge Engineering (SEKE 2021) | ||||||||||||
Event Location: | Pittsburgh, USA (Online) | ||||||||||||
Event Type: | international Conference | ||||||||||||
Event Dates: | 1-10 July | ||||||||||||
Organizer: | http://ksiresearchorg.ipage.com/seke/seke21.html | ||||||||||||
HGF - Research field: | Aeronautics, Space and Transport | ||||||||||||
HGF - Program: | Space | ||||||||||||
HGF - Program Themes: | Space System Technology | ||||||||||||
DLR - Research area: | Raumfahrt | ||||||||||||
DLR - Program: | R SY - Space System Technology | ||||||||||||
DLR - Research theme (Project): | R - Tasks SISTEC, R - Analytics and visualization of large space software systems | ||||||||||||
Location: | Köln-Porz , Oberpfaffenhofen | ||||||||||||
Institutes and Institutions: | Institute for Software Technology Institute for Software Technology > Intelligent and Distributed Systems | ||||||||||||
Deposited By: | El Baff, Roxanne | ||||||||||||
Deposited On: | 26 May 2021 12:05 | ||||||||||||
Last Modified: | 26 May 2021 12:05 |
Repository Staff Only: item control page