elib
DLR-Header
DLR-Logo -> http://www.dlr.de
DLR Portal Home | Imprint | Privacy Policy | Contact | Deutsch
Fontsize: [-] Text [+]

Improving Sentence Retrieval Using Sequence Similarity

Boban, Ivan and Doko, Alen and Gotovac, Sven (2020) Improving Sentence Retrieval Using Sequence Similarity. Applied Sciences, 10 (12). Multidisciplinary Digital Publishing Institute (MDPI). doi: 10.3390/app10124316. ISSN 2076-3417.

[img] PDF - Published version
787kB

Official URL: https://www.mdpi.com/2076-3417/10/12/4316

Abstract

Sentence retrieval is an information retrieval technique that aims to find sentences corresponding to an information need. It is used for tasks like question answering (QA) or novelty detection. Since it is similar to document retrieval but with a smaller unit of retrieval, methods for document retrieval are also used for sentence retrieval like term frequency—inverse document frequency (TF-IDF), BM 25 , and language modeling-based methods. The effect of partial matching of words to sentence retrieval is an issue that has not been analyzed. We think that there is a substantial potential for the improvement of sentence retrieval methods if we consider this approach. We adapted TF-ISF, BM 25 , and language modeling-based methods to test the partial matching of terms through combining sentence retrieval with sequence similarity, which allows matching of words that are similar but not identical. All tests were conducted using data from the novelty tracks of the Text Retrieval Conference (TREC). The scope of this paper was to find out if such approach is generally beneficial to sentence retrieval. However, we did not examine in depth how partial matching helps or hinders the finding of relevant sentences.

Item URL in elib:https://elib.dlr.de/139266/
Document Type:Article
Title:Improving Sentence Retrieval Using Sequence Similarity
Authors:
AuthorsInstitution or Email of AuthorsAuthor's ORCID iD
Boban, IvanUNSPECIFIEDUNSPECIFIED
Doko, AlenUNSPECIFIEDhttps://orcid.org/0000-0001-7401-3558
Gotovac, SvenUNSPECIFIEDUNSPECIFIED
Date:2020
Journal or Publication Title:Applied Sciences
Refereed publication:Yes
Open Access:Yes
Gold Open Access:Yes
In SCOPUS:Yes
In ISI Web of Science:Yes
Volume:10
DOI:10.3390/app10124316
Publisher:Multidisciplinary Digital Publishing Institute (MDPI)
ISSN:2076-3417
Status:Published
Keywords:sentence retrieval; TF−ISF; BM25; language modeling; partial match; sequence similarity
HGF - Research field:Aeronautics, Space and Transport
HGF - Program:Aeronautics
HGF - Program Themes:fixed-wing aircraft
DLR - Research area:Aeronautics
DLR - Program:L AR - Aircraft Research
DLR - Research theme (Project):L - Simulation and Validation (old)
Location: Bremen
Institutes and Institutions:Institute for Software Technology
Institute for Software Technology > Intelligent and Distributed Systems
Deposited By: Doko, Alen
Deposited On:14 Dec 2020 09:51
Last Modified:14 Dec 2020 09:51

Repository Staff Only: item control page

Browse
Search
Help & Contact
Information
electronic library is running on EPrints 3.3.12
Website and database design: Copyright © German Aerospace Center (DLR). All rights reserved.