Boban, Ivan und Doko, Alen und Gotovac, Sven (2020) Improving Sentence Retrieval Using Sequence Similarity. Applied Sciences, 10 (12). Multidisciplinary Digital Publishing Institute (MDPI). doi: 10.3390/app10124316. ISSN 2076-3417.
PDF
- Verlagsversion (veröffentlichte Fassung)
787kB |
Offizielle URL: https://www.mdpi.com/2076-3417/10/12/4316
Kurzfassung
Sentence retrieval is an information retrieval technique that aims to find sentences corresponding to an information need. It is used for tasks like question answering (QA) or novelty detection. Since it is similar to document retrieval but with a smaller unit of retrieval, methods for document retrieval are also used for sentence retrieval like term frequency—inverse document frequency (TF-IDF), BM 25 , and language modeling-based methods. The effect of partial matching of words to sentence retrieval is an issue that has not been analyzed. We think that there is a substantial potential for the improvement of sentence retrieval methods if we consider this approach. We adapted TF-ISF, BM 25 , and language modeling-based methods to test the partial matching of terms through combining sentence retrieval with sequence similarity, which allows matching of words that are similar but not identical. All tests were conducted using data from the novelty tracks of the Text Retrieval Conference (TREC). The scope of this paper was to find out if such approach is generally beneficial to sentence retrieval. However, we did not examine in depth how partial matching helps or hinders the finding of relevant sentences.
elib-URL des Eintrags: | https://elib.dlr.de/139266/ | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Dokumentart: | Zeitschriftenbeitrag | ||||||||||||||||
Titel: | Improving Sentence Retrieval Using Sequence Similarity | ||||||||||||||||
Autoren: |
| ||||||||||||||||
Datum: | 2020 | ||||||||||||||||
Erschienen in: | Applied Sciences | ||||||||||||||||
Referierte Publikation: | Ja | ||||||||||||||||
Open Access: | Ja | ||||||||||||||||
Gold Open Access: | Ja | ||||||||||||||||
In SCOPUS: | Ja | ||||||||||||||||
In ISI Web of Science: | Ja | ||||||||||||||||
Band: | 10 | ||||||||||||||||
DOI: | 10.3390/app10124316 | ||||||||||||||||
Verlag: | Multidisciplinary Digital Publishing Institute (MDPI) | ||||||||||||||||
ISSN: | 2076-3417 | ||||||||||||||||
Status: | veröffentlicht | ||||||||||||||||
Stichwörter: | sentence retrieval; TF−ISF; BM25; language modeling; partial match; sequence similarity | ||||||||||||||||
HGF - Forschungsbereich: | Luftfahrt, Raumfahrt und Verkehr | ||||||||||||||||
HGF - Programm: | Luftfahrt | ||||||||||||||||
HGF - Programmthema: | Flugzeuge | ||||||||||||||||
DLR - Schwerpunkt: | Luftfahrt | ||||||||||||||||
DLR - Forschungsgebiet: | L AR - Aircraft Research | ||||||||||||||||
DLR - Teilgebiet (Projekt, Vorhaben): | L - Simulation und Validierung (alt) | ||||||||||||||||
Standort: | Bremen | ||||||||||||||||
Institute & Einrichtungen: | Institut für Softwaretechnologie Institut für Softwaretechnologie > Intelligente und verteilte Systeme | ||||||||||||||||
Hinterlegt von: | Doko, Alen | ||||||||||||||||
Hinterlegt am: | 14 Dez 2020 09:51 | ||||||||||||||||
Letzte Änderung: | 14 Dez 2020 09:51 |
Nur für Mitarbeiter des Archivs: Kontrollseite des Eintrags