elib
DLR-Header
DLR-Logo -> http://www.dlr.de
DLR Portal Home | Imprint | Privacy Policy | Contact | Deutsch
Fontsize: [-] Text [+]

How can voting mechanisms improve the robustness and generalizability of toponym disambiguation?

Hu, Xuke and Sun, Yeran and Kersten, Jens and Zhou, Zhiyong and Klan, Friederike and Fan, Hongchao (2023) How can voting mechanisms improve the robustness and generalizability of toponym disambiguation? International Journal of Applied Earth Observation and Geoinformation. Elsevier. doi: 10.1016/j.jag.2023.103191. ISSN 1569-8432. (Submitted)

This is the latest version of this item.

[img] PDF - Published version
2MB

Abstract

A vast amount of geospatial information exists in natural language texts, such as tweets and news. Extracting geospatial information from texts is called Geoparsing, which includes two subtasks: toponym recognition and toponym disambiguation, i.e., to identify the geospatial representations of toponyms. This paper focuses on toponym disambiguation, which is approached by toponym resolution and entity linking. Recently, many novel approaches have been proposed, especially deep learning-based, such as CamCoder, GENRE, and BLINK. In this paper, a spatial clustering-based voting approach combining several individual approaches is proposed to improve SOTA performance regarding robustness and generalizability. Experiments are conducted to compare a voting ensemble with 20 latest and commonly-used approaches based on 12 public datasets, including several highly challenging datasets (e.g., WikToR). They are in six types: tweets, historical documents, news, web pages, scientific articles, and Wikipedia articles, containing 98,300 places across the world. Experimental results show that the voting ensemble performs the best on all the datasets, achieving an average Accuracy@161km of 0.86, proving its generalizability and robustness. Besides, it drastically improves the performance of resolving fine-grained places, i.e., POIs, natural features, and traffic ways.

Item URL in elib:https://elib.dlr.de/188965/
Document Type:Article
Title:How can voting mechanisms improve the robustness and generalizability of toponym disambiguation?
Authors:
AuthorsInstitution or Email of AuthorsAuthor's ORCID iDORCID Put Code
Hu, XukeUNSPECIFIEDUNSPECIFIEDUNSPECIFIED
Sun, YeranUniversity of LincolnUNSPECIFIEDUNSPECIFIED
Kersten, JensUNSPECIFIEDUNSPECIFIEDUNSPECIFIED
Zhou, ZhiyongUniversity of ZurichUNSPECIFIEDUNSPECIFIED
Klan, FriederikeUNSPECIFIEDhttps://orcid.org/0000-0002-1856-7334UNSPECIFIED
Fan, HongchaoUNSPECIFIEDUNSPECIFIEDUNSPECIFIED
Date:1 February 2023
Journal or Publication Title:International Journal of Applied Earth Observation and Geoinformation
Refereed publication:Yes
Open Access:Yes
Gold Open Access:Yes
In SCOPUS:Yes
In ISI Web of Science:Yes
DOI:10.1016/j.jag.2023.103191
Publisher:Elsevier
ISSN:1569-8432
Status:Submitted
Keywords:Toponym disambiguation; Toponym resolution; Geocoding; Geoparsing; Entity linking; Entity disambiguation; Voting.
HGF - Research field:Aeronautics, Space and Transport
HGF - Program:Space
HGF - Program Themes:Space System Technology
DLR - Research area:Raumfahrt
DLR - Program:R SY - Space System Technology
DLR - Research theme (Project):R - Environment, Health and Big Data
Location: Jena
Institutes and Institutions:Institute of Data Science > Data Acquisition and Mobilisation
Deposited By: Hu, Xuke
Deposited On:02 Nov 2022 11:02
Last Modified:05 Dec 2023 11:28

Available Versions of this Item

  • How can voting mechanisms improve the robustness and generalizability of toponym disambiguation? (deposited 02 Nov 2022 11:02) [Currently Displayed]

Repository Staff Only: item control page

Browse
Search
Help & Contact
Information
electronic library is running on EPrints 3.3.12
Website and database design: Copyright © German Aerospace Center (DLR). All rights reserved.