Hamm, Andreas und Odrowski, Simon (2021) Term-Community-Based Topic Detection with Variable Resolution. Information, 12 (6). Multidisciplinary Digital Publishing Institute (MDPI). doi: 10.3390/info12060221. ISSN 2078-2489.
|
PDF
- Verlagsversion (veröffentlichte Fassung)
634kB |
Offizielle URL: https://www.mdpi.com/2078-2489/12/6/221
Kurzfassung
Network-based procedures for topic detection in huge text collections offer an intuitive alternative to probabilistic topic models. We present in detail a method that is especially designed with the requirements of domain experts in mind. Like similar methods, it employs community detection in term co-occurrence graphs, but it is enhanced by including a resolution parameter that can be used for changing the targeted topic granularity. We also establish a term ranking and use semantic word-embedding for presenting term communities in a way that facilitates their interpretation. We demonstrate the application of our method with a widely used corpus of general news articles and show the results of detailed social-sciences expert evaluations of detected topics at various resolutions. A comparison with topics detected by Latent Dirichlet Allocation is also included. Finally, we discuss factors that influence topic interpretation.
| elib-URL des Eintrags: | https://elib.dlr.de/142499/ | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Dokumentart: | Zeitschriftenbeitrag | ||||||||||||
| Titel: | Term-Community-Based Topic Detection with Variable Resolution | ||||||||||||
| Autoren: |
| ||||||||||||
| Datum: | 23 Mai 2021 | ||||||||||||
| Erschienen in: | Information | ||||||||||||
| Referierte Publikation: | Ja | ||||||||||||
| Open Access: | Ja | ||||||||||||
| Gold Open Access: | Ja | ||||||||||||
| In SCOPUS: | Ja | ||||||||||||
| In ISI Web of Science: | Ja | ||||||||||||
| Band: | 12 | ||||||||||||
| DOI: | 10.3390/info12060221 | ||||||||||||
| Verlag: | Multidisciplinary Digital Publishing Institute (MDPI) | ||||||||||||
| ISSN: | 2078-2489 | ||||||||||||
| Status: | veröffentlicht | ||||||||||||
| Stichwörter: | text mining; natural language processing; topic modeling; term ranking; community detection; corpus analysis; word embeddings | ||||||||||||
| HGF - Forschungsbereich: | keine Zuordnung | ||||||||||||
| HGF - Programm: | keine Zuordnung | ||||||||||||
| HGF - Programmthema: | keine Zuordnung | ||||||||||||
| DLR - Schwerpunkt: | keine Zuordnung | ||||||||||||
| DLR - Forschungsgebiet: | keine Zuordnung | ||||||||||||
| DLR - Teilgebiet (Projekt, Vorhaben): | keine Zuordnung | ||||||||||||
| Standort: | Köln-Porz | ||||||||||||
| Institute & Einrichtungen: | Think Tank | ||||||||||||
| Hinterlegt von: | Hamm, Dr. Andreas | ||||||||||||
| Hinterlegt am: | 31 Mai 2021 15:21 | ||||||||||||
| Letzte Änderung: | 23 Okt 2023 09:52 |
Nur für Mitarbeiter des Archivs: Kontrollseite des Eintrags