Cerra, Daniele and Reinartz, Peter and Datcu, Mihai (2014) Authorship analysis based on data compression. Pattern Recognition Letters, 42, pp. 79-84. Elsevier. doi: 10.1016/j.patrec.2014.01.019. ISSN 0167-8655.
|
PDF (arXiv preprint)
271kB |
Official URL: http://www.sciencedirect.com/science/article/pii/S0167865514000336
Abstract
This paper proposes to perform authorship analysis using the Fast Compression Distance (FCD), a similarity measure based on compression with dictionaries directly extracted from the written texts. The FCD computes a similarity between two documents through an effective binary search on the intersection set between the two related dictionaries. In the reported experiments the proposed method is applied to documents which are heterogeneous in style, written in five different languages and coming from different historical periods. Results are comparable to the state of the art and outperform traditional compression-based methods.
| Item URL in elib: | https://elib.dlr.de/88386/ | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Document Type: | Article | ||||||||||||||||
| Title: | Authorship analysis based on data compression | ||||||||||||||||
| Authors: |
| ||||||||||||||||
| Date: | 1 June 2014 | ||||||||||||||||
| Journal or Publication Title: | Pattern Recognition Letters | ||||||||||||||||
| Refereed publication: | Yes | ||||||||||||||||
| Open Access: | Yes | ||||||||||||||||
| Gold Open Access: | No | ||||||||||||||||
| In SCOPUS: | Yes | ||||||||||||||||
| In ISI Web of Science: | Yes | ||||||||||||||||
| Volume: | 42 | ||||||||||||||||
| DOI: | 10.1016/j.patrec.2014.01.019 | ||||||||||||||||
| Page Range: | pp. 79-84 | ||||||||||||||||
| Editors: |
| ||||||||||||||||
| Publisher: | Elsevier | ||||||||||||||||
| Series Name: | International Association for Pattern Recognition | ||||||||||||||||
| ISSN: | 0167-8655 | ||||||||||||||||
| Status: | Published | ||||||||||||||||
| Keywords: | Authorship analysis; Data compression; Similarity measure | ||||||||||||||||
| HGF - Research field: | Aeronautics, Space and Transport | ||||||||||||||||
| HGF - Program: | Space | ||||||||||||||||
| HGF - Program Themes: | Earth Observation | ||||||||||||||||
| DLR - Research area: | Raumfahrt | ||||||||||||||||
| DLR - Program: | R EO - Earth Observation | ||||||||||||||||
| DLR - Research theme (Project): | R - Vorhaben hochauflösende Fernerkundungsverfahren (old) | ||||||||||||||||
| Location: | Oberpfaffenhofen | ||||||||||||||||
| Institutes and Institutions: | Remote Sensing Technology Institute > Photogrammetry and Image Analysis | ||||||||||||||||
| Deposited By: | Cerra, Daniele | ||||||||||||||||
| Deposited On: | 11 Mar 2014 08:54 | ||||||||||||||||
| Last Modified: | 06 Nov 2023 14:19 |
Repository Staff Only: item control page