Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Interactive Optimization of Embedding-based Text Similarity Calculations
Linnéuniversitetet.ORCID iD: 0000-0001-6150-0787
Linnéuniversitetet.ORCID iD: 0000-0001-6745-4398
Linnéuniversitetet.ORCID iD: 0000-0002-2901-935X
Linnéuniversitetet.ORCID iD: 0000-0002-1907-7820
Show others and affiliations
2022 (English)In: Information Visualization, ISSN 1473-8716, E-ISSN 1473-8724, Vol. 21, no 4, p. 335-353Article in journal (Refereed) Published
Abstract [en]

Comparing text documents is an essential task for a variety of applications within diverse research fields, and several different methods have been developed for this. However, calculating text similarity is an ambiguous and context-dependent task, so many open challenges still exist. In this paper, we present a novel method for text similarity calculations based on the combination of embedding technology and ensemble methods. By using several embeddings, instead of only one, we show that it is possible to achieve higher quality, which in turn is a key factor for developing high-performing applications for text similarity exploitation. We also provide a prototype visual analytics tool which helps the analyst to find optimal performing ensembles and gain insights to the inner workings of the similarity calculations. Furthermore, we discuss the generalizability of our key ideas to fields beyond the scope of text analysis.

Place, publisher, year, edition, pages
Sage Publications, 2022. Vol. 21, no 4, p. 335-353
Keywords [en]
Text embedding, ensemble methods, text similarity, similarity calculations, visual analytics
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:bth-23883DOI: 10.1177/14738716221114372OAI: oai:DiVA.org:bth-23883DiVA, id: diva2:1710879
Available from: 2022-11-15 Created: 2022-11-15 Last updated: 2022-11-15Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full text

Authority records

Jusufi, Ilir

Search in DiVA

By author/editor
Daniel, WitschardJusufi, IlirMartins, Rafael MessiasKucher, KostiantynKerren, Andreas
In the same journal
Information Visualization
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 123 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf