Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Visually guided extraction of prevalent topics
Linnaeus University.
Blekinge Tekniska Högskola, Fakulteten för datavetenskaper, Institutionen för datavetenskap.ORCID-id: 0000-0001-6745-4398
Linköping University.
Linnaeus University.
2025 (engelsk)Inngår i: Information Visualization, ISSN 1473-8716, E-ISSN 1473-8724, Vol. 24, nr 2, s. 179-198Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

The sensemaking process of large sets of text documents is highly challenging for tasks such as obtaining a comprehensive overview or keeping up with the most important trends and topics. Even though several established methods for condensation and summarization of large text corpora exist, many of them lack the ability to account for difference in prevalence between identified topics, which in turn impedes quantitative analysis. In this paper, we therefore propose a novel prevalence-aware method for topic extraction, and show how it can be used to obtain important insights from two text corpora with very different content. We also implemented a prototype visual analytics tool which guides the user in the search for relevant insights and promotes trust in the yielded results. We have verified our application by a user study, as well as by a validation run on a data set with previously known topic structure. The results clearly show that our approach is suitable for text mining, that it can be used by non-experts, and that it offers features which makes it an interesting candidate for use in several different analysis scenarios. 

sted, utgiver, år, opplag, sider
Sage Publications, 2025. Vol. 24, nr 2, s. 179-198
Emneord [en]
similarity calculations, text embedding, text mining, topic modeling, Visual analytics, Embeddings, Sense making, Similarity calculation, Text corpora, Text document, Text-mining, Topic extraction
HSV kategori
Identifikatorer
URN: urn:nbn:se:bth-27447DOI: 10.1177/14738716241312400ISI: 001408697200001Scopus ID: 2-s2.0-105001067590OAI: oai:DiVA.org:bth-27447DiVA, id: diva2:1936281
Prosjekter
Rekrytering 21
Forskningsfinansiär
ELLIIT - The Linköping‐Lund Initiative on IT and Mobile CommunicationsKnowledge Foundation, 20210077Tilgjengelig fra: 2025-02-10 Laget: 2025-02-10 Sist oppdatert: 2025-09-30bibliografisk kontrollert

Open Access i DiVA

fulltext(4278 kB)93 nedlastinger
Filinformasjon
Fil FULLTEXT01.pdfFilstørrelse 4278 kBChecksum SHA-512
b51ebc8b6dbb8c873e4bb7a4e30ba00c29ec34c5a7123e55d33341ab686d8be92b334f6b50e5e06db251e9ea2a75abbde1f3934e4ceb8548c95d300f45ecf06e
Type fulltextMimetype application/pdf

Andre lenker

Forlagets fulltekstScopus

Person

Jusufi, Ilir

Søk i DiVA

Av forfatter/redaktør
Jusufi, Ilir
Av organisasjonen
I samme tidsskrift
Information Visualization

Søk utenfor DiVA

GoogleGoogle Scholar
Totalt: 93 nedlastinger
Antall nedlastinger er summen av alle nedlastinger av alle fulltekster. Det kan for eksempel være tidligere versjoner som er ikke lenger tilgjengelige

doi
urn-nbn

Altmetric

doi
urn-nbn
Totalt: 1601 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf