Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
On Evaluation of Data Stream Clustering Algorithms: A Survey
Blekinge Institute of Technology, Faculty of Computing, Department of Computer Science.ORCID iD: 0000-0001-7199-8080
Blekinge Institute of Technology, Faculty of Computing, Department of Computer Science.ORCID iD: 0000-0003-3128-191x
Blekinge Institute of Technology, Faculty of Computing, Department of Computer Science.ORCID iD: 0000-0001-9947-1088
(English)Manuscript (preprint) (Other academic)
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
URN: urn:nbn:se:bth-25538OAI: oai:DiVA.org:bth-25538DiVA, id: diva2:1809370
Available from: 2023-11-03 Created: 2023-11-03 Last updated: 2023-11-06Bibliographically approved
In thesis
1. Data Stream Mining and Analysis: Clustering Evolving Data
Open this publication in new window or tab >>Data Stream Mining and Analysis: Clustering Evolving Data
2024 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Streaming data is becoming more prevalent in our society every day. With the increasing use of technologies such as the Internet of Things (IoT) and 5G networks, the number of possible data sources steadily increases. Therefore, there is a need to develop algorithms that can handle the massive amount of data we now generate.

Data mining is an area of research where data is mined to gain an understanding of data and its underlying structure. When we move to streaming data, and the corresponding sub-domain data stream mining, restrictions are imposed on the algorithms that can be used. Data streams are possibly endless, and their instances arrive rapidly, can often only be processed once or a few times, and often evolve as the data is generated over time.

This thesis explores data-driven techniques to model and analyze evolving data streams. We focus on slower data streams where incremental updates are not necessary, and the interest lies in analyzing its behavior over longer time periods. We aim to evaluate existing and develop novel algorithms and techniques suitable for analyzing these types of data streams. We use both supervised and unsupervised learning methods to model the user/system behaviors, and the methods and algorithms are evaluated on various datasets.

Specifically, we investigate regression and clustering algorithms to mine streaming data for user/system behavior patterns. We also design an algorithm capable of modeling user/system behavior in a single evolving data stream, which is easy to use and capitalizes on prior knowledge from the history of the stream. Furthermore, we design a clustering algorithm that takes advantage of multiple data streams, where each stream represents a part of the entire system, to model various aspects of the user/system behavior. Finally, we review the current state-of-the-art methods for evaluating data stream clustering algorithms and identify aspects that should be considered for the future.

Place, publisher, year, edition, pages
Karlskrona: Blekinge Tekniska Högskola, 2024. p. 231
Series
Blekinge Institute of Technology Doctoral Dissertation Series, ISSN 1653-2090 ; 1
Keywords
Data Stream Mining, Clustering, Data Streams, Data Mining
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:bth-25539 (URN)978-91-7295-472-4 (ISBN)
Public defence
2024-01-24, Karlskrona, 09:00 (English)
Opponent
Supervisors
Available from: 2023-11-17 Created: 2023-11-03 Last updated: 2023-12-12Bibliographically approved

Open Access in DiVA

No full text in DiVA

Authority records

Nordahl, ChristianBoeva, VeselkaGrahn, Håkan

Search in DiVA

By author/editor
Nordahl, ChristianBoeva, VeselkaGrahn, Håkan
By organisation
Department of Computer Science
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 102 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf