Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
EvolveCluster: an evolutionary clustering algorithm for streaming data
Blekinge Tekniska Högskola, Fakulteten för datavetenskaper, Institutionen för datavetenskap.ORCID-id: 0000-0001-7199-8080
Blekinge Tekniska Högskola, Fakulteten för datavetenskaper, Institutionen för datavetenskap.ORCID-id: 0000-0003-3128-191x
Blekinge Tekniska Högskola, Fakulteten för datavetenskaper, Institutionen för datavetenskap.ORCID-id: 0000-0001-9947-1088
Blekinge Tekniska Högskola, Fakulteten för datavetenskaper, Institutionen för datavetenskap.
2022 (Engelska)Ingår i: Evolving Systems, ISSN 1868-6478, E-ISSN 1868-6486, nr 4, s. 603-623Artikel i tidskrift (Refereegranskat) Published
Abstract [en]

Data has become an integral part of our society in the past years, arriving faster and in larger quantities than before. Traditional clustering algorithms rely on the availability of entire datasets to model them correctly and efficiently. Such requirements are not possible in the data stream clustering scenario, where data arrives and needs to be analyzed continuously. This paper proposes a novel evolutionary clustering algorithm, entitled EvolveCluster, capable of modeling evolving data streams. We compare EvolveCluster against two other evolutionary clustering algorithms, PivotBiCluster and Split-Merge Evolutionary Clustering, by conducting experiments on three different datasets. Furthermore, we perform additional experiments on EvolveCluster to further evaluate its capabilities on clustering evolving data streams. Our results show that EvolveCluster manages to capture evolving data stream behaviors and adapts accordingly.

Ort, förlag, år, upplaga, sidor
SPRINGER HEIDELBERG , 2022. nr 4, s. 603-623
Nyckelord [en]
Evolving data stream; Clustering; Data stream clustering
Nationell ämneskategori
Datavetenskap (datalogi)
Identifikatorer
URN: urn:nbn:se:bth-22395DOI: 10.1007/s12530-021-09408-yISI: 000717906700001Scopus ID: 2-s2.0-85119001929OAI: oai:DiVA.org:bth-22395DiVA, id: diva2:1614644
Ingår i projekt
Bigdata@BTH- Scalable resource-efficient systems for big data analytics, KK-stiftelsen
Forskningsfinansiär
KK-stiftelsen, 20140032
Anmärkning

open access

Tillgänglig från: 2021-11-26 Skapad: 2021-11-26 Senast uppdaterad: 2023-11-03Bibliografiskt granskad
Ingår i avhandling
1. Data Stream Mining and Analysis: Clustering Evolving Data
Öppna denna publikation i ny flik eller fönster >>Data Stream Mining and Analysis: Clustering Evolving Data
2024 (Engelska)Doktorsavhandling, sammanläggning (Övrigt vetenskapligt)
Abstract [en]

Streaming data is becoming more prevalent in our society every day. With the increasing use of technologies such as the Internet of Things (IoT) and 5G networks, the number of possible data sources steadily increases. Therefore, there is a need to develop algorithms that can handle the massive amount of data we now generate.

Data mining is an area of research where data is mined to gain an understanding of data and its underlying structure. When we move to streaming data, and the corresponding sub-domain data stream mining, restrictions are imposed on the algorithms that can be used. Data streams are possibly endless, and their instances arrive rapidly, can often only be processed once or a few times, and often evolve as the data is generated over time.

This thesis explores data-driven techniques to model and analyze evolving data streams. We focus on slower data streams where incremental updates are not necessary, and the interest lies in analyzing its behavior over longer time periods. We aim to evaluate existing and develop novel algorithms and techniques suitable for analyzing these types of data streams. We use both supervised and unsupervised learning methods to model the user/system behaviors, and the methods and algorithms are evaluated on various datasets.

Specifically, we investigate regression and clustering algorithms to mine streaming data for user/system behavior patterns. We also design an algorithm capable of modeling user/system behavior in a single evolving data stream, which is easy to use and capitalizes on prior knowledge from the history of the stream. Furthermore, we design a clustering algorithm that takes advantage of multiple data streams, where each stream represents a part of the entire system, to model various aspects of the user/system behavior. Finally, we review the current state-of-the-art methods for evaluating data stream clustering algorithms and identify aspects that should be considered for the future.

Ort, förlag, år, upplaga, sidor
Karlskrona: Blekinge Tekniska Högskola, 2024. s. 231
Serie
Blekinge Institute of Technology Doctoral Dissertation Series, ISSN 1653-2090 ; 1
Nyckelord
Data Stream Mining, Clustering, Data Streams, Data Mining
Nationell ämneskategori
Datavetenskap (datalogi)
Forskningsämne
Datavetenskap
Identifikatorer
urn:nbn:se:bth-25539 (URN)978-91-7295-472-4 (ISBN)
Disputation
2024-01-24, Karlskrona, 09:00 (Engelska)
Opponent
Handledare
Tillgänglig från: 2023-11-17 Skapad: 2023-11-03 Senast uppdaterad: 2023-12-12Bibliografiskt granskad

Open Access i DiVA

fulltext(4187 kB)405 nedladdningar
Filinformation
Filnamn FULLTEXT01.pdfFilstorlek 4187 kBChecksumma SHA-512
e23a9d5ba489c28b79484337df20f6a4f377995ef741d1345d70d62d8f877866eb2d8f4bbda5800279e92f716573f42e78cd3c6268c7fd985efaa981cd1a8bb3
Typ fulltextMimetyp application/pdf

Övriga länkar

Förlagets fulltextScopus

Person

Nordahl, ChristianBoeva, VeselkaGrahn, HåkanNetz Persson, Marie

Sök vidare i DiVA

Av författaren/redaktören
Nordahl, ChristianBoeva, VeselkaGrahn, HåkanNetz Persson, Marie
Av organisationen
Institutionen för datavetenskap
I samma tidskrift
Evolving Systems
Datavetenskap (datalogi)

Sök vidare utanför DiVA

GoogleGoogle Scholar
Totalt: 405 nedladdningar
Antalet nedladdningar är summan av nedladdningar för alla fulltexter. Det kan inkludera t.ex tidigare versioner som nu inte längre är tillgängliga.

doi
urn-nbn

Altmetricpoäng

doi
urn-nbn
Totalt: 326 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf