Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Data Stream Mining and Analysis: Clustering Evolving Data
Blekinge Tekniska Högskola, Fakulteten för datavetenskaper, Institutionen för datavetenskap.ORCID-id: 0000-0001-7199-8080
2024 (engelsk)Doktoravhandling, med artikler (Annet vitenskapelig)
Abstract [en]

Streaming data is becoming more prevalent in our society every day. With the increasing use of technologies such as the Internet of Things (IoT) and 5G networks, the number of possible data sources steadily increases. Therefore, there is a need to develop algorithms that can handle the massive amount of data we now generate.

Data mining is an area of research where data is mined to gain an understanding of data and its underlying structure. When we move to streaming data, and the corresponding sub-domain data stream mining, restrictions are imposed on the algorithms that can be used. Data streams are possibly endless, and their instances arrive rapidly, can often only be processed once or a few times, and often evolve as the data is generated over time.

This thesis explores data-driven techniques to model and analyze evolving data streams. We focus on slower data streams where incremental updates are not necessary, and the interest lies in analyzing its behavior over longer time periods. We aim to evaluate existing and develop novel algorithms and techniques suitable for analyzing these types of data streams. We use both supervised and unsupervised learning methods to model the user/system behaviors, and the methods and algorithms are evaluated on various datasets.

Specifically, we investigate regression and clustering algorithms to mine streaming data for user/system behavior patterns. We also design an algorithm capable of modeling user/system behavior in a single evolving data stream, which is easy to use and capitalizes on prior knowledge from the history of the stream. Furthermore, we design a clustering algorithm that takes advantage of multiple data streams, where each stream represents a part of the entire system, to model various aspects of the user/system behavior. Finally, we review the current state-of-the-art methods for evaluating data stream clustering algorithms and identify aspects that should be considered for the future.

sted, utgiver, år, opplag, sider
Karlskrona: Blekinge Tekniska Högskola, 2024. , s. 231
Serie
Blekinge Institute of Technology Doctoral Dissertation Series, ISSN 1653-2090 ; 1
Emneord [en]
Data Stream Mining, Clustering, Data Streams, Data Mining
HSV kategori
Forskningsprogram
Datavetenskap
Identifikatorer
URN: urn:nbn:se:bth-25539ISBN: 978-91-7295-472-4 (tryckt)OAI: oai:DiVA.org:bth-25539DiVA, id: diva2:1809390
Disputas
2024-01-24, Karlskrona, 09:00 (engelsk)
Opponent
Veileder
Tilgjengelig fra: 2023-11-17 Laget: 2023-11-03 Sist oppdatert: 2023-12-12bibliografisk kontrollert
Delarbeid
1. Detection of Residents' Abnormal Behaviour by Analysing Energy Consumption of Individual Households
Åpne denne publikasjonen i ny fane eller vindu >>Detection of Residents' Abnormal Behaviour by Analysing Energy Consumption of Individual Households
2017 (engelsk)Inngår i: Proceedings of the 17th IEEE International Conference on Data Mining Workshops (ICDMW) / [ed] Gottumukkala, R; Ning, X; Dong, G; Raghavan, V; Aluru, S; Karypis, G; Miele, L; Wu, X, IEEE, 2017, s. 729-738Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

As average life expectancy continuously rises, assisting the elderly population with living independently is of great importance. Detecting abnormal behaviour of the elderly living at home is one way to assist the eldercare systems with the increase of the elderly population. In this study, we perform an initial investigation to identify abnormal behaviour of household residents using energy consumption data. We conduct an experiment in two parts, the first to identify a suitable prediction algorithm to model energy consumption behaviour, and the second to detect abnormal behaviour. This approach allows for an initial step for the elderly care that has a low cost, is easily deployable, and is non-intrusive.

sted, utgiver, år, opplag, sider
IEEE, 2017
Serie
International Conference on Data Mining Workshops, ISSN 2375-9232
Emneord
Energy consumption, Predictive models, Smart meters, Correlation, Senior citizens
HSV kategori
Identifikatorer
urn:nbn:se:bth-15565 (URN)10.1109/ICDMW.2017.101 (DOI)000425845700096 ()978-1-5386-3800-2 (ISBN)
Konferanse
IEEE International Conference on Data Mining series (ICDM), New Orleans
Forskningsfinansiär
Knowledge Foundation, 20140032
Tilgjengelig fra: 2018-01-11 Laget: 2018-01-11 Sist oppdatert: 2023-11-03bibliografisk kontrollert
2. Profiling of household residents’ electricity consumption behavior using clustering analysis
Åpne denne publikasjonen i ny fane eller vindu >>Profiling of household residents’ electricity consumption behavior using clustering analysis
2019 (engelsk)Inngår i: Lect. Notes Comput. Sci., Springer Verlag , 2019, s. 779-786Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

In this study we apply clustering techniques for analyzing and understanding households’ electricity consumption data. The knowledge extracted by this analysis is used to create a model of normal electricity consumption behavior for each particular household. Initially, the household’s electricity consumption data are partitioned into a number of clusters with similar daily electricity consumption profiles. The centroids of the generated clusters can be considered as representative signatures of a household’s electricity consumption behavior. The proposed approach is evaluated by conducting a number of experiments on electricity consumption data of ten selected households. The obtained results show that the proposed approach is suitable for data organizing and understanding, and can be applied for modeling electricity consumption behavior on a household level. © Springer Nature Switzerland AG 2019.

sted, utgiver, år, opplag, sider
Springer Verlag, 2019
Serie
Lecture Notes in Computer Science ; 11540
Emneord
Ambient Assisted Living, Non-intrusive remote monitoring, Assisted living, Clustering analysis, Clustering techniques, Electricity-consumption, Household level, Number of clusters, Remote monitoring, Electric power utilization
HSV kategori
Identifikatorer
urn:nbn:se:bth-18593 (URN)10.1007/978-3-030-22750-0_78 (DOI)000589285300076 ()2-s2.0-85068459816 (Scopus ID)9783030227494 (ISBN)
Konferanse
International Conference on Computational Science, ICCS, Faro, Algarve, 12 June 2019 through 14 June 2019
Tilgjengelig fra: 2019-09-09 Laget: 2019-09-09 Sist oppdatert: 2023-11-03bibliografisk kontrollert
3. Monitoring Household Electricity Consumption Behaviour for Mining Changes
Åpne denne publikasjonen i ny fane eller vindu >>Monitoring Household Electricity Consumption Behaviour for Mining Changes
2019 (engelsk)Konferansepaper, Oral presentation only (Fagfellevurdert)
Abstract [en]

In this paper, we present an ongoing work on using a household electricity consumption behavior model for recognizing changes in sleep patterns. The work is inspired by recent studies in neuroscience revealing an association between dementia and sleep disorders and more particularly, supporting the hypothesis that insomnia may be a predictor for dementia in older adults. Our approach initially creates a clustering model of normal electricity consumption behavior of the household by using historical data. Then we build a new clustering model on a new set of electricity consumption data collected over a predefined time period and compare the existing model with the built new electricity consumption behavior model. If a discrepancy between the two clustering models is discovered a further analysis of the current electricity consumption behavior is conducted in order to investigate whether this discrepancy is associated with alterations in the resident’s sleep patterns. The approach is studied and initially evaluated on electricity consumption data collected from a single randomly selected anonymous household. The obtained results show that our approach is robust to mining changes in the resident daily routines by monitoring and analyzing their electricity consumption behavior model.

HSV kategori
Identifikatorer
urn:nbn:se:bth-18651 (URN)
Konferanse
3rd International Workshop on Aging, Rehabilitation and Independent Assisted Living (ARIAL), International Joint Conferenec on Artificial Intelligence (IJCAI), August 10-16, 2019, Macao, China.
Prosjekter
Scalable resource-efficient systems for big data analytics
Merknad

open access

Tilgjengelig fra: 2019-09-12 Laget: 2019-09-12 Sist oppdatert: 2023-11-03bibliografisk kontrollert
4. Modeling Evolving User Behavior via Sequential Clustering
Åpne denne publikasjonen i ny fane eller vindu >>Modeling Evolving User Behavior via Sequential Clustering
2020 (engelsk)Inngår i: Communications in Computer and Information Science / [ed] Cellier P.,Driessens K., Springer, 2020, Vol. 1168 CCIS, s. 12-20Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

In this paper we address the problem of modeling the evolution of clusters over time by applying sequential clustering. We propose a sequential partitioning algorithm that can be applied for grouping distinct snapshots of streaming data so that a clustering model is built on each data snapshot. The algorithm is initialized by a clustering solution built on available historical data. Then a new clustering solution is generated on each data snapshot by applying a partitioning algorithm seeded with the centroids of the clustering model obtained at the previous time interval. At each step the algorithm also conducts model adapting operations in order to reflect the evolution in the clustering structure. In that way, it enables to deal with both incremental and dynamic aspects of modeling evolving behavior problems. In addition, the proposed approach is able to trace back evolution through the detection of clusters' transitions, such as splits and merges. We have illustrated and initially evaluated our ideas on household electricity consumption data. The results have shown that the proposed sequential clustering algorithm is robust to modeling evolving behavior by being enable to mine changes and update the model, respectively.

sted, utgiver, år, opplag, sider
Springer, 2020
Serie
Communications in Computer and Information Science, ISSN 18650929
Emneord
Behavior modeling, Clustering evolution, Data mining, Household electricity consumption data, Sequential clustering
HSV kategori
Identifikatorer
urn:nbn:se:bth-18666 (URN)10.1007/978-3-030-43887-6_2 (DOI)000718590300002 ()9783030438869 (ISBN)
Konferanse
19th Joint European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD, Würzburg; Germany,16th to the 20th of September 2019
Merknad

open access

Tilgjengelig fra: 2019-09-18 Laget: 2019-09-18 Sist oppdatert: 2024-01-01bibliografisk kontrollert
5. EvolveCluster: an evolutionary clustering algorithm for streaming data
Åpne denne publikasjonen i ny fane eller vindu >>EvolveCluster: an evolutionary clustering algorithm for streaming data
2022 (engelsk)Inngår i: Evolving Systems, ISSN 1868-6478, E-ISSN 1868-6486, nr 4, s. 603-623Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

Data has become an integral part of our society in the past years, arriving faster and in larger quantities than before. Traditional clustering algorithms rely on the availability of entire datasets to model them correctly and efficiently. Such requirements are not possible in the data stream clustering scenario, where data arrives and needs to be analyzed continuously. This paper proposes a novel evolutionary clustering algorithm, entitled EvolveCluster, capable of modeling evolving data streams. We compare EvolveCluster against two other evolutionary clustering algorithms, PivotBiCluster and Split-Merge Evolutionary Clustering, by conducting experiments on three different datasets. Furthermore, we perform additional experiments on EvolveCluster to further evaluate its capabilities on clustering evolving data streams. Our results show that EvolveCluster manages to capture evolving data stream behaviors and adapts accordingly.

sted, utgiver, år, opplag, sider
SPRINGER HEIDELBERG, 2022
Emneord
Evolving data stream; Clustering; Data stream clustering
HSV kategori
Identifikatorer
urn:nbn:se:bth-22395 (URN)10.1007/s12530-021-09408-y (DOI)000717906700001 ()2-s2.0-85119001929 (Scopus ID)
Forskningsfinansiär
Knowledge Foundation, 20140032
Merknad

open access

Tilgjengelig fra: 2021-11-26 Laget: 2021-11-26 Sist oppdatert: 2023-11-03bibliografisk kontrollert
6. MultiStream EvolveCluster
Åpne denne publikasjonen i ny fane eller vindu >>MultiStream EvolveCluster
2023 (engelsk)Inngår i: The 36th Canadian Conference on Artificial Intelligence, 2023Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

This paper proposes a novel multi-stream clustering algorithm, MultiStream EvolveCluster (MS-EC), that can be used for continuous and distributed monitoring and analysis ofevolving time series phenomena. It can maintain evolving clustering solutions separatelyfor each stream/view and consensus clustering solutions reflecting evolving interrelationsamong the streams. Each stream behavior can be analyzed by different clustering techniques using a distance measure and data granularity that is specially selected for it. Theproperties of the MultiStream EvolveCluster algorithm are studied and evaluated withrespect to different consensus clustering techniques, distance measures, and cluster evaluation measures in synthetic and real-world smart building datasets. Our evaluation resultsshow a stable algorithm performance in synthetic data scenarios. In the case of real-worlddata, the algorithm behavior demonstrates sensitivity to the individual streams’ data quality and the used consensus clustering technique.

Emneord
evolve clustering, data stream mining, consensus clustering
HSV kategori
Identifikatorer
urn:nbn:se:bth-25534 (URN)
Konferanse
The 36th Canadian Conference on Artificial Intelligence, Montreal, 5-9 June 2023
Tilgjengelig fra: 2023-11-01 Laget: 2023-11-01 Sist oppdatert: 2023-11-03bibliografisk kontrollert
7. On Evaluation of Data Stream Clustering Algorithms: A Survey
Åpne denne publikasjonen i ny fane eller vindu >>On Evaluation of Data Stream Clustering Algorithms: A Survey
(engelsk)Manuskript (preprint) (Annet vitenskapelig)
HSV kategori
Forskningsprogram
Datavetenskap
Identifikatorer
urn:nbn:se:bth-25538 (URN)
Tilgjengelig fra: 2023-11-03 Laget: 2023-11-03 Sist oppdatert: 2023-11-06bibliografisk kontrollert

Open Access i DiVA

fulltext(51625 kB)201 nedlastinger
Filinformasjon
Fil FULLTEXT03.pdfFilstørrelse 51625 kBChecksum SHA-512
3affa7860aadbca59587191b186ea1f58a287b1967f75325fa50f0c654556162721f9cdfa83eab5d25f994ff67d4dc2d13a1f95a419ada1f5563df60e85b938c
Type fulltextMimetype application/pdf

Person

Nordahl, Christian

Søk i DiVA

Av forfatter/redaktør
Nordahl, Christian
Av organisasjonen

Søk utenfor DiVA

GoogleGoogle Scholar
Totalt: 236 nedlastinger
Antall nedlastinger er summen av alle nedlastinger av alle fulltekster. Det kan for eksempel være tidligere versjoner som er ikke lenger tilgjengelige

isbn
urn-nbn

Altmetric

isbn
urn-nbn
Totalt: 1364 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf