Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Modeling Evolving User Behavior via Sequential Clustering
Blekinge Institute of Technology, Faculty of Computing, Department of Computer Science.ORCID iD: 0000-0003-3128-191x
Blekinge Institute of Technology, Faculty of Computing, Department of Computer Science.ORCID iD: 0000-0001-7199-8080
2020 (English)In: Communications in Computer and Information Science / [ed] Cellier P.,Driessens K., Springer, 2020, Vol. 1168 CCIS, p. 12-20Conference paper, Published paper (Refereed)
Abstract [en]

In this paper we address the problem of modeling the evolution of clusters over time by applying sequential clustering. We propose a sequential partitioning algorithm that can be applied for grouping distinct snapshots of streaming data so that a clustering model is built on each data snapshot. The algorithm is initialized by a clustering solution built on available historical data. Then a new clustering solution is generated on each data snapshot by applying a partitioning algorithm seeded with the centroids of the clustering model obtained at the previous time interval. At each step the algorithm also conducts model adapting operations in order to reflect the evolution in the clustering structure. In that way, it enables to deal with both incremental and dynamic aspects of modeling evolving behavior problems. In addition, the proposed approach is able to trace back evolution through the detection of clusters' transitions, such as splits and merges. We have illustrated and initially evaluated our ideas on household electricity consumption data. The results have shown that the proposed sequential clustering algorithm is robust to modeling evolving behavior by being enable to mine changes and update the model, respectively.

Place, publisher, year, edition, pages
Springer, 2020. Vol. 1168 CCIS, p. 12-20
Series
Communications in Computer and Information Science, ISSN 18650929
Keywords [en]
Behavior modeling, Clustering evolution, Data mining, Household electricity consumption data, Sequential clustering
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:bth-18666DOI: 10.1007/978-3-030-43887-6_2ISI: 000718590300002ISBN: 9783030438869 (print)OAI: oai:DiVA.org:bth-18666DiVA, id: diva2:1352327
Conference
19th Joint European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD, Würzburg; Germany,16th to the 20th of September 2019
Note

open access

Available from: 2019-09-18 Created: 2019-09-18 Last updated: 2024-01-01Bibliographically approved
In thesis
1. Data-Driven Techniques for Modeling and Analysis of User Behavior
Open this publication in new window or tab >>Data-Driven Techniques for Modeling and Analysis of User Behavior
2019 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

Our society is becoming more digitalized for each day. Now, we are able to gather data from individual users with higher resolution than ever. With the increased amount of data on an individual user level, we can analyze their behavior. This is of interest in many different domains, for example service providers wanting to improve their service for their customers. If they know how their service is used, they have more insight in how they can improve. But, it also imposes additional difficulties. When we reach the individual user, the irregularities in the regular behavior makes it harder to model the normal behavior.

In this thesis, we explore data-driven techniques to model and analyze user behaviors. We aim to evaluate existing as well as develop novel technologies to identify approaches that are suitable for use on an individual user level. We use both supervised and unsupervised learning methods to model the user behavior and evaluate the approaches on real world electricity consumption data.

Firstly, we analyze household electricity consumption data and investigate the use of regression to model the household's behavior. We identify consumption trends, how data granularity affects modeling, and we show that regression is a viable approach to model user behavior. Secondly, we use clustering analysis to profile individual households in terms of their electricity consumption. We compare two dissimilarity measures, how they affect the clustering analysis, and we investigate how the produced clustering solutions differ. Thirdly, we propose a sequential clustering algorithm to model evolving user behavior. We evaluate the proposed algorithm on electricity consumption data and show how the produced model can be used to identify and trace changes in the user's behavior. The algorithm is robust to evolving behaviors and handles both dynamic and incremental aspects of streaming data.

Place, publisher, year, edition, pages
Karlskrona: Blekinge Tekniska Högskola, 2019
Series
Blekinge Institute of Technology Licentiate Dissertation Series, ISSN 1650-2140 ; 15
National Category
Computer Sciences
Identifiers
urn:nbn:se:bth-18667 (URN)978-91-7295-391-8 (ISBN)
Supervisors
Available from: 2019-11-15 Created: 2019-09-18 Last updated: 2019-12-18Bibliographically approved
2. Data Stream Mining and Analysis: Clustering Evolving Data
Open this publication in new window or tab >>Data Stream Mining and Analysis: Clustering Evolving Data
2024 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Streaming data is becoming more prevalent in our society every day. With the increasing use of technologies such as the Internet of Things (IoT) and 5G networks, the number of possible data sources steadily increases. Therefore, there is a need to develop algorithms that can handle the massive amount of data we now generate.

Data mining is an area of research where data is mined to gain an understanding of data and its underlying structure. When we move to streaming data, and the corresponding sub-domain data stream mining, restrictions are imposed on the algorithms that can be used. Data streams are possibly endless, and their instances arrive rapidly, can often only be processed once or a few times, and often evolve as the data is generated over time.

This thesis explores data-driven techniques to model and analyze evolving data streams. We focus on slower data streams where incremental updates are not necessary, and the interest lies in analyzing its behavior over longer time periods. We aim to evaluate existing and develop novel algorithms and techniques suitable for analyzing these types of data streams. We use both supervised and unsupervised learning methods to model the user/system behaviors, and the methods and algorithms are evaluated on various datasets.

Specifically, we investigate regression and clustering algorithms to mine streaming data for user/system behavior patterns. We also design an algorithm capable of modeling user/system behavior in a single evolving data stream, which is easy to use and capitalizes on prior knowledge from the history of the stream. Furthermore, we design a clustering algorithm that takes advantage of multiple data streams, where each stream represents a part of the entire system, to model various aspects of the user/system behavior. Finally, we review the current state-of-the-art methods for evaluating data stream clustering algorithms and identify aspects that should be considered for the future.

Place, publisher, year, edition, pages
Karlskrona: Blekinge Tekniska Högskola, 2024. p. 231
Series
Blekinge Institute of Technology Doctoral Dissertation Series, ISSN 1653-2090 ; 1
Keywords
Data Stream Mining, Clustering, Data Streams, Data Mining
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:bth-25539 (URN)978-91-7295-472-4 (ISBN)
Public defence
2024-01-24, Karlskrona, 09:00 (English)
Opponent
Supervisors
Available from: 2023-11-17 Created: 2023-11-03 Last updated: 2023-12-12Bibliographically approved

Open Access in DiVA

fulltext(745 kB)342 downloads
File information
File name FULLTEXT01.pdfFile size 745 kBChecksum SHA-512
b438a7714c2ddd3a57b0cbcaf1c3886e9e830d5d82bf8fd1f7fd144b3ed10dd401e4a60912f6a376c2da7e1014d3fa8b9cee331a832e3696f1bdba401f517900
Type fulltextMimetype application/pdf

Other links

Publisher's full text

Search in DiVA

By author/editor
Boeva, VeselkaNordahl, Christian
By organisation
Department of Computer Science
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 342 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 466 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf