Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Clustering Techniques for Mining and Analysis of Evolving Data
Blekinge Institute of Technology, Faculty of Computing, Department of Computer Science.ORCID iD: 0000-0003-3371-5347
2021 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

The amount of data generated is on rise due to increased demand for fields like IoT, smart monitoring applications, etc. Data generated through such systems have many distinct characteristics like continuous data generation, evolutionary, multi-source nature, and heterogeneity. In addition, the real-world data generated in these fields is largely unlabelled. Clustering is an unsupervised learning technique used to group, analyze and interpret unlabelled data. Conventional clustering algorithms are not suitable for dealing with data having previously mentioned characteristics due to memory and computational constraints, their inability to handle concept drift, distributed location of data. Therefore novel clustering approaches capable of analyzing and interpreting evolving and/or multi-source streaming data are needed. 

The thesis is focused on building evolutionary clustering algorithms for data that evolves over time. We have initially proposed an evolutionary clustering approach, entitled Split-Merge Clustering (Paper I), capable of continuously updating the generated clustering solution in the presence of new data. Through the progression of the work, new challenges have been studied and addressed. Namely, the Split-Merge Clustering algorithm has been enhanced in Paper II with new capabilities to deal with the challenges of multi-view data applications. A multi-view or multi-source data presents the studied phenomenon/system from different perspectives (views), and can reveal interesting knowledge that is not visible when only one view is considered and analyzed. This has motivated us to continue in this direction by designing two other novel multi-view data stream clustering algorithms. The algorithm proposed in Paper III improves the performance and interpretability of the algorithm proposed in Paper II. Paper IV introduces a minimum spanning tree based multi-view clustering algorithm capable of transferring knowledge between consecutive data chunks, and it is also enriched with a post-clustering pattern-labeling procedure. 

The proposed and studied evolutionary clustering algorithms are evaluated on various data sets. The obtained results have demonstrated the robustness of the algorithms for modeling, analyzing, and mining evolving data streams. They are able to adequately adapt single and multi-view clustering models by continuously integrating newly arriving data. 

Place, publisher, year, edition, pages
Karlskrona: Blekinge Tekniska Högskola, 2021.
Series
Blekinge Institute of Technology Licentiate Dissertation Series, ISSN 1650-2140 ; 2021:09
Keywords [en]
Clustering analysis, Concept drift, Evolutionary clustering, Machine learning, Streaming data
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
URN: urn:nbn:se:bth-22262ISBN: 978-91-7295-432-8 (print)OAI: oai:DiVA.org:bth-22262DiVA, id: diva2:1607671
Presentation
2021-12-13, J1630, Blekinge Tekniska Högskola SE-371 79, Karlskrona, 13:00 (English)
Opponent
Supervisors
Available from: 2021-11-02 Created: 2021-11-01 Last updated: 2021-11-19Bibliographically approved
List of papers
1. Bipartite Split-Merge Evolutionary Clustering
Open this publication in new window or tab >>Bipartite Split-Merge Evolutionary Clustering
Show others...
2019 (English)In: Lect. Notes Comput. Sci., Springer , 2019, p. 204-223Conference paper, Published paper (Refereed)
Abstract [en]

We propose a split-merge framework for evolutionary clustering. The proposed clustering technique, entitled Split-Merge Evolutionary Clustering is supposed to be more robust to concept drift scenarios by providing the flexibility to consider at each step a portion of the data and derive clusters from it to be used subsequently to update the existing clustering solution. The proposed framework is built around the idea to model two clustering solutions as a bipartite graph, which guides the update of the existing clustering solution by merging some clusters with ones from the newly constructed clustering while others are transformed by splitting their elements among several new clusters. We have evaluated and compared the discussed evolutionary clustering technique with two other state of the art algorithms: a bipartite correlation clustering (PivotBiCluster) and an incremental evolving clustering (Dynamic split-and-merge). © Springer Nature Switzerland AG 2019.

Place, publisher, year, edition, pages
Springer, 2019
Series
Lecture Notes in Computer Science (LNCS), ISSN 0302-9743, E-ISSN 1611-3349
Keywords
Bipartite clustering, Data mining, Dynamic clustering, Evolutionary clustering, Split-merge framework, Unsupervised learning, Artificial intelligence, Bipartite correlation clustering, Clustering solutions, Clustering techniques, State-of-the-art algorithms, Cluster analysis
National Category
Computer Sciences
Identifiers
urn:nbn:se:bth-19127 (URN)10.1007/978-3-030-37494-5_11 (DOI)000722592200011 ()2-s2.0-85077496461 (Scopus ID)9783030374938 (ISBN)
Conference
11th International Conference on Agents and Artificial Intelligence, ICAART; Prague; Czech Republic; 19 February 2019 through 21 February
Available from: 2020-01-23 Created: 2020-01-23 Last updated: 2024-04-09Bibliographically approved
2. Split-merge evolutionary clustering for multi-view streaming data
Open this publication in new window or tab >>Split-merge evolutionary clustering for multi-view streaming data
2020 (English)In: Procedia Computer Science / [ed] Cristani M.,Toro C.,Zanni-Merk C.,Howlett R.J.,Jain L.C.,Jain L.C., Elsevier, 2020, Vol. 176, p. 460-469Conference paper, Published paper (Refereed)
Abstract [en]

In this study, we propose a new multi-view stream clustering approach, called MV Split-Merge Clustering. The proposed approach is an extension of an existing split-merge evolutionary clustering algorithm (entitled Split-Merge Clustering) to multi-view data applications. The extended version can be used to integrate data from multiple views in a streaming manner and discover cluster structure for each data chunk. The MV Split-Merge Clustering can be applied for grouping distinct chunks of multi-view streaming data so that a global integrated clustering model is built on each data chunk. At each time window, an updated clustering solution (local model) is initially produced on each view of the current data chunk by applying the Split-Merge Clustering algorithm. Formal Concept Analysis is then used in order to integrate information from the multiple views (local clustering models) and generate a global model (formal concept lattice) that reveals the correlations among the clusters of the local models. The proposed MV Split-Merge Clustering has been initially evaluated on a publicly available data set. Our results show that the approach is able to identify a clustering structure and relationships among the different views comparable to those produced in a batch scenario. © 2020 The Authors. Published by Elsevier B.V.

Place, publisher, year, edition, pages
Elsevier, 2020
Series
Procedia Computer Science, E-ISSN 1877-0509
Keywords
Clustering algorithms, Data stream mining, Evolutionary clustering, Multi-View clustering, Online learning, Cluster analysis, Evolutionary algorithms, Formal concept analysis, Information analysis, Knowledge based systems, Cluster structure, Clustering model, Clustering solutions, Extended versions, Formal concept lattices, Multi-view datum, Stream clustering
National Category
Computer Sciences
Identifiers
urn:nbn:se:bth-20632 (URN)10.1016/j.procs.2020.08.048 (DOI)2-s2.0-85093357055 (Scopus ID)
Conference
24th KES International Conference on Knowledge-Based and Intelligent Information and Engineering Systems, KES 2020, Virtual Online, 16 September 2020 through 18 September 2020
Funder
Knowledge Foundation, 20140032
Note

open access

Available from: 2020-11-02 Created: 2020-11-02 Last updated: 2024-04-09Bibliographically approved
3. A Multi-view Clustering Approach for Analysis of Streaming Data
Open this publication in new window or tab >>A Multi-view Clustering Approach for Analysis of Streaming Data
2021 (English)In: IFIP Advances in Information and Communication Technology / [ed] Maglogiannis I., Macintyre J., Iliadis L., Springer Science and Business Media Deutschland GmbH , 2021, p. 169-183Conference paper, Published paper (Refereed)
Abstract [en]

Data available today in smart monitoring applications such as smart buildings, machine health monitoring, smart healthcare, etc., is not centralized and usually supplied by a number of different devices (sensors, mobile devices and edge nodes). Due to which the data has a heterogeneous nature and provides different perspectives (views) about the studied phenomenon. This makes the monitoring task very challenging, requiring machine learning and data mining models that are not only able to continuously integrate and analyze multi-view streaming data, but also are capable of adapting to concept drift scenarios of newly arriving data. This study presents a multi-view clustering approach that can be applied for monitoring and analysis of streaming data scenarios. The approach allows for parallel monitoring of the individual view clustering models and mining view correlations in the integrated (global) clustering models. The global model built at each data chunk is a formal concept lattice generated by a formal context consisting of closed patterns representing the most typical correlations among the views. The proposed approach is evaluated on two different data sets. The obtained results demonstrate that it is suitable for modelling and monitoring multi-view streaming phenomena by providing means for continuous analysis and pattern mining. © 2021, IFIP International Federation for Information Processing.

Place, publisher, year, edition, pages
Springer Science and Business Media Deutschland GmbH, 2021
Series
IFIP Advances in Information and Communication Technology, ISSN 18684238 ; 627
Keywords
Closed patterns, Formal concept analysis, Multi-instance learning, Multi-view clustering, Streaming data, Artificial intelligence, Data mining, Intelligent buildings, mHealth, Monitoring, Continuous analysis, Data mining models, Formal concept lattices, Machine health monitoring, Monitoring and analysis, Monitoring tasks, Smart monitoring, Cluster analysis
National Category
Computer Sciences
Identifiers
urn:nbn:se:bth-22023 (URN)10.1007/978-3-030-79150-6_14 (DOI)2-s2.0-85111810320 (Scopus ID)9783030791490 (ISBN)
Conference
12.5 International Conference on Artificial Intelligence Applications and Innovations, AIAI 2021, Virtual, Online, 25 June 2021 - 27 June 2021
Funder
Knowledge Foundation, 20140032
Available from: 2021-08-20 Created: 2021-08-20 Last updated: 2024-04-09Bibliographically approved
4. A Graph-based Multi-view Clustering Approach for Continuous Pattern Mining
Open this publication in new window or tab >>A Graph-based Multi-view Clustering Approach for Continuous Pattern Mining
2022 (English)In: Recent Advancements in Multi-View Data Analytics / [ed] Witold Pedrycz and Shyi-Ming Chen, Springer Science+Business Media B.V., 2022, p. 201-237Chapter in book (Refereed)
Abstract [en]

Today’s smart monitoring applications need machine learning models and data mining algorithms that are capable of analysing and mining the temporal component of data streams. These models and algorithms also ought to take into account the multi-source nature of the sensor data by being able to conduct multi-view analysis. In this study, we address these challenges by introducing a novel multi-view data stream clustering approach, entitled MST-MVS clustering, that can be applied in different smart monitoring applications for continuous pattern mining and data labelling. This proposed approach is based on the Minimum Spanning Tree (MST) clustering algorithm. This algorithm is applied for parallel building of local clustering models on different views in each chunk of data. The MST-MVS clustering transfers knowledge learnt in the current data chunk to the next chunk in the form of artificial nodes used by the MST clustering algorithm. These artificial nodes are identified by analyzing multi-view patterns extracted at each data chunk in the form of an integrated (global) clustering model. We further show how the extracted patterns can be used for post-labelling of the chunk’s data by introducing a dedicated labelling technique, entitled Pattern-labelling. We study and evaluate the MST-MVS clustering algorithm under different experimental scenarios on synthetic and real-world data. © 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.

Place, publisher, year, edition, pages
Springer Science+Business Media B.V., 2022
Series
Studies in Big Data, ISSN 2197-6503, E-ISSN 2197-6511 ; 106
Keywords
data stream, clustering analysis, pattern mining, minimum spanning tree
National Category
Computer Sciences
Identifiers
urn:nbn:se:bth-22261 (URN)10.1007/978-3-030-95239-6_8 (DOI)2-s2.0-85130970889 (Scopus ID)978-3-030-95239-6 (ISBN)
Available from: 2021-11-01 Created: 2021-11-01 Last updated: 2024-08-16Bibliographically approved

Open Access in DiVA

fulltext(6105 kB)417 downloads
File information
File name FULLTEXT02.pdfFile size 6105 kBChecksum SHA-512
a37871c97902feb8a8a0c767bdbbb515ae227bc5fed7b767dfbf7ef8edd3db2fff4bedafecb0ae1f231d103de9ddd22dd5450ede35ab5d7fb5de34b08ab791dd
Type fulltextMimetype application/pdf

Authority records

Devagiri, Vishnu Manasa

Search in DiVA

By author/editor
Devagiri, Vishnu Manasa
By organisation
Department of Computer Science
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 427 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 1342 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf