Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Multi-view Clustering Analyses for District Heating Substations
Blekinge Institute of Technology, Faculty of Computing, Department of Computer Science.ORCID iD: 0000-0002-3010-8798
Blekinge Institute of Technology, Faculty of Computing, Department of Computer Science.ORCID iD: 0000-0003-3128-191x
NODA Intelligent Systems AB, SWE.
Blekinge Institute of Technology, Faculty of Computing, Department of Computer Science.ORCID iD: 0000-0001-9947-1088
2020 (English)In: DATA 2020 - Proceedings of the 9th International Conference on Data Science, Technology and Applications2020, / [ed] Hammoudi S.,Quix C.,Bernardino J., SciTePress, 2020, p. 158-168Conference paper, Published paper (Refereed)
Abstract [en]

In this study, we propose a multi-view clustering approach for mining and analysing multi-view network datasets. The proposed approach is applied and evaluated on a real-world scenario for monitoring and analysing district heating (DH) network conditions and identifying substations with sub-optimal behaviour. Initially, geographical locations of the substations are used to build an approximate graph representation of the DH network. Two different analyses can further be applied in this context: step-wise and parallel-wise multi-view clustering. The step-wise analysis is meant to sequentially consider and analyse substations with respect to a few different views. At each step, a new clustering solution is built on top of the one generated by the previously considered view, which organizes the substations in a hierarchical structure that can be used for multi-view comparisons. The parallel-wise analysis on the other hand, provides the opportunity to analyse substations with regards to two different views in parallel. Such analysis is aimed to represent and identify the relationships between substations by organizing them in a bipartite graph and analysing the substations’ distribution with respect to each view. The proposed data analysis and visualization approach arms domain experts with means for analysing DH network performance. In addition, it will facilitate the identification of substations with deviating operational behaviour based on comparative analysis with their closely located neighbours.

Place, publisher, year, edition, pages
SciTePress, 2020. p. 158-168
Keywords [en]
Data Mining, Multi-view Clustering, Multi-layer Clustering, Time Series, District Heating Substation
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:bth-20452DOI: 10.5220/0009780001580168OAI: oai:DiVA.org:bth-20452DiVA, id: diva2:1469602
Conference
9th International Conference on Data Science, Technology and Applications, DATA 2020, Virtual, Online; France, 7 July 2020 through 9 July 2020
Part of project
Bigdata@BTH- Scalable resource-efficient systems for big data analytics, Knowledge Foundation
Funder
Knowledge Foundation, 20140032
Note

open access

Available from: 2020-09-22 Created: 2020-09-22 Last updated: 2025-09-30Bibliographically approved
In thesis
1. Data Mining Approaches for Outlier Detection Analysis
Open this publication in new window or tab >>Data Mining Approaches for Outlier Detection Analysis
2020 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Outlier detection is studied and applied in many domains. Outliers arise due to different reasons such as fraudulent activities, structural defects, health problems, and mechanical issues. The detection of outliers is a challenging task that can reveal system faults, fraud, and save people's lives. Outlier detection techniques are often domain-specific. The main challenge in outlier detection relates to modelling the normal behaviour in order to identify abnormalities. The choice of model is important, i.e., an unsuitable data model can lead to poor results. This requires a good understanding and interpretation of the data, the constraints, and requirements of the domain problem. Outlier detection is largely an unsupervised problem due to unavailability of labeled data and the fact that labeled data is expensive. 

In this thesis, we study and apply a combination of both machine learning and data mining techniques to build data-driven and domain-oriented outlier detection models. We focus on three real-world application domains: maritime surveillance, district heating, and online media and sequence datasets. We show the importance of data preprocessing as well as feature selection in building suitable methods for data modelling. We take advantage of both supervised and unsupervised techniques to create hybrid methods. 

More specifically, we propose a rule-based anomaly detection system using open data for the maritime surveillance domain. We exploit sequential pattern mining for identifying contextual and collective outliers in online media data. We propose a minimum spanning tree clustering technique for detection of groups of outliers in online media and sequence data. We develop a few higher order mining approaches for identifying manual changes and deviating behaviours in the heating systems at the building level. The proposed approaches are shown to be capable of explaining the underlying properties of the detected outliers. This can facilitate domain experts in narrowing down the scope of analysis and understanding the reasons of such anomalous behaviours. We also investigate the reproducibility of the proposed models in similar application domains.

Place, publisher, year, edition, pages
Karlskrona: Blekinge Tekniska Högskola, 2020. p. 251
Series
Blekinge Institute of Technology Doctoral Dissertation Series, ISSN 1653-2090 ; 9
Keywords
outlier detection, data modelling, machine learning, clustering analysis, data stream mining
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:bth-20454 (URN)9789172954090 (ISBN)
Public defence
2020-12-01, J1630, Karlskrona, 13:00 (English)
Opponent
Supervisors
Funder
Knowledge Foundation, 20140032
Available from: 2020-10-16 Created: 2020-10-12 Last updated: 2025-09-30Bibliographically approved

Open Access in DiVA

Multi-view Clustering Analyses for District Heating Substations(474 kB)413 downloads
File information
File name FULLTEXT01.pdfFile size 474 kBChecksum SHA-512
ff4322620319d0cb99b1cd7c32b01b4c45afa242bd77b94a5845decda842054f4d6fcc213f5cddb9d767b6cc48f4c76f655645f773f3d83c259b04aa7460d591
Type fulltextMimetype application/pdf

Other links

Publisher's full text

Authority records

Abghari, ShahroozBoeva, VeselkaGrahn, Håkan

Search in DiVA

By author/editor
Abghari, ShahroozBoeva, VeselkaGrahn, Håkan
By organisation
Department of Computer Science
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 413 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 1378 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf