Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
A Domain Adaptation Technique through Cluster Boundary Integration
Blekinge Tekniska Högskola, Fakulteten för datavetenskaper, Institutionen för datavetenskap.ORCID-id: 0000-0003-3371-5347
Blekinge Tekniska Högskola, Fakulteten för datavetenskaper, Institutionen för datavetenskap.ORCID-id: 0000-0003-3128-191x
Blekinge Tekniska Högskola, Fakulteten för datavetenskaper, Institutionen för datavetenskap.ORCID-id: 0000-0002-3010-8798
2025 (engelsk)Inngår i: Evolving Systems, ISSN 1868-6478, E-ISSN 1868-6486, Vol. 16, nr 1, artikkel-id 14Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

Many machine learning models deployed on smart or edge devices experience a phase where there is a drop in their performance due to the arrival of data from new domains. This paper proposes a novel unsupervised domain adaptation algorithm called DIBCA++ to deal with such situations. The algorithm uses only the clusters’ mean, standard deviation, and size, which makes the proposed algorithm modest in terms of the required storage and computation. The study also presents the explainability aspect of the algorithm. DIBCA++ is compared with its predecessor, DIBCA, and its applicability and performance are studied and evaluated in two real-world scenarios. One is coping with the Global Navigation Satellite System activation problem from the smart logistics domain, while the other identifies different activities a person performs and deals with a human activity recognition task. Both scenarios involve time series data phenomena, i.e., DIBCA++ also contributes towards addressing the current gap regarding domain adaptation solutions for time series data. Based on the experimental results, DIBCA++ has improved performance compared to DIBCA. The DIBCA++ has performed better in all human activity recognition task experiments and 82.5% of experimental scenarios on the smart logistics use case. The results also showcase the need and benefit of personalizing the models using DIBCA++, along with the ability to transfer new knowledge between domains, leading to improved performance. The adapted source and target models have performed better in 70% and 80% of cases in an experimental scenario conducted on smart logistics. 

sted, utgiver, år, opplag, sider
Springer Nature, 2025. Vol. 16, nr 1, artikkel-id 14
Emneord [en]
Cluster integration, Clustering techniques, Domain adaptation
HSV kategori
Identifikatorer
URN: urn:nbn:se:bth-26090DOI: 10.1007/s12530-024-09635-zISI: 001363397000001Scopus ID: 2-s2.0-85210317128OAI: oai:DiVA.org:bth-26090DiVA, id: diva2:1850083
Ingår i projekt
HINTS – Intelligenta verkligheter med människan i centrum
Forskningsfinansiär
Knowledge Foundation, 20220068Tilgjengelig fra: 2024-04-09 Laget: 2024-04-09 Sist oppdatert: 2025-09-30bibliografisk kontrollert
Inngår i avhandling
1. Mining Evolving and Heterogeneous Data: Cluster-based Analysis Techniques
Åpne denne publikasjonen i ny fane eller vindu >>Mining Evolving and Heterogeneous Data: Cluster-based Analysis Techniques
2024 (engelsk)Doktoravhandling, med artikler (Annet vitenskapelig)
Abstract [en]

A large amount of data is generated from fields like IoT, smart monitoring applications, etc., raising demand for suitable data analysis and mining techniques. Data produced through such systems have many distinct characteristics, like continuous generation, evolving nature, multi-source origin, and heterogeneity, and in addition are usually not annotated. Clustering is an unsupervised learning technique used to group and analyze unlabeled data. Conventional clustering algorithms are unsuitable for dealing with data with the mentioned characteristics due to memory, computational constraints, and their inability to handle the heterogeneous and evolving nature of the data. Therefore, novel clustering approaches are needed to analyze and interpret such challenging data. 

This thesis focuses on building and studying advanced clustering algorithms that can address the main challenges of today's real-world data: evolving and heterogeneous nature. An evolving clustering approach capable of continuously updating the generated clustering solution in the presence of new data is initially proposed, which is later extended to address the challenges of multi-view data applications. Multi-view or multi-source data presents the studied phenomenon or system from different perspectives (views) and can reveal interesting knowledge that is invisible when only one view is considered and analyzed. This has motivated us to continue exploring data from different perspectives in several other studies of this thesis. Domain shift is another common problem when data is obtained from various devices or locations, leading to a drop in the performance of machine learning models if they are not adapted to the current domain (device, location, etc.). The thesis explores the domain adaptation problem in a resource-constraint way using cluster integration techniques. A new hybrid clustering technique for analyzing the heterogeneous data is also proposed. It produces homogeneous groups, facilitating continuous monitoring and fault detection.

The algorithms and techniques proposed in this thesis are evaluated on various data sets, including real-world data from industrial partners in domains like smart building systems, smart logistics, and performance monitoring of industrial assets. The obtained results demonstrated the robustness of the algorithms for modeling, analyzing, and mining evolving data streams and/or heterogeneous data. They can adequately adapt single and multi-view clustering models by continuously integrating newly arriving data.

sted, utgiver, år, opplag, sider
Karlskrona: Blekinge Tekniska Högskola, 2024
Serie
Blekinge Institute of Technology Doctoral Dissertation Series, ISSN 1653-2090 ; 2024:06
Emneord
Domain Adaptation, Evolving Clustering, Heterogeneous Data, Multi-View Clustering, Streaming Data
HSV kategori
Forskningsprogram
Datavetenskap
Identifikatorer
urn:nbn:se:bth-26098 (URN)978-91-7295-479-3 (ISBN)
Disputas
2024-05-22, J1630, Campus Gräsvik, Karlskrona, 09:00 (engelsk)
Opponent
Veileder
Tilgjengelig fra: 2024-04-10 Laget: 2024-04-09 Sist oppdatert: 2025-09-30bibliografisk kontrollert

Open Access i DiVA

fulltext(1839 kB)63 nedlastinger
Filinformasjon
Fil FULLTEXT01.pdfFilstørrelse 1839 kBChecksum SHA-512
ce6b86eccaa598a93f59b5fc170c76597a7a9f6bc77711d23d1e3b40f453c32d772b3cdaac95db33c30053d9361e0a0e410843d8dace713faf35e7cd823877c9
Type fulltextMimetype application/pdf

Andre lenker

Forlagets fulltekstScopus

Person

Devagiri, Vishnu ManasaBoeva, VeselkaAbghari, Shahrooz

Søk i DiVA

Av forfatter/redaktør
Devagiri, Vishnu ManasaBoeva, VeselkaAbghari, Shahrooz
Av organisasjonen
I samme tidsskrift
Evolving Systems

Søk utenfor DiVA

GoogleGoogle Scholar
Totalt: 63 nedlastinger
Antall nedlastinger er summen av alle nedlastinger av alle fulltekster. Det kan for eksempel være tidligere versjoner som er ikke lenger tilgjengelige

doi
urn-nbn

Altmetric

doi
urn-nbn
Totalt: 526 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf