Change search
Link to record
Permanent link

Direct link
BETA
Boeva, Veselka, ProfessorORCID iD iconorcid.org/0000-0003-3128-191x
Alternative names
Publications (10 of 13) Show all publications
Boeva, V., Angelova, M. & Tsiporkova, E. (2019). A split-merge evolutionary clustering algorithm. In: ICAART 2019 - Proceedings of the 11th International Conference on Agents and Artificial Intelligence: . Paper presented at 11th International Conference on Agents and Artificial Intelligence, ICAART; Prague, 19 February 2019 through 21 February 2019 (pp. 337-346). SciTePress, 2
Open this publication in new window or tab >>A split-merge evolutionary clustering algorithm
2019 (English)In: ICAART 2019 - Proceedings of the 11th International Conference on Agents and Artificial Intelligence, SciTePress , 2019, Vol. 2, p. 337-346Conference paper, Published paper (Refereed)
Abstract [en]

In this article we propose a bipartite correlation clustering technique that can be used to adapt the existing clustering solution to a clustering of newly collected data elements. The proposed technique is supposed to provide the flexibility to compute clusters on a new portion of data collected over a defined time period and to update the existing clustering solution by the computed new one. Such an updating clustering should better reflect the current characteristics of the data by being able to examine clusters occurring in the considered time period and eventually capture interesting trends in the area. For example, some clusters will be updated by merging with ones from newly constructed clustering while others will be transformed by splitting their elements among several new clusters. The proposed clustering algorithm, entitled Split-Merge Evolutionary Clustering, is evaluated and compared to another bipartite correlation clustering technique (PivotBiCluster) on two different case studies: expertise retrieval and patient profiling in healthcare. Copyright © 2019 by SCITEPRESS - Science and Technology Publications, Lda. All rights reserved

Place, publisher, year, edition, pages
SciTePress, 2019
Keywords
Bipartite Clustering, Data Mining, Evolutionary Clustering, PubMed Data, Unsupervised Learning, Artificial intelligence, Cluster analysis, Evolutionary algorithms, Bipartite correlation clustering, Case-studies, Clustering solutions, Current characteristic, Data elements, Clustering algorithms
National Category
Computer Sciences
Identifiers
urn:nbn:se:bth-17896 (URN)2-s2.0-85064827857 (Scopus ID)9789897583506 (ISBN)
Conference
11th International Conference on Agents and Artificial Intelligence, ICAART; Prague, 19 February 2019 through 21 February 2019
Available from: 2019-05-21 Created: 2019-05-21 Last updated: 2019-05-21Bibliographically approved
Angelova, M., Vishnu Manasa, D., Boeva, V., Linde, P. & Lavesson, N. (2019). An Expertise Recommender System based on Data from an Institutional Repository (DiVA). In: Leslie Chan, Pierre Mounier (Ed.), Connecting the Knowledge Common from Projects to sustainable Infrastructure: The 22nd International conference on Electronic Publishing - Revised Selected Papers (pp. 135-149). OpenEdition Press
Open this publication in new window or tab >>An Expertise Recommender System based on Data from an Institutional Repository (DiVA)
Show others...
2019 (English)In: Connecting the Knowledge Common from Projects to sustainable Infrastructure: The 22nd International conference on Electronic Publishing - Revised Selected Papers / [ed] Leslie Chan, Pierre Mounier, OpenEdition Press , 2019, p. 135-149Chapter in book (Refereed)
Abstract [en]

Finding experts in academics is an important practical problem, e.g. recruiting reviewersfor reviewing conference, journal or project submissions, partner matching for researchproposals, finding relevant M. Sc. or Ph. D. supervisors etc. In this work, we discuss anexpertise recommender system that is built on data extracted from the Blekinge Instituteof Technology (BTH) instance of the institutional repository system DiVA (DigitalScientific Archive). DiVA is a publication and archiving platform for research publicationsand student essays used by 46 publicly funded universities and authorities in Sweden andthe rest of the Nordic countries (www.diva-portal.org). The DiVA classification system isbased on the Swedish Higher Education Authority (UKÄ) and the Statistic Sweden's (SCB)three levels classification system. Using the classification terms associated with studentM. Sc. and B. Sc. theses published in the DiVA platform, we have developed a prototypesystem which can be used to identify and recommend subject thesis supervisors in academy.

Place, publisher, year, edition, pages
OpenEdition Press, 2019
Keywords
Text mining, Recommender system, Institutional repository, Ontology
National Category
Other Computer and Information Science
Identifiers
urn:nbn:se:bth-18095 (URN)979-1-0365-3801-8 (ISBN)979-1-0365-3802-5 (ISBN)
Note

open access

Available from: 2019-06-18 Created: 2019-06-18 Last updated: 2019-06-19Bibliographically approved
Lundberg, L., Lennerstad, H., Boeva, V. & García Martín, E. (2019). Handling non-linear relations in support vector machines through hyperplane folding. In: ACM International Conference Proceeding Series: . Paper presented at 11th International Conference on Machine Learning and Computing, ICMLC 2019; Zhuhai; China; 22 February 2019 through 24 February (pp. 137-141). Association for Computing Machinery
Open this publication in new window or tab >>Handling non-linear relations in support vector machines through hyperplane folding
2019 (English)In: ACM International Conference Proceeding Series, Association for Computing Machinery , 2019, p. 137-141Conference paper, Published paper (Refereed)
Abstract [en]

We present a new method, called hyperplane folding, that increases the margin in Support Vector Machines (SVMs). Based on the location of the support vectors, the method splits the dataset into two parts, rotates one part of the dataset and then merges the two parts again. This procedure increases the margin as long as the margin is smaller than half of the shortest distance between any pair of data points from the two different classes. We provide an algorithm for the general case with n-dimensional data points. A small experiment with three folding iterations on 3-dimensional data points with non-linear relations shows that the margin does indeed increase and that the accuracy improves with a larger margin. The method can use any standard SVM implementation plus some basic manipulation of the data points, i.e., splitting, rotating and merging. Hyperplane folding also increases the interpretability of the data. © 2019 Association for Computing Machinery.

Place, publisher, year, edition, pages
Association for Computing Machinery, 2019
Keywords
Hyperplane folding, Hyperplane hinging, Non-linear relations, Piecewise linear classification, Support vector machines, Geometry, Piecewise linear techniques, Vectors, Different class, Interpretability, Nonlinear relations, Piecewise linear, Support vector, Support vector machine (SVMs)
National Category
Computer Sciences
Identifiers
urn:nbn:se:bth-18039 (URN)10.1145/3318299.3318319 (DOI)2-s2.0-85066460409 (Scopus ID)
Conference
11th International Conference on Machine Learning and Computing, ICMLC 2019; Zhuhai; China; 22 February 2019 through 24 February
Available from: 2019-06-14 Created: 2019-06-14 Last updated: 2019-06-17Bibliographically approved
García Martín, E., Lavesson, N., Grahn, H., Casalicchio, E. & Boeva, V. (2019). How to Measure Energy Consumption in Machine Learning Algorithms. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics): ECMLPKDD 2018: European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases Workshops. Lecture Notes in Computer Science. Springer, Cham. Paper presented at European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2018; Dublin; Ireland; 10 September 2018 through 14 September 2018 (pp. 243-255). , 11329
Open this publication in new window or tab >>How to Measure Energy Consumption in Machine Learning Algorithms
Show others...
2019 (English)In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics): ECMLPKDD 2018: European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases Workshops. Lecture Notes in Computer Science. Springer, Cham, 2019, Vol. 11329, p. 243-255Conference paper, Published paper (Refereed)
Abstract [en]

Machine learning algorithms are responsible for a significant amount of computations. These computations are increasing with the advancements in different machine learning fields. For example, fields such as deep learning require algorithms to run during weeks consuming vast amounts of energy. While there is a trend in optimizing machine learning algorithms for performance and energy consumption, still there is little knowledge on how to estimate an algorithm’s energy consumption. Currently, a straightforward cross-platform approach to estimate energy consumption for different types of algorithms does not exist. For that reason, well-known researchers in computer architecture have published extensive works on approaches to estimate the energy consumption. This study presents a survey of methods to estimate energy consumption, and maps them to specific machine learning scenarios. Finally, we illustrate our mapping suggestions with a case study, where we measure energy consumption in a big data stream mining scenario. Our ultimate goal is to bridge the current gap that exists to estimate energy consumption in machine learning scenarios.

Series
Lecture Notes in Computer Science, ISSN 0302-9743 ; 11329
Keywords
Computer architecture, Energy efficiency, Green computing, Machine learning
National Category
Computer Sciences
Identifiers
urn:nbn:se:bth-17209 (URN)10.1007/978-3-030-13453-2_20 (DOI)9783030134525 (ISBN)
Conference
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2018; Dublin; Ireland; 10 September 2018 through 14 September 2018
Funder
Knowledge Foundation, 20140032
Available from: 2018-11-01 Created: 2018-11-01 Last updated: 2019-04-18Bibliographically approved
Angelova, M., Vishnu Manasa, D., Boeva, V., Linde, P. & Lavesson, N. (2018). An Expertise Recommender System Based on Data from an Institutional Repository (DiVA). In: Proceedings of the 22nd edition of the International Conference on ELectronic PUBlishing: . Paper presented at 22nd edition of the International Conference on ELectronic PUBlishing - Connecting the Knowledge Commons: From Projects to Sustainable Infrastructure, Toronto.
Open this publication in new window or tab >>An Expertise Recommender System Based on Data from an Institutional Repository (DiVA)
Show others...
2018 (English)In: Proceedings of the 22nd edition of the International Conference on ELectronic PUBlishing, 2018Conference paper, Published paper (Refereed)
Abstract [en]

Finding experts in academics is an important practical problem, e.g. recruiting reviewersfor reviewing conference, journal or project submissions, partner matching for researchproposals, finding relevant M. Sc. or Ph. D. supervisors etc. In this work, we discuss anexpertise recommender system that is built on data extracted from the Blekinge Instituteof Technology (BTH) instance of the institutional repository system DiVA (DigitalScientific Archive). DiVA is a publication and archiving platform for research publicationsand student essays used by 46 publicly funded universities and authorities in Sweden andthe rest of the Nordic countries (www.diva-portal.org). The DiVA classification system isbased on the Swedish Higher Education Authority (UKÄ) and the Statistic Sweden's (SCB)three levels classification system. Using the classification terms associated with studentM. Sc. and B. Sc. theses published in the DiVA platform, we have developed a prototypesystem which can be used to identify and recommend subject thesis supervisors inacademy.

Keywords
Text mining, Recommender system, Institutional repository, Ontology
National Category
Other Computer and Information Science
Identifiers
urn:nbn:se:bth-16660 (URN)0.4000/proceedings.elpub.2018.17 (DOI)
Conference
22nd edition of the International Conference on ELectronic PUBlishing - Connecting the Knowledge Commons: From Projects to Sustainable Infrastructure, Toronto
Note

open access

Available from: 2018-06-29 Created: 2018-06-29 Last updated: 2019-06-18Bibliographically approved
Boeva, V., Lundberg, L., Angelova, M. & Kohstall, J. (2018). Cluster Validation Measures for Label Noise Filtering. In: JardimGoncalves, R; Mendonca, JP; Jotsov, V; Marques, M; Martins, J; Bierwolf, R (Ed.), 9th International Conference on Intelligent Systems 2018: Theory, Research and Innovation in Applications, IS 2018 - Proceedings: . Paper presented at 9th International Conference on Intelligent Systems, IS 2018; Funchal - Madeira; Portugal; 25 September 2018 through 27 (pp. 109-116). Institute of Electrical and Electronics Engineers Inc.
Open this publication in new window or tab >>Cluster Validation Measures for Label Noise Filtering
2018 (English)In: 9th International Conference on Intelligent Systems 2018: Theory, Research and Innovation in Applications, IS 2018 - Proceedings / [ed] JardimGoncalves, R; Mendonca, JP; Jotsov, V; Marques, M; Martins, J; Bierwolf, R, Institute of Electrical and Electronics Engineers Inc. , 2018, p. 109-116Conference paper, Published paper (Refereed)
Abstract [en]

Cluster validation measures are designed to find the partitioning that best fits the underlying data. In this paper, we show that these well-known and scientifically proven validation measures can also be used in a different context, i.e., for filtering mislabeled instances or class outliers prior to training in super-vised learning problems. A technique, entitled CVI-based Outlier Filtering, is proposed in which mislabeled instances are identified and eliminated from the training set, and a classification hypothesis is then built from the set of remaining instances. The proposed approach assigns each instance several cluster validation scores representing its potential of being an outlier with respect to the clustering properties the used validation measures assess. We examine CVI-based Outlier Filtering and compare it against the LOF detection method on ten data sets from the UCI data repository using five well-known learning algorithms and three different cluster validation indices. In addition, we study two approaches for filtering mislabeled instances: local and global. Our results show that for most learning algorithms and data sets, the proposed CVI-based outlier filtering algorithm outperforms the baseline method (LOF). The greatest increase in classification accuracy has been achieved by combining at least two of the used cluster validation indices and global filtering of mislabeled instances. © 2018 IEEE.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers Inc., 2018
Keywords
Class noise, Classification, Cluster validation measures, Label noise, Classification (of information), Intelligent systems, Learning algorithms, Statistics, Classification accuracy, Cluster validation, Clustering properties, Data repositories, Detection methods, Filtering algorithm, Learning problem, Clustering algorithms
National Category
Computer Sciences
Identifiers
urn:nbn:se:bth-18023 (URN)10.1109/IS.2018.8710495 (DOI)000469337900017 ()2-s2.0-85065973083 (Scopus ID)9781538670972 (ISBN)
Conference
9th International Conference on Intelligent Systems, IS 2018; Funchal - Madeira; Portugal; 25 September 2018 through 27
Available from: 2019-06-14 Created: 2019-06-14 Last updated: 2019-07-01Bibliographically approved
Boeva, V., Angelova, M., Lavesson, N., Rosander, O. & Tsiporkova, E. (2018). Evolutionary clustering techniques for expertise mining scenarios. In: van den Herik J.,Rocha A.P. (Ed.), ICAART 2018 - Proceedings of the 10th International Conference on Agents and Artificial Intelligence, Volume 2: . Paper presented at 10th International Conference on Agents and Artificial Intelligence, ICAART, Funchal, Madeira (pp. 523-530). SciTePress, 2
Open this publication in new window or tab >>Evolutionary clustering techniques for expertise mining scenarios
Show others...
2018 (English)In: ICAART 2018 - Proceedings of the 10th International Conference on Agents and Artificial Intelligence, Volume 2 / [ed] van den Herik J.,Rocha A.P., SciTePress , 2018, Vol. 2, p. 523-530Conference paper, Published paper (Refereed)
Abstract [en]

The problem addressed in this article concerns the development of evolutionary clustering techniques that can be applied to adapt the existing clustering solution to a clustering of newly collected data elements. We are interested in clustering approaches that are specially suited for adapting clustering solutions in the expertise retrieval domain. This interest is inspired by practical applications such as expertise retrieval systems where the information available in the system database is periodically updated by extracting new data. The experts available in the system database are usually partitioned into a number of disjoint subject categories. It is becoming impractical to re-cluster this large volume of available information. Therefore, the objective is to update the existing expert partitioning by the clustering produced on the newly extracted experts. Three different evolutionary clustering techniques are considered to be suitable for this scenario. The proposed techniques are initially evaluated by applying the algorithms on data extracted from the PubMed repository. Copyright © 2018 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved.

Place, publisher, year, edition, pages
SciTePress, 2018
Keywords
Data Mining, Expert Finding, Health Science, Knowledge Management, Natural Language Processing, Artificial intelligence, Cluster analysis, Natural language processing systems, Search engines, Clustering approach, Clustering solutions, Data elements, Evolutionary clustering, Retrieval systems, System database
National Category
Computer Sciences
Identifiers
urn:nbn:se:bth-16224 (URN)2-s2.0-85046663632 (Scopus ID)9789897582752 (ISBN)
Conference
10th International Conference on Agents and Artificial Intelligence, ICAART, Funchal, Madeira
Available from: 2018-05-24 Created: 2018-05-24 Last updated: 2018-05-24Bibliographically approved
García Martín, E., Lavesson, N., Grahn, H., Casalicchio, E. & Boeva, V. (2018). Hoeffding Trees with nmin adaptation. In: The 5th IEEE International Conference on Data Science and Advanced Analytics (DSAA 2018): . Paper presented at 5th IEEE International Conference on Data Science and Advanced Analytics (IEEE DSAA), 1–4 October 2018, Turin (pp. 70-79). IEEE
Open this publication in new window or tab >>Hoeffding Trees with nmin adaptation
Show others...
2018 (English)In: The 5th IEEE International Conference on Data Science and Advanced Analytics (DSAA 2018), IEEE, 2018, p. 70-79Conference paper, Published paper (Refereed)
Abstract [en]

Machine learning software accounts for a significant amount of energy consumed in data centers. These algorithms are usually optimized towards predictive performance, i.e. accuracy, and scalability. This is the case of data stream mining algorithms. Although these algorithms are adaptive to the incoming data, they have fixed parameters from the beginning of the execution. We have observed that having fixed parameters lead to unnecessary computations, thus making the algorithm energy inefficient.In this paper we present the nmin adaptation method for Hoeffding trees. This method adapts the value of the nmin pa- rameter, which significantly affects the energy consumption of the algorithm. The method reduces unnecessary computations and memory accesses, thus reducing the energy, while the accuracy is only marginally affected. We experimentally compared VFDT (Very Fast Decision Tree, the first Hoeffding tree algorithm) and CVFDT (Concept-adapting VFDT) with the VFDT-nmin (VFDT with nmin adaptation). The results show that VFDT-nmin consumes up to 27% less energy than the standard VFDT, and up to 92% less energy than CVFDT, trading off a few percent of accuracy in a few datasets.

Place, publisher, year, edition, pages
IEEE, 2018
Series
Proceedings of the International Conference on Data Science and Advanced Analytics, ISSN 2472-1573
Keywords
data stream mining; green artificial intelligence; energy efficiency; hoeffding trees; energy aware machine learning
National Category
Computer Sciences
Identifiers
urn:nbn:se:bth-17207 (URN)10.1109/DSAA.2018.00017 (DOI)000459238600008 ()978-1-5386-5090-5 (ISBN)
Conference
5th IEEE International Conference on Data Science and Advanced Analytics (IEEE DSAA), 1–4 October 2018, Turin
Funder
Knowledge Foundation, 20140032
Available from: 2018-11-01 Created: 2018-11-01 Last updated: 2019-04-05Bibliographically approved
Nordahl, C., Grahn, H., Persson, M. & Boeva, V. (2018). Organizing, Visualizing and Understanding Households Electricity Consumption Data through Clustering Analysis.. In: Organizing, Visualizing and Understanding Households Electricity Consumption Data through Clustering Analysis: . Paper presented at 2ND WORKSHOP ON AI FOR AGING, REHABILITATION AND INDEPENDENT ASSISTED LIVING (ARIAL) @IJCAI'18, Stockholm. https://sites.google.com/view/arial2018/accepted-papersprogram
Open this publication in new window or tab >>Organizing, Visualizing and Understanding Households Electricity Consumption Data through Clustering Analysis.
2018 (English)In: Organizing, Visualizing and Understanding Households Electricity Consumption Data through Clustering Analysis, https://sites.google.com/view/arial2018/accepted-papersprogram , 2018Conference paper, Published paper (Refereed)
Abstract [en]

We propose a cluster analysis approach for organizing, visualizing and understanding households’ electricity consumption data. We initially partition the consumption data into a number of clusters with similar daily electricity consumption profiles. The centroids of each cluster can be seen as representative signatures of a household’s electricity consumption behaviors. We evaluate the proposed approach by conducting a number of experiments on electricity consumption data of ten selected households. Our results show that the approach is suitable for data analysis, understanding and creating electricity consumption behavior models.

Place, publisher, year, edition, pages
https://sites.google.com/view/arial2018/accepted-papersprogram, 2018
National Category
Other Computer and Information Science
Identifiers
urn:nbn:se:bth-17439 (URN)
Conference
2ND WORKSHOP ON AI FOR AGING, REHABILITATION AND INDEPENDENT ASSISTED LIVING (ARIAL) @IJCAI'18, Stockholm
Projects
BigData@BTH
Available from: 2018-12-19 Created: 2018-12-19 Last updated: 2019-01-16Bibliographically approved
Boeva, V., Lundberg, L., Kota, S. M. H. & Sköld, L. (2017). Analysis of Organizational Structure through Cluster Validation Techniques Evaluation of email communications at an organizational level. In: Gottumukkala, R Ning, X Dong, G Raghavan, V Aluru, S Karypis, G Miele, L Wu, X (Ed.), 2017 17TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2017): . Paper presented at 17th IEEE International Conference on Data Mining (ICDMW), NOV 18-21, 2017, New Orleans, LA (pp. 170-176). IEEE
Open this publication in new window or tab >>Analysis of Organizational Structure through Cluster Validation Techniques Evaluation of email communications at an organizational level
2017 (English)In: 2017 17TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2017) / [ed] Gottumukkala, R Ning, X Dong, G Raghavan, V Aluru, S Karypis, G Miele, L Wu, X, IEEE , 2017, p. 170-176Conference paper, Published paper (Refereed)
Abstract [en]

In this work, we report an ongoing study that aims to apply cluster validation measures for analyzing email communications at an organizational level of a company. This analysis can be used to evaluate the company structure and to produce further recommendations for structural improvements. Our initial evaluations, based on data in the forms of emails logs and organizational structure for a large European telecommunication company, show that cluster validation techniques can be useful tools for assessing the organizational structure using objective analysis of internal email communications, and for simulating and studying different reorganization scenarios.

Place, publisher, year, edition, pages
IEEE, 2017
Series
International Conference on Data Mining Workshops, ISSN 2375-9232
Keywords
cluster validation measures, data analysis, human capital management, internal communication, organizational structure
National Category
Computer Sciences
Identifiers
urn:nbn:se:bth-15992 (URN)10.1109/ICDMW.2017.28 (DOI)000425845700022 ()978-1-5386-3800-2 (ISBN)
Conference
17th IEEE International Conference on Data Mining (ICDMW), NOV 18-21, 2017, New Orleans, LA
Available from: 2018-03-23 Created: 2018-03-23 Last updated: 2018-03-23Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0003-3128-191x

Search in DiVA

Show all publications