Change search
Link to record
Permanent link

Direct link
BETA
Lundberg, Lars
Publications (10 of 159) Show all publications
Kohstall, J., Boeva, V., Lundberg, L. & Angelova, M. (2020). Ensembles of Cluster Validation Indices for Label Noise Filtering (864ed.). In: Studies in Computational Intelligence: (pp. 71-98). Springer
Open this publication in new window or tab >>Ensembles of Cluster Validation Indices for Label Noise Filtering
2020 (English)In: Studies in Computational Intelligence, Springer, 2020, 864, p. 71-98Chapter in book (Refereed)
Abstract [en]

Cluster validation measures are designed to find the partitioning that best fits the underlying data. In this study, we show that these measures can be used for identifying mislabeled instances or class outliers prior to training in supervised learning problems. We introduce an ensemble technique, entitled CVI-based Outlier Filtering, which identifies and eliminates mislabeled instances from the training set, and then builds a classification hypothesis from the set of remaining instances. Our approach assigns to each instance in the training set several cluster validation scores representing its potential of being a class outlier with respect to the clustering properties the used validation measures assess. In this respect, the proposed approach may be referred to a multi-criteria outlier filtering measure. In this work, we specifically study and evaluate valued-based ensembles of cluster validation indices. The added value of this approach in comparison to the logical and rank-based ensemble solutions are discussed and further demonstrated. © 2020, Springer Nature Switzerland AG.

Place, publisher, year, edition, pages
Springer, 2020 Edition: 864
Series
Studies in Computational Intelligence, ISSN 1860949X ; 864
National Category
Computer Sciences Information Systems
Identifiers
urn:nbn:se:bth-19346 (URN)10.1007/978-3-030-38704-4_4 (DOI)2-s2.0-85081586798 (Scopus ID)
Projects
Scalable resource efficient systems for big data analytics
Funder
Knowledge Foundation, 20140032
Available from: 2020-03-30 Created: 2020-03-30 Last updated: 2020-04-30Bibliographically approved
Shirinbab, S., Lundberg, L. & Casalicchio, E. (2020). Performance evaluation of containers and virtual machines when running Cassandra workload concurrently. Concurrency and Computation, Article ID e5693.
Open this publication in new window or tab >>Performance evaluation of containers and virtual machines when running Cassandra workload concurrently
2020 (English)In: Concurrency and Computation, ISSN 1532-0626, E-ISSN 1532-0634, article id e5693Article in journal (Refereed) Epub ahead of print
Abstract [en]

NoSQL distributed databases are often used as Big Data platforms. To provide efficient resource sharing and cost effectiveness, such distributed databases typically run concurrently on a virtualized infrastructure that could be implemented using hypervisor-based virtualization or container-based virtualization. Hypervisor-based virtualization is a mature technology but imposes overhead on CPU, networking, and disk. Recently, by sharing the operating system resources and simplifying the deployment of applications, container-based virtualization is getting more popular. This article presents a performance comparison between multiple instances of VMware VMs and Docker containers running concurrently. Our workload models a real-world Big Data Apache Cassandra application from Ericsson. As a baseline, we evaluated the performance of Cassandra when running on the nonvirtualized physical infrastructure. Our study shows that Docker has lower overhead compared with VMware; the performance on the container-based infrastructure was as good as on the nonvirtualized. Our performance evaluations also show that running multiple instances of a Cassandra database concurrently affected the performance of read and write operations differently; for both VMware and Docker, the maximum number of read operations was reduced when we ran several instances concurrently, whereas the maximum number of write operations increased when we ran instances concurrently.

Place, publisher, year, edition, pages
Wiley-Blackwell, 2020
Keywords
Cassandra; cloud computing; containers; performance evaluation; virtual machine
National Category
Computer Sciences
Identifiers
urn:nbn:se:bth-19270 (URN)10.1002/cpe.5693 (DOI)000513133800001 ()
Note

open access

Available from: 2020-03-05 Created: 2020-03-05 Last updated: 2020-03-05Bibliographically approved
Niyizamwiyitira, C. & Lundberg, L. (2019). A utilization-based schedulability test of real-time systems running on a multiprocessor virtual machine. Computer journal, 62(6), 884-904, Article ID bxy005.
Open this publication in new window or tab >>A utilization-based schedulability test of real-time systems running on a multiprocessor virtual machine
2019 (English)In: Computer journal, ISSN 0010-4620, E-ISSN 1460-2067, Vol. 62, no 6, p. 884-904, article id bxy005Article in journal (Refereed) Published
Abstract [en]

We consider a real-time application that executes in a VM with multiple virtual cores. Tasks are scheduled globally using fixed-priority scheduling. In order to avoid Dhalls effect, we classify tasks into two priority classes: heavy and light. Heavy tasks have higher priority than light tasks. For light tasks we use rate monotonic priority assignment. We propose a utilization-based schedulability test. If the task set is schedulable, we provide an assignment of priorities to tasks. The input to the test is the task set, the number of cores in the VM, the period, deadline and blocking time for the VM. We evaluate how jitter, when scheduling VMs on the hypervisor level, affects the schedulability of the real-time tasks running in the VM. The schedulability of the real-time tasks in the VM decreases when the hypervisor jitter increases, but on the other hand the schedulability on the hypervisor level increases if we allow more jitter, i.e. there is a trade-off. Our results make it possible to evaluate this trade-off and take informed decisions when selecting scheduling parameters on the hypervisor level. Simulations show that the priority assignment used by our algorithm schedules more task sets than using rate monotonic priority assignment. © 2019 The British Computer Society. All rights reserved.

Place, publisher, year, edition, pages
Oxford University Press, 2019
Keywords
Global fixed priority scheduling, Hard real-time scheduling, Multiprocessor utilization based schedulability test, Virtual multiprocessor scheduling, VM deadline, Economic and social effects, Interactive computer systems, Jitter, Multiprocessing systems, Scheduling, Scheduling algorithms, Virtual machine, Fixed priority scheduling, Hard real-time, Multi processor scheduling, Schedulability test, Real time systems
National Category
Computer Sciences
Identifiers
urn:nbn:se:bth-18589 (URN)10.1093/comjnl/bxz005 (DOI)000482395200007 ()2-s2.0-85068528659 (Scopus ID)
Funder
Knowledge Foundation, 20140032
Note

open access

Available from: 2019-09-09 Created: 2019-09-09 Last updated: 2019-09-20Bibliographically approved
Josyula, S. P., Törnquist Krasemann, J. & Lundberg, L. (2019). Exploring the Potential of GPU Computing in Train Rescheduling. In: Proceedings of the 8th International Conference on Railway Operations Modelling and Analysis, Norrköping, 2019.: . Paper presented at 8th International Conference on Railway Operations Modelling and Analysis.
Open this publication in new window or tab >>Exploring the Potential of GPU Computing in Train Rescheduling
2019 (English)In: Proceedings of the 8th International Conference on Railway Operations Modelling and Analysis, Norrköping, 2019., 2019Conference paper, Published paper (Refereed)
National Category
Computer Sciences
Identifiers
urn:nbn:se:bth-18510 (URN)
Conference
8th International Conference on Railway Operations Modelling and Analysis
Available from: 2019-08-02 Created: 2019-08-02 Last updated: 2019-08-15Bibliographically approved
Sidorova, Y., Rosander, O., Sköld, L., Grahn, H. & Lundberg, L. (2019). Finding a healthy equilibrium of geo-demographic segments for a telecom business: Who are malicious hot-spotters?. In: George A. Tsihrintzis, Dionisios N. Sotiropoulos, Lakhmi C. Jain (Ed.), Machine Learning Paradigms: Advances in Data Analytics (pp. 187-196). Springer Science and Business Media Deutschland GmbH
Open this publication in new window or tab >>Finding a healthy equilibrium of geo-demographic segments for a telecom business: Who are malicious hot-spotters?
Show others...
2019 (English)In: Machine Learning Paradigms: Advances in Data Analytics / [ed] George A. Tsihrintzis, Dionisios N. Sotiropoulos, Lakhmi C. Jain, Springer Science and Business Media Deutschland GmbH , 2019, p. 187-196Chapter in book (Refereed)
Abstract [en]

In telecommunication business, a major investment goes into the infrastructure and its maintenance, while business revenues are proportional to how big, good, and well-balanced the customer base is. In our previous work we presented a data-driven analytic strategy based on combinatorial optimization and analysis of the historical mobility designed to quantify the desirability of different geo-demographic segments, and several segments were recommended for a partial reduction. Within a segment, clients are different. In order to enable intelligent reduction, we introduce the term infrastructure-stressing client and, using the proposed method, we reveal the list of the IDs of such clients. We also have developed a visualization tool to allow for manual checks: it shows how the client moved through a sequence of hot spots and was repeatedly served by critically loaded antennas. The code and the footprint matrix are available on the SourceForge. © 2019, Springer International Publishing AG, part of Springer Nature.

Place, publisher, year, edition, pages
Springer Science and Business Media Deutschland GmbH, 2019
Series
Intelligent Systems Reference Library, ISSN 1868-4394 ; 149
Keywords
Business intelligence, Combinatorial optimization, Fuzzy logic, Geo-demographic segments, Mobility data, MOSAIC
National Category
Telecommunications Business Administration Computer Sciences
Identifiers
urn:nbn:se:bth-16885 (URN)10.1007/978-3-319-94030-4_8 (DOI)2-s2.0-85049522294 (Scopus ID)978-3-319-94029-8 (ISBN)
Available from: 2018-08-20 Created: 2018-08-20 Last updated: 2019-10-21Bibliographically approved
Lundberg, L., Lennerstad, H., Boeva, V. & García Martín, E. (2019). Handling non-linear relations in support vector machines through hyperplane folding. In: ACM International Conference Proceeding Series: . Paper presented at 11th International Conference on Machine Learning and Computing, ICMLC 2019; Zhuhai; China; 22 February 2019 through 24 February (pp. 137-141). Association for Computing Machinery
Open this publication in new window or tab >>Handling non-linear relations in support vector machines through hyperplane folding
2019 (English)In: ACM International Conference Proceeding Series, Association for Computing Machinery , 2019, p. 137-141Conference paper, Published paper (Refereed)
Abstract [en]

We present a new method, called hyperplane folding, that increases the margin in Support Vector Machines (SVMs). Based on the location of the support vectors, the method splits the dataset into two parts, rotates one part of the dataset and then merges the two parts again. This procedure increases the margin as long as the margin is smaller than half of the shortest distance between any pair of data points from the two different classes. We provide an algorithm for the general case with n-dimensional data points. A small experiment with three folding iterations on 3-dimensional data points with non-linear relations shows that the margin does indeed increase and that the accuracy improves with a larger margin. The method can use any standard SVM implementation plus some basic manipulation of the data points, i.e., splitting, rotating and merging. Hyperplane folding also increases the interpretability of the data. © 2019 Association for Computing Machinery.

Place, publisher, year, edition, pages
Association for Computing Machinery, 2019
Keywords
Hyperplane folding, Hyperplane hinging, Non-linear relations, Piecewise linear classification, Support vector machines, Geometry, Piecewise linear techniques, Vectors, Different class, Interpretability, Nonlinear relations, Piecewise linear, Support vector, Support vector machine (SVMs)
National Category
Computer Sciences
Identifiers
urn:nbn:se:bth-18039 (URN)10.1145/3318299.3318319 (DOI)000477981500023 ()2-s2.0-85066460409 (Scopus ID)
Conference
11th International Conference on Machine Learning and Computing, ICMLC 2019; Zhuhai; China; 22 February 2019 through 24 February
Available from: 2019-06-14 Created: 2019-06-14 Last updated: 2019-09-10Bibliographically approved
Sidorova, Y., Sköld, L., Rosander, O. & Lundberg, L. (2019). Optimizing utilization in cellular radio networks using mobility data. Optimization and Engineering, 20(1), 37-64
Open this publication in new window or tab >>Optimizing utilization in cellular radio networks using mobility data
2019 (English)In: Optimization and Engineering, ISSN 1389-4420, E-ISSN 1573-2924, Vol. 20, no 1, p. 37-64Article in journal (Refereed) Published
Abstract [en]

The main resource for any telecom operator is the physical radio cell network. We present two related methods for optimizing utilization in radio networks: Tetris optimization and selective cell expansion. Tetris optimization tries to find the mix of users from different market segments that provides the most even load in the network. Selective cell expansion identifies hotspot cells, expands the capacity of these radio cells, and calculates how many subscribers the radio network can handle after the expansions. Both methods are based on linear programming and use mobility data, i.e., data defining where different categories of subscribers tend to be during different times of the week. Based on real-world mobility data from a region in Sweden, we show that Tetris optimization based on six user segments made it possible to increase the number of subscribers by 58% without upgrading the physical infrastructure. The same data show that by selectively expanding less than 6% of the cells we are able to increase the number of subscribers by more than a factor of three without overloading the network. We also investigate the best way to combine Tetris optimization and selective cell expansion. © 2018 The Author(s)

Place, publisher, year, edition, pages
Springer New York LLC, 2019
Keywords
Cellular radio network, Linear programming, Mobility data, Optimization, Cytology, Radio, Cell expansion, Cellular radio networks, Hot-spot cells, Market segment, Mobility datum, Number of subscribers, Radio networks, Telecom operators, Cells
National Category
Computer Sciences
Identifiers
urn:nbn:se:bth-16334 (URN)10.1007/s11081-018-9387-4 (DOI)000457786900002 ()2-s2.0-85047379164 (Scopus ID)
Available from: 2018-06-07 Created: 2018-06-07 Last updated: 2019-02-21Bibliographically approved
Sidorova, Y., Sköld, L., Lennerstad, H. & Lundberg, L. (2019). The Use of Fuzzy Logic in Creating a Visual Data Summary of a Telecom Operator’s Customer Base. In: Communications in Computer and Information Science: . Paper presented at 1st International Conference on Intelligent Technologies and Applications, INTAP 2018; Bahawalpur; Pakistan; 23 October 2018 through 25 October 2018 (pp. 301-312). Springer Verlag, 932
Open this publication in new window or tab >>The Use of Fuzzy Logic in Creating a Visual Data Summary of a Telecom Operator’s Customer Base
2019 (English)In: Communications in Computer and Information Science, Springer Verlag , 2019, Vol. 932, p. 301-312Conference paper, Published paper (Refereed)
Abstract [en]

As pointed out by Zadeh, the mission of fuzzy logic in the era of big data is to create a relevant summary of huge amounts of data and facilitate decision-making. In this study, elements of fuzzy set theory are used to create a visual summary of telecom data, which gives a comprehensive idea concerning the desirability of boosting an operator’s presence in different neighborhoods and regions. The data used for validation cover historical mobility in a region of Sweden during a week. Fuzzy logic allows us to model inherently relative characteristics, such as “a tall man” or “a beautiful woman”, and importantly it also defines “anchors”, the situations (characterized with the value of the membership function for the characteristic) under which the relative notion receives a unique crisp interpretation. We propose color coding of the membership value for the relative notions such as “the desirability of boosting operator’s presence in the neighborhood” and “how well the operator is doing in the region”. The corresponding regions on the map (e.g., postcode zones or larger groupings) are colored in different shades passing from green (1) though yellow (0.5) to red (0). The color hues pass a clear intuitive message making the summary easy to grasp. © 2019, Springer Nature Singapore Pte Ltd.

Place, publisher, year, edition, pages
Springer Verlag, 2019
Series
Communications in Computer and Information Science, ISSN 1865-0929 ; 932
Keywords
Call Detail Records, Color, Fuzzy membership function, Mobility data, Computer circuits, Decision making, Fuzzy set theory, Membership functions, Color coding, Customerbase, Membership values, Mobility datum, Telecom operators, Use of fuzzy logic, Fuzzy logic
National Category
Telecommunications
Identifiers
urn:nbn:se:bth-17870 (URN)10.1007/978-981-13-6052-7_26 (DOI)000465006200026 ()9789811360510 (ISBN)
Conference
1st International Conference on Intelligent Technologies and Applications, INTAP 2018; Bahawalpur; Pakistan; 23 October 2018 through 25 October 2018
Available from: 2019-05-02 Created: 2019-05-02 Last updated: 2019-05-03Bibliographically approved
Josyula, S. P., Törnquist Krasemann, J. & Lundberg, L. (2018). A parallel algorithm for train rescheduling. Transportation Research Part C: Emerging Technologies, 95, 545-569
Open this publication in new window or tab >>A parallel algorithm for train rescheduling
2018 (English)In: Transportation Research Part C: Emerging Technologies, ISSN 0968-090X, E-ISSN 1879-2359, Vol. 95, p. 545-569Article in journal (Refereed) Published
Abstract [en]

One of the crucial factors in achieving a high punctuality in railway traffic systems, is the ability to effectively reschedule the trains when disturbances occur. The railway traffic rescheduling problem is a complex task to solve both from a practical and a computational perspective. Problems of practically relevant sizes have typically a very large search space, making them time-consuming to solve even for state-of-the-art optimization solvers. Though competitive algorithmic approaches are a widespread topic of research, not much research has been done to explore the opportunities and challenges in parallelizing them. This paper presents a parallel algorithm to efficiently solve the real-time railway rescheduling problem on a multi-core parallel architecture. We devised (1) an effective way to represent the solution space as a binary tree and (2) a novel sequential heuristic algorithm based on a depth-first search (DFS) strategy that quickly traverses the tree. Based on that, we designed a parallel algorithm for a multi-core architecture, which proved to be 10.5 times faster than the sequential algorithm even when run on a single processing core. When executed on a parallel machine with 8 cores, the speed further increased by a factor of 4.68 and every disturbance scenario in the considered case study was solved within 6 s. We conclude that for the problem under consideration, though a sequential DFS approach is fast in several disturbance scenarios, it is notably slower in many other disturbance scenarios. The parallel DFS approach that combines a DFS with simultaneous breadth-wise tree exploration, while being much faster on an average, is also consistently fast across all scenarios.

Place, publisher, year, edition, pages
Elsevier, 2018
Keywords
Railway traffic; Rescheduling; Parallel depth-first search; Optimization
National Category
Engineering and Technology
Identifiers
urn:nbn:se:bth-16868 (URN)10.1016/j.trc.2018.07.003 (DOI)000447112500032 ()
Projects
TRANS-FORM
Funder
Swedish Research Council Formas
Note

open access

Available from: 2018-08-15 Created: 2018-08-15 Last updated: 2019-08-15
Vishnubhotla, S. D., Mendes, E. & Lundberg, L. (2018). An insight into the capabilities of professionals and teams in agile software development: A systematic literature review. In: PROCEEDINGS OF 2018 7TH INTERNATIONAL CONFERENCE ON SOFTWARE AND COMPUTER APPLICATIONS (ICSCA 2018): . Paper presented at 7th International Conference on Software and Computer Applications, ICSCA 2018, Kuantan, Malaysia (pp. 10-19). Association for Computing Machinery
Open this publication in new window or tab >>An insight into the capabilities of professionals and teams in agile software development: A systematic literature review
2018 (English)In: PROCEEDINGS OF 2018 7TH INTERNATIONAL CONFERENCE ON SOFTWARE AND COMPUTER APPLICATIONS (ICSCA 2018), Association for Computing Machinery , 2018, p. 10-19Conference paper, Published paper (Refereed)
Abstract [en]

Background: Previous studies investigated key characteristics of software engineers and factors influencing the performance of individuals, productivity of teams and project success within agile software development (ASD). They aided in the active investigation of human aspects in ASD. However, capability measurement and prediction with respect to agile workforce, owing to its importance, is an area that needs spotlight. Objective: The objective of this paper is to present the state of the art relating to capability measurement of software engineers and teams working in ASD projects. Method: We carried out a systematic literature review (SLR) focused on identifying attributes used for measuring and predicting the capabilities of individual software engineers and teams. Results: Evidence from 16 studies showed attributes that can measure capabilities of engineers and teams, and also attributes that can be used as capability predictors. Further, different instruments used to measure those attributes were presented. Conclusions: The SLR presented a wide list of attributes that were grouped into various categories. This information can be used by project managers as, for example, a checklist to consider when allocating software engineers to teams and in turn teams to a project. Further, this study indicated the necessity for an investigation into capability prediction models. © 2018 Association for Computing Machinery.

Place, publisher, year, edition, pages
Association for Computing Machinery, 2018
Keywords
Agile software development, Capability measurement, Capability prediction, Competence, Individual capability, Systematic literature review, Team capability, Application programs, Engineers, Forecasting, Human resource management, Software design
National Category
Software Engineering
Identifiers
urn:nbn:se:bth-16644 (URN)10.1145/3185089.3185096 (DOI)000461243700003 ()2-s2.0-85048487301 (Scopus ID)9781450354141 (ISBN)
Conference
7th International Conference on Software and Computer Applications, ICSCA 2018, Kuantan, Malaysia
Available from: 2018-06-28 Created: 2018-06-28 Last updated: 2019-10-16Bibliographically approved
Organisations

Search in DiVA

Show all publications