Change search
Link to record
Permanent link

Direct link
BETA
Alternative names
Publications (10 of 101) Show all publications
Sidorova, Y., Rosander, O., Sköld, L., Grahn, H. & Lundberg, L. (2019). Finding a healthy equilibrium of geo-demographic segments for a telecom business: Who are malicious hot-spotters?. In: George A. Tsihrintzis, Dionisios N. Sotiropoulos, Lakhmi C. Jain (Ed.), Machine Learning Paradigms: Advances in Data Analytics (pp. 187-196). Springer Science and Business Media Deutschland GmbH
Open this publication in new window or tab >>Finding a healthy equilibrium of geo-demographic segments for a telecom business: Who are malicious hot-spotters?
Show others...
2019 (English)In: Machine Learning Paradigms: Advances in Data Analytics / [ed] George A. Tsihrintzis, Dionisios N. Sotiropoulos, Lakhmi C. Jain, Springer Science and Business Media Deutschland GmbH , 2019, p. 187-196Chapter in book (Refereed)
Abstract [en]

In telecommunication business, a major investment goes into the infrastructure and its maintenance, while business revenues are proportional to how big, good, and well-balanced the customer base is. In our previous work we presented a data-driven analytic strategy based on combinatorial optimization and analysis of the historical mobility designed to quantify the desirability of different geo-demographic segments, and several segments were recommended for a partial reduction. Within a segment, clients are different. In order to enable intelligent reduction, we introduce the term infrastructure-stressing client and, using the proposed method, we reveal the list of the IDs of such clients. We also have developed a visualization tool to allow for manual checks: it shows how the client moved through a sequence of hot spots and was repeatedly served by critically loaded antennas. The code and the footprint matrix are available on the SourceForge. © 2019, Springer International Publishing AG, part of Springer Nature.

Place, publisher, year, edition, pages
Springer Science and Business Media Deutschland GmbH, 2019
Series
Intelligent Systems Reference Library, ISSN 1868-4394 ; 149
Keywords
Business intelligence, Combinatorial optimization, Fuzzy logic, Geo-demographic segments, Mobility data, MOSAIC
National Category
Telecommunications Business Administration Computer Sciences
Identifiers
urn:nbn:se:bth-16885 (URN)10.1007/978-3-319-94030-4_8 (DOI)2-s2.0-85049522294 (Scopus ID)978-3-319-94029-8 (ISBN)
Available from: 2018-08-20 Created: 2018-08-20 Last updated: 2018-08-20Bibliographically approved
Westphal, F., Lavesson, N. & Grahn, H. (2018). Document Image Binarization Using Recurrent Neural Networks. In: Proceedings - 13th IAPR International Workshop on Document Analysis Systems, DAS 2018: . Paper presented at 2018 13th IAPR International Workshop on Document Analysis Systems (DAS), vienna (pp. 263-268).
Open this publication in new window or tab >>Document Image Binarization Using Recurrent Neural Networks
2018 (English)In: Proceedings - 13th IAPR International Workshop on Document Analysis Systems, DAS 2018, 2018, p. 263-268Conference paper, Published paper (Refereed)
Abstract [en]

In the context of document image analysis, image binarization is an important preprocessing step for other document analysis algorithms, but also relevant on its own by improving the readability of images of historical documents. While historical document image binarization is challenging due to common image degradations, such as bleedthrough, faded ink or stains, achieving good binarization performance in a timely manner is a worthwhile goal to facilitate efficient information extraction from historical documents. In this paper, we propose a recurrent neural network based algorithm using Grid Long Short-Term Memory cells for image binarization, as well as a pseudo F-Measure based weighted loss function. We evaluate the binarization and execution performance of our algorithm for different choices of footprint size, scale factor and loss function. Our experiments show a significant trade-off between binarization time and quality for different footprint sizes. However, we see no statistically significant difference when using different scale factors and only limited differences for different loss functions. Lastly, we compare the binarization performance of our approach with the best performing algorithm in the 2016 handwritten document image binarization contest and show that both algorithms perform equally well.

Keywords
image binarization, recurrent neural networks, Grid LSTM, historical documents, Text analysis, Labeling, Recurrent neural networks, Heuristic algorithms, Training, Degradation, Ink
National Category
Computer Vision and Robotics (Autonomous Systems)
Identifiers
urn:nbn:se:bth-16749 (URN)10.1109/DAS.2018.71 (DOI)978-1-5386-3346-5 (ISBN)
Conference
2018 13th IAPR International Workshop on Document Analysis Systems (DAS), vienna
Funder
Knowledge Foundation, 20140032
Available from: 2018-07-06 Created: 2018-07-06 Last updated: 2018-08-27Bibliographically approved
Westphal, F., Grahn, H. & Lavesson, N. (2018). Efficient document image binarization using heterogeneous computing and parameter tuning. International Journal on Document Analysis and Recognition, 21(1-2), 41-58
Open this publication in new window or tab >>Efficient document image binarization using heterogeneous computing and parameter tuning
2018 (English)In: International Journal on Document Analysis and Recognition, ISSN 1433-2833, E-ISSN 1433-2825, Vol. 21, no 1-2, p. 41-58Article in journal (Refereed) Published
Abstract [en]

In the context of historical document analysis, image binarization is a first important step, which separates foreground from background, despite common image degradations, such as faded ink, stains, or bleed-through. Fast binarization has great significance when analyzing vast archives of document images, since even small inefficiencies can quickly accumulate to years of wasted execution time. Therefore, efficient binarization is especially relevant to companies and government institutions, who want to analyze their large collections of document images. The main challenge with this is to speed up the execution performance without affecting the binarization performance. We modify a state-of-the-art binarization algorithm and achieve on average a 3.5 times faster execution performance by correctly mapping this algorithm to a heterogeneous platform, consisting of a CPU and a GPU. Our proposed parameter tuning algorithm additionally improves the execution time for parameter tuning by a factor of 1.7, compared to previous parameter tuning algorithms. We see that for the chosen algorithm, machine learning-based parameter tuning improves the execution performance more than heterogeneous computing, when comparing absolute execution times. © 2018 The Author(s)

Place, publisher, year, edition, pages
Springer Verlag, 2018
Keywords
Automatic parameter tuning, Heterogeneous computing, Historical documents, Image binarization, Bins, History, Image analysis, Learning systems, Document image binarization, Government institutions, Heterogeneous platforms, Parameter tuning algorithm, Parameter estimation
National Category
Computer Sciences
Identifiers
urn:nbn:se:bth-15891 (URN)10.1007/s10032-017-0293-7 (DOI)000433193500003 ()2-s2.0-85041228615 (Scopus ID)
Available from: 2018-02-15 Created: 2018-02-15 Last updated: 2018-08-27Bibliographically approved
García Martín, E., Lavesson, N., Grahn, H., Casalicchio, E. & Boeva, V. (2018). Hoeffding Trees with nmin adaptation. In: The 5th IEEE International Conference on Data Science and Advanced Analytics (DSAA 2018): . Paper presented at IEEE 5th International Conference on Data Science and Advanced Analytics, 1–4 October 2018, Turin (pp. 70-79). IEEE
Open this publication in new window or tab >>Hoeffding Trees with nmin adaptation
Show others...
2018 (English)In: The 5th IEEE International Conference on Data Science and Advanced Analytics (DSAA 2018), IEEE, 2018, p. 70-79Conference paper, Published paper (Refereed)
Abstract [en]

Machine learning software accounts for a significant amount of energy consumed in data centers. These algorithms are usually optimized towards predictive performance, i.e. accuracy, and scalability. This is the case of data stream mining algorithms. Although these algorithms are adaptive to the incoming data, they have fixed parameters from the beginning of the execution. We have observed that having fixed parameters lead to unnecessary computations, thus making the algorithm energy inefficient.In this paper we present the nmin adaptation method for Hoeffding trees. This method adapts the value of the nmin pa- rameter, which significantly affects the energy consumption of the algorithm. The method reduces unnecessary computations and memory accesses, thus reducing the energy, while the accuracy is only marginally affected. We experimentally compared VFDT (Very Fast Decision Tree, the first Hoeffding tree algorithm) and CVFDT (Concept-adapting VFDT) with the VFDT-nmin (VFDT with nmin adaptation). The results show that VFDT-nmin consumes up to 27% less energy than the standard VFDT, and up to 92% less energy than CVFDT, trading off a few percent of accuracy in a few datasets.

Place, publisher, year, edition, pages
IEEE, 2018
National Category
Computer Sciences
Identifiers
urn:nbn:se:bth-17207 (URN)10.1109/DSAA.2018.00017 (DOI)
Conference
IEEE 5th International Conference on Data Science and Advanced Analytics, 1–4 October 2018, Turin
Funder
Knowledge Foundation, 20140032
Available from: 2018-11-01 Created: 2018-11-01 Last updated: 2018-11-01Bibliographically approved
García Martín, E., Lavesson, N., Grahn, H., Casalicchio, E. & Boeva, V. (2018). How to Measure Energy Consumption in Machine Learning Algorithms. In: Green Data Mining, International Workshop on Energy Efficient Data Mining and Knowledge Discovery: ECMLPKDD 2018: European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases Workshops. Lecture Notes in Computer Science. Springer, Cham. Paper presented at European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, Dublin.
Open this publication in new window or tab >>How to Measure Energy Consumption in Machine Learning Algorithms
Show others...
2018 (English)In: Green Data Mining, International Workshop on Energy Efficient Data Mining and Knowledge Discovery: ECMLPKDD 2018: European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases Workshops. Lecture Notes in Computer Science. Springer, Cham, 2018Conference paper, Published paper (Refereed)
Abstract [en]

Machine learning algorithms are responsible for a significant amount of computations. These computations are increasing with the advancements in different machine learning fields. For example, fields such as deep learning require algorithms to run during weeks consuming vast amounts of energy. While there is a trend in optimizing machine learning algorithms for performance and energy consumption, still there is little knowledge on how to estimate an algorithm’s energy consumption. Currently, a straightforward cross-platform approach to estimate energy consumption for different types of algorithms does not exist. For that reason, well-known researchers in computer architecture have published extensive works on approaches to estimate the energy consumption. This study presents a survey of methods to estimate energy consumption, and maps them to specific machine learning scenarios. Finally, we illustrate our mapping suggestions with a case study, where we measure energy consumption in a big data stream mining scenario. Our ultimate goal is to bridge the current gap that exists to estimate energy consumption in machine learning scenarios.

National Category
Computer Sciences
Identifiers
urn:nbn:se:bth-17209 (URN)
Conference
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, Dublin
Funder
Knowledge Foundation, 20140032
Available from: 2018-11-01 Created: 2018-11-01 Last updated: 2018-11-01Bibliographically approved
Nordahl, C., Grahn, H., Persson, M. & Boeva, V. (2018). Organizing, Visualizing and Understanding Households Electricity Consumption Data through Clustering Analysis.. In: Organizing, Visualizing and Understanding Households Electricity Consumption Data through Clustering Analysis: . Paper presented at 2ND WORKSHOP ON AI FOR AGING, REHABILITATION AND INDEPENDENT ASSISTED LIVING (ARIAL) @IJCAI'18, Stockholm. https://sites.google.com/view/arial2018/accepted-papersprogram
Open this publication in new window or tab >>Organizing, Visualizing and Understanding Households Electricity Consumption Data through Clustering Analysis.
2018 (English)In: Organizing, Visualizing and Understanding Households Electricity Consumption Data through Clustering Analysis, https://sites.google.com/view/arial2018/accepted-papersprogram , 2018Conference paper, Published paper (Refereed)
Abstract [en]

We propose a cluster analysis approach for organizing, visualizing and understanding households’ electricity consumption data. We initially partition the consumption data into a number of clusters with similar daily electricity consumption profiles. The centroids of each cluster can be seen as representative signatures of a household’s electricity consumption behaviors. We evaluate the proposed approach by conducting a number of experiments on electricity consumption data of ten selected households. Our results show that the approach is suitable for data analysis, understanding and creating electricity consumption behavior models.

Place, publisher, year, edition, pages
https://sites.google.com/view/arial2018/accepted-papersprogram, 2018
National Category
Other Computer and Information Science
Identifiers
urn:nbn:se:bth-17439 (URN)
Conference
2ND WORKSHOP ON AI FOR AGING, REHABILITATION AND INDEPENDENT ASSISTED LIVING (ARIAL) @IJCAI'18, Stockholm
Projects
BigData@BTH
Available from: 2018-12-19 Created: 2018-12-19 Last updated: 2019-01-16Bibliographically approved
Sidorova, Y., Lundberg, L., Rosander, O., Grahn, H. & Skold, L. (2018). Recommendations for marketing campaigns in telecommunication business based on the footprint analysis: Who is a good client?. In: 2017 8th International Conference on Information, Intelligence, Systems and Applications, IISA 2017: . Paper presented at 8th International Conference on Information, Intelligence, Systems and Applications, IISA 2017, Larnaca (pp. 1-6). Institute of Electrical and Electronics Engineers Inc.
Open this publication in new window or tab >>Recommendations for marketing campaigns in telecommunication business based on the footprint analysis: Who is a good client?
Show others...
2018 (English)In: 2017 8th International Conference on Information, Intelligence, Systems and Applications, IISA 2017, Institute of Electrical and Electronics Engineers Inc. , 2018, p. 1-6Conference paper, Published paper (Refereed)
Abstract [en]

A major investment made by a telecom operator goes into the infrastructure and its maintenance, while business revenues depend on how efficiently it is exploited. We present a data-driven analytic strategy based on combinatorial optimization and analysis of historical data. The data cover historical mobility in one region of Sweden during a week. Applying the proposed method in a case study, we have identified the optimal combination of geodemographic segments in the customer base, developed a functionality to assess the potential of a planned marketing campaign, and investigated how many and which segments to target for customer base growth. A comprehensible summary of the conclusions is created via execution of the queries with a fuzzy logic component. © 2017 IEEE.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers Inc., 2018
Keywords
business intelligence, combinatorial optimization, fuzzy logic, geodemographic segments, mobility data, MOSAIC, Commerce, Competitive intelligence, Computer circuits, Investments, Footprint analysis, Historical data, Marketing campaign, Mobility datum, Optimal combination, Telecom operators, Marketing
National Category
Computer Sciences
Identifiers
urn:nbn:se:bth-16539 (URN)10.1109/IISA.2017.8316396 (DOI)2-s2.0-85047937690 (Scopus ID)9781538637319 (ISBN)
Conference
8th International Conference on Information, Intelligence, Systems and Applications, IISA 2017, Larnaca
Available from: 2018-06-18 Created: 2018-06-18 Last updated: 2018-06-18Bibliographically approved
Martinsen, J. K., Grahn, H. & Isberg, A. (2017). Combining thread-level speculation and just-in-time compilation in Google’s V8 JavaScript engine. Concurrency and Computation, 29(1), Article ID e3826.
Open this publication in new window or tab >>Combining thread-level speculation and just-in-time compilation in Google’s V8 JavaScript engine
2017 (English)In: Concurrency and Computation, ISSN 1532-0626, E-ISSN 1532-0634, Vol. 29, no 1, article id e3826Article in journal (Refereed) Published
Abstract [en]

Summary: Thread-level speculation can be used to take advantage of multicore architectures for JavaScript in web applications. We extend previous studies with these main contributions; we implement thread-level speculation in the state-of-the art just-in-time-enabled JavaScript engine V8 and make the measurements in the Chromium web browser both from Google instead of using an interpreted JavaScript engine. We evaluate the thread-level speculation and just-in-time compilation combination on 15 very popular web applications, 20 HTML5 demos from the JS1K competition, and 4 Google Maps use cases. The performance is evaluated on two, four, and eight cores. The results clearly show that it is possible to successfully combine thread-level speculation and just-in-time compilation. This makes it possible to take advantage of multicore architectures for web applications while hiding the details of parallel programming from the programmer. Further, our results show an average speedup for the thread-level speculation and just-in-time compilation combination by a factor of almost 3 on four cores and over 4 on eight cores, without changing any of the JavaScript source code.

Place, publisher, year, edition, pages
Wiley Online Library, 2017
Keywords
Computer architecture; Computer programming; Engines; High level languages; Just in time production; Parallel programming; Software architecture; World Wide Web, Javascript; Just in time; Just-in-time compilation; Multicore architectures; Source codes; State of the art; Thread level speculation; WEB application, Multicore programming
National Category
Computer Engineering Computer Sciences
Identifiers
urn:nbn:se:bth-13219 (URN)10.1002/cpe.3826 (DOI)000390562700002 ()2-s2.0-84966359864 (Scopus ID)
Available from: 2016-10-03 Created: 2016-10-03 Last updated: 2018-02-02Bibliographically approved
García Martín, E., Lavesson, N. & Grahn, H. (2017). Energy Efficiency Analysis of the Very Fast Decision Tree Algorithm. In: Rokia Missaoui, Talel Abdessalem, Matthieu Latapy (Ed.), Trends in Social Network Analysis: Information Propagation, User Behavior Modeling, Forecasting, and Vulnerability Assessment (pp. 229-252). Cham, Switzerland: Springer
Open this publication in new window or tab >>Energy Efficiency Analysis of the Very Fast Decision Tree Algorithm
2017 (English)In: Trends in Social Network Analysis: Information Propagation, User Behavior Modeling, Forecasting, and Vulnerability Assessment / [ed] Rokia Missaoui, Talel Abdessalem, Matthieu Latapy, Cham, Switzerland: Springer, 2017, p. 229-252Chapter in book (Refereed)
Abstract [en]

Data mining algorithms are usually designed to optimize a trade-off between predictive accuracy and computational efficiency. This paper introduces energy consumption and energy efficiency as important factors to consider during data mining algorithm analysis and evaluation. We conducted an experiment to illustrate how energy consumption and accuracy are affected when varying the parameters of the Very Fast Decision Tree (VFDT) algorithm. These results are compared with a theoretical analysis on the algorithm, indicating that energy consumption is affected by the parameters design and that it can be reduced significantly while maintaining accuracy.

Place, publisher, year, edition, pages
Cham, Switzerland: Springer, 2017
Series
Lectures Notes in Social Networks, ISSN 2190-5428
Keywords
Energy efficiency, Green computing, Very Fast Decision Tree, Big Data
National Category
Computer Sciences
Identifiers
urn:nbn:se:bth-15489 (URN)10.1007/978-3-319-53420-6_10 (DOI)978-3-319-53419-0 (ISBN)978-3-319-53420-6 (ISBN)
Funder
Knowledge Foundation, 20140032
Available from: 2017-11-14 Created: 2017-11-14 Last updated: 2018-02-02Bibliographically approved
García Martín, E., Lavesson, N. & Grahn, H. (2017). Identification of Energy Hotspots: A Case Study of the Very Fast Decision Tree. In: Au M., Castiglione A., Choo KK., Palmieri F., Li KC. (Ed.), GPC 2017: Green, Pervasive, and Cloud Computing: . Paper presented at International Conference on Green, Pervasive and Cloud Computing (GPC), Cetara, Amalfi Coast, Italy (pp. 267-281). Cham, Switzerland: Springer, 10232
Open this publication in new window or tab >>Identification of Energy Hotspots: A Case Study of the Very Fast Decision Tree
2017 (English)In: GPC 2017: Green, Pervasive, and Cloud Computing / [ed] Au M., Castiglione A., Choo KK., Palmieri F., Li KC., Cham, Switzerland: Springer, 2017, Vol. 10232, p. 267-281Conference paper, Published paper (Refereed)
Abstract [en]

Large-scale data centers account for a significant share of the energy consumption in many countries. Machine learning technology requires intensive workloads and thus drives requirements for lots of power and cooling capacity in data centers. It is time to explore green machine learning. The aim of this paper is to profile a machine learning algorithm with respect to its energy consumption and to determine the causes behind this consumption. The first scalable machine learning algorithm able to handle large volumes of streaming data is the Very Fast Decision Tree (VFDT), which outputs competitive results in comparison to algorithms that analyze data from static datasets. Our objectives are to: (i) establish a methodology that profiles the energy consumption of decision trees at the function level, (ii) apply this methodology in an experiment to obtain the energy consumption of the VFDT, (iii) conduct a fine-grained analysis of the functions that consume most of the energy, providing an understanding of that consumption, (iv) analyze how different parameter settings can significantly reduce the energy consumption. The results show that by addressing the most energy intensive part of the VFDT, the energy consumption can be reduced up to a 74.3%.

Place, publisher, year, edition, pages
Cham, Switzerland: Springer, 2017
Series
Lecture Notes in Computer Science, ISSN 0302-9743
Keywords
Machine learning, Big data, Very Fast Decision Tree, Green machine learning, Data mining, Data stream mining
National Category
Computer Sciences
Identifiers
urn:nbn:se:bth-15490 (URN)10.1007/978-3-319-57186-7_21 (DOI)000426275000021 ()978-3-319-57185-0 (ISBN)978-3-319-57186-7 (ISBN)
Conference
International Conference on Green, Pervasive and Cloud Computing (GPC), Cetara, Amalfi Coast, Italy
Funder
Knowledge Foundation, 20140032
Available from: 2017-11-14 Created: 2017-11-14 Last updated: 2018-03-23Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0001-9947-1088

Search in DiVA

Show all publications