1 of 1
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Energy Efficiency in Machine Learning: Approaches to Sustainable Data Stream Mining
Blekinge Institute of Technology, Faculty of Computing, Department of Computer Science.
2020 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Energy efficiency in machine learning explores how to build machine learning algorithms and models with low computational and power requirements. Although energy consumption is starting to gain interest in the field of machine learning, still the majority of solutions focus on obtaining the highest predictive accuracy, without a clear focus on sustainability.

This thesis explores green machine learning, which builds on green computing and computer architecture to design sustainable and energy efficient machine learning algorithms. In particular, we investigate how to design machine learning algorithms that automatically learn from streaming data in an energy efficient manner.

We first illustrate how energy can be measured in the context of machine learning, in the form of a literature review and a procedure to create theoretical energy models. We use this knowledge to analyze the energy footprint of Hoeffding trees, presenting an energy model that maps the number of computations and memory accesses to the main functionalities of the algorithm. We also analyze the hardware events correlated to the execution of the algorithm, their functions and their hyper parameters.

The final contribution of the thesis is showcased by two novel extensions of Hoeffding tree algorithms, the Hoeffding tree with nmin adaptation and the Green Accelerated Hoeffding Tree. These solutions are able to reduce their energy consumption by twenty and thirty percent, with minimal effect on accuracy. This is achieved by setting an individual splitting criteria for each branch of the decision tree, spending more energy on the fast growing branches and saving energy on the rest.

This thesis shows the importance of evaluating energy consumption when designing machine learning algorithms, proving that we can design more energy efficient algorithms and still achieve competitive accuracy results.

Place, publisher, year, edition, pages
Karlskrona: Blekinge Tekniska Högskola, 2020. , p. 267
Series
Blekinge Institute of Technology Doctoral Dissertation Series, ISSN 1653-2090 ; 2
Keywords [en]
machine learning, energy efficiency, data stream mining, green machine learning, edge computing
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:bth-18986ISBN: 978-91-7295-396-3 (print)OAI: oai:DiVA.org:bth-18986DiVA, id: diva2:1374831
Public defence
2020-01-31, J1650, Blekinge Institute of Technology, Karlskrona, 13:15 (English)
Opponent
Supervisors
Funder
Knowledge Foundation, 20140032Available from: 2019-12-03 Created: 2019-12-03 Last updated: 2020-01-23Bibliographically approved
List of papers
1. Estimation of energy consumption in machine learning
Open this publication in new window or tab >>Estimation of energy consumption in machine learning
2019 (English)In: Journal of Parallel and Distributed Computing, ISSN 0743-7315, E-ISSN 1096-0848, Vol. 134, p. 75-88Article in journal (Refereed) Published
Abstract [en]

Energy consumption has been widely studied in the computer architecture field for decades. While the adoption of energy as a metric in machine learning is emerging, the majority of research is still primarily focused on obtaining high levels of accuracy without any computational constraint. We believe that one of the reasons for this lack of interest is due to their lack of familiarity with approaches to evaluate energy consumption. To address this challenge, we present a review of the different approaches to estimate energy consumption in general and machine learning applications in particular. Our goal is to provide useful guidelines to the machine learning community giving them the fundamental knowledge to use and build specific energy estimation methods for machine learning algorithms. We also present the latest software tools that give energy estimation values, together with two use cases that enhance the study of energy consumption in machine learning.

Place, publisher, year, edition, pages
Academic Press, 2019
Keywords
Deep learning, Energy consumption, Green AI, High performance computing, Machine learning
National Category
Computer Sciences
Identifiers
urn:nbn:se:bth-18650 (URN)10.1016/j.jpdc.2019.07.007 (DOI)000489358200007 ()
Note

Funding text

Eva García-Martín and Håkan Grahn work under the research project “Scalable resource-efficient systems for big data analytics” funded by the Knowledge Foundation (grant: 20140032 ) in Sweden. Crefeda Faviola Rodrigues and Graham Riley are funded under the European FP7-INFRASTRUCTURES-2012-1 call (grant: 312979 ) and part-funded by ARM Ltd., UK under a Ph.D. Studentship Agreement. Eva Garcia-Martin is a Ph.D. student in Machine Learning at Blekinge Institute of Technology, in Sweden. She is working under the project Scalable resource- efficient systems for big data analytics funded by the Knowledge Foundation, advised by Niklas Lavesson and Håkan Grahn. The main focus of her thesis is on making machine learning algorithms more energy efficient. In particular, she has studied the energy consumption patterns of streaming algorithms, and then proposed new algorithm extensions that reduce their energy consumption. Personal website: https://egarciamartin.github.io/. Crefeda Faviola Rodrigues is a Ph.D. student in Advanced Processor Technology (APT) group at The University of Manchester and she is supervised by Mr. Graham Riley and Dr. Mikel Lujan. Her research is part funded by ARM and IS-ENES2 Project. Her research topic is “Efficient execution of Convolutional Neural Networks on low power heterogeneous systems”. The main focus of her thesis is to enable energy efficiency in deep learning algorithms such as Convolutional Neural Networks or ConvNets on embedded platforms like the Jetson TX1 and Snapdragon 820. Personal website: https://personalpages.manchester.ac.uk/staff/crefeda.rodrigues/. Graham Riley is a Lecturer in the School of Computer Science at the University of Manchester and hold a part-time position in the Scientific Computing Department (SCD) at STFC, Daresbury. His research is application-driven and much of his research has been undertaken in collaboration with computational scientists in application areas such as Earth System Modeling (including the U.K. Met Office) and, previously, computational chemistry and biology. His aim is to apply his experience in high performance computing and software engineering for (principally) scientific computing to new application domains. He is also interested in techniques and tools to support flexible coupled modeling in scientific computing and in performance modeling techniques for large-scale heterogeneous HPC systems, where energy efficiency is increasingly key. Personal website: http://www.manchester.ac.uk/research/graham.riley/. Håkan Grahn is professor of computer engineering since 2007. He received a M.Sc. degree in Computer Science and Engineering in 1990 and a Ph.D. degree in Computer Engineering in 1995, both from Lund University. His main interests are computer architecture, multicore systems, GPU computing, parallel programming, image processing, and machine learning/data mining. He has published more than 100 papers on these subjects. During 1999–2002 he was head of department for the Dept. of software engineering and computer science, and during 2011–2013, he was Dean of research at Blekinge Institute of Technology. Currently he is project leader for BigData@BTH – “Scalable resource-efficient systems for big data analytics”, a research profile funded by the Knowledge foundation during 2014–2020. Personal website: https://www.bth.se/eng/staff/hakan-grahn-hgr/.

Available from: 2019-09-11 Created: 2019-09-11 Last updated: 2019-12-03Bibliographically approved
2. Energy Efficiency Analysis of the Very Fast Decision Tree Algorithm
Open this publication in new window or tab >>Energy Efficiency Analysis of the Very Fast Decision Tree Algorithm
2017 (English)In: Trends in Social Network Analysis: Information Propagation, User Behavior Modeling, Forecasting, and Vulnerability Assessment / [ed] Rokia Missaoui, Talel Abdessalem, Matthieu Latapy, Cham, Switzerland: Springer, 2017, p. 229-252Chapter in book (Refereed)
Abstract [en]

Data mining algorithms are usually designed to optimize a trade-off between predictive accuracy and computational efficiency. This paper introduces energy consumption and energy efficiency as important factors to consider during data mining algorithm analysis and evaluation. We conducted an experiment to illustrate how energy consumption and accuracy are affected when varying the parameters of the Very Fast Decision Tree (VFDT) algorithm. These results are compared with a theoretical analysis on the algorithm, indicating that energy consumption is affected by the parameters design and that it can be reduced significantly while maintaining accuracy.

Place, publisher, year, edition, pages
Cham, Switzerland: Springer, 2017
Series
Lectures Notes in Social Networks, ISSN 2190-5428
Keywords
Energy efficiency, Green computing, Very Fast Decision Tree, Big Data
National Category
Computer Sciences
Identifiers
urn:nbn:se:bth-15489 (URN)10.1007/978-3-319-53420-6_10 (DOI)978-3-319-53419-0 (ISBN)978-3-319-53420-6 (ISBN)
Funder
Knowledge Foundation, 20140032
Available from: 2017-11-14 Created: 2017-11-14 Last updated: 2019-12-03Bibliographically approved
3. Identification of Energy Hotspots: A Case Study of the Very Fast Decision Tree
Open this publication in new window or tab >>Identification of Energy Hotspots: A Case Study of the Very Fast Decision Tree
2017 (English)In: GPC 2017: Green, Pervasive, and Cloud Computing / [ed] Au M., Castiglione A., Choo KK., Palmieri F., Li KC., Cham, Switzerland: Springer, 2017, Vol. 10232, p. 267-281Conference paper, Published paper (Refereed)
Abstract [en]

Large-scale data centers account for a significant share of the energy consumption in many countries. Machine learning technology requires intensive workloads and thus drives requirements for lots of power and cooling capacity in data centers. It is time to explore green machine learning. The aim of this paper is to profile a machine learning algorithm with respect to its energy consumption and to determine the causes behind this consumption. The first scalable machine learning algorithm able to handle large volumes of streaming data is the Very Fast Decision Tree (VFDT), which outputs competitive results in comparison to algorithms that analyze data from static datasets. Our objectives are to: (i) establish a methodology that profiles the energy consumption of decision trees at the function level, (ii) apply this methodology in an experiment to obtain the energy consumption of the VFDT, (iii) conduct a fine-grained analysis of the functions that consume most of the energy, providing an understanding of that consumption, (iv) analyze how different parameter settings can significantly reduce the energy consumption. The results show that by addressing the most energy intensive part of the VFDT, the energy consumption can be reduced up to a 74.3%.

Place, publisher, year, edition, pages
Cham, Switzerland: Springer, 2017
Series
Lecture Notes in Computer Science, ISSN 0302-9743
Keywords
Machine learning, Big data, Very Fast Decision Tree, Green machine learning, Data mining, Data stream mining
National Category
Computer Sciences
Identifiers
urn:nbn:se:bth-15490 (URN)10.1007/978-3-319-57186-7_21 (DOI)000426275000021 ()978-3-319-57185-0 (ISBN)978-3-319-57186-7 (ISBN)
Conference
International Conference on Green, Pervasive and Cloud Computing (GPC), Cetara, Amalfi Coast, Italy
Funder
Knowledge Foundation, 20140032
Available from: 2017-11-14 Created: 2017-11-14 Last updated: 2019-12-03Bibliographically approved
4. Energy-Aware Very Fast Decision Tree
Open this publication in new window or tab >>Energy-Aware Very Fast Decision Tree
Show others...
(English)In: Journal of Data Science and Analytics, ISSN 2364-415XArticle in journal (Refereed) Accepted
Abstract [en]

Recently machine learning researchers are designing algorithms that can run in embedded and mobile devices, which introduces additional constraints compared to traditional algorithm design approaches. One of these constraints is energy consumption, which directly translates to battery capacity for these devices. Streaming algorithms, such as the Very Fast Decision Tree (VFDT), are designed to run in such devices due to their high velocity and low memory requirements. However, they have not been designed with an energy efficiency focus. This paper addresses this challenge by presenting the nmin adaptation method, which reduces the energy consumption of the VFDT algorithm with only minor effects on accuracy. nmin adaptation allows the algorithm to grow faster in those branches where there is more confidence to create a split, and delays the split on the less confident branches. This removes unnecessary computations related to checking for splits but maintains similar levels of accuracy. We have conducted extensive experiments on 29 public datasets, showing that the VFDT with nmin adaptation consumes up to 31% less energy than the original VFDT, and up to 96% less energy than the CVFDT (VFDT adapted for concept drift scenarios), trading off up to 1.7 percent of accuracy.

National Category
Computer Sciences
Identifiers
urn:nbn:se:bth-19150 (URN)
Funder
Knowledge Foundation, 20140032
Available from: 2020-01-23 Created: 2020-01-23 Last updated: 2020-01-23
5. Energy Modeling of Hoeffding Tree Ensembles
Open this publication in new window or tab >>Energy Modeling of Hoeffding Tree Ensembles
(English)In: Intelligent Data Analysis, ISSN 1088-467X, E-ISSN 1571-4128Article in journal (Refereed) Accepted
Abstract [en]

Energy consumption reduction has been an increasing trend in machine learning over the past few years due to its socio-ecological importance. In new challenging areas such as edge computing, energy consumption and predictive accuracy are key variables during algorithm design and implementation. State-of-the-art ensemble stream mining algorithms are able to create highly accurate predictions at a substantial energy cost. This paper introduces the nmin adaptation method to ensembles of Hoeffding tree algorithms, to further reduce their energy consumption without sacrificing accuracy. We also present extensive theoretical energy models of such algorithms, detailing their energy patterns and how nmin adaptation affects their energy consumption. We have evaluated the energy efficiency and accuracy of the nmin adaptation method on five different ensembles of Hoeffding trees under 11 publicly available datasets. The results show that we are able to reduce the energy consumption significantly, by 21 % on average, affecting accuracy by less than one percent on average.

National Category
Computer Sciences
Identifiers
urn:nbn:se:bth-19151 (URN)
Funder
Knowledge Foundation, 20140032
Available from: 2020-01-23 Created: 2020-01-23 Last updated: 2020-01-23
6. Green Accelerated Hoeffding Tree
Open this publication in new window or tab >>Green Accelerated Hoeffding Tree
(English)Manuscript (preprint) (Other academic)
Abstract [en]

For the past years, the main concern in machine learning had been to create highly accurate models, without considering the high computational requirements involved. Stream mining algorithms are able to produce highly accurate models in real time, without strong computational demands. This is the case of the Hoeffding tree algorithm. Recent extensions to this algorithm, such as the Extremely Very Fast Decision Tree (EFDT), focus on increasing predictive accuracy, but at the cost of a higher energy consumption. This paper presents the Green Accelerated Hoeffding Tree (GAHT) algorithm, which is able to achieve same levels of accuracy as the latest EFDT, while reducing its energy consumption by 27 percent with minimal effect on accuracy.

Keywords
Data Stream Mining · Hoeffding trees · Green machine learning · Energy efficiency
National Category
Computer Sciences
Identifiers
urn:nbn:se:bth-19152 (URN)
Funder
Knowledge Foundation, 20140032
Available from: 2020-01-23 Created: 2020-01-23 Last updated: 2020-01-23

Open Access in DiVA

Spikblad(77 kB)9 downloads
File information
File name FULLTEXT01.pdfFile size 77 kBChecksum SHA-512
8dc1ca31ee18a73a9eb9adb9a8ac32ebcbb24e8675bc8e1e0e79da743dd049a7709d76415662b143d32ceb61d332a6a369c20f5838a59c61849d6c9da30b49ac
Type fulltextMimetype application/pdf
fulltext(4092 kB)30 downloads
File information
File name FULLTEXT02.pdfFile size 4092 kBChecksum SHA-512
1a1536d258b3477916789cce1437f8f16316ee6a817c614359523a1cfdf7f4c3dbb61c00fce361d7b568d48d62cb1692c59b836358807c04d9df441866416195
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
García-Martín, Eva
By organisation
Department of Computer Science
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 39 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 401 hits
1 of 1
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf