Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Hoeffding Trees with nmin adaptation
Blekinge Institute of Technology, Faculty of Computing, Department of Computer Science and Engineering.ORCID iD: 0000-0003-4973-9255
Blekinge Institute of Technology, Faculty of Computing, Department of Computer Science and Engineering.
Blekinge Institute of Technology, Faculty of Computing, Department of Computer Science and Engineering.ORCID iD: 0000-0001-9947-1088
Blekinge Institute of Technology, Faculty of Computing, Department of Computer Science and Engineering.ORCID iD: 0000-0002-3118-5058
Show others and affiliations
2018 (English)In: The 5th IEEE International Conference on Data Science and Advanced Analytics (DSAA 2018), IEEE, 2018Conference paper, Published paper (Refereed)
Abstract [en]

Machine learning software accounts for a significant amount of energy consumed in data centers. These algorithms are usually optimized towards predictive performance, i.e. accuracy, and scalability. This is the case of data stream mining algorithms. Although these algorithms are adaptive to the incoming data, they have fixed parameters from the beginning of the execution. We have observed that having fixed parameters lead to unnecessary computations, thus making the algorithm energy inefficient.In this paper we present the nmin adaptation method for Hoeffding trees. This method adapts the value of the nmin pa- rameter, which significantly affects the energy consumption of the algorithm. The method reduces unnecessary computations and memory accesses, thus reducing the energy, while the accuracy is only marginally affected. We experimentally compared VFDT (Very Fast Decision Tree, the first Hoeffding tree algorithm) and CVFDT (Concept-adapting VFDT) with the VFDT-nmin (VFDT with nmin adaptation). The results show that VFDT-nmin consumes up to 27% less energy than the standard VFDT, and up to 92% less energy than CVFDT, trading off a few percent of accuracy in a few datasets.

Place, publisher, year, edition, pages
IEEE, 2018.
Series
Proceedings of the International Conference on Data Science and Advanced Analytics, ISSN 2472-1573
Keywords [en]
data stream mining; green artificial intelligence; energy efficiency; hoeffding trees; energy aware machine learning
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:bth-15493DOI: 10.1109/DSAA.2018.00017ISI: 000459238600008ISBN: 9781538650905 (print)OAI: oai:DiVA.org:bth-15493DiVA, id: diva2:1156971
Conference
5th IEEE International Conference on Data Science and Advanced Analytics (IEEE DSAA), 1–4 October 2018, Turin
Part of project
Bigdata@BTH- Scalable resource-efficient systems for big data analytics, Knowledge Foundation
Funder
Knowledge Foundation, 20140032Available from: 2017-11-14 Created: 2017-11-14 Last updated: 2021-12-22Bibliographically approved
In thesis
1. Extraction and Energy Efficient Processing of Streaming Data
Open this publication in new window or tab >>Extraction and Energy Efficient Processing of Streaming Data
2017 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

The interest in machine learning algorithms is increasing, in parallel with the advancements in hardware and software required to mine large-scale datasets. Machine learning algorithms account for a significant amount of energy consumed in data centers, which impacts the global energy consumption. However, machine learning algorithms are optimized towards predictive performance and scalability. Algorithms with low energy consumption are necessary for embedded systems and other resource constrained devices; and desirable for platforms that require many computations, such as data centers. Data stream mining investigates how to process potentially infinite streams of data without the need to store all the data. This ability is particularly useful for companies that are generating data at a high rate, such as social networks.

This thesis investigates algorithms in the data stream mining domain from an energy efficiency perspective. The thesis comprises of two parts. The first part explores how to extract and analyze data from Twitter, with a pilot study that investigates a correlation between hashtags and followers. The second and main part investigates how energy is consumed and optimized in an online learning algorithm, suitable for data stream mining tasks.

The second part of the thesis focuses on analyzing, understanding, and reformulating the Very Fast Decision Tree (VFDT) algorithm, the original Hoeffding tree algorithm, into an energy efficient version. It presents three key contributions. First, it shows how energy varies in the VFDT from a high-level view by tuning different parameters. Second, it presents a methodology to identify energy bottlenecks in machine learning algorithms, by portraying the functions of the VFDT that consume the largest amount of energy. Third, it introduces dynamic parameter adaptation for Hoeffding trees, a method to dynamically adapt the parameters of Hoeffding trees to reduce their energy consumption. The results show an average energy reduction of 23% on the VFDT algorithm.

Place, publisher, year, edition, pages
Karlskrona: Blekinge Tekniska Högskola, 2017
Series
Blekinge Institute of Technology Licentiate Dissertation Series, ISSN 1650-2140 ; 3
Keywords
machine learning, green computing, data mining, data stream mining, green machine learning
National Category
Computer Sciences
Identifiers
urn:nbn:se:bth-15532 (URN)
Presentation
2017-12-18, J1640, Blekinge Tekniska Högskola, 371 79, Karlskrona, 13:00 (English)
Opponent
Supervisors
Projects
Scalable resource-efficient systems for big data analytics
Funder
Knowledge Foundation, 20140032
Available from: 2017-11-22 Created: 2017-11-22 Last updated: 2018-01-13Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full text

Authority records

García Martín, EvaLavesson, NiklasGrahn, HåkanCasalicchio, EmilianoBoeva, Veselka

Search in DiVA

By author/editor
García Martín, EvaLavesson, NiklasGrahn, HåkanCasalicchio, EmilianoBoeva, Veselka
By organisation
Department of Computer Science and Engineering
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 336 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf