Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Hoeffding Trees with nmin adaptation
Blekinge Tekniska Högskola, Fakulteten för datavetenskaper, Institutionen för datalogi och datorsystemteknik.ORCID-id: 0000-0003-4973-9255
Blekinge Tekniska Högskola, Fakulteten för datavetenskaper, Institutionen för datalogi och datorsystemteknik.
Blekinge Tekniska Högskola, Fakulteten för datavetenskaper, Institutionen för datalogi och datorsystemteknik.ORCID-id: 0000-0001-9947-1088
Blekinge Tekniska Högskola, Fakulteten för datavetenskaper, Institutionen för datalogi och datorsystemteknik.ORCID-id: 0000-0002-3118-5058
Visa övriga samt affilieringar
(Engelska)Manuskript (preprint) (Övrigt vetenskapligt)
Abstract [en]

Machine learning software accounts for a significant amount of energy consumed in data centers. These algorithms are usually optimized towards predictive performance, i.e. accuracy, and scalability. This is the case of data stream mining algorithms. Although these algorithms are adaptive to the incoming data, they have fixed parameters from the beginning of the execution, which lead to energy hotspots. We present dynamic parameter adaptation for data stream mining algorithms to trade-off energy efficiency against accuracy during runtime. To validate this approach, we introduce the nmin adaptation method to improve parameter adaptation in Hoeffding trees. This method dynamically adapts the number of instances needed to make a split (nmin) and thereby reduces the overall energy consumption. We created an experiment to compare the Very Fast Decision Tree algorithm (VFDT, original Hoeffding tree algorithm) with nmin adaptation and the standard VFDT. The results show that VFDT with nmin adaptation consumes up to 89% less energy than the standard VFDT, trading off a few percent of accuracy. Our approach can be used to trade off energy consumption with predictive and computational performance in the strive towards resource-aware machine learning. 

Nyckelord [en]
Hoeffding trees, data stream mining, green computing, green machine learning, energy efficiency
Nationell ämneskategori
Datavetenskap (datalogi)
Identifikatorer
URN: urn:nbn:se:bth-15493OAI: oai:DiVA.org:bth-15493DiVA, id: diva2:1156971
Forskningsfinansiär
KK-stiftelsen, 20140032Tillgänglig från: 2017-11-14 Skapad: 2017-11-14 Senast uppdaterad: 2018-02-02Bibliografiskt granskad
Ingår i avhandling
1. Extraction and Energy Efficient Processing of Streaming Data
Öppna denna publikation i ny flik eller fönster >>Extraction and Energy Efficient Processing of Streaming Data
2017 (Engelska)Licentiatavhandling, sammanläggning (Övrigt vetenskapligt)
Abstract [en]

The interest in machine learning algorithms is increasing, in parallel with the advancements in hardware and software required to mine large-scale datasets. Machine learning algorithms account for a significant amount of energy consumed in data centers, which impacts the global energy consumption. However, machine learning algorithms are optimized towards predictive performance and scalability. Algorithms with low energy consumption are necessary for embedded systems and other resource constrained devices; and desirable for platforms that require many computations, such as data centers. Data stream mining investigates how to process potentially infinite streams of data without the need to store all the data. This ability is particularly useful for companies that are generating data at a high rate, such as social networks.

This thesis investigates algorithms in the data stream mining domain from an energy efficiency perspective. The thesis comprises of two parts. The first part explores how to extract and analyze data from Twitter, with a pilot study that investigates a correlation between hashtags and followers. The second and main part investigates how energy is consumed and optimized in an online learning algorithm, suitable for data stream mining tasks.

The second part of the thesis focuses on analyzing, understanding, and reformulating the Very Fast Decision Tree (VFDT) algorithm, the original Hoeffding tree algorithm, into an energy efficient version. It presents three key contributions. First, it shows how energy varies in the VFDT from a high-level view by tuning different parameters. Second, it presents a methodology to identify energy bottlenecks in machine learning algorithms, by portraying the functions of the VFDT that consume the largest amount of energy. Third, it introduces dynamic parameter adaptation for Hoeffding trees, a method to dynamically adapt the parameters of Hoeffding trees to reduce their energy consumption. The results show an average energy reduction of 23% on the VFDT algorithm.

Ort, förlag, år, upplaga, sidor
Karlskrona: Blekinge Tekniska Högskola, 2017
Serie
Blekinge Institute of Technology Licentiate Dissertation Series, ISSN 1650-2140 ; 3
Nyckelord
machine learning, green computing, data mining, data stream mining, green machine learning
Nationell ämneskategori
Datavetenskap (datalogi)
Identifikatorer
urn:nbn:se:bth-15532 (URN)
Presentation
2017-12-18, J1640, Blekinge Tekniska Högskola, 371 79, Karlskrona, 13:00 (Engelska)
Opponent
Handledare
Projekt
Scalable resource-efficient systems for big data analytics
Forskningsfinansiär
KK-stiftelsen, 20140032
Tillgänglig från: 2017-11-22 Skapad: 2017-11-22 Senast uppdaterad: 2018-01-13Bibliografiskt granskad

Open Access i DiVA

Fulltext saknas i DiVA

Personposter BETA

García Martín, EvaLavesson, NiklasGrahn, HåkanCasalicchio, EmilianoBoeva, Veselka

Sök vidare i DiVA

Av författaren/redaktören
García Martín, EvaLavesson, NiklasGrahn, HåkanCasalicchio, EmilianoBoeva, Veselka
Av organisationen
Institutionen för datalogi och datorsystemteknik
Datavetenskap (datalogi)

Sök vidare utanför DiVA

GoogleGoogle Scholar

urn-nbn

Altmetricpoäng

urn-nbn
Totalt: 285 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf