Change search
Link to record
Permanent link

Direct link
BETA
Alternative names
Publications (10 of 107) Show all publications
Westphal, F., Lavesson, N. & Grahn, H. (2019). A Case for Guided Machine Learning. In: Andreas Hozinger, Peter Kieseberg, A Min Tjoa and Edgar Weippl (Ed.), Machine Learning and Knowledge Extraction: . Paper presented at International Cross-Domain Conference for Machine Learning and Knowledge Extraction, Canterbury 26-29 August (pp. 353-361). Springer
Open this publication in new window or tab >>A Case for Guided Machine Learning
2019 (English)In: Machine Learning and Knowledge Extraction / [ed] Andreas Hozinger, Peter Kieseberg, A Min Tjoa and Edgar Weippl, Springer, 2019, p. 353-361Conference paper, Published paper (Refereed)
Abstract [en]

Involving humans in the learning process of a machine learning algorithm can have many advantages ranging from establishing trust into a particular model to added personalization capabilities to reducing labeling efforts. While these approaches are commonly summarized under the term interactive machine learning (iML), no unambiguous definition of iML exists to clearly define this area of research. In this position paper, we discuss the shortcomings of current definitions of iML and propose and define the term guided machine learning (gML) as an alternative.

Place, publisher, year, edition, pages
Springer, 2019
Keywords
guided machine learning, interactive machine learning, human-in-the-loop, definition
National Category
Human Computer Interaction Computer Sciences
Identifiers
urn:nbn:se:bth-18708 (URN)10.1007/978-3-030-29726-8_22 (DOI)978-3-030-29726-8 (ISBN)
Conference
International Cross-Domain Conference for Machine Learning and Knowledge Extraction, Canterbury 26-29 August
Projects
Scalable resource-efficient systems for big data analytics
Funder
Knowledge Foundation, 20140032
Available from: 2019-09-27 Created: 2019-09-27 Last updated: 2019-10-09Bibliographically approved
Kusetogullari, H., Yavariabdi, A., Cheddad, A., Grahn, H. & Johan, H. (2019). ARDIS: A Swedish Historical Handwritten Digit Dataset. Neural computing & applications (Print)
Open this publication in new window or tab >>ARDIS: A Swedish Historical Handwritten Digit Dataset
Show others...
2019 (English)In: Neural computing & applications (Print), ISSN 0941-0643, E-ISSN 1433-3058Article in journal (Refereed) Epub ahead of print
Abstract [en]

This paper introduces a new image-based handwrittenhistorical digit dataset named ARDIS (Arkiv DigitalSweden). The images in ARDIS dataset are extractedfrom 15,000 Swedish church records which were writtenby different priests with various handwriting styles in thenineteenth and twentieth centuries. The constructed datasetconsists of three single digit datasets and one digit stringsdataset. The digit strings dataset includes 10,000 samplesin Red-Green-Blue (RGB) color space, whereas, the otherdatasets contain 7,600 single digit images in different colorspaces. An extensive analysis of machine learning methodson several digit datasets is examined. Additionally, correlationbetween ARDIS and existing digit datasets ModifiedNational Institute of Standards and Technology (MNIST)and United States Postal Service (USPS) is investigated. Experimental results show that machine learning algorithms,including deep learning methods, provide low recognitionaccuracy as they face difficulties when trained on existingdatasets and tested on ARDIS dataset. Accordingly, ConvolutionalNeural Network (CNN) trained on MNIST andUSPS and tested on ARDIS provide the highest accuracies 58.80% and 35.44%, respectively. Consequently, the resultsreveal that machine learning methods trained on existingdatasets can have difficulties to recognize digits effectivelyon our dataset which proves that ARDIS dataset hasunique characteristics. This dataset is publicly available forthe research community to further advance handwritten digitrecognition algorithms.

Place, publisher, year, edition, pages
Springer Nature Switzerland, 2019
Keywords
Handwritten digit recognition, ARDIS dataset, Machine learning methods, Benchmark
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:bth-17741 (URN)10.1007/s00521-019-04163-3 (DOI)
Funder
Knowledge Foundation, 20140032
Available from: 2019-03-27 Created: 2019-03-27 Last updated: 2019-05-02Bibliographically approved
García Martín, E., Rodrigues, C. F., Riley, G. & Grahn, H. (2019). Estimation of energy consumption in machine learning. Journal of Parallel and Distributed Computing, 75-88
Open this publication in new window or tab >>Estimation of energy consumption in machine learning
2019 (English)In: Journal of Parallel and Distributed Computing, ISSN 0743-7315, E-ISSN 1096-0848, p. 75-88Article in journal (Refereed) Published
Abstract [en]

Energy consumption has been widely studied in the computer architecture field for decades. While the adoption of energy as a metric in machine learning is emerging, the majority of research is still primarily focused on obtaining high levels of accuracy without any computational constraint. We believe that one of the reasons for this lack of interest is due to their lack of familiarity with approaches to evaluate energy consumption. To address this challenge, we present a review of the different approaches to estimate energy consumption in general and machine learning applications in particular. Our goal is to provide useful guidelines to the machine learning community giving them the fundamental knowledge to use and build specific energy estimation methods for machine learning algorithms. We also present the latest software tools that give energy estimation values, together with two use cases that enhance the study of energy consumption in machine learning.

Place, publisher, year, edition, pages
Academic Press, 2019
Keywords
Deep learning, Energy consumption, Green AI, High performance computing, Machine learning
National Category
Computer Sciences
Identifiers
urn:nbn:se:bth-18650 (URN)10.1016/j.jpdc.2019.07.007 (DOI)
Note

Funding text

Eva García-Martín and Håkan Grahn work under the research project “Scalable resource-efficient systems for big data analytics” funded by the Knowledge Foundation (grant: 20140032 ) in Sweden. Crefeda Faviola Rodrigues and Graham Riley are funded under the European FP7-INFRASTRUCTURES-2012-1 call (grant: 312979 ) and part-funded by ARM Ltd., UK under a Ph.D. Studentship Agreement. Eva Garcia-Martin is a Ph.D. student in Machine Learning at Blekinge Institute of Technology, in Sweden. She is working under the project Scalable resource- efficient systems for big data analytics funded by the Knowledge Foundation, advised by Niklas Lavesson and Håkan Grahn. The main focus of her thesis is on making machine learning algorithms more energy efficient. In particular, she has studied the energy consumption patterns of streaming algorithms, and then proposed new algorithm extensions that reduce their energy consumption. Personal website: https://egarciamartin.github.io/. Crefeda Faviola Rodrigues is a Ph.D. student in Advanced Processor Technology (APT) group at The University of Manchester and she is supervised by Mr. Graham Riley and Dr. Mikel Lujan. Her research is part funded by ARM and IS-ENES2 Project. Her research topic is “Efficient execution of Convolutional Neural Networks on low power heterogeneous systems”. The main focus of her thesis is to enable energy efficiency in deep learning algorithms such as Convolutional Neural Networks or ConvNets on embedded platforms like the Jetson TX1 and Snapdragon 820. Personal website: https://personalpages.manchester.ac.uk/staff/crefeda.rodrigues/. Graham Riley is a Lecturer in the School of Computer Science at the University of Manchester and hold a part-time position in the Scientific Computing Department (SCD) at STFC, Daresbury. His research is application-driven and much of his research has been undertaken in collaboration with computational scientists in application areas such as Earth System Modeling (including the U.K. Met Office) and, previously, computational chemistry and biology. His aim is to apply his experience in high performance computing and software engineering for (principally) scientific computing to new application domains. He is also interested in techniques and tools to support flexible coupled modeling in scientific computing and in performance modeling techniques for large-scale heterogeneous HPC systems, where energy efficiency is increasingly key. Personal website: http://www.manchester.ac.uk/research/graham.riley/. Håkan Grahn is professor of computer engineering since 2007. He received a M.Sc. degree in Computer Science and Engineering in 1990 and a Ph.D. degree in Computer Engineering in 1995, both from Lund University. His main interests are computer architecture, multicore systems, GPU computing, parallel programming, image processing, and machine learning/data mining. He has published more than 100 papers on these subjects. During 1999–2002 he was head of department for the Dept. of software engineering and computer science, and during 2011–2013, he was Dean of research at Blekinge Institute of Technology. Currently he is project leader for BigData@BTH – “Scalable resource-efficient systems for big data analytics”, a research profile funded by the Knowledge foundation during 2014–2020. Personal website: https://www.bth.se/eng/staff/hakan-grahn-hgr/.

Available from: 2019-09-11 Created: 2019-09-11 Last updated: 2019-09-11Bibliographically approved
Sidorova, Y., Rosander, O., Sköld, L., Grahn, H. & Lundberg, L. (2019). Finding a healthy equilibrium of geo-demographic segments for a telecom business: Who are malicious hot-spotters?. In: George A. Tsihrintzis, Dionisios N. Sotiropoulos, Lakhmi C. Jain (Ed.), Machine Learning Paradigms: Advances in Data Analytics (pp. 187-196). Springer Science and Business Media Deutschland GmbH
Open this publication in new window or tab >>Finding a healthy equilibrium of geo-demographic segments for a telecom business: Who are malicious hot-spotters?
Show others...
2019 (English)In: Machine Learning Paradigms: Advances in Data Analytics / [ed] George A. Tsihrintzis, Dionisios N. Sotiropoulos, Lakhmi C. Jain, Springer Science and Business Media Deutschland GmbH , 2019, p. 187-196Chapter in book (Refereed)
Abstract [en]

In telecommunication business, a major investment goes into the infrastructure and its maintenance, while business revenues are proportional to how big, good, and well-balanced the customer base is. In our previous work we presented a data-driven analytic strategy based on combinatorial optimization and analysis of the historical mobility designed to quantify the desirability of different geo-demographic segments, and several segments were recommended for a partial reduction. Within a segment, clients are different. In order to enable intelligent reduction, we introduce the term infrastructure-stressing client and, using the proposed method, we reveal the list of the IDs of such clients. We also have developed a visualization tool to allow for manual checks: it shows how the client moved through a sequence of hot spots and was repeatedly served by critically loaded antennas. The code and the footprint matrix are available on the SourceForge. © 2019, Springer International Publishing AG, part of Springer Nature.

Place, publisher, year, edition, pages
Springer Science and Business Media Deutschland GmbH, 2019
Series
Intelligent Systems Reference Library, ISSN 1868-4394 ; 149
Keywords
Business intelligence, Combinatorial optimization, Fuzzy logic, Geo-demographic segments, Mobility data, MOSAIC
National Category
Telecommunications Business Administration Computer Sciences
Identifiers
urn:nbn:se:bth-16885 (URN)10.1007/978-3-319-94030-4_8 (DOI)2-s2.0-85049522294 (Scopus ID)978-3-319-94029-8 (ISBN)
Available from: 2018-08-20 Created: 2018-08-20 Last updated: 2018-08-20Bibliographically approved
García Martín, E., Lavesson, N., Grahn, H., Casalicchio, E. & Boeva, V. (2019). How to Measure Energy Consumption in Machine Learning Algorithms. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics): ECMLPKDD 2018: European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases Workshops. Lecture Notes in Computer Science. Springer, Cham. Paper presented at European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2018; Dublin; Ireland; 10 September 2018 through 14 September 2018 (pp. 243-255). , 11329
Open this publication in new window or tab >>How to Measure Energy Consumption in Machine Learning Algorithms
Show others...
2019 (English)In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics): ECMLPKDD 2018: European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases Workshops. Lecture Notes in Computer Science. Springer, Cham, 2019, Vol. 11329, p. 243-255Conference paper, Published paper (Refereed)
Abstract [en]

Machine learning algorithms are responsible for a significant amount of computations. These computations are increasing with the advancements in different machine learning fields. For example, fields such as deep learning require algorithms to run during weeks consuming vast amounts of energy. While there is a trend in optimizing machine learning algorithms for performance and energy consumption, still there is little knowledge on how to estimate an algorithm’s energy consumption. Currently, a straightforward cross-platform approach to estimate energy consumption for different types of algorithms does not exist. For that reason, well-known researchers in computer architecture have published extensive works on approaches to estimate the energy consumption. This study presents a survey of methods to estimate energy consumption, and maps them to specific machine learning scenarios. Finally, we illustrate our mapping suggestions with a case study, where we measure energy consumption in a big data stream mining scenario. Our ultimate goal is to bridge the current gap that exists to estimate energy consumption in machine learning scenarios.

Series
Lecture Notes in Computer Science, ISSN 0302-9743 ; 11329
Keywords
Computer architecture, Energy efficiency, Green computing, Machine learning
National Category
Computer Sciences
Identifiers
urn:nbn:se:bth-17209 (URN)10.1007/978-3-030-13453-2_20 (DOI)9783030134525 (ISBN)
Conference
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2018; Dublin; Ireland; 10 September 2018 through 14 September 2018
Funder
Knowledge Foundation, 20140032
Available from: 2018-11-01 Created: 2018-11-01 Last updated: 2019-04-18Bibliographically approved
Nordahl, C., Boeva, V., Grahn, H. & Netz Persson, M. (2019). Monitoring Household Electricity Consumption Behaviour for Mining Changes. In: : . Paper presented at 3rd International Workshop on Aging, Rehabilitation and Independent Assisted Living (ARIAL), International Joint Conferenec on Artificial Intelligence (IJCAI), August 10-16, 2019, Macao, China..
Open this publication in new window or tab >>Monitoring Household Electricity Consumption Behaviour for Mining Changes
2019 (English)Conference paper, Oral presentation only (Refereed)
Abstract [en]

In this paper, we present an ongoing work on using a household electricity consumption behavior model for recognizing changes in sleep patterns. The work is inspired by recent studies in neuroscience revealing an association between dementia and sleep disorders and more particularly, supporting the hypothesis that insomnia may be a predictor for dementia in older adults. Our approach initially creates a clustering model of normal electricity consumption behavior of the household by using historical data. Then we build a new clustering model on a new set of electricity consumption data collected over a predefined time period and compare the existing model with the built new electricity consumption behavior model. If a discrepancy between the two clustering models is discovered a further analysis of the current electricity consumption behavior is conducted in order to investigate whether this discrepancy is associated with alterations in the resident’s sleep patterns. The approach is studied and initially evaluated on electricity consumption data collected from a single randomly selected anonymous household. The obtained results show that our approach is robust to mining changes in the resident daily routines by monitoring and analyzing their electricity consumption behavior model.

National Category
Computer Sciences
Identifiers
urn:nbn:se:bth-18651 (URN)
Conference
3rd International Workshop on Aging, Rehabilitation and Independent Assisted Living (ARIAL), International Joint Conferenec on Artificial Intelligence (IJCAI), August 10-16, 2019, Macao, China.
Projects
Scalable resource-efficient systems for big data analytics
Available from: 2019-09-12 Created: 2019-09-12 Last updated: 2019-10-17Bibliographically approved
Nordahl, C., Boeva, V., Grahn, H. & Netz Persson, M. (2019). Profiling of household residents’ electricity consumption behavior using clustering analysis. In: Lect. Notes Comput. Sci.: . Paper presented at International Conference on Computational Science, ICCS, Faro, Algarve, 12 June 2019 through 14 June 2019 (pp. 779-786). Springer Verlag
Open this publication in new window or tab >>Profiling of household residents’ electricity consumption behavior using clustering analysis
2019 (English)In: Lect. Notes Comput. Sci., Springer Verlag , 2019, p. 779-786Conference paper, Published paper (Refereed)
Abstract [en]

In this study we apply clustering techniques for analyzing and understanding households’ electricity consumption data. The knowledge extracted by this analysis is used to create a model of normal electricity consumption behavior for each particular household. Initially, the household’s electricity consumption data are partitioned into a number of clusters with similar daily electricity consumption profiles. The centroids of the generated clusters can be considered as representative signatures of a household’s electricity consumption behavior. The proposed approach is evaluated by conducting a number of experiments on electricity consumption data of ten selected households. The obtained results show that the proposed approach is suitable for data organizing and understanding, and can be applied for modeling electricity consumption behavior on a household level. © Springer Nature Switzerland AG 2019.

Place, publisher, year, edition, pages
Springer Verlag, 2019
Series
Lecture Notes in Computer Science ; 11540
Keywords
Ambient Assisted Living, Non-intrusive remote monitoring, Assisted living, Clustering analysis, Clustering techniques, Electricity-consumption, Household level, Number of clusters, Remote monitoring, Electric power utilization
National Category
Computer Sciences
Identifiers
urn:nbn:se:bth-18593 (URN)10.1007/978-3-030-22750-0_78 (DOI)2-s2.0-85068459816 (Scopus ID)9783030227494 (ISBN)
Conference
International Conference on Computational Science, ICCS, Faro, Algarve, 12 June 2019 through 14 June 2019
Available from: 2019-09-09 Created: 2019-09-09 Last updated: 2019-10-17Bibliographically approved
Westphal, F., Lavesson, N. & Grahn, H. (2018). Document Image Binarization Using Recurrent Neural Networks. In: Proceedings - 13th IAPR International Workshop on Document Analysis Systems, DAS 2018: . Paper presented at 2018 13th IAPR International Workshop on Document Analysis Systems (DAS), vienna (pp. 263-268). IEEE
Open this publication in new window or tab >>Document Image Binarization Using Recurrent Neural Networks
2018 (English)In: Proceedings - 13th IAPR International Workshop on Document Analysis Systems, DAS 2018, IEEE, 2018, p. 263-268Conference paper, Published paper (Refereed)
Abstract [en]

In the context of document image analysis, image binarization is an important preprocessing step for other document analysis algorithms, but also relevant on its own by improving the readability of images of historical documents. While historical document image binarization is challenging due to common image degradations, such as bleedthrough, faded ink or stains, achieving good binarization performance in a timely manner is a worthwhile goal to facilitate efficient information extraction from historical documents. In this paper, we propose a recurrent neural network based algorithm using Grid Long Short-Term Memory cells for image binarization, as well as a pseudo F-Measure based weighted loss function. We evaluate the binarization and execution performance of our algorithm for different choices of footprint size, scale factor and loss function. Our experiments show a significant trade-off between binarization time and quality for different footprint sizes. However, we see no statistically significant difference when using different scale factors and only limited differences for different loss functions. Lastly, we compare the binarization performance of our approach with the best performing algorithm in the 2016 handwritten document image binarization contest and show that both algorithms perform equally well.

Place, publisher, year, edition, pages
IEEE, 2018
Keywords
image binarization, recurrent neural networks, Grid LSTM, historical documents, Text analysis, Labeling, Recurrent neural networks, Heuristic algorithms, Training, Degradation, Ink
National Category
Computer Vision and Robotics (Autonomous Systems)
Identifiers
urn:nbn:se:bth-16749 (URN)10.1109/DAS.2018.71 (DOI)000467070300045 ()978-1-5386-3346-5 (ISBN)
Conference
2018 13th IAPR International Workshop on Document Analysis Systems (DAS), vienna
Funder
Knowledge Foundation, 20140032
Available from: 2018-07-06 Created: 2018-07-06 Last updated: 2019-06-28Bibliographically approved
Westphal, F., Grahn, H. & Lavesson, N. (2018). Efficient document image binarization using heterogeneous computing and parameter tuning. International Journal on Document Analysis and Recognition, 21(1-2), 41-58
Open this publication in new window or tab >>Efficient document image binarization using heterogeneous computing and parameter tuning
2018 (English)In: International Journal on Document Analysis and Recognition, ISSN 1433-2833, E-ISSN 1433-2825, Vol. 21, no 1-2, p. 41-58Article in journal (Refereed) Published
Abstract [en]

In the context of historical document analysis, image binarization is a first important step, which separates foreground from background, despite common image degradations, such as faded ink, stains, or bleed-through. Fast binarization has great significance when analyzing vast archives of document images, since even small inefficiencies can quickly accumulate to years of wasted execution time. Therefore, efficient binarization is especially relevant to companies and government institutions, who want to analyze their large collections of document images. The main challenge with this is to speed up the execution performance without affecting the binarization performance. We modify a state-of-the-art binarization algorithm and achieve on average a 3.5 times faster execution performance by correctly mapping this algorithm to a heterogeneous platform, consisting of a CPU and a GPU. Our proposed parameter tuning algorithm additionally improves the execution time for parameter tuning by a factor of 1.7, compared to previous parameter tuning algorithms. We see that for the chosen algorithm, machine learning-based parameter tuning improves the execution performance more than heterogeneous computing, when comparing absolute execution times. © 2018 The Author(s)

Place, publisher, year, edition, pages
Springer Verlag, 2018
Keywords
Automatic parameter tuning, Heterogeneous computing, Historical documents, Image binarization, Bins, History, Image analysis, Learning systems, Document image binarization, Government institutions, Heterogeneous platforms, Parameter tuning algorithm, Parameter estimation
National Category
Computer Sciences
Identifiers
urn:nbn:se:bth-15891 (URN)10.1007/s10032-017-0293-7 (DOI)000433193500003 ()2-s2.0-85041228615 (Scopus ID)
Available from: 2018-02-15 Created: 2018-02-15 Last updated: 2018-08-27Bibliographically approved
García Martín, E., Lavesson, N., Grahn, H., Casalicchio, E. & Boeva, V. (2018). Hoeffding Trees with nmin adaptation. In: The 5th IEEE International Conference on Data Science and Advanced Analytics (DSAA 2018): . Paper presented at 5th IEEE International Conference on Data Science and Advanced Analytics (IEEE DSAA), 1–4 October 2018, Turin (pp. 70-79). IEEE
Open this publication in new window or tab >>Hoeffding Trees with nmin adaptation
Show others...
2018 (English)In: The 5th IEEE International Conference on Data Science and Advanced Analytics (DSAA 2018), IEEE, 2018, p. 70-79Conference paper, Published paper (Refereed)
Abstract [en]

Machine learning software accounts for a significant amount of energy consumed in data centers. These algorithms are usually optimized towards predictive performance, i.e. accuracy, and scalability. This is the case of data stream mining algorithms. Although these algorithms are adaptive to the incoming data, they have fixed parameters from the beginning of the execution. We have observed that having fixed parameters lead to unnecessary computations, thus making the algorithm energy inefficient.In this paper we present the nmin adaptation method for Hoeffding trees. This method adapts the value of the nmin pa- rameter, which significantly affects the energy consumption of the algorithm. The method reduces unnecessary computations and memory accesses, thus reducing the energy, while the accuracy is only marginally affected. We experimentally compared VFDT (Very Fast Decision Tree, the first Hoeffding tree algorithm) and CVFDT (Concept-adapting VFDT) with the VFDT-nmin (VFDT with nmin adaptation). The results show that VFDT-nmin consumes up to 27% less energy than the standard VFDT, and up to 92% less energy than CVFDT, trading off a few percent of accuracy in a few datasets.

Place, publisher, year, edition, pages
IEEE, 2018
Series
Proceedings of the International Conference on Data Science and Advanced Analytics, ISSN 2472-1573
Keywords
data stream mining; green artificial intelligence; energy efficiency; hoeffding trees; energy aware machine learning
National Category
Computer Sciences
Identifiers
urn:nbn:se:bth-17207 (URN)10.1109/DSAA.2018.00017 (DOI)000459238600008 ()978-1-5386-5090-5 (ISBN)
Conference
5th IEEE International Conference on Data Science and Advanced Analytics (IEEE DSAA), 1–4 October 2018, Turin
Funder
Knowledge Foundation, 20140032
Available from: 2018-11-01 Created: 2018-11-01 Last updated: 2019-04-05Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0001-9947-1088

Search in DiVA

Show all publications