Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Performance Aspects of Databases and Virtualized Real-time Applications
Blekinge Institute of Technology, Faculty of Computing, Department of Computer Science and Engineering.
2018 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Context: High computing system performance depends on the interaction between software and hardware layers in modern computer systems. Two strong trends that effect different layers in computer systems are that single processors are now more or less completely replaced by multiprocessors, which are often organized into clusters, and virtualization of resources. The performance evaluation of different software on such physical and virtualized resources, is the focus of this thesis.

Objectives: The objectives of this thesis are to investigate the performance evaluation of SQL and No SQL database management systems, namely Cassandra, CouchDB, MongoDB, PostgreSQL, and RethinkDB; and soft real-time application namely, voice-driven web. Scheduling algorithms for resource allocation for hard real-time applications on virtual processor are also investigated.

Methods: Experiment is used to measure the performance of SQL and No SQL management systems on cluster. It is also used to develop a prototype and predicts processor performance of voice-driven web on multiprocessors. Theoretical methods are used to model and design algorithms to schedule real-time applications on the virtual processor machine. Simulation is used to quantify the performance implications of certain parameter values in our theoretical results and to compare expected performance with theoretical bounds in our schedulability tests.

Results:The performance of Cassandra, CouchDB, MongoDB, 2

PostgreSQL, and RethinkDB is evaluated in terms of writing and reading throughput and latencies in cluster computing. For reading throughput, all database systems are horizontally scalable as the cluster’s nodes number increases, however, only Cassandra and couchDB exhibit scalability for data writing. The overall evaluation shows that Cassandra has the most writing scalable throughput as the number of nodes increases with a relative low latency, whereas PostgreSQL has the lowest writing latency, and MongoDB has the lowest reading latency.

The architectures’ tradeoffs of voice-driven web show that the voice engine should be installed on the server instead of being on the mobile device, and performance evaluations show that speech engine scales with respect to the number of cores in the multiprocessor with and without hyperthreading.

The thesis presents scheduling techniques for real-time applications that runs in virtual machines which are time sharing the processor. Each virtual machine’s period and execution time that allow real-time applications to meet their deadlines can be defined using these techniques. Simulation results show the impact of the length of different VM periods with respect to overhead. The tradeoffs between resources consumption and period length are also given. Furthermore, a utilization based test for scheduling real-time application on virtual multiprocessor is presented. This test determines if a task set is schedulable or not. If the task set is schedulable the algorithm provides the priority for each task. This algorithm avoids Dhall’s effect, which may cause task sets with even very low utilization to miss deadlines.

Conclusions: The thesis presented the performance evaluation of reading and writing throughput and latencies for SQL and NoSQL management systems in the cluster computing. The thesis quantifies the tradeoffs of voice-driven web architectures and the performance scalability of the speech engine with respect to number of cores of the multiprocessor. Furthermore, this thesis proposes scheduling algorithms for real-time 3

application with hard deadline on virtual processors, either as a single core processor or as a multicore processor.

Place, publisher, year, edition, pages
Karlskrona: Blekinge Tekniska Högskola, 2018. , p. 228
Series
Blekinge Institute of Technology Doctoral Dissertation Series, ISSN 1653-2090 ; 02
Keywords [en]
SQL and NoSQL database, Bigdata management systems, Structured and non-structured Database Evaluation, Voice-driven web, Multicore performance prediction, Hard real-time Scheduling, Virtual Multiprocessor Scheduling
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:bth-15758ISBN: 978-91-7295-348-2 (print)OAI: oai:DiVA.org:bth-15758DiVA, id: diva2:1173854
Public defence
2018-09-21, J1650, Blekinge Tekniska Högskola, 371 79 Karlskrona, Karlskrona, 13:00 (English)
Opponent
Supervisors
Funder
Sida - Swedish International Development Cooperation AgencyAvailable from: 2018-01-15 Created: 2018-01-14 Last updated: 2019-09-13Bibliographically approved
List of papers
1. Performance Evaluation of SQL and NoSQL Database Management Systems in a Cluster
Open this publication in new window or tab >>Performance Evaluation of SQL and NoSQL Database Management Systems in a Cluster
2017 (English)In: International Journal of Database Management Systems, ISSN 0975-5705, Vol. 9, no 6, p. 1-24Article in journal (Refereed) Published
Abstract [en]

In  this  study,  we  evaluate  the  performance  of  SQL  and  NoSQL  database  management  systems  namely; Cassandra, CouchDB, MongoDB, PostgreSQL, and RethinkDB.  We use a cluster of  four  nodes to run the database  systems,  with  external  load  generators.The  evaluation  is  conducted  using  data  from  Telenor Sverige,  a  telecommunication  company  that  operates in  Sweden.  The  experiments  are  conducted  using three  datasets  of  different  sizes.The  write  throughput  and  latency  as  well  as  the  read  throughput  and latency are evaluated for four queries; namely distance query, k-nearest neighbour query, range query, and region  query.  For  write  operations  Cassandra  has  the  highest  throughput  when  multiple  nodes  are  used, whereas  PostgreSQL  has  the  lowest  latency  and  the  highest  throughput  for  a  single  node.  For  read operations  MongoDB  has  the  lowest  latency  for  all  queries.  However,  Cassandra  has  the  highest throughput  for  reads.  The  throughput  decreasesas  the  dataset  size  increases  for  both  write  and  read,  for both  sequential  as  well  as  random  order  access.  However,  this  decrease  is  more  significant  for  random read and write. In this study, we present the experience we had with these different database management systems including setup and configuration complexity.

Place, publisher, year, edition, pages
Academy and Industry Research Collaboration Center (AIRCC), 2017
Keywords
Trajectory queries, cluster computing, SQL database
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
urn:nbn:se:bth-15754 (URN)10.5121/ijdms.2017.9601 (DOI)
Funder
Knowledge Foundation, 20140032
Available from: 2018-01-14 Created: 2018-01-14 Last updated: 2024-04-10Bibliographically approved
2. Evaluation of Voice-driven Web Application Architecture
Open this publication in new window or tab >>Evaluation of Voice-driven Web Application Architecture
2012 (English)Conference paper, Published paper (Refereed) Published
Abstract [en]

This paper quantifies the implications and trade-offs of three different architectures for voice driven web application, architectures are implemented as prototypes. The prototypes differ from each other by either using recording, or Text To Speech (TTS) as server based, or TTS as client based to process output speech. A typical application used in this paper, is the most dynamic weather information source which is presented as web feeds or Really Simple Syndication (RSS) feeds. The evaluated quality attributes are performance, maintainability, and development effort. The empirical results show that, each system's architecture has a different quality profile, for instance, one architecture has the lowest development time but the highest maintainability cost, and another has the lowest bandwidth requirements but the highest development cost. Finally, suggestions about optimal choice of system architecture according to the quality requirements of the final system are drawn.

Place, publisher, year, edition, pages
Sorrento: IEEE, 2012
Keywords
Voice based web, IVR application, Web voice quality attributes, Voice driven web evaluation, voice driven architecture
National Category
Software Engineering
Identifiers
urn:nbn:se:bth-6949 (URN)10.1109/SITIS.2012.86 (DOI)000315360300079 ()oai:bth.se:forskinfoDB947E19E0BF5EC4C1257B9B0026A943 (Local ID)978-0-7695-4911-8 (ISBN)oai:bth.se:forskinfoDB947E19E0BF5EC4C1257B9B0026A943 (Archive number)oai:bth.se:forskinfoDB947E19E0BF5EC4C1257B9B0026A943 (OAI)
Conference
8th International Conference on Signal Image Technology and Internet Based Systems (SITIS)
Available from: 2013-07-01 Created: 2013-07-01 Last updated: 2018-01-15Bibliographically approved
3. Performance evaluation and prediction of open source speech engine on multicore processors
Open this publication in new window or tab >>Performance evaluation and prediction of open source speech engine on multicore processors
2013 (English)Conference paper, Published paper (Refereed)
Abstract [en]

This paper quantifies the performance of the core part of voice driven web using free and open source speech engine; the speech engine which is very high computation demanding, it consists of Automatic Speech Recognition (ASR) and Text To Speech (TTS). Two open source programs, Sphinx-4 and FreeTTS-1.2.2 are used for ASR and TTS respectively. These two programs are executed on 2 different hardware multicore processors with 4 hyperthreaded cores, and 8 cores respectively. The response time with respect to the load variance and the number of cores is measured and predicted using a linear regression model. The results show that, the response time is linear with respect to the input length, this property can be used to directly predict the response for any input length. Moreover, though the response time and the speed up increases as the number of cores increases, the regression coefficients and number of threads reveal that ASR benefits from multicore. The speedup factor for ASR is 1.56 for 8 cores. However for FreeTTS, though being sequential the speed up from the program itself is insignificant, there is about 1. 43 speedup for 8 cores, that comes from the system's contribution. Our findings show that the generalization of the results for multicore processor does not apply to hyperthreading. This paper presents the investigation that is useful for educators, researchers, and applications' developer in voice based applications 'domain.

Place, publisher, year, edition, pages
Luxembourg: ACM, 2013
Keywords
linear regression, multicore performance, open source, performance prediction, speech recognition, text to speech, voice driven web
National Category
Software Engineering
Identifiers
urn:nbn:se:bth-6735 (URN)10.1145/2536146.2536184 (DOI)9781450320047 (ISBN)
Conference
5th International Conference on Management of Emergent Digital EcoSystems (MEDES),Luxembourg
Available from: 2014-04-14 Created: 2014-04-14 Last updated: 2018-01-15Bibliographically approved
4. Period assignment in real-time scheduling of multiple virtual machines
Open this publication in new window or tab >>Period assignment in real-time scheduling of multiple virtual machines
2015 (English)In: Proceedings of the 7th International Conference on Management of computational and collective intElligence in Digital EcoSystems, Association for Computing Machinery (ACM), 2015, p. 180-187Conference paper, Published paper (Refereed)
Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2015
Keywords
Virtualization; Real-time scheduling; Hard- deadlines; Virtual Machine scheduling; VM period assignment.
National Category
Computer Sciences
Identifiers
urn:nbn:se:bth-11801 (URN)10.1145/2857218.2857262 (DOI)
Conference
7th International Conference on Management of computational and collective intElligence in Digital EcoSystems, Sao Paulo, Brazil
Available from: 2016-04-12 Created: 2016-04-12 Last updated: 2021-05-05Bibliographically approved
5. “Real-time scheduling of multiple virtual machines
Open this publication in new window or tab >>“Real-time scheduling of multiple virtual machines
2017 (English)In: International journal of Computers and their applications, Vol. 24, no 3, p. 91-109Article in journal (Refereed) Published
Abstract [en]

    The use of virtualized systems is growing, and one would like to   benefit from   this   kind   of   systems   also   for   real-time applications  with  hard  deadlines.    There  are  two  levels  of  scheduling  in  real-time  applications  executing  in  a  virtualized  environment: traditional real-time scheduling of the tasks in the real-time  application  inside  a  Virtual  Machine  (VM),  and  scheduling   of   different   VMs   on   the   hypervisor   level.   Traditional real-time scheduling uses methods based on periods, deadlines and worst-case execution times of the real-time tasks.In   order   to   apply   the   existing   theory   also   to   virtualized   environments   we   must   obtain   periods   and   (worst-case) execution times for VMs containing real-time applications.   In this paper, we describe a technique for calculating periods and execution  times  and  utilization  for  VMs  containing  real-time applications with hard deadlines.  We show that when we look at  all  VMs  that  share  a  physical  processor  we  are  able  to  use  longer  (better)  periods.   Alternatively,  if  the  periods  are  the  same,  we  are  able  to  use  a  smaller  amount  of  the  processor  resource  for  the  VMs  and  more  tasks  become  schedulable  compared to when we look at each  VM  in  isolation.   We  also  introduce an overhead model that makes it possible to find VM periods that minimize the processor utilization.

Keywords
Real-time virtual machine;real-time scheduling hard deadlines; VM overhead; VM period
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
urn:nbn:se:bth-15755 (URN)
Funder
Knowledge Foundation, 20140032
Available from: 2018-01-14 Created: 2018-01-14 Last updated: 2018-01-19Bibliographically approved
6. A utilization-based schedulability test of real-time systems running on a multiprocessor virtual machine
Open this publication in new window or tab >>A utilization-based schedulability test of real-time systems running on a multiprocessor virtual machine
2019 (English)In: Computer journal, ISSN 0010-4620, E-ISSN 1460-2067, Vol. 62, no 6, p. 884-904, article id bxy005Article in journal (Refereed) Published
Abstract [en]

We consider a real-time application that executes in a VM with multiple virtual cores. Tasks are scheduled globally using fixed-priority scheduling. In order to avoid Dhalls effect, we classify tasks into two priority classes: heavy and light. Heavy tasks have higher priority than light tasks. For light tasks we use rate monotonic priority assignment. We propose a utilization-based schedulability test. If the task set is schedulable, we provide an assignment of priorities to tasks. The input to the test is the task set, the number of cores in the VM, the period, deadline and blocking time for the VM. We evaluate how jitter, when scheduling VMs on the hypervisor level, affects the schedulability of the real-time tasks running in the VM. The schedulability of the real-time tasks in the VM decreases when the hypervisor jitter increases, but on the other hand the schedulability on the hypervisor level increases if we allow more jitter, i.e. there is a trade-off. Our results make it possible to evaluate this trade-off and take informed decisions when selecting scheduling parameters on the hypervisor level. Simulations show that the priority assignment used by our algorithm schedules more task sets than using rate monotonic priority assignment. © 2019 The British Computer Society. All rights reserved.

Place, publisher, year, edition, pages
Oxford University Press, 2019
Keywords
Global fixed priority scheduling, Hard real-time scheduling, Multiprocessor utilization based schedulability test, Virtual multiprocessor scheduling, VM deadline, Economic and social effects, Interactive computer systems, Jitter, Multiprocessing systems, Scheduling, Scheduling algorithms, Virtual machine, Fixed priority scheduling, Hard real-time, Multi processor scheduling, Schedulability test, Real time systems
National Category
Computer Sciences
Identifiers
urn:nbn:se:bth-18589 (URN)10.1093/comjnl/bxz005 (DOI)000482395200007 ()2-s2.0-85068528659 (Scopus ID)
Funder
Knowledge Foundation, 20140032
Note

open access

Available from: 2019-09-09 Created: 2019-09-09 Last updated: 2019-09-20Bibliographically approved

Open Access in DiVA

fulltext(38322 kB)395 downloads
File information
File name FULLTEXT02.pdfFile size 38322 kBChecksum SHA-512
ec552ac50a1aac1465df353e1e8810572da7d66a21f11aef904da4e6be19c2d36eea27006a2e8f578136ac9402ed419229769d71790de2bcb9e17bc44a9b8db3
Type fulltextMimetype application/pdf

Authority records

Niyizamwiyitira, Christine

Search in DiVA

By author/editor
Niyizamwiyitira, Christine
By organisation
Department of Computer Science and Engineering
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 475 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 3392 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf