Performance Evaluation of Cassandra in a Virtualized Environment
2017 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE credits
Student thesis
Abstract [en]
Context. Apache Cassandra is an open-source, scalable, NoSQL database that distributes the data over many commodity servers. It provides no single point of failure by copying and storing the data in different locations. Cassandra uses a ring design rather than the traditional master-slave design.
Virtualization is the technique using which physical resources of a machine are divided and utilized by various virtual machines. It is the fundamental technology, which allows cloud computing to provide resource sharing among the users.
Objectives. Through this research, the effects of virtualization on Cassandra are observed by comparing the virtual machine arrangement to physical machine arrangement along with the overhead caused by virtualization.
Methods. An experiment is conducted in this study to identify the aforementioned effects of virtualization on Cassandra compared to the physical machines. Cassandra runs on physical machines with Ubuntu 14.04 LTS arranged in a multi node cluster. Results are obtained by executing the mixed, read only and write only operations in the Cassandra stress tool on the data populated in this cluster. This procedure is repeated for 100% and 66% workload. The same procedure is repeated in virtual machines cluster and the results are compared.
Results. Virtualization overhead has been identified in terms of CPU utilization and the effects of virtualization on Cassandra are found out in terms of Disk utilization, throughput and latency.
Conclusions. The overhead caused due to virtualization is observed and the effect of this overhead on the performance of Cassandra has been identified. The consequence of the virtualization overhead has been related to the change in performance of Cassandra.
Place, publisher, year, edition, pages
2017.
Keywords [en]
Cassandra, Virtualization, NoSQL databases
National Category
Computer Systems
Identifiers
URN: urn:nbn:se:bth-14032OAI: oai:DiVA.org:bth-14032DiVA, id: diva2:1083541
Subject / course
DV2566 Master's Thesis (120 credits) in Computer Science
Educational program
DVAXA Master of Science Programme in Computer Science
Presentation
2017-01-24, 15:00 (English)
Supervisors
Examiners
2017-03-232017-03-212017-03-23Bibliographically approved