Distributed databases for Multi Mediation: Scalability, Availability & Performance
2015 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE credits
Student thesis
Abstract [en]
Context: Multi Mediation is a process of collecting data from network(s) & network elements, pre-processing this data and distributing it to various systems like Big Data analysis, Billing Systems, Network Monitoring Systems, and Service Assurance etc. With the growing demand for networks and emergence of new services, data collected from networks is growing. There is need for efficiently organizing this data and this can be done using databases. Although RDBMS offers Scale-up solutions to handle voluminous data and concurrent requests, this approach is expensive. So, alternatives like distributed databases are an attractive solution. Suitable distributed database for Multi Mediation, needs to be investigated.
Objectives: In this research we analyze two distributed databases in terms of performance, scalability and availability. The inter-relations between performance, scalability and availability of distributed databases are also analyzed. The distributed databases that are analyzed are MySQL Cluster 7.4.4 and Apache Cassandra 2.0.13. Performance, scalability and availability are quantified, measurements are made in the context of Multi Mediation system.
Methods: The methods to carry out this research are both qualitative and quantitative. Qualitative study is made for the selection of databases for evaluation. A benchmarking harness application is designed to quantitatively evaluate the performance of distributed database in the context of Multi Mediation. Several experiments are designed and performed using the benchmarking harness on the database cluster.
Results: Results collected include average response time & average throughput of the distributed databases in various scenarios. The average throughput & average INSERT response time results favor Apache Cassandra low availability configuration. MySQL Cluster average SELECT response time is better than Apache Cassandra for greater number of client threads, in high availability and low availability configurations.Conclusions: Although Apache Cassandra outperforms MySQL Cluster, the support for transaction and ACID compliance are not to be forgotten for the selection of database. Apart from the contextual benchmarks, organizational choices, development costs, resource utilizations etc. are more influential parameters for selection of database within an organization. There is still a need for further evaluation of distributed databases.
Place, publisher, year, edition, pages
2015. , p. 52
Keywords [en]
Availability, Benchmarking, Distributed databases, Performance, Scalability.
National Category
Communication Systems Telecommunications Computer Systems
Identifiers
URN: urn:nbn:se:bth-1018OAI: oai:DiVA.org:bth-1018DiVA, id: diva2:825934
External cooperation
Ericsson
Subject / course
ET2580 Master's Thesis (120 credits) in Electrical Engineering with emphasis on Telecommunication Systems
Educational program
ETATE Master of Science Programme in Electrical Engineering with emphasis on Telecommunication Systems
Presentation
2015-06-02, J3208 “Claude Shannon”, BTH, Karlskrona, 13:04 (Swedish)
Supervisors
Examiners
Note
I am indebted to my advisor Prof. Lars Lundberg and his valuable ideas which helped in the completion of this work. In fact he has guided on every crucial and important stages of this research work.
I sincerely thank Prof. Markus Fiedler & Prof. Kurt Tutschku for their endless support during the work.
I am grateful to Neeraj Garg, Sourab, Saket & Kulbir at Ericsson, for providing me necessary equipment and helping me financially during my work.
To my family members and friends who one way or the other shared their support. Thank you.
Above all I would like to thank the Supreme Personality of Godhead, the author of everything.
2015-06-302015-06-242015-07-16Bibliographically approved