Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Compactions in Apache Cassandra: Performance Analysis of Compaction Strategies in Apache Cassandra
Blekinge Tekniska Högskola, Fakulteten för datavetenskaper, Institutionen för kommunikationssystem.
2016 (Engelska)Självständigt arbete på avancerad nivå (masterexamen), 20 poäng / 30 hpStudentuppsats (Examensarbete)
Abstract [en]

Context: The global communication system is in a tremendous growth, leading to wide range of data generation. The Telecom operators in various Telecom Industries, that generate large amount of data has a need to manage these data efficiently. As the technology involved in the database management systems is increasing, there is a remarkable growth of NoSQL databases in the 20th century. Apache Cassandra is an advanced NoSQL database system, which is popular for handling semi-structured and unstructured format of Big Data. Cassandra has an effective way of compressing data by using different compaction strategies. This research is focused on analyzing the performances of different compaction strategies in different use cases for default Cassandra stress model. The analysis can suggest better usage of compaction strategies in Cassandra, for a write heavy workload.

Objectives: In this study, we investigate the appropriate performance metrics to evaluate the performance of compaction strategies. We provide the detailed analysis of Size Tiered Compaction Strategy, Date Tiered Compaction Strategy, and Leveled Compaction Strategy for a write heavy (90/10) work load, using default cassandra stress tool.

Methods: A detailed literature research has been conducted to study the NoSQL databases, and the working of different compaction strategies in Apache Cassandra. The performances metrics are considered by the understanding of the literature research conducted, and considering the opinions of supervisors and Ericsson’s Apache Cassandra team. Two different tools were developed for collecting the performances of the considered metrics. The first tool was developed using Jython scripting language to collect the cassandra metrics, and the second tool was developed using python scripting language to collect the Operating System metrics. The graphs have been generated in Microsoft Excel, using the values obtained from the scripts.

Results: Date Tiered Compaction Strategy and Size Tiered Compaction strategy showed more or less similar behaviour during the stress tests conducted. Level Tiered Compaction strategy has showed some remarkable results that effected the system performance, as compared to date tiered compaction and size tiered compaction strategies. Date tiered compaction strategy does not perform well for default cassandra stress model. Size tiered compaction can be preferred for default cassandra stress model, but not considerable for big data.

Conclusions: With a detailed analysis and logical comparison of metrics, we finally conclude that Level Tiered Compaction Strategy performs better for a write heavy (90/10) workload while using default cassandra stress model, as compared to size tiered compaction and date tiered compaction strategies.

Ort, förlag, år, upplaga, sidor
2016. , s. 30
Nyckelord [en]
Apache Cassandra, Compaction Strategies, Default Cassandra Stress model, Performance, NoSQL Database
Nationell ämneskategori
Telekommunikation
Identifikatorer
URN: urn:nbn:se:bth-12885OAI: oai:DiVA.org:bth-12885DiVA, id: diva2:948190
Externt samarbete
Ericsson, Karlskrona
Ämne / kurs
ET2580 Masterarbete i elektroteknik med inriktning mot telekommunikationssystem
Utbildningsprogram
ETATE Masterprogram i Elektroteknik med inriktning mot Telekommunikation
Presentation
2016-05-31, J1640, Valhallavagen, 37141, Karlskrona, 11:15 (Engelska)
Handledare
Examinatorer
Tillgänglig från: 2016-08-08 Skapad: 2016-07-08 Senast uppdaterad: 2025-09-30Bibliografiskt granskad

Open Access i DiVA

fulltext(1181 kB)1725 nedladdningar
Filinformation
Filnamn FULLTEXT02.pdfFilstorlek 1181 kBChecksumma SHA-512
67f867bd7cd4687ed52794bf8a7cf9bc44a3b893413f9ba70980ad4dcc9900de268d0979de9d7f8b3a944e73085b55c7521f25547d23df07fa039ab8e5920e67
Typ fulltextMimetyp application/pdf

Av organisationen
Institutionen för kommunikationssystem
Telekommunikation

Sök vidare utanför DiVA

GoogleGoogle Scholar
Totalt: 1735 nedladdningar
Antalet nedladdningar är summan av nedladdningar för alla fulltexter. Det kan inkludera t.ex tidigare versioner som nu inte längre är tillgängliga.

urn-nbn

Altmetricpoäng

urn-nbn
Totalt: 2316 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf