Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Compactions in Apache Cassandra: Performance Analysis of Compaction Strategies in Apache Cassandra
Blekinge Tekniska Högskola, Fakulteten för datavetenskaper, Institutionen för kommunikationssystem.
2016 (engelsk)Independent thesis Advanced level (degree of Master (Two Years)), 20 poäng / 30 hpOppgave
Abstract [en]

Context: The global communication system is in a tremendous growth, leading to wide range of data generation. The Telecom operators in various Telecom Industries, that generate large amount of data has a need to manage these data efficiently. As the technology involved in the database management systems is increasing, there is a remarkable growth of NoSQL databases in the 20th century. Apache Cassandra is an advanced NoSQL database system, which is popular for handling semi-structured and unstructured format of Big Data. Cassandra has an effective way of compressing data by using different compaction strategies. This research is focused on analyzing the performances of different compaction strategies in different use cases for default Cassandra stress model. The analysis can suggest better usage of compaction strategies in Cassandra, for a write heavy workload.

Objectives: In this study, we investigate the appropriate performance metrics to evaluate the performance of compaction strategies. We provide the detailed analysis of Size Tiered Compaction Strategy, Date Tiered Compaction Strategy, and Leveled Compaction Strategy for a write heavy (90/10) work load, using default cassandra stress tool.

Methods: A detailed literature research has been conducted to study the NoSQL databases, and the working of different compaction strategies in Apache Cassandra. The performances metrics are considered by the understanding of the literature research conducted, and considering the opinions of supervisors and Ericsson’s Apache Cassandra team. Two different tools were developed for collecting the performances of the considered metrics. The first tool was developed using Jython scripting language to collect the cassandra metrics, and the second tool was developed using python scripting language to collect the Operating System metrics. The graphs have been generated in Microsoft Excel, using the values obtained from the scripts.

Results: Date Tiered Compaction Strategy and Size Tiered Compaction strategy showed more or less similar behaviour during the stress tests conducted. Level Tiered Compaction strategy has showed some remarkable results that effected the system performance, as compared to date tiered compaction and size tiered compaction strategies. Date tiered compaction strategy does not perform well for default cassandra stress model. Size tiered compaction can be preferred for default cassandra stress model, but not considerable for big data.

Conclusions: With a detailed analysis and logical comparison of metrics, we finally conclude that Level Tiered Compaction Strategy performs better for a write heavy (90/10) workload while using default cassandra stress model, as compared to size tiered compaction and date tiered compaction strategies.

sted, utgiver, år, opplag, sider
2016. , s. 30
Emneord [en]
Apache Cassandra, Compaction Strategies, Default Cassandra Stress model, Performance, NoSQL Database
HSV kategori
Identifikatorer
URN: urn:nbn:se:bth-12885OAI: oai:DiVA.org:bth-12885DiVA, id: diva2:948190
Eksternt samarbeid
Ericsson, Karlskrona
Fag / kurs
ET2580 Master's Thesis (120 credits) in Electrical Engineering with emphasis on Telecommunication Systems
Utdanningsprogram
ETATE Master of Science Programme in Electrical Engineering with emphasis on Telecommunication Systems
Presentation
2016-05-31, J1640, Valhallavagen, 37141, Karlskrona, 11:15 (engelsk)
Veileder
Examiner
Tilgjengelig fra: 2016-08-08 Laget: 2016-07-08 Sist oppdatert: 2025-09-30bibliografisk kontrollert

Open Access i DiVA

fulltext(1181 kB)1725 nedlastinger
Filinformasjon
Fil FULLTEXT02.pdfFilstørrelse 1181 kBChecksum SHA-512
67f867bd7cd4687ed52794bf8a7cf9bc44a3b893413f9ba70980ad4dcc9900de268d0979de9d7f8b3a944e73085b55c7521f25547d23df07fa039ab8e5920e67
Type fulltextMimetype application/pdf

Av organisasjonen

Søk utenfor DiVA

GoogleGoogle Scholar
Totalt: 1735 nedlastinger
Antall nedlastinger er summen av alle nedlastinger av alle fulltekster. Det kan for eksempel være tidligere versjoner som er ikke lenger tilgjengelige

urn-nbn

Altmetric

urn-nbn
Totalt: 2316 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf