Unlocking Software Potential Through App Store Reviews
2025 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE credits
Student thesis
Abstract [en]
Background: The software development industry is advancing rapidly, driven by the increasing demand for responsive and innovative applications. Within this landscape, gaming apps stand out due to their complexity and dynamic user expectations, requiring frequent updates and innovative features to maintain user engagement. App store reviews, a largely untapped resource, offer invaluable insights into user experiences and expectations, particularly in the gaming domain. While significant research has focused on descriptive analytics and sentiment analysis, a critical gap exists in systematically extracting and categorizing specific requirement types from unstructured review data for gaming apps.
Objectives: The aim of this research is to uncover actionable insights from user reviews of gaming apps to enhance the software development process. The objectives include predicting the sentiment (positive, negative, or neutral) of review comments using NLP models such as TextBlob, BERT, and Large Language Model (LLM); categorizing the comments into distinct categories to facilitate targeted softwareimprovements; and evaluating the performance of these models to determine theiraccuracy and effectiveness in sentiment prediction and feedback categorization.
Methods: The implementation methodology for achieving the stated objectives involves a multi-step approach using natural language processing (NLP) and machine learning techniques. First, app store review data is collected and preprocessed to clean and structure the text for analysis. Sentiment prediction models, including TextBlob, BERT, and Large Language Model (LLM), are then applied to classify the reviews as positive, negative, or neutral. For feedback categorization. SemanticClustering is used to classify reviews into distinct categories. Evaluation metrics such as accuracy, precision, recall, and F1-score are used to assess and compare the performance of TextBlob, BERT, and LLM. The insights derived from this analysis are validated through cross-validation and tested for their applicability in guiding software development and enhancement efforts.
Results: The performance results of the implemented models demonstrate varying levels of accuracy in sentiment prediction. The TextBlob model, being a lightweight and lexicon-based approach, achieved a performance accuracy of 87%, showcasing its utility for basic sentiment analysis but limited contextual understanding. TheBERT model, leveraging its deep contextual embeddings, improved accuracy to 90%, reflecting its ability to handle complex language nuances and provide more precise classifications. The Large Language Model (LLM) outperformed both, achieving the highest accuracy of 93%, demonstrating their advanced capability in capturing semantic relationships and nuanced text patterns. These results highlight the effectiveness of advanced NLP techniques, with LLM emerging as the most reliable tool for extracting actionable insights from unstructured app store review data. For categorization, the Semantic Clustering method helps in classifying reviews based on the content of reviews with ease by using top keywords and provides valuable insight to developers.
Conclusions: This research highlights the potential of NLP and machine learning techniques in deriving actionable insights from unstructured app store review data. By employing TextBlob, BERT, and Large Language Model (LLM), the study successfully predicted review sentiments and categorized feedback into several distinct categories using Semantic Clustering. The results demonstrated that while TextBlob provides a lightweight and moderately accurate solution, BERT and LLM significantly enhance performance, with LLM achieving the highest accuracy of 93%. These findings underscore the effectiveness of advanced NLP models in bridging the gap between user feedback and software development processes, enabling a more user-centered approach to application design and improvement. The methodology and insights from this study can be extended to other domains requiring unstructured text analysis, paving the way for broader applications in data-driven decision-making.
Place, publisher, year, edition, pages
2025. , p. 56
Keywords [en]
App Store Reviews, Software Development, User feedback, TextBlob, BERT, Sentiment Analysis, Natural Language Processing, Large Language Model.
National Category
Software Engineering
Identifiers
URN: urn:nbn:se:bth-28211OAI: oai:DiVA.org:bth-28211DiVA, id: diva2:1977662
Subject / course
PA2534 Master's Thesis (120 credits) in Software Engineering
Educational program
PAADA Master Qualification Plan in Software Engineering 120,0 hp
Supervisors
Examiners
2025-06-302025-06-262025-09-30Bibliographically approved