Extracting Customer Sentiments from Email Support Tickets: A case for email support ticket prioritisation
2019 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE credits
Student thesis
Abstract [en]
Background
Daily, companies generate enormous amounts of customer support tickets which are grouped and placed in specialised queues, based on some characteristics, from where they are resolved by the customer support personnel (CSP) on a first-in-first-out basis. Given that these tickets require different levels of urgency, a logical next step to improving the effectiveness of the CSPs is to prioritise the tickets based on business policies. Among the several heuristics that can be used in prioritising tickets is sentiment polarity.
Objectives
This study investigates how machine learning methods and natural language techniques can be leveraged to automatically predict the sentiment polarity of customer support tickets using.
Methods
Using a formal experiment, the study examines how well Support Vector Machine (SVM), Naive Bayes (NB) and Logistic Regression (LR) based sentiment polarity prediction models built for the product and movie reviews, can be used to make sentiment predictions on email support tickets. Due to the limited size of annotated email support tickets, Valence Aware Dictionary and sEntiment Reasoner (VADER) and cluster ensemble - using k-means, affinity propagation and spectral clustering, is investigated for making sentiment polarity prediction.
Results
Compared to NB and LR, SVM performs better, scoring an average f1-score of .71 whereas NB scores least with a .62 f1-score. SVM, combined with the presence vector, outperformed the frequency and TF-IDF vectors with an f1-score of .73 while NB records an f1-score of .63. Given an average f1-score of .23, the models transferred from the movie and product reviews performed inadequately even when compared with a dummy classifier with an f1-score average of .55. Finally, the cluster ensemble method outperformed VADER with an f1-score of .61 and .53 respectively.
Conclusions
Given the results, SVM, combined with a presence vector of bigrams and trigrams is a candidate solution for extracting sentiments from email support tickets. Additionally, transferring sentiment models from the movie and product reviews domain to the email support tickets is not possible. Finally, given that there exists a limited dataset for conducting sentiment analysis studies in the Swedish and the customer support context, a cluster ensemble is recommended as a sample selection method for generating annotated data.
Place, publisher, year, edition, pages
2019. , p. 62
Keywords [en]
Machine Learning, Natural Language Processing, Sentiment Analysis, Cluster Ensemble, VADER, Customer support
National Category
Computer Systems
Identifiers
URN: urn:nbn:se:bth-18853OAI: oai:DiVA.org:bth-18853DiVA, id: diva2:1375069
External cooperation
Telenor AB, Swden, Karlskrona
Subject / course
DV2572 Master´s Thesis in Computer Science
Educational program
DVACS Master of Science Programme in Computer Science
Presentation
2019-09-26, 07:01 (English)
Supervisors
Examiners
2019-12-042019-12-042019-12-04Bibliographically approved