Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
E-mail classification with machine learning and word embeddings for improved customer support
Blekinge Institute of Technology, Faculty of Computing, Department of Computer Science.ORCID iD: 0000-0002-8929-7220
Blekinge Institute of Technology, Faculty of Computing, Department of Computer Science.ORCID iD: 0000-0002-9316-4842
Blekinge Institute of Technology, Faculty of Computing, Department of Computer Science. student.
Blekinge Institute of Technology, Faculty of Computing, Department of Computer Science. student.
2021 (English)In: Neural Computing & Applications, ISSN 0941-0643, E-ISSN 1433-3058, Vol. 33, no 6, p. 1881-1902Article in journal (Refereed) Published
Abstract [en]

Classifying e-mails into distinct labels can have a great impact on customer support. By using machine learning to label e-mails, the system can set up queues containing e-mails of a specific category. This enables support personnel to handle request quicker and more easily by selecting a queue that match their expertise. This study aims to improve a manually defined rule-based algorithm, currently implemented at a large telecom company, by using machine learning. The proposed model should have higher F1-score and classification rate. Integrating or migrating from a manually defined rule-based model to a machine learning model should also reduce the administrative and maintenance work. It should also make the model more flexible. By using the frameworks, TensorFlow, Scikit-learn and Gensim, the authors conduct a number of experiments to test the performance of several common machine learning algorithms, text-representations, word embeddings to investigate how they work together. A long short-term memory network showed best classification performance with an F1-score of 0.91. The authors conclude that long short-term memory networks outperform other non-sequential models such as support vector machines and AdaBoost when predicting labels for e-mails. Further, the study also presents a Web-based interface that were implemented around the LSTM network, which can classify e-mails into 33 different labels. © 2020, The Author(s).

Place, publisher, year, edition, pages
Springer , 2021. Vol. 33, no 6, p. 1881-1902
Keywords [en]
E-mail classification, Long short-term memory, Machine learning, Natural language processing, Adaptive boosting, Brain, Electronic mail, Embeddings, Learning systems, Multimedia systems, Support vector machines, Classification performance, Classification rates, Email classification, Machine learning models, Rule based algorithms, Rule-based models, Text representation, Web-based interface
National Category
Language Technology (Computational Linguistics) Computer Sciences
Identifiers
URN: urn:nbn:se:bth-20122DOI: 10.1007/s00521-020-05058-4ISI: 000541326700002Scopus ID: 2-s2.0-85086707161OAI: oai:DiVA.org:bth-20122DiVA, id: diva2:1452273
Note

Open access

Available from: 2020-07-06 Created: 2020-07-06 Last updated: 2022-05-04Bibliographically approved

Open Access in DiVA

fulltext(1201 kB)231 downloads
File information
File name FULLTEXT01.pdfFile size 1201 kBChecksum SHA-512
d1327c08f481f075e34aec845362648b3f3615578afabaff5ee3a050b5a895d202a7e836f10d12bcaf2931b8c1c71675aa2d324964f27a923f9dae6818fa1678
Type fulltextMimetype application/pdf

Other links

Publisher's full textScopus

Authority records

Borg, AntonBoldt, Martin

Search in DiVA

By author/editor
Borg, AntonBoldt, Martin
By organisation
Department of Computer Science
In the same journal
Neural Computing & Applications
Language Technology (Computational Linguistics)Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 231 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 400 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf