Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Predicting the Helpfulness of Online Product Reviews
Blekinge Institute of Technology, Faculty of Computing, Department of Software Engineering.
2021 (English)Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesis
Abstract [en]

Review helpfulness prediction has attracted growing attention of researchers that proposed various solutions using Machine Learning (ML) techniques. Most of the studies used online reviews from Amazon to predict helpfulness where each review is accompanied with information indicating how many people found the review helpful. This research aims to analyze the complete process of modelling review helpfulness from several perspectives. Experiments are conducted comparing different methods for representing the review text as well as analyzing the importance of data sampling for regression compared to using non-sampled datasets. Additionally, a set of review, review meta-data and product features are evaluated on their ability to capture the helpfulness of reviews. Two Amazon product review datasets are utilized for the experiments and two of the most widely used machine-learning algorithms, Linear Regression and Convolutional Neural Network (CNN). The experiments empirically demonstrate that the choice of representation of the textual data has an impact on performance with tf-idf and word2Vec obtaining the lowest Mean Squared Error (MSE) values. The importance of data sampling is also evident from the experiments as the imbalanced ratios in the unsampled dataset negatively affected the performance of both models with bias predictions in favor of the majority group of high ratios in the dataset. Lastly, the findings suggest that review features such as unigrams of review text and title, length of review text in words, polarity of title along with rating as review meta-data feature are the most influential features for determining helpfulness of reviews.

Place, publisher, year, edition, pages
2021. , p. 27
Keywords [en]
Review helpfulness prediction, product reviews, machine learning, data sampling, regression
National Category
Software Engineering
Identifiers
URN: urn:nbn:se:bth-22144OAI: oai:DiVA.org:bth-22144DiVA, id: diva2:1595730
Subject / course
PA1445 Kandidatkurs i Programvaruteknik
Educational program
PAGPT Software Engineering
Supervisors
Examiners
Available from: 2021-09-27 Created: 2021-09-20 Last updated: 2021-09-27Bibliographically approved

Open Access in DiVA

Predicting the Helpfulness of Online Product Reviews(1349 kB)785 downloads
File information
File name FULLTEXT02.pdfFile size 1349 kBChecksum SHA-512
b39024840c36870648e0eb147288c206a39b70d686dda221295a2208a7613e87be895ae66bfbfdef22b2f28b017ad4934591b310ec9f2a501e32733b2ee32e23
Type fulltextMimetype application/pdf

By organisation
Department of Software Engineering
Software Engineering

Search outside of DiVA

GoogleGoogle Scholar
Total: 785 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 387 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf