Improving Support-vector machines with Hyperplane folding
2019 (English)Independent thesis Advanced level (professional degree), 20 credits / 30 HE credits
Student thesis
Abstract [en]
Background.
Hyperplane folding was introduced by Lars Lundberg et al. in Hyperplane folding increased the margin while suffering from a flaw, referred to asover-rotation in this thesis.
The aim of this thesis is to introduce a new different technique thatwould not over-rotate data points. This novel technique is referred to as RubberBand folding in the thesis. The following research questions are addressed: 1) DoesRubber Band folding increases classification accuracy? 2) Does Rubber Band fold-ing increase the Margin? 3) How does Rubber Band folding effect execution time?
Rubber Band folding was implemented and its result was compared toHyperplane folding and the Support-vector machine. This comparison was done byapplying Stratified ten-fold cross-validation on four data sets for research question1 & 2. Four folds were applied for both Hyperplane folding and Rubber Band fold-ing, as more folds can lead to over-fitting. While research question 3 used 15 folds,in order to see trends and is not affected by over-fitting. One BMI data set, wasartificially made for the initial Hyperplane folding paper. Another data set labeled patients with, or without a liver disorder. Another data set predicted if patients havebenign- or malign cancer cells. Finally, a data set predicted if a hepatitis patient isalive within five years.Results.Rubber Band folding achieved a higher classification accuracy when com-pared to Hyperplane folding in all data sets. Rubber Band folding increased theclassification in the BMI data set and cancer data set while the accuracy for Rub-ber Band folding decreased in liver and hepatitis data sets. Hyperplane folding’saccuracy decreased in all data sets.Both Rubber Band folding and Hyperplane folding increases the margin for alldata sets tested. Rubber Band folding achieved a margin higher than Hyperplanefolding’s in the BMI and Liver data sets. Execution time for both the classification ofdata points and the training time for the classifier increases linearly per fold. RubberBand folding has slower growth in classification time when compared to Hyperplanefolding.
Rubber Band folding can increase the classification accuracy, in whichexact cases are unknown. It is howevered believed to be when the data is none-linearly seperable.Rubber Band folding increases the margin. When compared to Hyperplane fold-ing, Rubber Band folding can in some cases, achieve a higher increase in marginwhile in some cases Hyperplane folding achieves a higher margin.Both Hyperplane folding and Rubber Band folding increases training time andclassification time linearly. The difference between Hyperplane folding and RubberBand folding in training time was negligible while Rubber bands increase in classifi-cation time was lower. This was attributed to Rubber Band folding rotating fewerpoints after 15 folds.
Place, publisher, year, edition, pages
2019. , p. 67
Keywords [en]
Support-Vector Machines, Dimensionality reduction, Rubber Band folding, Hyperplane folding, Machine learning
National Category
Engineering and Technology
Identifiers
URN: urn:nbn:se:bth-18305OAI: oai:DiVA.org:bth-18305DiVA, id: diva2:1333257
Subject / course
Degree Project in Master of Science in Engineering 30,0 hp
Educational program
PAACI Master of Science in Game and Software Engineering
Supervisors
Examiners
2019-07-012019-07-012022-05-12Bibliographically approved