Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Unsupervised Machine Learning: An Investigation of Clustering Algorithms on a Small Dataset
Blekinge Institute of Technology, Faculty of Computing, Department of Software Engineering.
Blekinge Institute of Technology, Faculty of Computing, Department of Software Engineering.
2018 (English)Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesis
Abstract [en]

Context: With the rising popularity of machine learning, looking at its shortcomings is valuable in seeing how well machine learning is applicable. Is it possible to apply the clustering with a small dataset?

Objectives: This thesis consists of a literature study, a survey and an experiment. It investigates how two different unsupervised machine learning algorithms DBSCAN(Density-Based Spatial Clustering of Applications with Noise) and K-means run on a dataset gathered from a survey.

Methods: Making a survey where we can see statistically what most people chose and apply clustering with the data from the survey to confirm if the clustering has the same patterns as what people have picked statistically.

Results: It was possible to identify patterns with clustering algorithms using a small dataset. The literature studies show examples that both algorithms have been used successfully.

Conclusions: It's possible to see patterns using DBSCAN and K-means on a small dataset. The size of the dataset is not necessarily the only aspect to take into consideration, feature and parameter selection are both important as well since the algorithms need to be tuned and customized to the data.

Place, publisher, year, edition, pages
2018. , p. 39
National Category
Software Engineering
Identifiers
URN: urn:nbn:se:bth-16300OAI: oai:DiVA.org:bth-16300DiVA, id: diva2:1213516
Subject / course
PA1445 Kandidatkurs i Programvaruteknik; PA1445 Kandidatkurs i Programvaruteknik
Educational program
PAGIP International Software Engineering; PAGPT Software Engineering
Supervisors
Examiners
Available from: 2018-06-05 Created: 2018-06-04 Last updated: 2018-06-05Bibliographically approved

Open Access in DiVA

BTH2018Forsberg(1189 kB)258 downloads
File information
File name FULLTEXT01.pdfFile size 1189 kBChecksum SHA-512
533184e829c2fa80f8c43bd7fff86b4dbd7c064e29b7e140f410a622a87f782eb48b582acb5186fb4f7e329c052f4b1f76d759baa9684604c6c66a100dd3ccce
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Forsberg, FredrikAlvarez Gonzalez, Pierre
By organisation
Department of Software Engineering
Software Engineering

Search outside of DiVA

GoogleGoogle Scholar
Total: 258 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 195 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf