Investigating Metrics that are Good Predictors of Human Oracle Costs An Experiment
2017 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE credits
Student thesis
Abstract [en]
Context. Human oracle cost, the cost associated in estimating the correctness of the output for the given test inputs is manually evaluated by humans and this cost is significant and is a concern in the software test data generation field. This study has been designed in the context to assess metrics that might predict human oracle cost.
Objectives. The major objective of this study is to address the human oracle cost, for this the study identifies the metrics that are good predictors of human oracle cost and can further help to solve the oracle problem. In this process, the identified suitable metrics from the literature are applied on the test input, to see if they can help in predicting the correctness of the output for the given test input. Methods. Initially a literature review was conducted to find some of the metrics that are relevant to the test data. Besides finding the aforementioned metrics, our literature review also tries to find out some possible code metrics that can be ap- plied on test data. Before conducting the actual experiment two pilot experiments were conducted. To accomplish our research objectives an experiment is conducted in the BTH university with master students as sample population. Further group interviews were conducted to check if the participants perceive any new metrics that might impact the correctness of the output. The data obtained from the experiment and the interviews is analyzed using linear regression model in SPSS suite. Further to analyze the accuracy vs metric data, linear discriminant model using SPSS pro- gram suite was used.
Results.Our literature review resulted in 4 metrics that are suitable to our study. As our test input is HTML we took HTML depth, size, compression size, number of tags as our metrics. Also, from the group interviews another 4 metrics are drawn namely number of lines of code and number of <div>, anchor <a> and paragraph <p> tags as each individual metric. The linear regression model which analyses time vs metric data, shows significant results, but with multicollinearity effecting the result, there was no variance among the considered metrics. So, the results of our study are proposed by adjusting the multicollinearity. Besides, the above analysis, linear discriminant model which analyses accuracy vs metric data was conducted to predict the metrics that influences accuracy. The results of our study show that metrics positively correlate with time and accuracy.
Conclusions. From the time vs metric data, when multicollinearity is adjusted by applying step-wise regression reduction technique, the program size, compression size and <div> tag are influencing the time taken by sample population. From accuracy vs metrics data number of <div> tags and number of lines of code are influencing the accuracy of the sample population.
Place, publisher, year, edition, pages
2017. , p. 120
Keywords [en]
Test data generation, comprehensibility of test data, software test data metrics, software code metrics, multiple regression analysis, linear discriminant analysis.
National Category
Software Engineering
Identifiers
URN: urn:nbn:se:bth-14080OAI: oai:DiVA.org:bth-14080DiVA, id: diva2:1086742
Subject / course
PA2534 Master's Thesis (120 credits) in Software Engineering
Educational program
PAAXA Master of Science Programme in Software Engineering
Supervisors
Examiners
2017-04-042017-04-042018-01-13Bibliographically approved