Evaluation of Test Data Generation Techniques for String Inputs
2017 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE credits
Student thesis
Abstract [en]
Context. The effective generation of test data is regarded as very important in the software testing. However, mature and effective techniques for generating string test data have seldom been explored due to the complexity and flexibility in the expression form of the string comparing to other data types.
Objectives. Based on this problem, this study is to investigate strengths and limitations of existing string test data generation techniques to support future work for exploring an effective technique to generate string test data. This main goal was achieved via two objectives. First is investigating existing techniques for string test data generation; as well as finding out criteria and Classes-Under-Test (CUTs) used for evaluating the ability of string test generation. Second is to assess representative techniques through comparing effectiveness and efficiency.
Methods. For the first objective, we used a systematic mapping study to collect data about existing techniques, criteria, and CUTs. With respect to the second objective, a comparison study was conducted to compare representative techniques selected from the results of systematic mapping study. The data from comparison study was analysed in a quantitative way by using statistical methods.
Results. The existing techniques, criteria and CUTs which are related to string test generation were identified. A multidimensional categorisation was proposed to classify existing string test data generation techniques. We selected representative techniques from the search-based method, symbolic execution method, and random generation method of categorisation. Meanwhile, corresponding automated test generation tools including EvoSuite, Symbolic PathFinder (SPF), and Randoop, which achieved representative techniques, were selected to assess through comparing effectiveness and efficiency when applied to 21 CUTs.
Conclusions. We concluded that: search-based method has the highest effectiveness and efficiency in three selected solution methods; random generation method has a low efficiency, but has a high fault-detecting ability for some specific CUTs; symbolic execution solution achieved by SPF cannot support string test generation well currently due to possibly incomplete string constraint solver or string generator.
Place, publisher, year, edition, pages
2017. , p. 91
Keywords [en]
string, test data generation, mapping study, comparison.
National Category
Software Engineering
Identifiers
URN: urn:nbn:se:bth-14798OAI: oai:DiVA.org:bth-14798DiVA, id: diva2:1118138
Subject / course
PA2534 Master's Thesis (120 credits) in Software Engineering
Educational program
PAAXA Master of Science Programme in Software Engineering
Supervisors
Examiners
2017-06-302017-06-292018-01-13Bibliographically approved