Academia.eduAcademia.edu

Software Testing Using Genetic Algorithms

2010, Advances in Computer Science and Engineering: Texts

This paper presents a set of methods that uses a genetic algorithm for automatic test-data generation in software testing. For several years researchers have proposed several methods for generating test data which had different drawbacks. In this paper, we have presented various Genetic Algorithm (GA) based test methods which will be having different parameters to automate the structural-oriented test data generation on the basis of internal program structure. The factors discovered are used in evaluating the fitness function of Genetic algorithm for selecting the best possible Test method. These methods take the test populations as an input and then evaluate the test cases for that program. This integration will help in improving the overall performance of genetic algorithm in search space exploration and exploitation fields with better convergence rate.

International Journal of Computer Science & Engineering Survey (IJCSES) Vol.7, No.2, April 2016 SOFTWARE TESTING USING GENETIC ALGORITHMS Akshat Sharma1, Rishon Patani1 and Ashish Aggarwal1 1 School of Computer Science and Engineering, VIT University, Vellore, Tamil Nadu, India ABSTRAC T This paper presents a set of methods that uses a genetic algorithm for automatic test-data generation in software testing. For several years researchers have proposed several methods for generating test data which had different drawbacks. In this paper, we have presented various Genetic Algorithm (GA) based test methods which will be having different parameters to automate the structural-oriented test data generation on the basis of internal program structure. The factors discovered are used in evaluating the fitness function of Genetic algorithm for selecting the best possible Test method. These methods take the test populations as an input and then evaluate the test cases for that program. This integration will help in improving the overall performance of genetic algorithm in search space exploration and exploitation fields with better convergence rate. KEYWORDS Genetic algorithm, Fitness function, Test data. 1. INTRODUCTION Software testing is a process in which the runtime quality and quantity of a software is tested to maximum limits. The basic test of software is done in the environment for which it is has been designed. It’s run through is checked for correct and efficient outputs. Also software testing is done in foreign environments also so as to explore about the possibilities of scalability [8, 12]. Each software is tested under various strategized environments. The testing done is expected to produce the correct results under the assumptions of specific functions, but at no time all the defects can be identified. Instead, it furnishes a comparison that compares the state and behavior of the product —principles or mechanisms by which users might be able to recognize the problem. A test case is generally the data which acts as input for the software testing to done. Its consists of unique identifier, requirement references from a software specification, a series of steps to follow, events, preconditions, input, output, expected result, and actual result.[17] It is also coined as an expected output to the tested environment. This can be as pragmatic as “for condition your derived result is b”, whereas other test cases described the input scenario in detailed analysis and showed results that were expected [2, 9, 16]. Evolutionary Testing uses a kind of meta-heuristic search technique, the Genetic Algorithm (GA), to convert the task of test case generation into an optimal problem [4, 13, 27]. Evolutionary DOI:10.5121/ijcses.2016.7203 21 International Journal of Computer Science & Engineering Survey (IJCSES) Vol.7, No.2, April 2016 testing is used to search for optimal test parameter combinations that satisfy a predefined test criterion. This test criterion is represented by using a “cost function” that measures how well each of the automatically generated optimization parameters are satisfying the given test criterion [7, 22]. In our paper a study of different types of genetic algorithms is done. Different algorithms have been run on different tools and analyzed for their performance. All these algorithms follow the same basis of evolutionary testing but have different cost functions. On running these cost functions on different tools, observations on how these functions respond are made [1, 3, 5, 11]. 2. INTRODUCTION TO GENETIC ALGORITHM Genetic algorithms are one of the best ways to solve a set of problems for which little information is given. Genetic algorithms are a very general algorithm and so they will work well in any search space [1, 25, 30, 33]. All you need to know is what you need the solution to be able to do well, and a genetic algorithm will be able to create a high quality solution. Genetic algorithms use the principles of selection and evolution to produce solution for various complex problems. Genetic algorithms tends to thrive in an environment in which there is a very large set of candidate solutions and in which the search space is not favorable and has many hills and valleys[12,15,16]. True, genetic algorithms can do well in any environment, but they might be greatly outclassed by more situation specific algorithms in the simpler search spaces. Therefore you must keep in mind that genetic algorithms are not always the best choice in random scenarios. Sometimes they might take quite a while to run and are therefore not always feasible for real time use. They are, however, one of the most powerful methods with which high quality solutions are created quickly to a problem [4,8,21]. There are few basic methodology/Terminology that will be used while implementing genetic algorithm like: Individual – Possible solutions Population - Set of all individuals Search Space - All possible solutions to the specified problem Chromosome – Blueprint for an individual Trait - Possible aspect of an individual entity. Allele - Possible settings for a trait Locus - The position of a gene on the chromosome Genome - Collection of all chromosomes for an individual entity. 3. BACKGROUND Genetic algorithms use the following three operations on its population. 3.1 Selection: A selection process is applied to determine a way in which individuals are chosen for mating from a population based on their fitness. Fitness is defined as a characteristic and capability of an individual to survive and reproduce in an environment. Selection generates the new population from the old one, thus starting a new generation. The fitness value of each chromosome in present 22 International Journal of Computer Science & Engineering Survey (IJCSES) Vol.7, No.2, April 2016 generation is determined by an appropriate evaluation. Thus, the fitness value is used to select a set of better chromosomes from the population for the next generation.[5,6]. 3.2 Crossover: After the selection process, the crossover operation is applied to the chromosomes selected from the population. Crossover involves swapping of sequence of bits or genes in the string between two individuals [8, 10]. This process of swapping is carried out and repeated each time with different parent individuals until the next generation has optimum individuals. Figure 3.2: Uniform Crossover 3.3 Mutation: After the crossover process, the mutate operation is applied to a randomly select subset of the population. Mutation leads to an alteration of chromosomes in small new ways to introduce good traits. The main aim of mutation is to bring diversity in population.[1,5] Figure 3.3: Mutation (Bit Inversion) Factors essential in a fitness function are: • • • Likelihood. Close to Boundary Value. Branch Coverage. It has been proven that GAs required less CPU time in reaching a global solution in software testing [13]. 3.4 Need for Genetic Algorithms in Software Testing: Drawbacks of manual testing: [7, 12] Speed of operation is limited as it is carried out by humans. High investment in terms of cost, time. Limited availability of resources. 23 International Journal of Computer Science & Engineering Survey (IJCSES) Vol.7, No.2, April 2016 Redundancy in test cases. Inefficient and inaccurate test checking. Pros of using genetic algorithms in software testing: Parallelism is a important characteristic of genetic testing [11,19]. Less likely to get stuck in extreme ends of a code during testing since it operates in a search space. With the same encoding, only fitness function needs to be changed according to the problem. Figure 4.1: Test Case Generation in Software Testing Using GA 4. GENETIC ALGORITHM WORKING The genetic algorithm is an evolutionary approach to computing, which has the ability to determine appropriate approx. solutions to optimization problems. The basic process adopted by Genetic algorithms typically involves creating an initial set of random solutions (population) and evaluating them [2, 5, 9, 12]. Followed by a process of selection, the better solutions are identified (parents) and are then used to generate new solutions (children). These values can be used to replace other lesser members of the population. This new population (generation), is then reevaluated and the process for generating new values continues, reproducing new generations until either a final solution is determined or some other criterion for determination of result is reached[18,22,32]. Genetic Algorithm borrows its terms from the biological world. For example, Genetic algorithm uses different representations for potential solutions which are referred to as a chromosome and the operators that are used to generate new child solutions are such as crossover and mutation are derived from nature. In their generic and most basic form, Genetic Algorithms were used mainly for single objective search and optimization algorithms. Common to most Genetic algorithms is the use of a chromosome, genetic operators, a selection mechanism and an evaluation mechanism [23, 27]. 24 International Journal of Computer Science & Engineering Survey (IJCSES) Vol.7, No.2, April 2016 In our case, all we have to do is determine the parameters we need in order to build an efficient function and bundle them into a binary string, this will be the definition of our chromosome. Therefore, each chromosome will fully describe a Function. A common approach when working with Genetic Algorithm is to start by making a 'population' of random chromosomes (Test Variables) perhaps a 100. You may remember earlier that we said we can test each one individuals and score it, or to use the correct terminology, 'evaluate its fitness’ (function). This calculated Fitness function will help us to evaluate the efficiency of the method used [11, 27, 29].And further to increase the efficiency of the test result we can change the input parameters and obtain values for different number of population. 5. APPLICATIONS OF GENETIC ALGORITHM Genetic Algorithm have been used for solving complex problems (such as NPC and NP-hard), for machine learning and is also used for evolving simple test programs. They are a very effective way of quickly finding a reasonable solution to a complex problem. Genetic algorithms are most efficient and effective in a search space for which little is known. Then again, genetic algorithms can be used to produce solutions to problems working only in the test environment and deviates once you try to use them in the real world [17, 24]. So when put simply, genetic algorithm can be used to create solutions for problems that are not very easy to calculate and analyze. 25 International Journal of Computer Science & Engineering Survey (IJCSES) Vol.7, No.2, April 2016 Figure 5.0: Flow chart of the workflow of Genetic Algorithm used for Test Case Generation n Software Testing. 6. IMPLEMENTATION OF GA IN SOFTWARE TESTING 6.1 Test case generation using GA in Ruby Algorithm: Start with randomly generated test cases from the population. Calculate the fitness f(x) of each pair of test cases (chromosome x) in the population. Repeat the following steps until a n child test cases have been generated. 26 International Journal of Computer Science & Engineering Survey (IJCSES) Vol.7, No.2, April 2016 a. Select a pair of parent test cases from the current population where the probability of selection is an increasing function of fitness. Selection is done “with replacement,” meaning that the same pair of test case can be selected more than once to become a parent. i.e. (Selection process is carried out) b. With the crossover probability Pc, cross over the pair at a randomly chosen point to form two child cases or off springs. If no crossover takes place, form two test cases that are exact copies of their respective parent cases. c. Mutate the two child cases with mutation probability Pm, and place the resulting pair of test cases in the new population. If n is odd, one new population member can be discarded at random. Replace the current test cases with the new test cases. 6.1.1 Population size = 50 Number of generations = 500 Crossover rate=0.7 Mutation rate = 0.001 Ruby Implementation with First Case: Generation Average fitness Max fitness 1 31.88 33 2 32.88 34 3 32.67 34 4 32.75 35 5 32.96 34 10 34.17 38 25 37.29 42 50 41.21 44 100 39.67 47 150 44.33 49 200 46.33 47 250 47.63 49 500 750 1000 50.23 51.79 52 50 51 51 Table 1: Average fitness value with mutation rate = 0.001 6.1.2 Population size = 50 Number of generations = 500 Crossover rate=0.7 Mutation rate = 0.01 27 International Journal of Computer Science & Engineering Survey (IJCSES) Vol.7, No.2, April 2016 Ruby Implementation with Second Case: Generation Average Max 1 32.25 32 2 32.13 32 3 33.92 33 4 32.42 33 5 33.79 33 10 34.71 34 25 38.42 36 50 39.08 37 100 37.42 36 150 35.54 40 200 38.79 40 250 41.83 39 500 42.18 43 Table 2: Average fitness value with mutation rate = 0.01 Mutation rate has a great impact on the average on the average fitness of genetic algorithms during testing. Smaller the rate, better the fitness function value will be. Below is a graph which represents the average fitness overtime for different mutation rates. 28 International Journal of Computer Science & Engineering Survey (IJCSES) Vol.7, No.2, April 2016 6.2 Genetic Algorithm Implementation in C++ Pseudo-code for genetic algorithm: choose initial_population: evaluate individual_fitness function determine population’s_average fitness_function Repeat select best_case individuals to reproduce; mate_pairs at random; apply crossover_operator; apply mutation_operator; evaluate Individual fitness; determine population's average fitness; The second step consists for generating data consists of the outer loop, which will generate the possible test cases remaining. To account for the possibility of unfeasible test requirements, which includes branches and statement values, the loop will produces iterations until it satisfies the test results for the given population values. The algorithm will produce values which will be applied for the crossover and mutation operator. Then the fitness function for individual values are generated and the population’s individual average fitness functions in calculated. In the final step, the algorithm will assign the combined values of the test cases and find at least one individual desired fitness function values until enough test generations have been passed. 6.2.1 Population size = 50 Number of generations = 250 Crossover rate=0.7 Mutation rate = 0.001 Table 3: Best fitness value generation Iteration Best fitness Average fitness Standard Deviation 1 16.77 11.59 5.54 2 20.39 15.09 3.26 3 20.39 15.71 3.27 4 22.99 15.84 3.79 5 22.99 16.03 3.89 10 23.24 17.65 3.46 25 25.12 20.87 3.61 29 International Journal of Computer Science & Engineering Survey (IJCSES) Vol.7, No.2, April 2016 50 25.89 21.42 4.32 100 26.27 20.88 5.33 150 26.77 23.28 3.77 200 26.77 20.38 6.91 250 26.78 22.41 4.42 500 26.98 22.68 5.54 Graph: Number of generations (x-axis) vs. average fitness (y-axis) 6.3 Genetic Algorithm Implementation using Matlab In this work both the genetic algorithm and the random testing method were compared and detailed analysis of the best fitness has been evaluated. In order to compare Genetic algorithm and a pure random method, 150 test cases were generated and tested by both methods [22]. From this we can see that the average response time of test cases created by genetic algorithm is much efficient than that of the random method. Case 1 shows the plot of the best and mean score of the population at every generation. The second plot function is stopping criteria, which plots the percentage of stopping criteria satisfied. Figure 1 30 International Journal of Computer Science & Engineering Survey (IJCSES) Vol.7, No.2, April 2016 Results generated for Case 1: The number of generations: 124 The number of function evaluations: 6250 The best function value found: -186. Case 2 shows a better detailed analysis of the best and the mean fitness function by changing the test cases to 20 instead of 10. This iteration generates better results than the previous iteration and by further iterating the test cases the result is obtained. Figure 2 Results generated for Case 2: The number of generations: 68 The number of function evaluations: 650 The best function value found: -176.997 Case 3 Figure 3 Results generated for final Case 3: The number of generations was: 101 The number of function evaluations was: 5100 The best function value found was: -186.627 31 International Journal of Computer Science & Engineering Survey (IJCSES) Vol.7, No.2, April 2016 Genetic algorithm for test case generation using Mat lab was implemented. It can be found out that for a population of 50 and maximum generations 500, the best fitness function is carried out until maximum generation is reached or until a certain value after which no significant change occurs. The iterative generation is stopped itself after such a condition hence providing us with a further optimization and avoid redundant solutions. 7. CONCLUSIONS In this paper, we analyzed how evolutionary techniques such as GAs helped in software testing. The results show how software testing using Genetic Algorithms becomes efficient even with increasing number of test cases. In Random Testing Methods, since data points do not have a dependence with time, it becomes inefficient as code becomes complex. Thus, to increase the efficiency and process time of Software testing, Genetic Algorithms are being used and provide us a means of an automatic test case generator. The evolutionary generation of test cases can be applied and proves to be efficient and cost effective than Random Testing. ACKNOWLEDGEMENT We would like to acknowledge the immense support of our Professor Dr Manjula R without whom it was not possible to write the paper. Her guidance formed the base of the paper and her expertise comments greatly enhanced the manuscript. We would also like thank our university (VIT University, Vellore, Tamil Nadu, India) which has always encouraged the students to go in the field of research and develop an interest towards exploration and innovation. REFERENCES [1] Goldberg, D.E, “Genetic Algorithms: in Search, Optimization & Machine Learning,” Addison Wesley, MA. 1989. [2] Horgan, J., London, S., and Lyu, M., “Achieving Software Quality with Testing Coverage Measures”, IEEE Computer, Vol. 27 No.9 pp. 60-69, 1994. [3] Berndt, D.J., Fisher, J., Johnson, L., Pinglikar, J., and Watkins, A., “Breeding Software Test Cases with Genetic Algorithms,” In Proceedings of the Thirty-Sixth Hawaii International Conference on System Sciences (HICSS-36), Hawaii, January 2003. [4] Mark Last, Shay Eyal1, and Abraham Kandel, “Effective Black-Box Testing with Genetic Algorithms,” IBM conference. [5] Lin, J.C. and Yeh, P.L, “Using Genetic Algorithms for Test Case Generation in Path Testing,” In Proceedings of the 9th Asian Test Symposium (ATS’00). Taipei, Taiwan, December 4-6, 2000. [6] André Baresel, Harmen Sthamer and Michael Schmidt, “fitness function design to improve evolutionary structural testing,” proceedings of the genetic and evolutionary computation conference, 2002. [7] Christoph C. Michael, Gary E. McGraw, Michael A. Schatz, and Curtis C. Walton, “Genetic Algorithms for Dynamic Test Data Generation,” Proceedings of the 1997 International Conference on Automated Software Engineering (ASE'97) (formerly: KBSE) 0-8186-7961-1/97 © 1997 IEEE. [8] Somerville, I., “Soft ware engineering,” 7th Ed. Addison-Wesley, [9] Aditya P mathur,”Foundation of Software Testing”, 1st edition Pearson Education 2008. [10] Alander, J.T., Mantere, T., and Turunen, P, “Genetic Algorithm Based Software Testing,” http://citeseer.ist.psu.edu/40769.html, 1997. [11] Nashat Mansour, Miran Salame,” Data Generation for Path Testing”, Software Quality Journal, 12, 121–136, 2004,Kluwer Academic Publishers. 32 International Journal of Computer Science & Engineering Survey (IJCSES) Vol.7, No.2, April 2016 [12] Praveen Ranjan Srivastava et al, “Generation of test data using Meta heuristic approach” IEEE TENCON (19-21 NOV 2008), India available in IEEEXPLORE. [13] Wegener, J., Baresel, A., and Sthamer, H, “Suitability of Evolutionary Algorithms for Evolutionary Testing,” In Proceedings of the 26th Annual International Computer Software and Applications Conference, Oxford, England, August 26-29, 2002. [14] Berndt, D.J. and Watkins A, “Investigating the Performance of Genetic Algorithm-Based. Software Test Case Generation,” In Proceedings of the Eighth IEEE International Symposium on High Assurance Systems Engineering (HASE'04), pp. 261-262, University of South Florida, March 25-26, 2004. [15] B. Korel. Automated software test data generation. IEEE Transactions on Software Engineering, 16(8), August 1990. [16] Bo Zhang, Chen Wang, “Automatic generation of test data for path testing by adaptive genetic simulated annealing algorithm”, IEEE, 2011, pp. 38 – 42. [17] Chartchai Doungsa et. al., “An automatic test data generation from UML state diagram using genetic algorithm”,http://eastwest.inf.brad.ac.uk/document/publication/DoungsaardSKIMA.pdf. [18] D.J Berndt, A. Watkins, “High volume software testing using genetic algorithms”, Proceedings of the 38t h International Conference on system sciences (9), IEEE, 2005, pp. 1- 9. [19] Francisca Emanuelle et. al., “Using Genetic algorithms for test plans for functional testing”, 44th ACM SE proceeding, 2006, pp. 140 - 145. [20] Goldberg, D.E, Genetic Algorithms: in search, optimization and machine learning, Addison Wesley, M.A, 1989. [21] Girgis, “Automatic test generation for data flow testing using a genetic algorithm”, Journal of computer science, 11 (6), 2005, pp. 898 – 915. [22] Giuseppe A. et. al., “Testing Web –applications: The State of Art and Future Trends”.Information and Software Technology. Elsevier, 2006, pp. 1172-1186. [23] Jin- Cherng Lin, Pu- Lin Yeh, “Automatic test data generation for path testing using Gas”, International journal of information sciences. Elsevier, 2000, pp. 47- 64. [24] Jose Carlos et. al., “A strategy for evaluating feasible and unfeasible test cases for the evolutionary testing of objectoriented software”, AST’ 08. ACM, 2008, http://www.cs.bham.ac.uk/~wbl/biblio/cache/http___jcbri beiro.googlepages.com_ast12-ribeiro.pdf, Accessed on 6.11.2012. [25] Liang You, YanSheng Lu, “A genetic algorithm for the time – aware regression testing reduction problem”, International conference on natural computation, IEEE, 2012, pp. 596 – 599. [26] McMinn, “Search based software test generation: A survey”, Software testing, Verification and reliability 14 (2), 2004, pp. 105-156. [27] Mark Last et. al., “Effective black-box testing with genetic algorithms”, Lecture notes in computer science, Springer, 2006, pp. 134 -148. [28] Maha alzabidi et. al., “Automatic software structural testing by using evolutionary algorithms for test data generations”, International Journal of Computer science and Network Security 9 (4), 2009, pp. 390 – 395. [29] Velur Rajappa et. al., “Efficient software test case generation Using genetic algorithm based graph theory” International conference on emerging trends in Engineering and Technology, IEEE, 2008, pp. 298 - 303. [30] Xuan Peng, Lu Lu, “A new approach for session - based test case generation by GA”. IEEE, 2011, pp. 91- 96. [31] Peter M. Kruse et. al., “A Highly Configurable test systems for evolutionary black box testing of embedded systems” GECCO. ACM, 2009, pp.1545 – 1551. [32] Ruilian zhao, shanshan lv, “Neural network based test cases generation using genetic algorithm” 13th IEEE international symposium on Pacific Rim dependable computing. IEEE, 2007, pp.97 - 100. [33] Robert M .Patton et. al. “A genetic algorithm approach to focused software usage testing” Annals of software engineering,http://www.cs.ucf.edu/~ecl/papers/03.rmpatto n.pdf. 33