Publications

 

This page is maintained by Yuanyuan Zhang, CREST, Department of Computer Science, King's College, London, UK.

Email: yuanyuan.zhang [AT] kcl.ac.uk

Last updated: 2 November 2009

 

Click on any column header to sort

QuickSearch:   Number of matching entries: 0.

Search Settings

    AuthorTitleYearJournal/ProceedingsReftypeDOI/URL
    Arcuri, Andrea On Search Based Software Evolution 2009 Proceedings of the 1st International Symposium on Search Based Software Engineering (SSBSE '09), pp. 39-42  inproceedings DOI  
    Abstract: Writing software is a difficult and expensive task. Its automation is hence very valuable. Search algorithms have been successfully used to tackle many software engineering problems. Unfortunately, for some problems the traditional techniques have been of only limited scope, and search algorithms have not been used yet. We hence propose a novel framework that is based on a co-evolution of programs and test cases to tackle these difficult problems.This framework can be used to tackle software engineering tasks such as Automatic Refinement, Fault Correction,Improving Non-functional Criteria and Reverse Engineering.While the programs evolve to accomplish one of these tasks, test cases are co-evolved at the the same time to find new faults in the evolving programs.
    BibTeX:
    @inproceedings{Arcuri09,
      author = {Andrea Arcuri},
      title = {On Search Based Software Evolution},
      booktitle = {Proceedings of the 1st International Symposium on Search Based Software Engineering (SSBSE '09)},
      publisher = {IEEE Computer Society},
      year = {2009},
      pages = {39-42},
      doi = {http://dx.doi.org/10.1109/SSBSE.2009.12}
    }
    
    Arcuri, Andrea Full Theoretical Runtime Analysis of Alternating Variable Method on the Triangle Classification Problem 2009 Proceedings of the 1st International Symposium on Search Based Software Engineering (SSBSE '09), pp. 113-121  inproceedings DOI  
    Abstract: Runtime Analysis is a type of theoretical investigation that aims to determine, via rigorous mathematical proofs,the time a search algorithm needs to find an optimal solution.This type of investigation is useful to understand why a search algorithm could be successful, and it gives insight of how search algorithms work. In previous work,we proved the runtimes of different search algorithms on the test data generation for the Triangle Classification (TC)problem. We theoretically proved that Alternating Variable Method (AVM) has the best performance on the coverage of the most difficult branch in our empirical study. In this paper,we prove that the runtime of AVM on all the branches of TC is O((log n)2). That is necessary and sufficient to prove that AVM has a better runtime on TC compared to the other search algorithms we previously analysed. The theorems in this paper are useful for future analyses. In fact, to state theta search algorithm has worse runtime compared to AVM, it will be just sufficient to prove that its lower bound is higher than ¦¸((log n)2) on the coverage of at least one branch of TC.
    BibTeX:
    @inproceedings{Arcuri09b,
      author = {Andrea Arcuri},
      title = {Full Theoretical Runtime Analysis of Alternating Variable Method on the Triangle Classification Problem},
      booktitle = {Proceedings of the 1st International Symposium on Search Based Software Engineering (SSBSE '09)},
      publisher = {IEEE Computer Society},
      year = {2009},
      pages = {113-121},
      doi = {http://dx.doi.org/10.1109/SSBSE.2009.16}
    }
    
    Arcuri, Andrea Insight Knowledge in Search Based Software Testing 2009 Proceedings of the 11th Annual Conference on Genetic and Evolutionary Computation (GECCO '09), pp. 1649-1656  inproceedings DOI  
    Abstract: Software testing can be re-formulated as a search problem, hence search algorithms (e.g., Genetic Algorithms) can be used to tackle it. Most of the research so far has been of empirical nature, in which novel proposed techniques have been validated on software testing benchmarks. However, only little attention has been spent to understand why meta-heuristics can be effective in software testing. This insight knowledge could be used to design novel more successful techniques. Recent theoretical work has tried to fill this gap, but it is very complex to carry out. This has limited its scope so far to only small problems. In this paper, we want to get insight knowledge on a difficult software testing problem. We combine together an empirical and theoretical analysis, and we exploit the benefits of both.
    BibTeX:
    @inproceedings{Arcuri09c,
      author = {Andrea Arcuri},
      title = {Insight Knowledge in Search Based Software Testing},
      booktitle = {Proceedings of the 11th Annual Conference on Genetic and Evolutionary Computation (GECCO '09)},
      publisher = {ACM},
      year = {2009},
      pages = {1649-1656},
      doi = {http://dx.doi.org/10.1145/1569901.1570122}
    }
    
    Arcuri, Andrea Longer is Better: On the Role of Test Sequence Length in Software Testing 2009 (CSR-09-03)  techreport URL 
    Abstract: In the presence of an internal state, often it is required a sequence of function calls to test software. In fact, to cover a particular branch of the code, a sequence of previous function

    calls might be required to put the internal state in the appropriate configuration. Internal states are not only present in object-oriented software, but also in procedural software (e.g.,

    static variables in C programs). In literature, there are many techniques to test this type of software. However, to our best knowledge, the properties related to choosing the length of

    these sequences have received only little attention in literature. In this paper, we analyse the role that the length plays in software testing. We show that on ¡°difficult¡± software testing

    benchmarks longer test sequences make their testing trivial. Hence, we argue that the choice of the length of the test sequences is very important in software testing.

    BibTeX:
    @techreport{Arcuri09e,
      author = {Andrea Arcuri},
      title = {Longer is Better: On the Role of Test Sequence Length in Software Testing},
      year = {2009},
      number = {CSR-09-03},
      url = {http://www.cs.bham.ac.uk/~axa/papers/paper_tech09b.pdf}
    }
    
    Arcuri, Andrea Theoretical Analysis of Local Search in Software Testing 2009 Proceedings of the 5th Symposium on Stochastic Algorithms, Foundations and Applications (SAGA '09)  inproceedings  
    BibTeX:
    @inproceedings{Arcuri09f,
      author = {Andrea Arcuri},
      title = {Theoretical Analysis of Local Search in Software Testing},
      booktitle = {Proceedings of the 5th Symposium on Stochastic Algorithms, Foundations and Applications (SAGA '09)},
      year = {2009},
      note = {to appear}
    }
    
    Arcuri, Andrea Evolutionary Repair of Faulty Software 2009 (CSR-09-02)  techreport URL 
    Abstract: Testing and fault localization are very expensive software engineering tasks that have been tried to be automated. Although many successful techniques have been designed, the actual change of the code for fixing the discovered faults is still a human-only task. Even in the ideal case in which automated tools could tell us exactly where the location of a fault is, it is not always trivial how to fix the code. In this paper we analyse the possibility of automating the complex task of fixing faults. We propose to model this task as a search problem, and hence to use for example evolutionary algorithms to solve it. We then discuss the potential of this approach and how its current limits can be addressed in the future. This task is extremely challenging and mainly unexplored in literature. Hence, this paper only covers an initial investigation and gives directions for future work. A research prototype called JAFF and a case study are presented to give first validation of this approach.
    BibTeX:
    @techreport{Arcurid09d,
      author = {Andrea Arcuri},
      title = {Evolutionary Repair of Faulty Software},
      year = {2009},
      number = {CSR-09-02},
      url = {http://www.cs.bham.ac.uk/~axa/papers/paper_tech09a.pdf}
    }
    
    Arcuri, Andrea On the Automation of Fixing Software Bugs 2008 Proceedings of the Doctoral Symposium of the IEEE International Conference on Software Engineering (ICSE '08), pp. 1003-1006  inproceedings DOI  
    Abstract: Software Testing can take up to half of the resources of the development of new software. Although there has been a lot of work on automating the testing phase, fixing a bug after its presence has been discovered is still a duty of the programmers. Techniques to help the software developers for locating bugs exist though, and they take name of Automated Debugging. However, to our best knowledge, there has been only little attempt in the past to completely automate the actual changing of the software for fixing the bugs. Therefore, in this paper we propose an evolutionary approach to automate the task of fixing bugs. The basic idea is to evolve the programs (e.g., by using Genetic Programming) with a fitness function that is based on how many unit tests they are able to pass. If a formal specification of the buggy software is given, more sophisticated fitness functions can be designed. Moreover, by using the formal specification as an oracle, we can generate as many unit tests as we want. Hence, a co-evolution between programs and unit tests might take place to give even better results. It is important to know that, to fix the bugs in a program with this novel approach, a user needs only to provide either a formal specification or a set of unit tests. No other information is required.
    BibTeX:
    @inproceedings{Arcuri08,
      author = {Andrea Arcuri},
      title = {On the Automation of Fixing Software Bugs},
      booktitle = {Proceedings of the Doctoral Symposium of the IEEE International Conference on Software Engineering (ICSE '08)},
      publisher = {ACM},
      year = {2008},
      pages = {1003-1006},
      doi = {http://dx.doi.org/10.1145/1370175.1370223}
    }
    
    Arcuri, Andrea and Lehre, P.K. and Yao, X. Theoretical Runtime Analysis in Search Based Software Engineering 2009 (CSR-09-04)  techreport URL 
    Abstract: Search algorithms have been used to tackle software engineering problems with promising results. Although the field has attracted a lot of attention recently, it still lacks a theoretical foundation. It has been empirically shown that search algorithms are successful in some software engineering tasks, but we need to understand why and when they are successful. The long term goal is to get insight of how search algorithms work on software engineering problems, so we can exploit this knowledge to design more efficient algorithms. Runtime Analysis is a type of theoretical investigation that aims to determine, via rigorous mathematical proofs, the time a search algorithm needs to find an optimal solution. Runtime analysis has previously been carried out on traditional combinatorial problems. In this paper, we advocate that runtime analysis would be helpful in search based software engineering as well. We give the first runtime analysis of search heuristics in the software testing domain.
    BibTeX:
    @techreport{ArcuriLY09,
      author = {Andrea Arcuri and P.K. Lehre and X. Yao},
      title = {Theoretical Runtime Analysis in Search Based Software Engineering},
      year = {2009},
      number = {CSR-09-04},
      url = {ftp://ftp.cs.bham.ac.uk/pub/tech-reports/2009/CSR-09-04.pdf}
    }
    
    Arcuri, Andrea and Lehre, Per Kristian and Yao, Xin Theoretical Runtime Analyses of Search Algorithms on the Test Data Generation for the Triangle Classification Problem 2008 Proceedings of 1st International Workshop on Search-Based Software Testing (SBST) in conjunction with ICST 2008, pp. 161-169 (Best Paper Award)  inproceedings URL 
    Abstract: Software Testing plays an important role in the life cycle of software development. Because software testing is very costly and tedious, many techniques have been proposed to automate it. One technique that has achieved good results is the use of Search Algorithms. Because most previous work on search algorithms has been of an empirical nature, there is a need for theoretical results that confirm the feasibility of search algorithms applied to software testing. Such theoretical results might shed light on the limitations and benefits of search algorithms applied in this context. In this paper, we formally analyse the expected runtime of three different search algorithms on the problem of Test Data Generation for an instance of the Triangle Classification program. The search algorithms that we analyse are Random Search, Hill Climbing and Alternating VariableMethod. We believe that this is a necessary first step that will lead and help the Software Engineering community to better understand the role of Search Based Techniques applied to software testing.
    BibTeX:
    @inproceedings{ArcuriLY08,
      author = {Andrea Arcuri and Per Kristian Lehre and Xin Yao},
      title = {Theoretical Runtime Analyses of Search Algorithms on the Test Data Generation for the Triangle Classification Problem},
      booktitle = {Proceedings of 1st International Workshop on Search-Based Software Testing (SBST) in conjunction with ICST 2008},
      publisher = {IEEE Computer Society},
      year = {2008},
      pages = {161-169 (Best Paper Award)},
      note = {Best Paper Award},
      url = {http://www.dcs.shef.ac.uk/~phil/sbtest/arcuri.pdf}
    }
    
    Arcuri, Andrea and White, David Robert and Clark, John and Yao, Xin Multi-Objective Improvement of Software using Co-Evolution and Smart Seeding 2008 Proceedings of the 7th International Conference on Simulated Evolution And Learning (SEAL '08), pp. 61-70  inproceedings DOI  
    Abstract: Optimising non-functional properties of software is an important part of the implementation process. One such property is execution time, and compilers target a reduction in execution time using a variety of optimisation techniques. Compiler optimisation is not always able to produce semantically equivalent alternatives that improve execution times, even if such alternatives are known to exist. Often, this is due to the local nature of such optimisations. In this paper we present a novel framework for optimising existing software using a hybrid of evolutionary optimisation techniques. Given as input the implementation of a program or function, we use Genetic Programming to evolve a new semantically equivalent version, optimised to reduce execution time subject to a given probability distribution of inputs. We employ a co-evolved population of test cases to encourage the preservation of the program's semantics, and exploit the original program through seeding of the population in order to focus the search. We carry out experiments to identify the important factors in maximising efficiency gains. Although in this work we have optimised execution time, other non-functional criteria could be optimised in a similar manner.
    BibTeX:
    @inproceedings{ArcuriWCY08,
      author = {Andrea Arcuri and David Robert White and John Clark and Xin Yao},
      title = {Multi-Objective Improvement of Software using Co-Evolution and Smart Seeding},
      booktitle = {Proceedings of the 7th International Conference on Simulated Evolution And Learning (SEAL '08)},
      publisher = {Springer},
      year = {2008},
      pages = {61-70},
      doi = {http://dx.doi.org/10.1007/978-3-540-89694-4_7}
    }
    
    Arcuri, Andrea and Yao, Xin A Novel Co-evolutionary Approach to Automatic Software Bug Fixing 2008 Proceedings of the IEEE Congress on Evolutionary Computation (CEC '08), pp. 162-168  inproceedings DOI  
    Abstract: Many tasks in Software Engineering are very expensive, and that has led the investigation to how to automate them. In particular, Software Testing can take up to half of the resources of the development of new software. Although there has been a lot of work on automating the testing phase, fixing a bug after its presence has been discovered is still a duty of the programmers. In this paper we propose an evolutionary approach to automate the task of fixing bugs. This novel evolutionary approach is based on Co-evolution, in which programs and test cases co-evolve, influencing each other with the aim of fixing the bugs of the programs. This competitive co-evolution is similar to what happens in nature for predators and prey. The user needs only to provide a buggy program and a formal specification of it. No other information is required.

    Hence, the approach may work for any implementable software. We show some preliminary experiments in which bugs in an implementation of a sorting algorithm are automatically fixed.

    BibTeX:
    @inproceedings{ArcuriY08,
      author = {Andrea Arcuri and Xin Yao},
      title = {A Novel Co-evolutionary Approach to Automatic Software Bug Fixing},
      booktitle = {Proceedings of the IEEE Congress on Evolutionary Computation (CEC '08)},
      publisher = {IEEE Computer Society},
      year = {2008},
      pages = {162-168},
      doi = {http://dx.doi.org/10.1109/CEC.2008.4630793}
    }
    
    Arcuri, Andrea and Yao, Xin Search Based Software Testing of Object-Oriented Containers 2008 Information Sciences
    Vol. 178(15), pp. 3075-3095 
    article DOI  
    Abstract: Automatic software testing tools are still far from ideal for real world Object-Oriented (OO) Software. The use of nature inspired search algorithms for this problem has been investigated recently. Testing complex data structures (e.g., containers) is very challenging since testing software with simple states is already hard. Because containers are used in almost every type of software, their reliability is of utmost importance. Hence, this paper focuses on the difficulties of testing container classes with nature inspired search algorithms. We will first describe how input data can be automatically generated for testing Java Containers. Input space reductions and a novel testability transformation are presented to aid the search algorithms. Different search algorithms are then considered and studied in order to understand when and why a search algorithm is effective for a testing problem. In our experiments, these nature inspired search algorithms seem to give better results than the traditional techniques described in literature. Besides, the problem of minimising the length of the test sequences is also addressed. Finally, some open research questions are given.
    BibTeX:
    @article{ArcuriY08b,
      author = {Andrea Arcuri and Xin Yao},
      title = {Search Based Software Testing of Object-Oriented Containers},
      journal = {Information Sciences},
      year = {2008},
      volume = {178},
      number = {15},
      pages = {3075-3095},
      doi = {http://dx.doi.org/10.1016/j.ins.2007.11.024}
    }
    
    Arcuri, Andrea and Yao, Xin Search Based Testing of Containers for Object-Oriented Software 2007 (CSR-07-3)  techreport URL 
    Abstract: Generating test data for Object-Oriented (OO) software is a hard task. Little work has been done on the subject, and a lot of open problems still need to be investigated. In this paper we focus on container classes. They are used in almost every type of software, hence their reliability is of utmost importance. We present novel techniques to generate test data for container classes in an automatic way. A new representation with novel search operators is described and tested. A way to reduce the search space for OO software is presented. This is achieved by dynamically eliminating the functions that cannot give any further help from the search. Besides, the problem of applying the branch distances of disjunctions and conjunctions to OO software is solved. Finally, Hill Climbing, Genetic Algorithms and Memetic Algorithms are used and compared. Our empirical case study shows that our Memetic Algorithm outperforms the other algorithms.
    BibTeX:
    @techreport{ArcuriY07,
      author = {Andrea Arcuri and Xin Yao},
      title = {Search Based Testing of Containers for Object-Oriented Software},
      year = {2007},
      number = {CSR-07-3},
      url = {ftp://ftp.cs.bham.ac.uk/pub/tech-reports/2007/CSR-07-3.pdf}
    }
    
    Arcuri, Andrea and Yao, Xin A Memetic Algorithm for Test Data Generation of Object-Oriented Software 2007 Proceedings of the 2007 IEEE Congress on Evolutionary Computation (CEC), pp. 2048-2055  inproceedings DOI  
    Abstract: Generating test data for Object-Oriented (OO) software is a hard task. Little work has been done on the subject, and a lot of open problems still need to be investigated. In this paper we focus on container classes. They are used in almost every type of software, hence their reliability is of utmost importance. We present novel techniques to generate test data for container classes in an automatic way. A new representation with novel search operators is described and tested. A way to reduce the search space for OO software is presented. This is achieved by dynamically eliminating the functions that cannot give any further help from the search. Besides, the problem of applying the branch distances of disjunctions and conjunctions to OO software is solved. Finally, Hill Climbing, Genetic Algorithms and Memetic Algorithms are used and compared. Our empirical case study shows that our Memetic Algorithm outperforms the other algorithms.
    BibTeX:
    @inproceedings{ArcuriY07b,
      author = {Andrea Arcuri and Xin Yao},
      title = {A Memetic Algorithm for Test Data Generation of Object-Oriented Software},
      booktitle = {Proceedings of the 2007 IEEE Congress on Evolutionary Computation (CEC)},
      publisher = {IEEE},
      year = {2007},
      pages = {2048-2055},
      doi = {http://dx.doi.org/10.1109/CEC.2007.4424725}
    }
    
    Arcuri, Andrea and Yao, Xin On Test Data Generation of Object-Oriented Software 2007 Testing: Academic and Industrial Conference, Practice and Research Techniques (TAIC PART), pp. 72-76  inproceedings DOI  
    Abstract: Nowadays, Object-Oriented (OO) languages are widely used in the development of many different kinds of applications. However, testing those applications is still very expensive and time-consuming for the software community. The automation of this task would therefore be highly desirable. Although automatic testing of procedural software has been studied in depth for many years, comparatively little work has been done about OO software. Different techniques exist. However, the most promising one is probably to model the task as a Search Problem.

    This paper explains why automatic testing of OO software is more difficult than procedural software. These difficulties provide strong challenges to the Natural Computation and Software Engineering communities. A brief review of the literature of the subject follows. The issues of the current State-of-Art of the field are then outlined. Finally, some open research problems are discussed.

    BibTeX:
    @inproceedings{ArcuriY07c,
      author = {Andrea Arcuri and Xin Yao},
      title = {On Test Data Generation of Object-Oriented Software},
      booktitle = {Testing: Academic and Industrial Conference, Practice and Research Techniques (TAIC PART)},
      publisher = {IEEE Computer Society},
      year = {2007},
      pages = {72-76},
      doi = {http://dx.doi.org/10.1109/TAICPART.2007.4344101}
    }
    
    Arcuri, Andrea and Yao, Xin Coevolving Programs and Unit Tests from their Specification 2007 Proceedings of the twenty-second IEEE/ACM International Conference on Automated Software Engineering (ASE '07), pp. 397-400  inproceedings DOI  
    Abstract: Writing a formal specification before implementing a program helps to find problems with the system requirements. The requirements might be for example incomplete and ambiguous. Fixing these types of errors is very difficult and expensive during the implementation phase of the software development cycle. Although writing a formal specification is usually easier than implementing the actual code, writing a specification requires time, and often it is preferred, instead, to use this time on the implementation.

    In this paper we introduce for the first time a framework that might evolve any possible generic program from its specification. We use the Genetic Programming to evolve the programs, and at the same time we exploit the specifications to coevolve sets of unit tests. Programs are rewarded on how many tests they do not fail, whereas the unit tests are rewarded on how many programs they make fail. We present and analyse four different problems on which this novel technique is successfully applied.

    BibTeX:
    @inproceedings{ArcuriY07d,
      author = {Andrea Arcuri and Xin Yao},
      title = {Coevolving Programs and Unit Tests from their Specification},
      booktitle = {Proceedings of the twenty-second IEEE/ACM International Conference on Automated Software Engineering (ASE '07)},
      publisher = {ACM},
      year = {2007},
      pages = {397-400},
      doi = {http://dx.doi.org/10.1145/1321631.1321693}
    }
    
    Baker, Paul and Harman, Mark and Steinhöfel, Kathleen and Skaliotis, Alexandros Search Based Approaches to Component Selection and Prioritization for the Next Release Problem 2006 Proceedings of the 22nd IEEE International Conference on Software Maintenance (ICSM '06), pp. 176-185  inproceedings DOI  
    Abstract: This paper addresses the problem of determining the next set of releases in the course of software evolution. It formulates both ranking and selection of candidate software components as a series of feature subset selection problems to which search based software engineering can be applied. The approach is automated using greedy and simulated annealing algorithms and evaluated using a set of software components from the component base of a large telecommunications organisation. The results are compared to those obtained by a panel of (human) experts. The results show that the two automated approaches convincingly outperform the expert judgment approach.
    BibTeX:
    @inproceedings{BakerHSS06,
      author = {Paul Baker and Mark Harman and Kathleen Steinhöfel and Alexandros Skaliotis},
      title = {Search Based Approaches to Component Selection and Prioritization for the Next Release Problem},
      booktitle = {Proceedings of the 22nd IEEE International Conference on Software Maintenance (ICSM '06)},
      publisher = {IEEE Computer Society},
      year = {2006},
      pages = {176-185},
      doi = {http://doi.ieeecomputersociety.org/10.1109/ICSM.2006.56}
    }
    
    Bate, Iain and Emberson, Paul Incorporating Scenarios And Heuristics To Improve Flexibility In Real-Time Embedded Systems 2006 Proceedings of the 12th IEEE Real-Time And Embedded Technology And Applications Symposium (RTAS '06), pp. 221-230  inproceedings DOI  
    Abstract: Flexibility, the ability to adapt to change, is important for real-time systems. As in any type of system, changes arise from maintenance, enhancements and upgrades. These changes are only feasible if timing requirements imposed by the real-time nature of the system can still be met. A flexible design will allow tasks to be added without impinging on other tasks, causing them to miss deadlines. The design space for these systems consists of many configurations describing how tasks and messages are allocated to hardware and scheduled on a hardware platform. Heuristic search is a well recognised strategy for solving allocation and scheduling problems but most research is limited to finding any valid solution for a current set of requirements. The technique proposed here incorporates scenario based analysis into heuristic search strategies where the ability of a solution to meet a scenario is included as another heuristic for the changeability of a system. This allows future requirements to be taken into account when choosing a solution so that future changes can be accommodated with minimal alterations to the existing system.
    BibTeX:
    @inproceedings{BateE06,
      author = {Iain Bate and Paul Emberson},
      title = {Incorporating Scenarios And Heuristics To Improve Flexibility In Real-Time Embedded Systems},
      booktitle = {Proceedings of the 12th IEEE Real-Time And Embedded Technology And Applications Symposium (RTAS '06)},
      publisher = {IEEE},
      year = {2006},
      pages = {221-230},
      doi = {http://dx.doi.org/10.1109/RTAS.2006.21}
    }
    
    Becerra, R. Landa and Sagarna, Ramón and Yao, Xin An Evaluation of Differential Evolution in Software Test Data Generation 2009 Proceedings of the 10th IEEE Congress on Evolutionary Computation (CEC '09), pp. 2850-2857  inproceedings DOI URL 
    Abstract: One of the main tasks software testing involves is the generation of the test inputs to be used during the test. Due to its expensive cost, the automation of this task has become one of the key issues in the area. Recently, this generation has been explicitly formulated as the resolution of a set of constrained optimisation problems. Differential Evolution (DE) is a population based evolutionary algorithm which has been successfully applied in a number of domains, including constrained optimisation. We present a test data generator employing DE to solve each of the constrained optimisation problems, and empirically evaluate its performance for several DE models. With the aim of comparing this technique with other approaches, we extend the experiments to the Breeder Genetic Algorithm and face it to DE, and compare different test data generators in the literature with the DE approach. The results present DE as a promising solution technique for this real-world problem.
    BibTeX:
    @inproceedings{BecerraSY09,
      author = {R. Landa Becerra and Ramón Sagarna and Xin Yao},
      title = {An Evaluation of Differential Evolution in Software Test Data Generation},
      booktitle = {Proceedings of the 10th IEEE Congress on Evolutionary Computation (CEC '09)},
      publisher = {IEEE},
      year = {2009},
      pages = {2850-2857},
      url = {http://www2.computer.org/portal/web/csdl/doi/10.1109/CEC.2009.4983300},
      doi = {http://dx.doi.org/10.1109/CEC.2009.4983300}
    }
    
    Binkley, David W. and Harman, Mark and Lakhotia, Kiran FlagRemover: A Testability Transformation for Transforming Loop Assigned Flags 2009 ACM Transactions on Software Engineering and Methodology
    Vol. 2(3), pp. 110-146 
    article URL 
    Abstract: Search-Based Testing is a widely studied technique for automatically generating test inputs, with the aim of reducing the cost of software engineering activities that rely upon testing. However, search-based approaches degenerate to random testing in the presence of flag variables, because flags create spikes and plateaux in the fitness landscape. Both these features are known to denote hard optimization problems for all search-based optimization techniques. Several authors have studied flag removal transformations and fitness function refinements to address the issue of flags, but the problem of loop-assigned flags remains unsolved. This paper introduces a testability transformation along with a tool that transforms programs with loop-assigned flags into flag-free equivalents, so that existing search-based test data generation approaches can successfully be applied. The paper presents the results of an empirical study that demonstrates the effectiveness and efficiency of the testability transformation on programs including those made up of open source and industrial production code, as well as test data generation problems specifically created to denote hard optimization problems.
    BibTeX:
    @article{BinkleyHL09,
      author = {David W. Binkley and Mark Harman and Kiran Lakhotia},
      title = {FlagRemover: A Testability Transformation for Transforming Loop Assigned Flags},
      journal = {ACM Transactions on Software Engineering and Methodology},
      year = {2009},
      volume = {2},
      number = {3},
      pages = {110-146},
      url = {http://www.dcs.kcl.ac.uk/staff/mark/tosem-issta04-jv.pdf}
    }
    
    Chen, Tianshi and Lehre, Per Kristian and Tang, Ke and Yao, Xin When Is an Estimation of Distribution Algorithm Better than an Evolutionary Algorithm? 2009 Proceedings of the 10th IEEE Congress on Evolutionary Computation (CEC '09), pp. 1470-1477  inproceedings URL 
    Abstract: Despite the wide-spread popularity of estimation of distribution algorithms (EDAs), there has been no theoretical proof that there exist optimisation problems where EDAs

    perform significantly better than traditional evolutionary algorithms. Here, it is proved rigorously that on a problem called SUBSTRING, a simple EDA called univariate marginal distribution algorithm (UMDA) is efficient, whereas the (1+1) EA is highly inefficient. Such studies are essential in gaining insight into fundamental research issues, i.e., what problem characteristics make an EDA or EA efficient, under what conditions an EDA is expected to outperform an EA, and what key factors are in an EDA that make it efficient or inefficient.

    BibTeX:
    @inproceedings{ChenLTY09,
      author = {Tianshi Chen and Per Kristian Lehre and Ke Tang and Xin Yao},
      title = {When Is an Estimation of Distribution Algorithm Better than an Evolutionary Algorithm?},
      booktitle = {Proceedings of the 10th IEEE Congress on Evolutionary Computation (CEC '09)},
      publisher = {IEEE},
      year = {2009},
      pages = {1470-1477},
      url = {http://nical.ustc.edu.cn/uploads/cec2009/P684.pdf}
    }
    
    Di Penta, Massimiliano and Harman, Mark and Antoniol, Giuliano and Qureshi, Fahim The Effect of Communication Overhead on Software Maintenance Project Staffing: a Search-Based Approach 2007 Proceedings of the 23rd IEEE International Conference on Software Maintenance (ICSM '07), pp. 315-324  inproceedings DOI  
    Abstract: Brooks' milestone 'Mythical Man Month' established the observation that there is no simple conversion between people and time in large scale software projects. Communication and training overheads yield a subtle and variable relationship between the person-months required for a project and the number of people needed to complete the task within a given timeframe. This paper formalises several instantiations of Brooks' law and uses these to construct project schedule and staffing instances - using a search-based project staffing and scheduling approach - on data from two large real world maintenance projects. The results reveal the impact of different formulations of Brooks' law on project completion time and on staff distribution across teams, and the influence of other factors such as the presence of dependencies between work packages on the effect of communication overhead.
    BibTeX:
    @inproceedings{DiPentaHAQ07,
      author = {Massimiliano Di Penta and Mark Harman and Giuliano Antoniol and Fahim Qureshi},
      title = {The Effect of Communication Overhead on Software Maintenance Project Staffing: a Search-Based Approach},
      booktitle = {Proceedings of the 23rd IEEE International Conference on Software Maintenance (ICSM '07)},
      publisher = {IEEE},
      year = {2007},
      pages = {315-324},
      doi = {http://dx.doi.org/10.1109/ICSM.2007.4362644}
    }
    
    Durillo, Juan J. and Zhang, Yuanyuan and Alba, Enrique and Nebro, Antonio J. A Study of the Multi-Objective Next Release Problem 2009 Proceedings of the 1st International Symposium on Search Based Software Engineering (SSBSE '09), pp. 49-58  inproceedings DOI  
    Abstract: One of the first issues which has to be taken into account by software companies is to determine what should be included in the next release of their products, in such a way that the highest possible number of customers get satisfied while this entails a minimum cost for the company. This problem is known as the Next Release Problem (NRP). Since minimizing the total cost of including new features into a software package and maximizing the total satisfaction of customers are contradictory objectives, the problem has a multi-objective nature. In this work we study the NRP problem from the multi-objective point of view, paying attention to the quality of the obtained solutions, the number of solutions, the range of solutions covered by these fronts, and the number of optimal solutions obtained.Also, we evaluate the performance of two state-of-the-art multi-objective metaheuristics for solving NRP: NSGA-II and MOCell. The obtained results show that MOCell outperforms NSGA-II in terms of the range of solutions covered, while this latter is able of obtaining better solutions than MOCell in large instances. Furthermore, we have observed that the optimal solutions found are composed of a high percentage of low-cost requirements and, also, the requirements that produce most satisfaction on the customers.
    BibTeX:
    @inproceedings{DurilloZAN09,
      author = {Juan J. Durillo and Yuanyuan Zhang and Enrique Alba and Antonio J. Nebro},
      title = {A Study of the Multi-Objective Next Release Problem},
      booktitle = {Proceedings of the 1st International Symposium on Search Based Software Engineering (SSBSE '09)},
      publisher = {IEEE Computer Society},
      year = {2009},
      pages = {49-58},
      doi = {http://dx.doi.org/10.1109/SSBSE.2009.21}
    }
    
    Emberson, Paul and Bate, Iain Stressing Search with Scenarios for Flexible Solutions to Real-Time Task Allocation Problems To appear IEEE Transactions on Software Engineering  article DOI  
    Abstract: One of the most important properties of a good software engineering process and of the design of the software it produces is robustness to changing requirements. Scenario-based analysis is a popular method for improving the flexibility of software architectures. This paper demonstrates a search-based technique for automating scenario-based analysis in the software architecture deployment view. Specifically, a novel parallel simulated annealing search algorithm is applied to the real-time task allocation problem to find baseline solutions which require a minimal number of changes in order to meet the requirements of potential upgrade scenarios. Another simulated annealing based search is used for finding a solution which is similar to an existing baseline when new requirements arise. Solutions generated using a variety of scenarios are judged by how well they respond to different system requirements changes. The evaluation is performed on a set of problems with a controlled set of different characteristics.
    BibTeX:
    @article{EmbersonB,
      author = {Paul Emberson and Iain Bate},
      title = {Stressing Search with Scenarios for Flexible Solutions to Real-Time Task Allocation Problems},
      journal = {IEEE Transactions on Software Engineering},
      publisher = {IEEE Computer Society},
      year = {To appear},
      doi = {http://doi.ieeecomputersociety.org/10.1109/TSE.2009.58}
    }
    
    Emberson, Paul and Bate, Iain Extending a Task Allocation Algorithm for Graceful Degradation of Real-Time Distributed Embedded Systems 2008 Proceedings of the 2008 Real-Time Systems Symposium (RTSS '08), pp. 270-279  inproceedings DOI  
    Abstract: Previous research which has considered task allocation and fault-tolerance together has concentrated on constructingschedules which accommodate a fixed number of redundanttasks. Often, all faults are treated as being equallysevere. There is little work which combines task allocationwith architectural level fault-tolerance issues such as thenumber of replicas to use and how they should be configured,both of which are tackled by this work. An acceptedmethod for assessing the impact of a combination of faults is to build a system utility model which can be used to assess how the system degrades when components fail. The keychallenge addressed here is how to design objective functions based on a utility model which can be incorporatedinto a search algorithm in order to optimise fault-toleranceproperties. Other issues such as how to extend the localsearch neighbourhood and balance objectives with schedulability constraints are also discussed.
    BibTeX:
    @inproceedings{EmbersonB08,
      author = {Paul Emberson and Iain Bate},
      title = {Extending a Task Allocation Algorithm for Graceful Degradation of Real-Time Distributed Embedded Systems},
      booktitle = {Proceedings of the 2008 Real-Time Systems Symposium (RTSS '08)},
      publisher = {IEEE Computer Society},
      year = {2008},
      pages = {270-279},
      doi = {http://dx.doi.org/10.1109/RTSS.2008.24}
    }
    
    Emberson, Paul and Bate, Iain Minimising Task Migration and Priority Changes In Mode Transitions 2007 Proceedings of the 13th IEEE Real-Time And Embedded Technology And Applications Symposium (RTAS '07), pp. 158-167  inproceedings DOI  
    Abstract: Handling mode changes is one of the most complex and important problems for real-time systems designers. The challenge is to move a system from running one set of software to another while still achieving the quality of service guarantees necessary. There has been previous work which concentrated on how to perform scheduling and timing analysis of mode changes. However, a common theme of all this research is that if the system's schedule and allocation is chosen to minimise the set of differences between modes then the mode transition problem can be performed more easily and quickly. This paper investigates how this can be achieved.
    BibTeX:
    @inproceedings{EmbersonB07,
      author = {Paul Emberson and Iain Bate},
      title = {Minimising Task Migration and Priority Changes In Mode Transitions},
      booktitle = {Proceedings of the 13th IEEE Real-Time And Embedded Technology And Applications Symposium (RTAS '07)},
      publisher = {IEEE Computer Society},
      year = {2007},
      pages = {158-167},
      doi = {http://dx.doi.org/10.1109/RTAS.2007.17}
    }
    
    Finkelstein, Anthony and Harman, Mark and Mansouri, S. Afshin and Ren, Jian and Zhang, Yuanyuan A Search based Approach to Fairness Analysis in Requirement Assignments to Aid Negotiation, Mediation and Decision Making 2009 Requirements Engineering Journal (RE '08 Special Issue)  article DOI  
    Abstract: This paper uses a multi-objective optimisation approach to support investigation of the trade-offs in various notions of fairness between multiple customers. Results are presented to validate the approach using two real-world data sets and also using data sets created specifically to stress test the approach. Simple graphical techniques are used to visualize the solution space. The paper also reports on experiments to determine the most suitable algorithm for this problem, comparing the results of the NSGA-II algorithms, a widely used multi objective evolutionary algorithm, and the Two-Archive evolutionary algorithm, a recently proposed alternative.
    BibTeX:
    @article{FinkelsteinHMRZ09,
      author = {Anthony Finkelstein and Mark Harman and S. Afshin Mansouri and Jian Ren and Yuanyuan Zhang},
      title = {A Search based Approach to Fairness Analysis in Requirement Assignments to Aid Negotiation, Mediation and Decision Making},
      journal = {Requirements Engineering Journal (RE '08 Special Issue)},
      year = {2009},
      doi = {http://dx.doi.org/10.1007/s00766-009-0075-y}
    }
    
    Finkelstein, Anthony and Harman, Mark and Mansouri, S. Afshin and Ren, Jian and Zhang, Yuanyuan ``Fairness Analysis" in Requirements Assignments 2008 Proceedings of the 16th IEEE International Requirements Engineering Conference (RE '08), pp. 115-124  inproceedings DOI  
    Abstract: Requirements engineering for multiple customers, each of whom have competing and often conflicting priorities, raises issues of negotiation, mediation and conflict resolution. This paper uses a multi-objective optimisation approach to support investigation of the trade-offs in various notions of fairness between multiple customers. Results are presented to validate the approach using two real-world data sets and also using data sets created specifically to stress test the approach. Simple graphical techniques are used to visualize the solution space.
    BibTeX:
    @inproceedings{FinkelsteinHMRZ08,
      author = {Anthony Finkelstein and Mark Harman and S. Afshin Mansouri and Jian Ren and Yuanyuan Zhang},
      title = {``Fairness Analysis" in Requirements Assignments},
      booktitle = {Proceedings of the 16th IEEE International Requirements Engineering Conference (RE '08)},
      publisher = {IEEE Computer Society},
      year = {2008},
      pages = {115-124},
      doi = {http://dx.doi.org/10.1109/RE.2008.61}
    }
    
    Ghani, Kamran and Clark, John A. Widening the Goal Posts: Program Stretching to Aid Search Based Software Testing 2009 Proceedings of the 1st International Symposium on Search Based Software Engineering (SSBSE '09)  inproceedings DOI  
    Abstract: Search based software testing has emerged in recent years as an important research area within automated software test data generation. The general approach of couching the satisfaction of test goals as numerical optimisation problems has been applied to a variety of problems such as satisfying structural coverage criteria, specification falsification, exception generation,
    BibTeX:
    @inproceedings{GhaniC09,
      author = {Kamran Ghani and John A. Clark},
      title = {Widening the Goal Posts: Program Stretching to Aid Search Based Software Testing},
      booktitle = {Proceedings of the 1st International Symposium on Search Based Software Engineering (SSBSE '09)},
      publisher = {IEEE Computer Society},
      year = {2009},
      doi = {http://dx.doi.org/10.1109/SSBSE.2009.26}
    }
    
    Ghani, Kamran and Clark, John A. Automatic Test Data Generation for Multiple Condition and MCDC Coverage 2009 Proceedings of the 4th International Conference on Software Engineering Advances (ICSEA '09)  inproceedings  
    BibTeX:
    @inproceedings{GhaniC09b,
      author = {Kamran Ghani and John A. Clark},
      title = {Automatic Test Data Generation for Multiple Condition and MCDC Coverage},
      booktitle = {Proceedings of the 4th International Conference on Software Engineering Advances (ICSEA '09)},
      year = {2009},
      note = {to appear}
    }
    
    Ghani, Kamran and Clark, John A. Strengthening Inferred Specification using Search Based Testing 2008 Proceedings of 1st International Workshop on Search-Based Software Testing (SBST) in conjunction with ICST 2008, pp. 187-194  inproceedings DOI  
    Abstract: Software specification is an important element of the software development process. However, in most cases the specifications are out-of-date or even missing. One solution for this kind of problem is to use some process that infers the specification automatically. Work by Ernst et al [9, 22] has shown how specifications can be generated using program execution traces. These approaches are dependent on the test suites used to produce the traces, which may lead to unreliable specifications being inferred. Such specification inference is highly useful, however. In this paper we show how search based testing techniques can challenge and identify erroneous elements of such inferred specifications. This leads to a much tighter (accurate) inferred specifications. Thus, specification inference and search based test data generation are shown to be complementary.
    BibTeX:
    @inproceedings{GhaniC08,
      author = {Kamran Ghani and John A. Clark},
      title = {Strengthening Inferred Specification using Search Based Testing},
      booktitle = {Proceedings of 1st International Workshop on Search-Based Software Testing (SBST) in conjunction with ICST 2008},
      publisher = {IEEE},
      year = {2008},
      pages = {187-194},
      doi = {http://dx.doi.org/10.1109/ICSTW.2008.39}
    }
    
    Ghani, Kamran and Clark, John A. and Zhan, Yuan Comparing Algorithms for Search-Based Test Data Generation of Matlab Simulink Models 2009 Proceedings of the 10th IEEE Congress on Evolutionary Computation (CEC '09)  inproceedings URL 
    Abstract: Search Based Software Engineering (SBSE) is an evolving field where meta-heuristic techniques are applied to solve many software engineering problems. One area of SBSE, where considerable research is underway, is software testing. We see much application of meta-heuristics search techniques for generating input test data. But most of the work in this area is concentrated on test data generation from source code. We see very little application of such techniques to testing from other sources such as requirement and design models. Zhan and Clark applied such techniques to generate test data for Simulink models. This paper extends the work of Zhan and Clark by investigating the application of Genetic Algorithms (GAs) to Simulink models and then statistically compares the results to the existing work, which is mainly based on Simulated Annealing (SA).
    BibTeX:
    @inproceedings{GhaniCZ09,
      author = {Kamran Ghani and John A. Clark and Yuan Zhan},
      title = {Comparing Algorithms for Search-Based Test Data Generation of Matlab Simulink Models},
      booktitle = {Proceedings of the 10th IEEE Congress on Evolutionary Computation (CEC '09)},
      publisher = {IEEE},
      year = {2009},
      url = {http://www2.computer.org/portal/web/csdl/doi/10.1109/CEC.2009.4983313}
    }
    
    Gold, Nicolas and Harman, Mark and Li, Zheng and Mahdavi, Kiarash Allowing Overlapping Boundaries in Source Code using a Search Based Approach to Concept Binding 2006 Proceedings of the 22nd IEEE International Conference on Software Maintenance (ICSM '06), pp. 310-319  inproceedings DOI  
    Abstract: One approach to supporting program comprehension involves binding concepts to source code. Previously proposed approaches to concept binding have enforced non-overlapping boundaries. However, real-world programs may contain overlapping concepts. This paper presents techniques to allow boundary overlap in the binding of concepts to source code. In order to allow boundaries to overlap, the concept binding problem is reformulated as a search problem. It is shown that the search space of overlapping concept bindings is exponentially large, indicating the suitability of sampling-based search algorithms. Hill climbing and genetic algorithms are introduced for sampling the space. The paper reports on experiments that apply these algorithms to 21 COBOL II programs taken from the commercial financial services sector. The results show that the genetic algorithm produces significantly better solutions than both the hill climber and random search.
    BibTeX:
    @inproceedings{GoldHLM06,
      author = {Nicolas Gold and Mark Harman and Zheng Li and Kiarash Mahdavi},
      title = {Allowing Overlapping Boundaries in Source Code using a Search Based Approach to Concept Binding},
      booktitle = {Proceedings of the 22nd IEEE International Conference on Software Maintenance (ICSM '06)},
      publisher = {IEEE Computer Society},
      year = {2006},
      pages = {310-319},
      doi = {http://dx.doi.org/10.1109/ICSM.2006.10}
    }
    
    Gueorguiev, Stefan and Harman, Mark and Antoniol, Giuliano Software Project Planning for Robustness and Completion Time in the Presence of Uncertainty using Multi Objective Search Based Software Engineering 2009 Proceedings of the 11th Annual Conference on Genetic and Evolutionary Computation (GECCO '09), pp. 1673-1680 (Best Paper Award)  inproceedings DOI  
    Abstract: All large-scale projects contain a degree of risk and uncertainty. Software projects are particularly vulnerable to overruns, due to the this uncertainty and the inherent difficulty of software project cost estimation. In this paper we introduce a search based approach to software project robustness. The approach is to formulate this problem as a multi objective Search Based Software Engineering problem, in which robustness and completion time are treated as two competing objectives. The paper presents the results of the application of this new approach to four large real-world software projects, using two different models of uncertainty.
    BibTeX:
    @inproceedings{GueorguievHA09,
      author = {Stefan Gueorguiev and Mark Harman and Giuliano Antoniol},
      title = {Software Project Planning for Robustness and Completion Time in the Presence of Uncertainty using Multi Objective Search Based Software Engineering},
      booktitle = {Proceedings of the 11th Annual Conference on Genetic and Evolutionary Computation (GECCO '09)},
      publisher = {ACM},
      year = {2009},
      pages = {1673-1680 (Best Paper Award)},
      doi = {http://dx.doi.org/10.1145/1569901.1570125}
    }
    
    Guo, Qiang and Hierons, Robert M. and Harman, Mark and Derderian, Karnig Heuristics for Fault Diagnosis when Testing from Finite State Machines 2007 Software Testing, Verification & Reliability
    Vol. 17(1), pp. 41-57 
    article DOI  
    Abstract: When testing from finite state machines, a failure observed in the implementation under test (IUT) is called a symptom. A symptom could have been caused by an earlier state transfer failure. Transitions that may be used to explain the observed symptoms are called diagnosing candidates. Finding strategies to generate an optimal set of diagnosing candidates that could effectively identify faults in the IUT is of great value in reducing the cost of system development and testing. This paper investigates fault diagnosis when testing from finite state machines and proposes heuristics for fault isolation and identification. The proposed heuristics attempt to lead to a symptom being observed in some shorter test sequences, which helps to reduce the cost of fault isolation and identification. The complexity of the proposed method is analysed. A case study is presented, which shows how the proposed approach assists in fault diagnosis.
    BibTeX:
    @article{GuoHHD07,
      author = {Qiang Guo and Robert M. Hierons and Mark Harman and Karnig Derderian},
      title = {Heuristics for Fault Diagnosis when Testing from Finite State Machines},
      journal = {Software Testing, Verification & Reliability},
      year = {2007},
      volume = {17},
      number = {1},
      pages = {41-57},
      doi = {http://dx.doi.org/10.1002/stvr.v17:1}
    }
    
    Harman, Mark Testability Transformation for Search-Based Testing 2008 Keynote of the 1st International Workshop on Search-Based Software Testing (SBST) in conjunction with ICST 2008  inproceedings  
    Abstract: Testability transformation combines automated test data generation with traditional program transformation. This talk introduces the concept of testability transformation, to address structural impediments to automated test data generation. A testability transformation alters the program's structure, creating an equivalent version of the program for which test data generation is easier. Because the structure of the original program is transformed, the test adequacy criterion may also need to be transformed, to preserve the meaning of the original adequacy criterion.

    This observation motivates the introduction of a theoretical foundation for testability transformation, in which a testability transformation extends traditional program transformation to include the test adequacy criterion together with the program to be transformed. The theory is illustrated with two example applications of testability transformation for search-based testing.

    Interestingly testability transformations are disposed of once they are used - testability transformation is a means to an end, rather than an end in itself. This is a departure from previous work on transformation, in which the transformed program replaces the original. Even more radical, the transformations need not be meaning preserving in the conventional sense of functional equivalence - the sine qua non of transformation for the past thirty years.

    BibTeX:
    @inproceedings{Harman08,
      author = {Mark Harman},
      title = {Testability Transformation for Search-Based Testing},
      booktitle = {Keynote of the 1st International Workshop on Search-Based Software Testing (SBST) in conjunction with ICST 2008},
      year = {2008}
    }
    
    Harman, Mark The Current State and Future of Search Based Software Engineering 2007 Proceedings of International Conference on Software Engineering / Future of Software Engineering 2007 (ICSE/FOSE '07), pp. 342-357  inproceedings DOI  
    Abstract: This paper describes work on the application of optimization techniques in software engineering. These optimization techniques come from the operations research and metaheuristic computation research communities. The paper briefly reviews widely used optimization techniques and the key ingredients required for their successful application to software engineering, providing an overview of existing results in eight software engineering application domains. The paper also describes the benefits that are likely to accrue from the growing body of work in this area and provides a set of open problems, challenges and areas for future work.
    BibTeX:
    @inproceedings{Harman07,
      author = {Mark Harman},
      title = {The Current State and Future of Search Based Software Engineering},
      booktitle = {Proceedings of International Conference on Software Engineering / Future of Software Engineering 2007 (ICSE/FOSE '07)},
      publisher = {IEEE Computer Society},
      year = {2007},
      pages = {342-357},
      doi = {http://dx.doi.org/10.1109/FOSE.2007.29}
    }
    
    Harman, Mark Search Based Software Engineering for Program Comprehension 2007 Proceedings of the 15th IEEE International Conference on Program Comprehension (ICPC '07), pp. 3-13  inproceedings DOI  
    Abstract: Search Based Software Engineering (SBSE) is an approach to software engineering in which search based optimization algorithms are used to identify optimal or near optimal solutions and to yield insight. SBSE techniques can cater for multiple, possibly competing objectives and/or constraints and applications where the potential solution space is large and complex. Such situations are common in software engineering, leading to an increasing interest in SBSE. This paper provides a brief overview of SBSE, explaining some of the ways in which it has already been applied to program - comprehension related activities. The paper also outlines some possible future applications of and challenges for the further application of SBSE to Program Comprehension.
    BibTeX:
    @inproceedings{Harman07b,
      author = {Mark Harman},
      title = {Search Based Software Engineering for Program Comprehension},
      booktitle = {Proceedings of the 15th IEEE International Conference on Program Comprehension (ICPC '07)},
      publisher = {IEEE},
      year = {2007},
      pages = {3-13},
      doi = {http://dx.doi.org/10.1109/ICPC.2007.35}
    }
    
    Harman, Mark and Hassoun, Youssef and Lakhotia, Kiran and McMinn, Phil and Wegener, Joachim The Impact of Input Domain Reduction on Search-based Test Data Generation 2007 Proceedings of the the 6th joint meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on The Foundations of Software Engineering, pp. 155-164  inproceedings DOI  
    Abstract: There has recently been a great deal of interest in search-based test data generation, with many local and global search algorithms being proposed. However, to date, there has been no investigation of the relationship between the size of the input domain (the search space) and performance of search-based algorithms. Static analysis can be used to remove irrelevant variables for a given test data generation problem, thereby reducing the search space size. This paper studies the effect of this domain reduction, presenting results from the application of local and global search algorithms to real world examples. This provides evidence to support the claim that domain reduction has implications for practical search-based test data generation.
    BibTeX:
    @inproceedings{HarmanHLMW07,
      author = {Mark Harman and Youssef Hassoun and Kiran Lakhotia and Phil McMinn and Joachim Wegener},
      title = {The Impact of Input Domain Reduction on Search-based Test Data Generation},
      booktitle = {Proceedings of the the 6th joint meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on The Foundations of Software Engineering},
      publisher = {ACM},
      year = {2007},
      pages = {155-164},
      doi = {http://dx.doi.org/10.1145/1287624.1287647}
    }
    
    Harman, Mark and Islam, Fayezin and Xie, Tao and Wappler, Stefan Automated Test Data Generation for Aspect-Oriented Programs 2009 Proceedings of the 8th International Conference on Aspect-Oriented Software Development (AOSD '09), pp. 185-196  inproceedings DOI  
    Abstract: Despite the upsurge of interest in the Aspect-Oriented Programming (AOP) paradigm, there remain few results on test data generation techniques for AOP. Furthermore, there is no work on search-based optimization for test data generation, an approach that has been shown to be successful in other programming paradigms.

    In this paper, we introduce a search-based optimization approach to automated test data generation for structural coverage of AOP systems. We present the results of an empirical study that demonstrates the effectiveness of the approach. We also introduce a domain reduction approach for AOP testing and show that this approach not only reduces test effort, but also increases test effectiveness. This finding is significant, because similar studies for non-AOP programming paradigms show no such improvement in effectiveness, merely a reduction in effort. We also present the results of an empirical study of the reduction in test effort achieved by focusing specifically on branches inside aspects.

    BibTeX:
    @inproceedings{HarmanIXW09,
      author = {Mark Harman and Fayezin Islam and Tao Xie and Stefan Wappler},
      title = {Automated Test Data Generation for Aspect-Oriented Programs},
      booktitle = {Proceedings of the 8th International Conference on Aspect-Oriented Software Development (AOSD '09)},
      publisher = {ACM},
      year = {2009},
      pages = {185-196},
      doi = {http://dx.doi.org/10.1145/1509239.1509264}
    }
    
    Harman, Mark and Krinke, Jens and Ren, Jian and Yoo, Shin Search Based Data Sensitivity Analysis Applied to Requirement Engineering 2009 Proceedings of the 11th Annual Conference on Genetic and Evolutionary Computation (GECCO '09), pp. 1681-1688  inproceedings DOI  
    Abstract: Software engineering is plagued by problems associated with unreliable cost estimates. This paper introduces an approach to sensitivity analysis for requirements engineering. It uses Search-Based Software Engineering to aid the decision maker to explore sensitivity of the cost estimates of requirements for the Next Release Problem (NRP). The paper presents both single- and multi-objective formulation of NRP with empirical sensitivity analysis on synthetic and real-world data. The results show strong correlation between the level of inaccuracy and the impact on the selection of requirements, as well as between the cost of requirements and the impact, which is as intuitively expected. However, there also exist a few sensitive exceptions to these trends; the paper uses a heat-map style visualisation to reveal these exceptions which require careful consideration. The paper also shows that such unusually sensitivity patterns occur in real-world data and how the proposed approach clearly identifies them.
    BibTeX:
    @inproceedings{HarmanKRY09,
      author = {Mark Harman and Jens Krinke and Jian Ren and Shin Yoo},
      title = {Search Based Data Sensitivity Analysis Applied to Requirement Engineering},
      booktitle = {Proceedings of the 11th Annual Conference on Genetic and Evolutionary Computation (GECCO '09)},
      publisher = {ACM},
      year = {2009},
      pages = {1681-1688},
      doi = {http://dx.doi.org/10.1145/1569901.1570126}
    }
    
    Harman, Mark and Lakhotia, Kiran and McMinn, Phil A Multi-Objective Approach to Search-based Test Data Generation 2007 Proceedings of the 9th annual Conference on Genetic and Evolutionary Computation (GECCO '07), pp. 1098-1105  inproceedings DOI  
    Abstract: There has been a considerable body of work on search-based test data generation for branch coverage. However, hitherto, there has been no work on multi-objective branch coverage. In many scenarios a single-objective formulation is unrealistic; testers will want to find test sets that meet several objectives simultaneously in order to maximize the value obtained from the inherently expensive process of running the test cases and examining the output they produce. This paper introduces multi-objective branch coverage. The paper presents results from a case study of the twin objectives of branch coverage and dynamic memory consumption for both real and synthetic programs. Several multi-objective evolutionary algorithms are applied. The results show that multi-objective evolutionary algorithms are suitable for this problem, and illustrates the way in which a Pareto optimal search can yield insights into the trade-offs between the two simultaneous objectives.
    BibTeX:
    @inproceedings{HarmanLM07,
      author = {Mark Harman and Kiran Lakhotia and Phil McMinn},
      title = {A Multi-Objective Approach to Search-based Test Data Generation},
      booktitle = {Proceedings of the 9th annual Conference on Genetic and Evolutionary Computation (GECCO '07)},
      publisher = {ACM},
      year = {2007},
      pages = {1098-1105},
      doi = {http://dx.doi.org/10.1145/1276958.1277175}
    }
    
    Harman, Mark and Mansouri, S. Afshin and Zhang, Yuanyuan Search Based Software Engineering: A Comprehensive Analysis and Review of Trends Techniques and Applications 2009 (TR-09-03)  techreport URL 
    Abstract: In the past five years there has been a dramatic increase in work on Search Based Software Engineering (SBSE), an approach to software engineering in which search based optimisation algorithms are used to address problems in Software Engineering. SBSE has been applied to problems throughout the Software Engineering lifecycle, from requirements and project planning to maintenance and re-engineering. The approach is attractive because it offers a suite of adaptive automated and semi-automated solutions in situations typified by large complex problem spaces with multiple competing and conflicting objectives.

    This paper provides a review and classification of literature on SBSE. The paper identifies research trends and relationships between the techniques applied and the applications to which they have been applied and highlights gaps in the literature and avenues for further research.

    BibTeX:
    @techreport{HarmanMZ09,
      author = {Mark Harman and S. Afshin Mansouri and Yuanyuan Zhang},
      title = {Search Based Software Engineering: A Comprehensive Analysis and Review of Trends Techniques and Applications},
      year = {2009},
      number = {TR-09-03},
      url = {http://www.dcs.kcl.ac.uk/technical-reports/papers/TR-09-03.pdf}
    }
    
    Harman, Mark and McMinn, Phil A Theoretical and Empirical Study of Search Based Testing: Local, Global and Hybrid Search IEEE Transactions on Software Engineering  article  
    BibTeX:
    @article{HarmanM,
      author = {Mark Harman and Phil McMinn},
      title = {A Theoretical and Empirical Study of Search Based Testing: Local, Global and Hybrid Search},
      journal = {IEEE Transactions on Software Engineering},
      note = {to appear}
    }
    
    Harman, Mark and McMinn, Phil A Theoretical & Empirical Analysis of Evolutionary Testing and Hill Climbing for Structural Test Data Generation 2007 Proceedings of the International Symposium on Software Testing and Analysis (ISSTA '07), pp. 73-83  inproceedings DOI  
    Abstract: Evolutionary testing has been widely studied as a technique for automating the process of test case generation. However, to date, there has been no theoretical examination of when and why it works. Furthermore, the empirical evidence for the effectiveness of evolutionary testing consists largely of small scale laboratory studies. This paper presents a first theoretical analysis of the scenarios in which evolutionary algorithms are suitable for structural test case generation. The theory is backed up by an empirical study that considers real world programs, the search spaces of which are several orders of magnitude larger than those previously considered.
    BibTeX:
    @inproceedings{HarmanM07,
      author = {Mark Harman and Phil McMinn},
      title = {A Theoretical & Empirical Analysis of Evolutionary Testing and Hill Climbing for Structural Test Data Generation},
      booktitle = {Proceedings of the International Symposium on Software Testing and Analysis (ISSTA '07)},
      publisher = {ACM},
      year = {2007},
      pages = {73-83},
      doi = {http://dx.doi.org/10.1145/1273463.1273475}
    }
    
    Harman, Mark and Skaliotis, Alexandros and Steinhöfel, Kathleen Search-based Approaches to the Component Selection and Prioritization Problem 2006 Proceedings of the 8th annual Conference on Genetic and Evolutionary Computation (GECCO '06), pp. 1951-1952  inproceedings DOI  
    Abstract: This poster paper addresses the problem of choosing sets of software components to combine in component-based software engineering. It formulates both ranking and selection problems as feature subset selection problems to which search based software engineering can be applied. We will consider selection and ranking of elements from a set of software components from the component base of a large telecommunications organisation.
    BibTeX:
    @inproceedings{HarmanSS06,
      author = {Mark Harman and Alexandros Skaliotis and Kathleen Steinhöfel},
      title = {Search-based Approaches to the Component Selection and Prioritization Problem},
      booktitle = {Proceedings of the 8th annual Conference on Genetic and Evolutionary Computation (GECCO '06)},
      publisher = {ACM},
      year = {2006},
      pages = {1951-1952},
      doi = {http://dx.doi.org/10.1145/1143997.1144321}
    }
    
    Harman, Mark and Tratt, Laurence Pareto Optimal Search Based Refactoring at the Design Level 2007 Proceedings of the 9th annual Conference on Genetic and Evolutionary Computation (GECCO '07), pp. 1106-1113  inproceedings DOI  
    Abstract: Refactoring aims to improve the quality of a software systems' structure, which tends to degrade as the system evolves. While manually determining useful refactorings can be challenging, search based techniques can automatically discover useful refactorings. Current search based refactoring approaches require metrics to be combined in a complex fashion, and producea single sequence of refactorings. In this paper we show how Pareto optimality can improve search based refactoring, making the combination of metrics easier, and aiding the presentation of multiple sequences of optimal refactorings to users.
    BibTeX:
    @inproceedings{HarmanT07,
      author = {Mark Harman and Laurence Tratt},
      title = {Pareto Optimal Search Based Refactoring at the Design Level},
      booktitle = {Proceedings of the 9th annual Conference on Genetic and Evolutionary Computation (GECCO '07)},
      publisher = {ACM},
      year = {2007},
      pages = {1106-1113},
      doi = {http://dx.doi.org/10.1145/1276958.1277176}
    }
    
    Jia, Yue and Harman, Mark Higher Order Mutation Testing 2009 Information and Software Technology
    Vol. 51(10), pp. 1379-1393 
    article DOI  
    Abstract: This paper introduces a new paradigm for Mutation Testing, which we call Higher Order Mutation Testing (HOM Testing). Traditional Mutation Testing considers only first order mutants, created by the injection of a single fault. Often these first order mutants denote trivial faults that are easily killed. Higher order mutants are created by the insertion of two or more faults. The paper introduces the concept of a subsuming HOM; one that is harder to kill than the first order mutants from which it is constructed. By

    definition, subsuming HOMs denote subtle fault combinations. The paper reports the results of an empirical study of HOM Testing using ten programs, including several non trivial real-world subjects for which test suites are available.

    BibTeX:
    @article{JiaH09,
      author = {Yue Jia and Mark Harman},
      title = {Higher Order Mutation Testing},
      journal = {Information and Software Technology},
      year = {2009},
      volume = {51},
      number = {10},
      pages = {1379-1393},
      doi = {http://dx.doi.org/10.1016/j.infsof.2009.04.016}
    }
    
    Jia, Yue and Harman, Mark Constructing Subtle Faults Using Higher Order Mutation Testing 2008 Proceedings of the 8th International Working Conference on Source Code Analysis and Manipulation (SCAM '08), pp. 249-258 (Best Paper Award)  inproceedings DOI  
    Abstract: Traditional mutation testing considers only first order mutants, created by the injection of a single fault. Often these first order mutants denote trivial faults that are easily killed. This paper investigates Higher Order Mutants (HOMs). It introduces the concept of a subsuming HOM; one that is harder to kill than the first order mutants from which it is constructed. By definition, subsuming HOMs denote subtle fault combinations. The paper reports the results of an empirical study into subsuming HOMs, using six benchmark programs. This is the largest study of mutation testing to date. To overcome the exponential explosion in the number of mutants considered, the paper introduces a search based approach to the identification of subsuming HOMs. Results are presented for a greedy algorithm, a genetic algorithm and a hill climbing algorithm.
    BibTeX:
    @inproceedings{JiaH08,
      author = {Yue Jia and Mark Harman},
      title = {Constructing Subtle Faults Using Higher Order Mutation Testing},
      booktitle = {Proceedings of the 8th International Working Conference on Source Code Analysis and Manipulation (SCAM '08)},
      publisher = {IEEE},
      year = {2008},
      pages = {249-258 (Best Paper Award)},
      doi = {http://dx.doi.org/10.1109/SCAM.2008.36}
    }
    
    Jia, Yue and Harman, Mark MILU : A Customizable, Runtime-Optimized Higher Order Mutation Testing Tool for the Full C Language 2008 Proceedings of the 3rd Testing: Academic and Industrial Conference - Practice and Research Techniques (TAIC PART '08), pp. 94-98  inproceedings DOI  
    Abstract: This paper introduces MILU, a C mutation testing tool designed for both first order and higher order mutation testing. All previous mutation testing tools apply all possible mutation operators to the program under test. By contrast, MILU allows customization of the set of mutation operators to be applied. To reduce runtime cost, MILU uses a novel 'test harness' technique to embed mutants and their associated test sets into a single-invocation procedure.
    BibTeX:
    @inproceedings{JiaH08b,
      author = {Yue Jia and Mark Harman},
      title = {MILU : A Customizable, Runtime-Optimized Higher Order Mutation Testing Tool for the Full C Language},
      booktitle = {Proceedings of the 3rd Testing: Academic and Industrial Conference - Practice and Research Techniques (TAIC PART '08)},
      publisher = {IEEE},
      year = {2008},
      pages = {94-98},
      doi = {http://dx.doi.org/10.1109/TAIC-PART.2008.18}
    }
    
    Jiang, Tao and Gold, Nicolas and Harman, Mark and Li, Zheng Locating Dependence Structures using Search-based Slicing 2007 Information and Software Technology
    Vol. 50(12), pp. 1189-1209 
    article DOI  
    Abstract: This paper introduces an approach to locating dependence structures in a program by searching the space of the powerset of the set of all possible program slices. The paper formulates this problem as a search-based software engineering problem. To evaluate the approach, the paper introduces an instance of a search-based slicing problem concerned with locating sets of slices that decompose a program into a set of covering slices that minimize inter-slice overlap. The paper reports the result of an empirical study of algorithm performance and result-similarity for Hill Climbing, Genetic, Random Search and Greedy Algorithms applied to a set of 12 C programs.
    BibTeX:
    @article{JiangGHL07,
      author = {Tao Jiang and Nicolas Gold and Mark Harman and Zheng Li},
      title = {Locating Dependence Structures using Search-based Slicing},
      journal = {Information and Software Technology},
      year = {2007},
      volume = {50},
      number = {12},
      pages = {1189-1209},
      doi = {http://dx.doi.org/10.1016/j.infsof.2007.11.001}
    }
    
    Jiang, Tao and Harman, Mark and Hassoun, Youssef Analysis of Procedure Splitability 2008 Proceedings of the 15th Working Conference on Reverse Engineering (WCRE '08), pp. 247-256  inproceedings DOI  
    Abstract: As software evolves there is a tendency for size to increase and structure to degrade, leading to problems for ongoing maintenance and reverse engineering. This paper introduces a greedy dependence-based procedure splitting algorithm that provides automated support for analysis and intervention where procedures show signs of poor structure and over large size. The paper reports on the algorithms, implementation and empirical evaluation of procedure splitability. The study reveals a surprising prevalence of splitable procedures and a strong correlation between procedure size and splitability.
    BibTeX:
    @inproceedings{JiangHH08,
      author = {Tao Jiang and Mark Harman and Youssef Hassoun},
      title = {Analysis of Procedure Splitability},
      booktitle = {Proceedings of the 15th Working Conference on Reverse Engineering (WCRE '08)},
      publisher = {IEEE Computer Society},
      year = {2008},
      pages = {247-256},
      doi = {http://dx.doi.org/10.1109/WCRE.2008.31}
    }
    
    Khan, Usman and Bate, Iain WCET Analysis of Modern Processors using Multi-Criteria Optimisation 2009 Proceedings of the 1st International Symposium on Search Based Software Engineering (SSBSE '09)  inproceedings DOI  
    Abstract: The Worst-Case Execution Time (WCET) is an important execution metric for real-time systems, and an accurate estimate for this increases the reliability of subsequent schedulability analysis. Performance enhancing features on modern processors, such as pipelines and caches, however, make it difficult to accurately predict the WCET. One technique for finding the WCET is to use test data generated using search algorithms. Existing work on search-based approaches has been successfully used in both industry and academia based on a single criterion function, the WCET, but only for simple processors. This paper investigates how effective this strategy is for more complex processors and to what extent other criteria help guide the search, e.g. the number of cache misses. Not unexpectedly the work shows no single choice of criteria work best across all problems. Based on the findings recommendations are proposed on which criteria are useful in particular situations.
    BibTeX:
    @inproceedings{KhanB09,
      author = {Usman Khan and Iain Bate},
      title = {WCET Analysis of Modern Processors using Multi-Criteria Optimisation},
      booktitle = {Proceedings of the 1st International Symposium on Search Based Software Engineering (SSBSE '09)},
      publisher = {IEEE Computer Society},
      year = {2009},
      doi = {http://dx.doi.org/10.1109/SSBSE.2009.20}
    }
    
    Lakhotia, Kiran and Harman, Mark and McMinn, Phil Handling Dynamic Data Structures in Search Based Testing 2008 Proceedings of the 10th Annual Conference on Genetic and Evolutionary Computation (GECCO '08), pp. 1759-1766  inproceedings DOI  
    Abstract: There has been little attention to search based test data generation in the presence of pointer inputs and dynamic data structures, an area in which recent concolic methods have excelled. This paper introduces a search based testing approach which is able to handle pointers and dynamic data structures. It combines an alternating variable hill climb with a set of constraint solving rules for pointer inputs. The result is a lightweight and efficient method, as shown in the results from a case study, which compares the method to CUTE, a concolic unit testing tool.
    BibTeX:
    @inproceedings{LakhotiaHM08,
      author = {Kiran Lakhotia and Mark Harman and Phil McMinn},
      title = {Handling Dynamic Data Structures in Search Based Testing},
      booktitle = {Proceedings of the 10th Annual Conference on Genetic and Evolutionary Computation (GECCO '08)},
      publisher = {ACM},
      year = {2008},
      pages = {1759-1766},
      doi = {http://dx.doi.org/10.1145/1389095.1389435}
    }
    
    Lakhotia, Kiran and McMinn, Phil and Harman, Mark Automated Test Data Generation for Coverage: Haven't We Solved This Problem Yet? 2009 Proceedings of Testing: Academia & Industry Conference - Practice And Research Techniques (TAIC-PART '09)  inproceedings  
    BibTeX:
    @inproceedings{LakhotiaMH09,
      author = {Kiran Lakhotia and Phil McMinn and Mark Harman},
      title = {Automated Test Data Generation for Coverage: Haven't We Solved This Problem Yet?},
      booktitle = {Proceedings of Testing: Academia & Industry Conference - Practice And Research Techniques (TAIC-PART '09)},
      publisher = {IEEE Computer Society},
      year = {2009},
      note = {to appear}
    }
    
    Langdon, William B. and Harman, Mark and Jia, Yue Multi Objective Mutation Testing with Genetic Programming 2009 Proceedings of Testing: Academia & Industry Conference - Practice And Research Techniques (TAIC-PART '09)  inproceedings URL 
    Abstract: In academic empirical studies, mutation testing has been demonstrated to be a powerful technique for fault finding. However, it remains very expensive and the few valuable traditional mutants that resemble real faults are mixed in with many others that denote unrealistic faults. These twin problems of expense and realism have been a significant barrier to industrial uptake of mutation testing. Genetic programming is used to search the space of complex faults (higher order mutants). The space is much larger than the traditional first order mutation space of simple faults. However, the use of a search based approach makes this scalable, seeking only those mutants that challenge the tester, while the consideration of complex faults addresses the problem of fault realism; it is known that 90percent of real faults are complex (i.e. higher order). We show that we are able to find examples that pose challenges to testing in the higher order space that cannot be represented in the first order space.
    BibTeX:
    @inproceedings{LangdonHJ09,
      author = {William B. Langdon and Mark Harman and Yue Jia},
      title = {Multi Objective Mutation Testing with Genetic Programming},
      booktitle = {Proceedings of Testing: Academia & Industry Conference - Practice And Research Techniques (TAIC-PART '09)},
      publisher = {IEEE Computer Society},
      year = {2009},
      url = {http://www.cs.ucl.ac.uk/staff/W.Langdon/ftp/papers/langdon_2009_TAICPART.pdf}
    }
    
    Langdon, William B. and Harman, Mark and Jia, Yue Multi objective higher order mutation testing with GP 2009 Proceedings of the 11th Annual Conference on Genetic and Evolutionary Computation (GECCO '09), pp. 1945-1946  inproceedings DOI  
    Abstract: Mutation testing is a powerful software engineering technique for fault finding. It works by injecting known faults (mutations) into software and seeing if the test suite finds them. It remains very expensive and the few valuable traditional mutants that resemble real faults are mixed in with many others that denote unrealistic faults. The expense and lack of realism inhibit industrial uptake of mutation testing. Genetic programming searches the space of complex faults to find realistic higher order mutants. Despite the much larger search space, we have found mutants composed of multiple changes to the C source code that challenge the tester and which cannot be represented in the first order space.
    BibTeX:
    @inproceedings{LangdonHJ09b,
      author = {William B. Langdon and Mark Harman and Yue Jia},
      title = {Multi objective higher order mutation testing with GP},
      booktitle = {Proceedings of the 11th Annual Conference on Genetic and Evolutionary Computation (GECCO '09)},
      publisher = {ACM},
      year = {2009},
      pages = {1945-1946},
      note = {Poster},
      doi = {http://dx.doi.org/10.1145/1569901.1570251}
    }
    
    Lehre, Per Kristian and Yao, Xin On the Impact of Mutation-Selection Balance on the Runtime of Evolutionary Algorithms 2009 (CSR-09-07)  techreport URL 
    Abstract: The interplay between the mutation operator and the selection mechanism plays a fundamental role in the behaviour of evolutionary algorithms (EAs). However, this interplay is still not completely understood. This paper presents a rigorous runtime analysis of a non-elitistic population based EA that uses the linear ranking selection mechanism. The analysis focuses on how the balance between parameter $eta$ controlling the selection pressure in linear ranking selection, and parameter $chi$ controlling the bit-wise mutation rate impact the expected runtime. The results point out situations where a correct balance between selection pressure and mutation rate is essential for finding the optimal solution in polynomial time. In particular, it is shown that there exist fitness functions which can only be solved in polynomial time if the ratio between parameters $eta$ and $chi$ is within a narrow critical interval, and where a small change in this ratio can increase the runtime exponentially. Furthermore, it is shown that the appropriate parameter choice depends on the characteristics of the fitness function. Hence there does in general not exists a problem-independent optimal balance between mutation rate and selection pressure. In addition to original results on EAs, this paper also introduces for the first time new techniques, i.e., branching processes, to the analysis of non-elitist population-based EAs.
    BibTeX:
    @techreport{Lehre09,
      author = {Per Kristian Lehre and Xin Yao},
      title = {On the Impact of Mutation-Selection Balance on the Runtime of Evolutionary Algorithms},
      year = {2009},
      number = {CSR-09-07},
      url = {ftp://ftp.cs.bham.ac.uk/pub/tech-reports/2009/CSR-09-07.pdf}
    }
    
    Lehre, Per Kristian and Yao, Xin Runtime Analysis of Search Heuristics on Software Engineering Problems 2009 Frontiers of Computer Science in China
    Vol. 3(1), pp. 64-72 
    article DOI  
    Abstract: Many software engineering tasks can potentially be automated using search heuristics. However, much work is needed in designing and evaluating search heuristics before this approach can be routinely applied to a software engineering problem. Experimental methodology should be complemented with theoretical analysis to achieve this goal. Recently, there have been significant theoretical advances in the runtime analysis of evolutionary algorithms (EAs) and other search heuristics in other problem domains. We suggest that these methods could be transferred and adapted to gain insight into the behaviour of search heuristics on software engineering problems while automating software engineering.
    BibTeX:
    @article{LehreY09,
      author = {Per Kristian Lehre and Xin Yao},
      title = {Runtime Analysis of Search Heuristics on Software Engineering Problems},
      journal = {Frontiers of Computer Science in China},
      year = {2009},
      volume = {3},
      number = {1},
      pages = {64-72},
      doi = {http://dx.doi.org/10.1007/s11704-009-0006-6}
    }
    
    Lehre, Per Kristian and Yao, Xin On The Impact of the Mutation-Selection Balance on the Runtime of Evolutionary Algorithms 2009 Proceedings of the 10th ACM SIGEVO Workshop on Foundations of Genetic Algorithms, pp. 47-58  inproceedings DOI  
    Abstract: The interplay between the mutation operator and the selection mechanism plays a fundamental role in the behaviour of evolutionary algorithms. However, this interplay is still not completely understood. This paper presents a rigorous runtime analysis of a non-elitistic population based evolutionary algorithm that uses the linear ranking selection mechanism. The analysis focuses on how the balance between parameter ¦Ç controlling the selection pressure in linear ranking selection, and parameter ¦Ö controlling the bit-wise mutation rate impacts the expected runtime.

    The results point out situations where a correct balance between selection pressure and mutation rate is essential for finding the optimal solution in polynomial time. In particular, it is shown that there exist fitness functions which under a certain assumption can be solved in polynomial time if the ratio between parameters ¦Ç and ¦Ö is appropriately tuned to the problem instance class, but where a small change in this ratio can increase the runtime exponentially. Furthermore, it is shown that the appropriate parameter choice depends on the characteristics of the fitness function. Hence there does in general not exists a problem-independent optimal balance between mutation rate and selection pressure.

    The results are obtained using new techniques based on branching processes.

    BibTeX:
    @inproceedings{LehreY09b,
      author = {Per Kristian Lehre and Xin Yao},
      title = {On The Impact of the Mutation-Selection Balance on the Runtime of Evolutionary Algorithms},
      booktitle = {Proceedings of the 10th ACM SIGEVO Workshop on Foundations of Genetic Algorithms},
      publisher = {ACM},
      year = {2009},
      pages = {47-58},
      doi = {http://dx.doi.org/10.1145/1527125.1527133}
    }
    
    Lehre, Per Kristian and Yao, Xin Crossover Can be Constructive when Computing Unique Input Output Sequences 2008
    Vol. 5361Proceedings of the 7th International Conference on Simulated Evolution and Learning (SEAL '08), pp. 595-604 
    inproceedings DOI  
    Abstract: Unique input output (UIO) sequences have important applications in conformance testing of finite state machines (FSMs). Previous experimental and theoretical research has shown that evolutionary algorithms (EAs) can compute UIOs efficiently on many FSM instance classes, but fail on others. However, it has been unclear how and to what degree EA parameter settings influence the runtime on the UIO problem. This paper investigates the choice of acceptance criterion in the (1+1) EA and the use of crossover in the (¦Ì+1) Steady State Genetic Algorithm. It is rigorously proved that changing these parameters can reduce the runtime from exponential to polynomial for some instance classes.
    BibTeX:
    @inproceedings{LehreY08,
      author = {Per Kristian Lehre and Xin Yao},
      title = {Crossover Can be Constructive when Computing Unique Input Output Sequences},
      booktitle = {Proceedings of the 7th International Conference on Simulated Evolution and Learning (SEAL '08)},
      publisher = {Springer},
      year = {2008},
      volume = {5361},
      pages = {595-604},
      doi = {http://dx.doi.org/10.1007/978-3-540-89694-4_60}
    }
    
    Lehre, Per Kristian and Yao, Xin Crossover Can be Constructive when Computing Unique Input Output Sequences 2008 (CSR-08-08)  techreport URL 
    Abstract: Unique input output (UIO) sequences have important applications in conformance testing of finite state machines (FSMs). Previous experimental and theoretical research has shown that evolutionary algorithms (EAs) can compute UIOs efficiently on many FSM instance classes, but fail on others. However, it has been unclear how and to what degree EA parameter settings influence the runtime on the UIO problem. This paper investigates the choice of acceptance criterion in the (1+1) EA and the use of crossover in the (¦Ì+1) Steady State Genetic Algorithm. It is rigorously proved that changing these parameters can reduce the runtime from exponential to polynomial for some instance classes.
    BibTeX:
    @techreport{LehreY08b,
      author = {Per Kristian Lehre and Xin Yao},
      title = {Crossover Can be Constructive when Computing Unique Input Output Sequences},
      year = {2008},
      number = {CSR-08-08},
      url = {ftp://ftp.cs.bham.ac.uk/pub/tech-reports/2008/CSR-08-08.ps.gz}
    }
    
    Lehre, Per Kristian and Yao, Xin Runtime Analysis of (1+1) EA on Computing Unique Input Output Sequences 2007 Proceedings of 2007 IEEE Congress on Evolutionary Computation (CEC '07), pp. 1882-1889  inproceedings DOI  
    Abstract: Computing unique input output (UIO) sequences is a fundamental and hard problem in conformance testing of finite state machines (FSM). Previous experimental research has shown that evolutionary algorithms (EAs) can be applied successfully to find UIOs on some instances. However, before EAs can be recommended as a practical technique for computing UIOs, it is necessary to better understand the potential and limitations of these algorithms on this problem. In particular, more research is needed in determining for what instances of the problem EAs are feasible. This paper presents a rigorous runtime analysis of the (1+1) EA on three classes of instances of this problem. First, it is shown that there are instances where the EA is efficient, while random testing fails completely. Secondly, an instance class that is difficult for both random testing and the EA is presented. Finally, a parametrised instance class with tunable difficulty is presented. Together, these results provide a first theoretical characterisation of the potential and limitations of the (1+1) EA on the problem of computing UIOs.
    BibTeX:
    @inproceedings{LehreY07,
      author = {Per Kristian Lehre and Xin Yao},
      title = {Runtime Analysis of (1+1) EA on Computing Unique Input Output Sequences},
      booktitle = {Proceedings of 2007 IEEE Congress on Evolutionary Computation (CEC '07)},
      publisher = {IEEE},
      year = {2007},
      pages = {1882-1889},
      doi = {http://dx.doi.org/10.1109/CEC.2007.4424703}
    }
    
    Li, Zheng and Harman, Mark and Hierons, Robert M. Search Algorithms for Regression Test Case Prioritization 2007 IEEE Transactions on Software Engineering
    Vol. 33(4), pp. 225-237 
    article DOI  
    Abstract: Regression testing is an expensive, but important, process. Unfortunately, there may be insufficient resources to allow for the reexecution of all test cases during regression testing. In this situation, test case prioritization techniques aim to improve the effectiveness of regression testing by ordering the test cases so that the most beneficial are executed first. Previous work on regression test case prioritization has focused on Greedy Algorithms. However, it is known that these algorithms may produce suboptimal results because they may construct results that denote only local minima within the search space. By contrast, metaheuristic and evolutionary search algorithms aim to avoid such problems. This paper presents results from an empirical study of the application of several greedy, metaheuristic, and evolutionary search algorithms to six programs, ranging from 374 to 11,148 lines of code for three choices of fitness metric. The paper addresses the problems of choice of fitness metric, characterization of landscape modality, and determination of the most suitable search technique to apply. The empirical results replicate previous results concerning Greedy Algorithms. They shed light on the nature of the regression testing search space, indicating that it is multimodal. The results also show that Genetic Algorithms perform well, although Greedy approaches are surprisingly effective, given the multimodal nature of the landscape.
    BibTeX:
    @article{LiHH07,
      author = {Zheng Li and Mark Harman and Robert M. Hierons},
      title = {Search Algorithms for Regression Test Case Prioritization},
      journal = {IEEE Transactions on Software Engineering},
      year = {2007},
      volume = {33},
      number = {4},
      pages = {225-237},
      doi = {http://dx.doi.org/10.1109/TSE.2007.38}
    }
    
    McMinn, Phil and Binkley, David and Harman, Mark Empirical Evaluation of a Nesting Testability Transformation for Evolutionary Testing 2008 ACM Transactions on Software Engineering Methodology
    Vol. 18(3) 
    article DOI  
    Abstract: Evolutionary testing is an approach to automating test data generation that uses an evolutionary algorithm to search a test object's input domain for test data. Nested predicates can cause problems for evolutionary testing, because information needed for guiding the search only becomes available as each nested conditional is satisfied. This means that the search process can overfit to early information, making it harder, and sometimes near impossible, to satisfy constraints that only become apparent later in the search. The article presents a testability transformation that allows the evaluation of all nested conditionals at once. Two empirical studies are presented. The first study shows that the form of nesting handled is prevalent in practice. The second study shows how the approach improves evolutionary test data generation.
    BibTeX:
    @article{McMinnBH08,
      author = {Phil McMinn and David Binkley and Mark Harman},
      title = {Empirical Evaluation of a Nesting Testability Transformation for Evolutionary Testing},
      journal = {ACM Transactions on Software Engineering Methodology},
      year = {2008},
      volume = {18},
      number = {3},
      doi = {http://dx.doi.org/10.1145/1525880.1525884}
    }
    
    McMinn, Phil and Harman, Mark and Binkley, David and Tonella, Paolo The Species per Path Approach to Search-based Test Data Generation 2006 Proceedings of the 2006 International Symposium on Software Testing and Analysis (ISSTA '06), pp. 13-24  inproceedings DOI  
    Abstract: This paper introduces the Species per Path approach to search-based software test data generation. The approach transforms the program under test into a version in which multiple paths to the search target are factored out. Test data are then sought for each individual path by dedicated 'species' operating in parallel. The factoring out of paths results in several individual search landscapes, with feasible paths giving rise to landscapes that are potentially more conducive to test data discovery than the original overall landscape.The paper presents the results of two empirical studies that validate and verify the approach. The validation study supports the claim that the approach is widely applicable and practical. The verification study shows that it is possible to generate test data for targets with the approach that are troublesome for the standard evolutionary method.
    BibTeX:
    @inproceedings{McMinnHBT06,
      author = {Phil McMinn and Mark Harman and David Binkley and Paolo Tonella},
      title = {The Species per Path Approach to Search-based Test Data Generation},
      booktitle = {Proceedings of the 2006 International Symposium on Software Testing and Analysis (ISSTA '06)},
      publisher = {ACM},
      year = {2006},
      pages = {13-24},
      doi = {http://dx.doi.org/10.1145/1146238.1146241}
    }
    
    Nallur, Vivek and Bahsoon, Rami and Yao, Xin Self-Optimizing Architecture for Ensuring Quality Attributes in the Cloud 2009 Proceedings of the 7th Working IEEE/IFIP Conference on Software Architecture (WICSA '09)  inproceedings URL 
    Abstract: We describe various challenges in ensuring Quality Attributes (QA) of applications hosted in the cloud and hence the perceived quality of service of the cloud as a whole.

    We advocate a self-management/optimisation architecturedriven approach to ensure that Quality Attributes are met. We discuss the limitations of current approaches to selfmanaging architecture. We propose a novel approach, which exploits the El Farol problem as a modelling mechanism for QAs in architectures of applications in the cloud. The approach uses Service Level Agreements (SLA) and Utility Theory to direct the self-optimization. We conclude by looking at directions for further work.

    BibTeX:
    @inproceedings{NallurBY09,
      author = {Vivek Nallur and Rami Bahsoon and Xin Yao},
      title = {Self-Optimizing Architecture for Ensuring Quality Attributes in the Cloud},
      booktitle = {Proceedings of the 7th Working IEEE/IFIP Conference on Software Architecture (WICSA '09)},
      year = {2009},
      note = {to appear},
      url = {http://www.cs.bham.ac.uk/~vxn851/research/documents/wicsa09.pdf}
    }
    
    Nguyen, Cu and Miles, Simon and Perini, Anna and Tonella, Paolo and Harman, Mark and Luck, Michael Evolutionary Testing of Autonomous Software Agents 2009 Proceedings of the 8th International Conference on Autonomous Agents and Multiagent Systems (AAMAS '09), pp. 521-528  inproceedings URL 
    Abstract: A system built in terms of autonomous agents may require even greater correctness assurance than one which is merely reacting to the immediate control of its users. Agents make substantial decisions for themselves, so thorough testing is an important consideration. However, autonomy also makes testing harder; by their nature, autonomous agents may react in different ways to the same inputs over time, because, for instance they have changeable goals and knowledge. For this reason, we argue that testing of autonomous agents requires a procedure that caters for a wide range of test case contexts, and that can search for the most demanding of these test cases, even when they are not apparent to the agents' developers. In this paper, we address this problem, introducing and evaluating an approach to testing autonomous agents that uses evolutionary optimization to generate demanding test cases.
    BibTeX:
    @inproceedings{NguyenMPTHL09,
      author = {Cu Nguyen and Simon Miles and Anna Perini and Paolo Tonella and Mark Harman and Michael Luck},
      title = {Evolutionary Testing of Autonomous Software Agents},
      booktitle = {Proceedings of the 8th International Conference on Autonomous Agents and Multiagent Systems (AAMAS '09)},
      publisher = {International Foundation for Autonomous Agents and Multiagent Systems},
      year = {2009},
      pages = {521-528},
      url = {http://portal.acm.org/citation.cfm?id=1558013.1558085}
    }
    
    Oliveto, Pietro Simone and Lehre, Per Kristian and Neumann, Frank Theoretical Analysis of Rank-based Mutation - Combining Exploration and Exploitation 2009 Proceedings of the 10th IEEE Congress on Evolutionary Computation (CEC '09), pp. 1455-1462  inproceedings URL 
    Abstract: Parameter setting is an important issue in the design of evolutionary algorithms. Recently, experimental work has pointed out that it is often not useful to work with a fixed mutation rate. Therefore it was proposed that the population be ranked according to fitness and the mutation rate of an individual should depend on its rank. The claim is that this allows the algorithm to explore new regions in the search space as well as progress quickly towards optimal solutions. Complementing the experimental investigations, we examine the proposed approach by presenting rigorous theoretical analyses which point out the differences of rank-based mutation compared to a standard approach using a fixed mutation rate. To this end we theoretically explain the behaviour of rank-based mutation on various fitness landscapes proposed in the experimental work and present new significant classes of functions where the use of rank-based mutation may be both beneficial or detrimental compared to fixed mutation strategies.
    BibTeX:
    @inproceedings{OlivetoLN09,
      author = {Pietro Simone Oliveto and Per Kristian Lehre and Frank Neumann},
      title = {Theoretical Analysis of Rank-based Mutation - Combining Exploration and Exploitation},
      booktitle = {Proceedings of the 10th IEEE Congress on Evolutionary Computation (CEC '09)},
      publisher = {IEEE},
      year = {2009},
      pages = {1455-1462},
      url = {http://www2.computer.org/portal/web/csdl/doi/10.1109/CEC.2009.4983114}
    }
    
    Poulding, Simon and Emberson, Paul and Bate, Iain and Clark, John An Efficient Experimental Methodology for Configuring Search-Based Design Algorithms 2007 Proceedings of the 10th IEEE High Assurance Systems Engineering Symposium (HASE '07), pp. 53-62  inproceedings DOI  
    Abstract: Many problems in high assurance systems design are only tractable using computationally expensive search algorithms. For these algorithms to be useful, designers must be provided with guidance as to how to configure the algorithms appropriately. This paper presents an experimental methodology for deriving such guidance that remains efficient when the algorithm requires substantial computing resources or takes a long time to find solutions. The methodology is shown to be effective on a highly-constrained task allocation algorithm that provides design solutions for high integrity systems. Using the methodology, an algorithm configuration is derived in a matter of days that significantly outperforms one resulting from months of "trial-and-error" optimisation.
    BibTeX:
    @inproceedings{PouldingEBC07,
      author = {Simon Poulding and Paul Emberson and Iain Bate and John Clark},
      title = {An Efficient Experimental Methodology for Configuring Search-Based Design Algorithms},
      booktitle = {Proceedings of the 10th IEEE High Assurance Systems Engineering Symposium (HASE '07)},
      publisher = {IEEE Computer Society},
      year = {2007},
      pages = {53-62},
      doi = {http://dx.doi.org/10.1109/HASE.2007.19}
    }
    
    Rhys, Sion Ll and Poulding, Simon and Clark, John A. Using Automated Search to Generate Test Data for Matlab 2009 Proceedings of the 11th Annual Conference on Genetic and Evolutionary Computation (GECCO '09), pp. 1697-1704  inproceedings DOI  
    Abstract: The critical functionality of many software applications relies on code that performs mathematically complex computations. However, such code is often difficult to test owing to the compound datatypes used and complicated mathematical operations performed. This paper proposes the use of automated search as an efficient means of generating test data for this type of software. Taking Matlab as an example of widely-used mathematical software, a technical framework is described that extends previous work on search-based test data generation in order to handle matrix datatypes and associated relational operators. An empirical evaluation demonstrates the feasibility of this approach.
    BibTeX:
    @inproceedings{RhysPC09,
      author = {Sion Ll Rhys and Simon Poulding and John A. Clark},
      title = {Using Automated Search to Generate Test Data for Matlab},
      booktitle = {Proceedings of the 11th Annual Conference on Genetic and Evolutionary Computation (GECCO '09)},
      publisher = {ACM},
      year = {2009},
      pages = {1697-1704},
      doi = {http://dx.doi.org/10.1145/1569901.1570128}
    }
    
    Rohlfshagen, Philipp and Lehre, Per Kristian and Yao, Xin Dynamic Evolutionary Optimisation: An Analysis of Frequency and Magnitude of Change 2009 Proceedings of the 11th Annual Conference on Genetic and Evolutionary Computation, pp. 1713-1720 (Best Paper Award)  inproceedings DOI  
    Abstract: In this paper, we rigorously analyse how the magnitude and frequency of change may affect the performance of the algorithm (1+1) EAdyn on a set of artificially designed pseudo-Boolean functions, given a simple but well-defined dynamic framework. We demonstrate some counter-intuitive scenarios that allow us to gain a better understanding of how the dynamics of a function may affect the runtime of an algorithm. In particular, we present the function Magnitude, where the time it takes for the (1+1) EAdyn to relocate the global optimum is less than n2log n (i.e., efficient) with overwhelming probability if the magnitude of change is large. For small changes of magnitude, on the other hand, the expected time to relocate the global optimum is e¦¸(n) (i.e., highly inefficient). Similarly, the expected runtime of the (1+1) EAdyn on the function Balance is O(n2) (efficient) for a high frequencies of change and n¦¸(¡Ìn) (highly inefficient) for low frequencies of change. These results contribute towards a better understanding of dynamic optimisation problems in general and show how traditional analytical methods may be applied in the dynamic case.
    BibTeX:
    @inproceedings{RohlfshagenLY09,
      author = {Philipp Rohlfshagen and Per Kristian Lehre and Xin Yao},
      title = {Dynamic Evolutionary Optimisation: An Analysis of Frequency and Magnitude of Change},
      booktitle = {Proceedings of the 11th Annual Conference on Genetic and Evolutionary Computation},
      publisher = {ACM},
      year = {2009},
      pages = {1713-1720 (Best Paper Award)},
      note = {Best Paper Award},
      doi = {http://dx.doi.org/10.1145/1569901.1570131}
    }
    
    Sagarna, Ramón An Optimization Approach for Software Test Data Generation: Applications of Estimation of Distribution Algorithms and Scatter Search 2007 School: University of the Basque Country  phdthesis URL 
    Abstract: No abstract
    BibTeX:
    @phdthesis{Sagarna07,
      author = {Ramón Sagarna},
      title = {An Optimization Approach for Software Test Data Generation: Applications of Estimation of Distribution Algorithms and Scatter Search},
      school = {University of the Basque Country},
      year = {2007},
      url = {http://www.cercia.ac.uk/projects/research/SEBASE/pdfs/2007/rs-thesis.pdf}
    }
    
    Sagarna, Ramón and Arcuri, Andrea and Yao, Xin Estimation of Distribution Algorithms for Testing Object Oriented Software 2007 Proceedings of the IEEE Congress on Evolutionary Computation (CEC '07), pp. 438-444  inproceedings DOI  
    Abstract: One of the main tasks software testing involves is the generation of the test cases to be used during the test. Due to its expensive cost, the automation of this task has become one of the key issues in the area. While most of the work on test data generation has concentrated on procedural software, little attention has been paid to object oriented programs, even so they are a usual practice nowadays. We present an approach based on Estimation of Distribution Algorithms (EDAs) for dealing with the test data generation of a particular type of objects, that is, containers. This is the first time that an EDA has been applied to testing object oriented software. In addition to automated test data generation, the EDA approach also offers the potential of modelling the fitness landscape defined by the testing problem and thus could provide some insight into the problem. Firstly, we show results from empirical evaluations and comment on some appealing properties of EDAs in this context. Next, a framework is discussed in order to deal with the generation of efficient tests for the container classes. Preliminary results are provided as well.
    BibTeX:
    @inproceedings{SagarnaAY07,
      author = {Ramón Sagarna and Andrea Arcuri and Xin Yao},
      title = {Estimation of Distribution Algorithms for Testing Object Oriented Software},
      booktitle = {Proceedings of the IEEE Congress on Evolutionary Computation (CEC '07)},
      publisher = {IEEE},
      year = {2007},
      pages = {438-444},
      doi = {http://dx.doi.org/10.1109/CEC.2007.4424504}
    }
    
    Sagarna, Ramón and Lozano, José A. Dynamic Search Space Transformations for Software Test Data Generation 2008 Computational Intelligence
    Vol. 24(1), pp. 23-61 
    article DOI  
    Abstract: Among the tasks in software testing, test data generation is particularly difficult and costly. In recent years, several approaches that use metaheuristic search techniques to automatically obtain the test inputs have been proposed. Although work in this field is very active, little attention has been paid to the selection of an appropriate search space. The present work describes an alternative to this issue. More precisely, two approaches which employ an Estimation of Distribution Algorithm as the metaheuristic technique are explained. In both cases, different regions are considered in the search for the test inputs. Moreover, to depart from a region near to the one containing the optimum, the definition of the initial search space incorporates static information extracted from the source code of the software under test. If this information is not enough to complete the definition, then a grid search method is used. According to the results of the experiments conducted, it is concluded that this is a promising option that can be used to enhance the test data generation process.
    BibTeX:
    @article{SagarnaL08,
      author = {Ramón Sagarna and José A. Lozano},
      title = {Dynamic Search Space Transformations for Software Test Data Generation},
      journal = {Computational Intelligence},
      year = {2008},
      volume = {24},
      number = {1},
      pages = {23-61},
      doi = {http://dx.doi.org/10.1111/j.1467-8640.2007.00321.x}
    }
    
    Sagarna, Ramón and Lozano, José A. Software Metrics Mining to Predict the Performance of Estimation of Distribution Algorithms in Test Data Generation 2007
    Vol. 102/2008Knowledge-Driven Computing. Knowledge Engineering and Intelligent Computations, pp. 235-254 
    incollection DOI  
    Abstract: Test data generation for a software system is a difficult and costly task. Thus, given a program, it would be highly desirable to choose the most adequate approach. A first step towards this is the prediction of the performance of a test data generator. We conduct a preliminary study on the suitability of software metrics for this problem in the context of a generator based on Estimation of Distribution Algorithms (EDAs). EDAs are a family of evolutionary algorithms that have been previously applied to the test data generation problem in softtware testing with promising results. More precisely, we analyze the adequacy of Data Mining techniques for predicting whether the EDAs based generator is able to fulfill branch coverage or not. Results offer interesting conclusions on the predictive capability of software metrics and show Data Mining as powerful field to be further investigated in this area.
    BibTeX:
    @incollection{SagarnaL07,
      author = {Ramón Sagarna and José A. Lozano},
      title = {Software Metrics Mining to Predict the Performance of Estimation of Distribution Algorithms in Test Data Generation},
      booktitle = {Knowledge-Driven Computing. Knowledge Engineering and Intelligent Computations},
      publisher = {Springer-Verlag},
      year = {2007},
      volume = {102/2008},
      pages = {235-254},
      doi = {http://dx.doi.org/10.1007/978-3-540-77475-4_15}
    }
    
    Sagarna, Ramón and Yao, Xin Handling Constraints for Search Based Software Test Data Generation 2008 Proceedings of 1st International Workshop on Search-Based Software Testing (SBST) in conjunction with ICST 2008, pp. 232-240  inproceedings DOI  
    Abstract: A major issue in software testing is the automatic generation of the inputs to be applied to the programme under test. To solve this problem, a number of approaches based on search methods have been developed in the last few years, offering promising results for adequacy criteria like, for instance, branch coverage. We devise branch coverage as the satisfaction of a number of constraints. This allows to formulate the test data generation as a constrained optimisation problem or as a constraint satisfaction problem. Then, we can see that many of the generators so far have followed the same particular approach. Furthermore, this constraint-handling point of view overcomes this limitation and opens the door to new designs and search strategies that, to the best of our knowledge, have not been considered yet. As a case study, we develop test data generators employing different penalty objective functions or multiobjective optimisation. The results of the conducted preliminary experiments suggest these generators can improve the performance of classical approaches.
    BibTeX:
    @inproceedings{SagarnaY08,
      author = {Ramón Sagarna and Xin Yao},
      title = {Handling Constraints for Search Based Software Test Data Generation},
      booktitle = {Proceedings of 1st International Workshop on Search-Based Software Testing (SBST) in conjunction with ICST 2008},
      publisher = {IEEE Computer Society},
      year = {2008},
      pages = {232-240},
      doi = {http://dx.doi.org/10.1109/ICSTW.2008.19}
    }
    
    Tate, Jonathan and Woolford-Lim, Benjamin and Bate, Iain and Yao, Xin Comparing Design of Experiments and Evolutionary Approaches to Multi-Objective Optimisation of Sensornet Protocols 2009 Proceedings of the 10th IEEE Congress on Evolutionary Computation (CEC '09), pp. 1137-1144  inproceedings URL 
    Abstract: The lifespan, and hence utility, of sensornets is limited by the energy resources of individual motes. Network designers seek to maximise energy efficiency while maintaining an acceptable network Quality of Service. However, the interactions between multiple tunable protocol parameters and multiple sensornet performance metrics are generally complex and unknown. In this paper we address this multi-dimensional optimisation problem by two distinct approaches. Firstly, we apply a Design Of Experiments approach to obtain a generalised linear interaction model, and from this derive an estimated near-optimal solution. Secondly, we apply the Two-Archive evolutionary algorithm to improve solution quality for a specific problem instance. We demonstrate that, whereas the first approach yields a more generally applicable solution, the second approach yields a broader range of viable solutions at potentially lower experimental cost.
    BibTeX:
    @inproceedings{TateWBY09,
      author = {Jonathan Tate and Benjamin Woolford-Lim and Iain Bate and Xin Yao},
      title = {Comparing Design of Experiments and Evolutionary Approaches to Multi-Objective Optimisation of Sensornet Protocols},
      booktitle = {Proceedings of the 10th IEEE Congress on Evolutionary Computation (CEC '09)},
      publisher = {IEEE},
      year = {2009},
      pages = {1137-1144},
      url = {http://www.cs.york.ac.uk/rts/papers_db_keyed.php?key=R:Tate:2009a}
    }
    
    Wang, Zai and Chen, Tianshi and Tang, Ke and Yao, Xin A Multi-objective Approach to Redundancy Allocation Problem in Parallel-series Systems 2009 Proceedings of the 10th IEEE Congress on Evolutionary Computation (CEC '09), pp. 582-589  inproceedings URL 
    Abstract: The Redundancy Allocation Problem (RAP) is a kind of reliability optimization problems. It involves the selection of components with appropriate levels of redundancy or reliability to maximize the system reliability under some predefined constraints. We can formulate the RAP as a combinatorial problem when just considering the redundancy level, while as a continuous problem when considering the reliability level. The RAP employed in this paper is that kind of combinatorial optimization problems. During the past thirty years, there have already been a number of investigations on RAP. However, these investigations often treat RAP as a single objective problem with the only goal to maximize the system reliability (or minimize the designing cost). In this paper, we regard RAP as a multi-objective optimization problem: the reliability of the system and the corresponding designing cost are considered as two different objectives. Consequently, we can utilize a classical Multi-objective Evolutionary Algorithm (MOEA), named Non-dominated Sorting Genetic Algorithm II (NSGA-II), to cope with this multi-objective redundancy allocation problem (MORAP) under a number of constraints. The experimental results demonstrate that the multi-objective evolutionary approach can provide more promising solutions in comparison with two widely used single-objective approaches on two parallel-series systems which are frequently studied in the field of reliability optimization.
    BibTeX:
    @inproceedings{WangCTY09,
      author = {Zai Wang and Tianshi Chen and Ke Tang and Xin Yao},
      title = {A Multi-objective Approach to Redundancy Allocation Problem in Parallel-series Systems},
      booktitle = {Proceedings of the 10th IEEE Congress on Evolutionary Computation (CEC '09)},
      publisher = {IEEE},
      year = {2009},
      pages = {582-589},
      url = {http://nical.ustc.edu.cn/uploads/cec2009/P252.pdf}
    }
    
    Wang, Zai and Tang, Ke and Yao, Xin A Multi-Objective Approach to Testing Resource Allocation in Modular Software Systems 2008 Proceedings of the 2008 IEEE Congress on Evolutionary Computation (CEC '08), pp. 1148-1153  inproceedings DOI  
    Abstract: Nowadays, as the software systems become increasingly large and complex, the problem of allocating the limited testing-resource during the testing phase has become more and more difficult. In this paper, we propose to solve the testing-resource allocation problem (TRAP) using multi-objective evolutionary algorithms. Specifically, we formulate TRAP as two multi-objective problems. First, we consider the reliability of the system and the testing cost as two objectives. In the second formulation, the total testing-resource consumed is also taken into account as the third goal. Two multi-objective evolutionary algorithms, Non-dominated Sorting Genetic Algorithm II (NSGA2) and Multi-Objective Differential Evolution Algorithms (MODE), are applied to solve the TRAP in the two scenarios. This is the first time that the TRAP is explicitly formulated and solved by multi-objective evolutionary approaches. Advantages of our approaches over the state-of-the-art single-objective approaches are demonstrated on two parallel-series modular software models.
    BibTeX:
    @inproceedings{WangTY08,
      author = {Zai Wang and Ke Tang and Xin Yao},
      title = {A Multi-Objective Approach to Testing Resource Allocation in Modular Software Systems},
      booktitle = {Proceedings of the 2008 IEEE Congress on Evolutionary Computation (CEC '08)},
      publisher = {IEEE},
      year = {2008},
      pages = {1148-1153},
      doi = {http://dx.doi.org/10.1109/CEC.2008.4630941}
    }
    
    White, David R. and Clark, John and Jacob, Jeremy and Poulding, Simon M. Searching for Resource-Efficient Programs: Low-Power Pseudorandom Number Generators 2008 Proceedings of the 10th Annual Conference on Genetic and Evolutionary Computation (GECCO '08), pp. 1775-1782  inproceedings DOI URL 
    Abstract: Non-functional properties of software, such as power consumption and memory usage, are important factors in designing software for resource-constrained platforms. This is an area where Search-Based Software Engineering has yet to be applied, and this paper investigates the potential of using Genetic Programming and Multi-Objective Optimisation as key tools in satisfying non-functional requirements. We outline the benefits of such an approach and give an example application of evolving pseudorandom number generators and performing power-functionality trade-offs.
    BibTeX:
    @inproceedings{WhiteCJP08,
      author = {David R. White and John Clark and Jeremy Jacob and Simon M. Poulding},
      title = {Searching for Resource-Efficient Programs: Low-Power Pseudorandom Number Generators},
      booktitle = {Proceedings of the 10th Annual Conference on Genetic and Evolutionary Computation (GECCO '08)},
      publisher = {ACM},
      year = {2008},
      pages = {1775-1782},
      url = {http://www.cs.bham.ac.uk/~wbl/biblio/gecco2008/docs/p1775.pdf},
      doi = {http://dx.doi.org/10.1145/1389095.1389437}
    }
    
    White, David R. and Poulding, Simon A Rigorous Evaluation of Crossover and Mutation in Genetic Programming 2009
    Vol. 5481Proceedings of the 12th European Conference on Genetic Programming (EuroGP '09), pp. 220-231 
    inproceedings DOI  
    Abstract: The role of crossover and mutation in Genetic Programming (GP) has been the subject of much debate since the emergence of the field. In this paper, we contribute new empirical evidence to this argument using a rigorous and principled experimental method applied to six problems common in the GP literature. The approach tunes the algorithm parameters to enable a fair and objective comparison of two different GP algorithms, the first using a combination of crossover and reproduction, and secondly using a combination of mutation and reproduction. We find that crossover does not significantly outperform mutation on most of the problems examined. In addition, we demonstrate that the use of a straightforward Design of Experiments methodology is effective at tuning GP algorithm parameters.
    BibTeX:
    @inproceedings{WhiteP09,
      author = {David R. White and Simon Poulding},
      title = {A Rigorous Evaluation of Crossover and Mutation in Genetic Programming},
      booktitle = {Proceedings of the 12th European Conference on Genetic Programming (EuroGP '09)},
      publisher = {Springer},
      year = {2009},
      volume = {5481},
      pages = {220-231},
      doi = {http://dx.doi.org/10.1007/978-3-642-01181-8_19}
    }
    
    Yoo, Shin and Harman, Mark Pareto Efficient Multi-Objective Test Case Selection 2007 Proceedings of the 2007 International Symposium on Software Testing and Analysis (ISSTA '07), pp. 140-150  inproceedings DOI  
    Abstract: Previous work has treated test case selection as a single objective optimisation problem. This paper introduces the concept of Pareto efficiency to test case selection. The Pareto efficient approach takes multiple objectives such as code coverage, past fault-detection history and execution cost, and constructs a group of non-dominating, equivalently optimal test case subsets. The paper describes the potential benefits of Pareto efficient multi-objective test case selection, illustrating with empirical studies of two and three objective formulations.
    BibTeX:
    @inproceedings{YooH07,
      author = {Shin Yoo and Mark Harman},
      title = {Pareto Efficient Multi-Objective Test Case Selection},
      booktitle = {Proceedings of the 2007 International Symposium on Software Testing and Analysis (ISSTA '07)},
      publisher = {ACM},
      year = {2007},
      pages = {140-150},
      doi = {http://dx.doi.org/10.1145/1273463.1273483}
    }
    
    Yoo, Shin and Harman, Mark and Tonella, Paolo and Susi, Angelo Clustering Test Cases to Achieve Effective and Scalable Prioritisation Incorporating Expert Knowledge 2009 Proceedings of The 18th International Symposium On Software Testing and Analysis (ISSTA '09), pp. 201-212  inproceedings DOI  
    Abstract: Pair-wise comparison has been successfully utilised in order to prioritise test cases by exploiting the rich, valuable and unique knowledge of the tester. However, the prohibitively large cost of the pair-wise comparison method prevents it from being applied to large test suites. In this paper, we introduce a cluster-based test case prioritisation technique. By clustering test cases, based on their dynamic runtime behaviour, we can reduce the required number of pair-wise comparisons significantly. The approach is evaluated on seven test suites ranging in size from 154 to 1,061 test cases. We present an empirical study that shows that the resulting prioritisation is more effective than existing coverage-based prioritisation techniques in terms of rate of fault detection. Perhaps surprisingly, the paper also demonstrates that clustering (even without human input) can outperform unclustered coverage-based technologies, and discusses an automated process that can be used to determine whether the application of the proposed approach would yield improvement.
    BibTeX:
    @inproceedings{YooHTS09,
      author = {Shin Yoo and Mark Harman and Paolo Tonella and Angelo Susi},
      title = {Clustering Test Cases to Achieve Effective and Scalable Prioritisation Incorporating Expert Knowledge},
      booktitle = {Proceedings of The 18th International Symposium On Software Testing and Analysis (ISSTA '09)},
      publisher = {ACM},
      year = {2009},
      pages = {201-212},
      doi = {http://dx.doi.org/10.1145/1572272.1572296}
    }
    
    Yoo, Shin and Harman, Mark and Ur, Shmuel Measuring and Improving Latency to Avoid Test Suite Wear Out 2009 Proceedings of the IEEE International Conference on Software Testing, Verification, and Validation Workshops (ICSTW '09), pp. 101-110 (Best Paper Award)  inproceedings DOI  
    Abstract: This paper introduces the concept of test suite latency. The more latent a test suite, the more it is possible to repeatedly select subsets that achieve a test goal (such as coverage) without re-applying test cases. Where a test case is re-applied it cannot reveal new information. The more a test suite is forced to re-apply already applied test cases in order to achieve the test goal, the more it has become `worn out'. Test suite latency is the flipside of wear out; the more latent a test suite, the less prone it is to wear out. The paper introduces a theory of test suite latency. It presents results from the empirical study of latency, highlighting the need for latency enhancement. The paper also introduces a strategy and algorithms for improving latency and an empirical study of their effectiveness. The results show that local search is effective at improving the latency of a test suite.
    BibTeX:
    @inproceedings{YooHU09,
      author = {Shin Yoo and Mark Harman and Shmuel Ur},
      title = {Measuring and Improving Latency to Avoid Test Suite Wear Out},
      booktitle = {Proceedings of the IEEE International Conference on Software Testing, Verification, and Validation Workshops (ICSTW '09)},
      publisher = {IEEE Computer Society},
      year = {2009},
      pages = {101-110 (Best Paper Award)},
      doi = {http://dx.doi.org/10.1109/ICSTW.2009.10}
    }
    
    Zhan, Yuan and Clark, John A. A Search-based Framework for Automatic Testing of MATLAB/Simulink Models 2008 Journal of Systems and Software
    Vol. 81(2), pp. 262-285 
    article DOI  
    Abstract: Search-based test-data generation has proved successful for code-level testing but almost no search-based work has been carried out at higher levels of abstraction. In this paper the application of such approaches at the higher levels of abstraction offered by MATLAB/Simulink models is investigated and a wide-ranging framework for test-data generation and management is presented. Model-level analogues of code-level structural coverage criteria are presented and search-based approaches to achieving them are described. The paper also describes the first search-based approach to the generation of mutant-killing test data, addressing a fundamental limitation of mutation testing. Some problems remain whatever the level of abstraction considered. In particular, complexity introduced by the presence of persistent state when generating test sequences is as much a challenge at the Simulink model level as it has been found to be at the code level. The framework addresses this problem. Finally, a flexible approach to test sub-set extraction is presented, allowing testing resources to be deployed effectively and efficiently.
    BibTeX:
    @article{ZhanC08,
      author = {Yuan Zhan and John A. Clark},
      title = {A Search-based Framework for Automatic Testing of MATLAB/Simulink Models},
      journal = {Journal of Systems and Software},
      year = {2008},
      volume = {81},
      number = {2},
      pages = {262-285},
      doi = {http://dx.doi.org/10.1016/j.jss.2007.05.039}
    }
    
    Zhan, Yuan and Clark, John A. The State Problem for Test Generation in Simulink 2006 Proceedings of the 8th annual Conference on Genetic and Evolutionary Computation (GECCO '06), pp. 1941-1948  inproceedings DOI  
    Abstract: Search based test-data generation has proved successful for code-level testing. In this paper we investigate the application of such approaches at the higher levels of abstraction offered by Matlab-Simulink models. The presence of persistent state has been shown to be problematic at the code level and such difficulties remain when Matlab-Simulink models are to be tested. In such cases, sequences of inputs that can put the model under test into particular states are needed to enable the underlying test goals to be achieved. Simple search guidance appears to be insufficient and results in a 'flat' cost function landscape. To address this problem, we introduce a technique called tracing and deducing, which helps provide better guidance to the search, allowing our developed tools to home in on the targeted test-data.
    BibTeX:
    @inproceedings{ZhanC06,
      author = {Yuan Zhan and John A. Clark},
      title = {The State Problem for Test Generation in Simulink},
      booktitle = {Proceedings of the 8th annual Conference on Genetic and Evolutionary Computation (GECCO '06)},
      publisher = {ACM},
      year = {2006},
      pages = {1941-1948},
      doi = {http://dx.doi.org/10.1145/1143997.1144319}
    }
    
    Zhang, Yuanyuan and Finkelstein, Anthony and Harman, Mark Search Based Requirements Optimisation: Existing Work & Challenges 2008
    Vol. 5025Proceedings of the 14th International Working Conference, Requirements Engineering: Foundation for Software Quality (RefsQ '08), pp. 88-94 
    inproceedings DOI  
    Abstract: In this position paper, we argue that search based software engineering techniques can be applied to the optimisation problem during the requirements analysis phase. Search based techniques offer significant advantages; they can be used to seek robust, scalable solutions, to perform sensitivity analysis, to yield insight and provide feedback explaining choices to the decision maker. This position paper overviews existing achievements and sets out future challenges.
    BibTeX:
    @inproceedings{ZhangFH08,
      author = {Yuanyuan Zhang and Anthony Finkelstein and Mark Harman},
      title = {Search Based Requirements Optimisation: Existing Work & Challenges},
      booktitle = {Proceedings of the 14th International Working Conference, Requirements Engineering: Foundation for Software Quality (RefsQ '08)},
      publisher = {Springer},
      year = {2008},
      volume = {5025},
      pages = {88-94},
      doi = {http://dx.doi.org/10.1007/978-3-540-69062-7_8}
    }
    
    Zhang, Yuanyuan and Harman, Mark and Mansouri, S. Afshin The Multi-Objective Next Release Problem 2007 Proceedings of the 9th annual Conference on Genetic and Evolutionary Computation (GECCO '07), pp. 1129-1137 (Best Paper Award)  inproceedings DOI  
    Abstract: This paper is concerned with the Multi-Objective Next Release Problem (MONRP), a problem in search-based requirements engineering. Previous work has considered only single objective formulations. In the multi-objective formulation, there are at least two (possibly conflicting) objectives that the software engineer wishes to optimize. It is argued that the multi-objective formulation is more realistic, since requirements engineering is characterised by the presence of many complex and conflicting demands, for which the software engineer must find a suitable balance. The paper presents the results of an empirical study into the suitability of weighted and Pareto optimal genetic algorithms, together with the NSGA-II algorithm, presenting evidence to support the claim that NSGA-II is well suited to the MONRP. The paper also provides benchmark data to indicate the size above which the MONRP becomes non-trivial.
    BibTeX:
    @inproceedings{ZhangHM07,
      author = {Yuanyuan Zhang and Mark Harman and S. Afshin Mansouri},
      title = {The Multi-Objective Next Release Problem},
      booktitle = {Proceedings of the 9th annual Conference on Genetic and Evolutionary Computation (GECCO '07)},
      publisher = {ACM},
      year = {2007},
      pages = {1129-1137 (Best Paper Award)},
      doi = {http://dx.doi.org/10.1145/1276958.1277179}
    }
    

    Created by JabRef on 02/11/2009.