Are Mutants a Valid Substitute for Real Faults in Software Testing?

René Just, Darioush Jalali, Laura Inozemtseva, Michael D. Ernst, Reid Holmes and Gordon Fraser

University of Washington Technical Report UW-CSE-14-02-02

This report is superseded by this paper.

Abstract

A good test suite is one that detects real faults. Because the set of faults in a program is unknowable, this definition is not useful to practitioners who are creating test suites nor to researchers who are creating and evaluating tools that generate test suites. In place of real faults, testing research often uses mutants, which are artificial faults -- each one a simple syntactic variation -- that are systematically seeded throughout the program under test. Mutation testing is appealing because large numbers of mutants can be automatically generated and used as a proxy for real faults.

Unfortunately, there is little experimental evidence to support the use of mutants as a proxy for real faults. This paper investigates whether mutants are indeed a valid substitute for real faults -- that is, whether a test suite's ability to detect mutants is correlated with its ability to detect real faults that developers have fixed.

Our experiments used 357 real faults in 5 open-source applications totalling 321,000 lines of source code. Furthermore, our experiments used both developer-written and generated test suites. We found a statistically significant correlation between mutant detection and real fault detection, even when controlling for code coverage.

Supplementary Material

PDF of the report

BibTeX

@techreport{JJI+14tr,
    author={Just, Ren\'{e} and Jalali, Darioush and Inozemtseva, Laura and Ernst, Michael D. and Holmes, Reid and Fraser, Gordon},
    title={Are Mutants a Valid Substitute for Real Faults in Software Testing?},
    institution={University of Washington},
    number={UW-CSE-14-02-02},
    year={2014},
}