Abstract
Regular expressions (regexes) permit to describe set of strings using a pattern-based syntax. Writing a correct regex that
exactly captures the desired set of strings is difficult, also because a regex is seldom syntactically incorrect, and so it
is rare to detect faults at parse time. We propose a fault-based approach for generating tests for regexes. We identify fault
classes representing possible mistakes a user can make when writing a regex, and we introduce the notion of *distinguishing
string*, i.e., a string that is able to witness a fault. Then, we provide a tool, based on the automata representation of
regexes, for generating distinguishing strings exposing the faults introduced in mutated versions of a regex under test. The
basic generation process is improved by two techniques, namely *monitoring* and *collecting*. Experiments show that the approach
produces compact test suites having a guaranteed fault detection capability, differently from other test generation approaches.
[download the pdf file] [DOI] [This paper won the **best paper award**. The tool is available at http://cs.unibg.it/mutrex/]