[TYPES] Systematically testing type systems

Wed Jul 5 10:57:13 EDT 2006

This is not difficult to do in a logic programming language that supports
higher-order or nominal abstract syntax, such as lambdaProlog, Twelf, or
alphaProlog, using "generate-and-test"; the tricky part is getting the tests
right and setting up the search in such a way that you don't miss obvious
short counterexamples yet still terminate in a small amount of time (my
boredom threshold is five seconds).  Automated assertion checking has also
been investigated in a general (constraint) logic programming and in Haskell
(QuickCheck).  I'm not aware of any prior research on applying this idea
specifically to type systems.

Coincidentally, I wrote a short (and very very preliminary) paper on doing
this in alphaProlog for the upcoming Workshop on Mechanizing Metatheory;
this seems like as good a time as any to mention it.  It is not clear to me
how far an exhaustive search approach can be pushed, since the number of
well-typed programs of size n is still exponential in n; something like
QuickCheck's random checking, but guided by the type system and operational
semantics, might be a better way.  However, even doing an exhaustive search
for "small" counterexamples seems likely to be helpful, at least as a sanity
check for trivial bugs.

If you want to have a look at the paper draft, it's online at

http://homepages.inf.ed.ac.uk/jcheney/publications/wmm06-draft.pdf

Big caveat: alphaProlog is not that well supported right now and the code
for automatic checking is not in a public release yet.

Questions, comments, and fisticuffs welcome.

--James

On 7/4/06, Klaus Ostermann <ostermann at informatik.tu-darmstadt.de> wrote:
>
> [ The Types Forum, http://lists.seas.upenn.edu/mailman/listinfo/types-list]
>
> Hi all,
>
> I hope this is not a stupid question, but I wonder whether it is feasible
> to
> test the soundness of a type checker by enumerating all programs in the
> language, reducing each program one step (assuming we have a small-step
> interpreter and a language where the full execution state is in the
> program
> itself) and checking whether subject reduction holds. If the enumeration
> of all
> programs would be smart enough (i.e., produce only syntactically correct
> programs; no two programs in the enumeration are alpha-equivalent) I could
> imagine that it would be feasible to test a type checker for all programs
> less
> than X bytes in size, where X is a reasonable number (e.g., 1KB) - at
> least for
> minimalistic languages as they are common in type research.
>
> Has this ever been tried, or is it obvious that it would not work in
> practice?
>
> Klaus Ostermann
>