[TYPES] A show-stopping problem for modular type classes?

Tue Oct 6 11:21:09 EDT 2009

Hi Brian,

Basically you're observing that a qualified GADT package includes an 
instance declaration, so unpacking it amounts to a using declaration.  
Accordingly, the "Problem" clause of your code would need to be treated 
as a using declaration, and it would be rejected for introducing 
overlapping instances.

It strikes me that in GHC you might well have the same problem with the 
"Ok" constructor.  A qualified existential package also includes an 
instance declaration, and unpacking it too amounts to a using 
declaration.  That ought not be a problem because it's an instance for 
an abstract type, so it cannot overlap any existing instances.  But, my 
understanding is that internally GHC does not really support abstract 
types, so that might not help.

By the way, I think that this issue exposes a design flaw in GADTs.  
GADTs provide a convenient way to do dynamic type analysis by 
integrating type representations into the datatype mechanism.  But, I 
think it's a mistake for GADTs to use actual types for those type 
representations.  Conceptually, the indices for a GADT are data, not 
sets of terms.  A better way to do GADTs would be to introduce a 
"datakind" mechanism, and use algebraic data kinds as the indices for a 
GADT.  You can get a picture of what this would look like from my 1999 
paper "Flexible Type Analysis" with Stephanie Weirich.  (That paper 
didn't have the tie-in with datatypes, which was invented later.)

The relevance to modular type classes is this:  If the index for a GADT 
were an algebraic data kind, rather than an actual type, there would be 
no temptation to try to attach an instance to that index.  The "Problem" 
constructor would be ill-kinded.  Obviously, introducing a datakind 
mechanism and supporting machinery would be a much more substantial fix 
than simply checking for overlapping instances when unpacking a GADT, 
and it also might break existing code.  But I do think it would be a 
better design looking forward.

    -- Karl

Brian Hulley wrote:
> [ The Types Forum, http://lists.seas.upenn.edu/mailman/listinfo/types-list ]
>
> Hi everyone,
> with regards to the integration of Haskell-style typeclasses into an SML-style module system I've come across a really difficult problem caused by the interaction of GADTs and scoped type-class instances, and am wondering if anyone has any ideas on how it might be solved.
>
> Consider the following Haskell module that compiles under ghc 6.8.2 with -fglasgow-exts enabled:
>
>   module Test where
>      data G a where
>         Ok      :: forall a. Num a => a -> G ()
>         Problem :: forall a. Num a => a -> G a
>
>      analyse :: forall a. (G a, G a) -> a
>      analyse (Ok x, Ok y)           = ()
>      analyse (Ok x, Problem y)      = ()
>      analyse (Problem x, Ok y)      = ()
>      analyse (Problem x, Problem y) = x + y
>
>   $ ghci -fglasgow-exts test.hs
>   GHCi, version 6.8.2: http://www.haskell.org/ghc/  :? for help
>   Loading package base ... linking ... done.
>   [1 of 1] Compiling Test             ( test.hs, interpreted )
>   Ok, modules loaded: Test.
>   *Test> analyse (Problem (1 :: Int), Problem (2 :: Int))
>   3
>   *Test> Leaving GHCi.
>
> Conceptually, the Problem constructor stores a Num dictionary for the type of its value argument, so in the above example each Problem constructor in the tuple stores a Num dictionary for the type Int (as well as an Int value).
>
> Because type class instances in Haskell are global, so that there can only be one instance per type, in the fourth clause of (analyse) even though pattern matching makes available *two* Num dictionaries for Int, the compiler knows that they must be the same and so can just forget about one of them (I assume), allowing the arithmetic operator (+) to be resolved for example to the function stored in the first dictionary.
>
> However in a modular setting, there is no such guarantee, so when compiling (x + y) there would be an ambiguity about which Num Int dictionary to use e.g.:
>
>
>    structure P1 = struct
>       instance Num Int where ...
>
>       val p1 : G Int = Problem 1
>
>    structure P2 = struct
>       instance Num Int where ...
>
>       val p2 : G Int = Problem 2 (* different dictionary from p1 *)
>
>    structure Client = struct
>
>       val x = analyse (p1, p2)   (* 2 different dicts for Num Int *)
>
> (I've written types the same way as in Haskell to avoid confusion and used layout to avoid having to write "end".)
>
> I've sidestepped the issue of how type class instances are defined and made canonical since this does not seem to me to be relevant: the point is that we can have two incompatible dictionaries stored in values of the same type where the types associated with the dictionaries are visible in the scope of the client, because of the way GADTs allow us to refine a type by adding extra predicates.
>
> (In contrast, uses of the Ok constructor would be fine since the type associated with the dictionary (and value) is hidden.)
>
> Many similar examples could be constructed e.g.
>
>     foo :: forall a. Num a => G a -> a -> a
>     foo (Problem x) y = x + y
>     ...
>
> Here the compiler must choose between the GADT dict and the dict sent into the (foo) function from the calling context.
>
> So far the only solution I can think of to the problem of designing a modular language with typeclasses is to simply forbid the use of user-defined predicates for GADT constructors (they are fine in existentials as pointed out above with the Ok constructor).
>
> This might be ok but it seems a bit draconian and also I'm wondering if it reveals a fundamental limitation with the whole concept of typeclasses as used in modern Haskell: are they fundamentally non-modular?
>
> Anyway any thoughts or pointers to literature addressing any of these issues would be welcome,
>
> Thanks, Brian.
>
>
>