[TYPES] A show-stopping problem for modular type classes?

Brian Hulley brianh at metamilk.com
Tue Oct 6 23:28:11 EDT 2009

Sameer Sundresh wrote:
> Brian Hulley wrote:
>> However in a modular setting, there is no such guarantee, so when
>> compiling (x + y) there would be an ambiguity about which Num Int
>> dictionary to use e.g.:
>>    structure P1 = struct
>>       instance Num Int where ...
>>       val p1 : G Int = Problem 1
>>    structure P2 = struct
>>       instance Num Int where ...
>>       val p2 : G Int = Problem 2 (* different dictionary from p1 *)
>>    structure Client = struct
>>       val x = analyse (p1, p2)   (* 2 different dicts for Num Int *)
> When we're type checking the Client module, how do we know that Int has 
> been declared as an instance of Num?  Are you assuming there is also a 
> global instance Num Int, and P1 and P2 are locally overriding it?
> - If yes, would it be acceptable to use the global instance declaration 
> for Num Int within module Client?
> - If no, would it be acceptable to require module Client to explicitly 
> declare or import an instance declaration for Int as a Num?  The 
> alternative--implicitly importing instance declarations into the context 
> where the values p1 and p2 are used--introduces overlapping instances, 
> as Karl Crary pointed out.  

Hi Sameer,

I think my example above may have been a bit unclear. It was supposed to be something like:

     structure P1 :> sig val p1 : G Int end = struct ...
     structure P2 :> sig val p2 : G Int end = struct ...

     structure A = struct
         val analyse : forall a. G a * G a -> a =
              (Ok x, Ok y)           => ()
           |  (Ok x, Problem y)      => ()
           |  (Problem x, Ok y)      => ()
           |  (Problem x, Problem y) => x + y

     structure Client = struct

          val x : Int = A.analyse (P1.p1, P2.p2)

so when typechecking the implementation of (analyse) all we know is that the fourth clause will introduce two conflicting instances for (Num _A) where (_A) is whatever type has been chosen by the caller to instantiate the type variable (a).

These instances would be brought into the scope of the rhs of the fourth clause by pattern matching. For the particular call to (analyse) in the Client module, they would happen to be instances for Num Int, so the point of the example was not that this conflicts with say a standard instance for Num Int that's always in scope but that the two Num Int instances defined in P1 and P2 would conflict with each other.

So in summary when compiling P1, we use the explicit instance for Num Int declared in P1 to construct the value p1 of type G Int, thus p1 contains a dictionary corresponding to this instance declaration, as well as the integer 1, and similarly for P2.
The Client module then calls A.analyse, passing only p1 and p2. A.analyse does not take any dictionary arguments, so it doesn't matter what instances are in scope in the Client module.
As for module A, it also doesn't really matter what instances are in scope since we don't know what type (a) will be instantiated to, though for other examples it would matter (see my reply to Karl for a example).  

> Something to think about is: what should the 
> scope be of implicitly imported instance declarations?

Following Haskell, this would be the rhs of any pattern match clause that destructs a value whose constructor requires a qualified type.

I suppose one idea consistent with the MTC paper [1] would be to separate the revelation of instances from their adoption in the rhs, so that the implementor could explicitly choose to ignore one of them etc. However this still would not address the problem: what if the creator of (p2) really intended that the integer stored in (p2) should participate in arithmetic according to the interpretation of arithmetic specified by the instance decl in P2 instead of just any old interpretation of arithmetic?

> Even if we can't 
> use p1 and p2 together, why shouldn't we be allowed to use p1 and p2 
> separately in two different places within the same module?

I think this should probably be fine since p1 and p2 each contain their own interpretation of integer arithmetic for the values they contain. The only issue is what to do if you try to add the integer wrapped in p1 to another integer assuming that there is already a standard Num Int instance in the enclosing scope - I've discussed one such example in my reply to Karl.

> A related question to ask is: should the module that uses a value be 
> able to choose the instance used for values of a given type?  Or must 
> the dictionary be baked-in to a value at creation time, as in your example?
> - If yes, we would associate type tags with the values p1 and p2, and 
> look up the corresponding dictionary in an instance environment within 
> module Client.  This introduces an additional level of dynamic indirection.

This in turn suggests that analyse would have to have type:

     forall a. (Num a) => G a * G a -> a

to allow the caller to pass the relevant dictionary in. (In general if different variants of the GADT required different dictionaries they would all need to be passed in which could lead to a bit of a dictionary-passing explosion.)

Interestingly this is very like the infamous "contexts on data decls" controversy in Haskell, where

     data (Num a) => Foo a = F a

does not cause values of type (Foo Z) to carry a Num Z dictionary but instead requires all functions operating on such values to accept a (Num Z) dictionary, which in a modular setting actually begins to sound quite attractive.

> - If no, the type checker should require that the instance corresponding 
> to a given value is in scope (and not in conflict with any other 
> instance in scope).  Within the type checker, we would need to annotate 
> the type of a value with which instance declarations it requires, e.g., 
> p2 requires P2.(Num Int).  If P2 imported its (Num Int) instance from 
> some other module P0, then p2 would require P0.(Num Int) instead 
> (otherwise P1 and P2 could both import from P0, but Client wouldn't know 
> they use the same definition).

Taking this idea further we could have a special kind whose inhabitants represent instance environments so that we could store the dictionaries in the values of the datatype but require that both values were created in compatible instance environments:

    val analyse :
           forall (a : *) (i1 : I) (i2 : I).
           (Compatible i1 i2) =>
           (G a)[i1] * (G a)[i2] -> a

Anyway thanks, you've given me a lot of ideas to think about,

Cheers, Brian.

[1] "Modular Type Classes" by Derek Dreyer, Robert Harper, Manuel M. T. Chakravarty, and Gabriele Keller, 2006.

More information about the Types-list mailing list