[TYPES] A show-stopping problem for modular type classes?

Wed Oct 7 04:15:06 EDT 2009

I agree with Brian's analysis: the 'Problem' constructor is only ok because any way of satisfying a Num constraint is as good as any other.   

The ability to have a data constructor capture a constraint of a *non-existential* type variable is newish in GHC.  I added it for several reasons.  Firstly, it is more uniform.  It's great to be able to say "You can give a constructor any old type, with arbitrary constraints".  Secondly, it's useful,  You can have, the constructor for (Set a) capturing the (Ord a) constraint, eg
   data Set a where
      S :: forall a. Ord a => [a]
so that union can have type union :: Set a -> Set a -> Set a.  (Empty set is another matter!)  Incidentally, the union function then illustrates Brian's point beautifully:
	union (S xs) (S ys) = S (merge xs ys)
The Ord constraint needed by merge can be satisfied from either the (S xs) or the (S ys) -- and they'd better be the same Ord implementation!

Thirdly, and most significantly, it's quite hard to *exclude* examples like 'Problem' if we allow general equality constraints (which we now do).  For example,
	Problem2 :: forall a b. (Num b, a~b) => T a
Here Num constrains an "existential" variable... but not really, because the (a~b) constraint makes it equal to 'a'.  In general, it might not be precisely equal but somehow connected:
	Problem3 : forall a b. (Num b, F a ~ G b) => T a
where F and G are type functions.  So it's getting complicated.  Since Haskell *does* have global instances, I decided not to try to find the bottom of this particular tarpit, and instead adopted the simple story: you can give a data constructor an arbitrary set of constraints.  GADTs are merely a special case, in which the type constraints are all equalities. 

All that said, I agree that the ice is thin here. For example, if you allow overlapping instances (which some people like) then I think you could cook up an example in which the choice mattered.  So I am not arguing that GHC's story is the Right Thing.

Karl suggests excluding such examples by indexing GADTs only by things of kind other than '*'.  I'm very sympathetic to datakinds, and am working with Conor McBride to, in effect, add datakinds to GHC.  Lots of GADT-ish programs index GADTs with '*' when it would be much better to index with, say {Bool}.  (See Conor's web page on the Strathclyde Haskell Extensions, SHE.)

But I don't think this issue is directly relevant to Brian's problem.  If we can index GADTs with things of kind {Bool}, say, then we should certainly be able to index type classes with things of this kind too, so the same issue arises.

Simon

| -----Original Message-----
| From: types-list-bounces at lists.seas.upenn.edu [mailto:types-list-
| bounces at lists.seas.upenn.edu] On Behalf Of Karl Crary
| Sent: 06 October 2009 16:21
| To: Brian Hulley
| Cc: types-list at lists.seas.upenn.edu
| Subject: Re: [TYPES] A show-stopping problem for modular type classes?
| 
| [ The Types Forum, http://lists.seas.upenn.edu/mailman/listinfo/types-list ]
| 
| Hi Brian,
| 
| Basically you're observing that a qualified GADT package includes an
| instance declaration, so unpacking it amounts to a using declaration.
| Accordingly, the "Problem" clause of your code would need to be treated
| as a using declaration, and it would be rejected for introducing
| overlapping instances.
| 
| It strikes me that in GHC you might well have the same problem with the
| "Ok" constructor.  A qualified existential package also includes an
| instance declaration, and unpacking it too amounts to a using
| declaration.  That ought not be a problem because it's an instance for
| an abstract type, so it cannot overlap any existing instances.  But, my
| understanding is that internally GHC does not really support abstract
| types, so that might not help.
| 
| By the way, I think that this issue exposes a design flaw in GADTs.
| GADTs provide a convenient way to do dynamic type analysis by
| integrating type representations into the datatype mechanism.  But, I
| think it's a mistake for GADTs to use actual types for those type
| representations.  Conceptually, the indices for a GADT are data, not
| sets of terms.  A better way to do GADTs would be to introduce a
| "datakind" mechanism, and use algebraic data kinds as the indices for a
| GADT.  You can get a picture of what this would look like from my 1999
| paper "Flexible Type Analysis" with Stephanie Weirich.  (That paper
| didn't have the tie-in with datatypes, which was invented later.)
| 
| The relevance to modular type classes is this:  If the index for a GADT
| were an algebraic data kind, rather than an actual type, there would be
| no temptation to try to attach an instance to that index.  The "Problem"
| constructor would be ill-kinded.  Obviously, introducing a datakind
| mechanism and supporting machinery would be a much more substantial fix
| than simply checking for overlapping instances when unpacking a GADT,
| and it also might break existing code.  But I do think it would be a
| better design looking forward.
| 
|     -- Karl
| 
| 
| Brian Hulley wrote:
| > [ The Types Forum, http://lists.seas.upenn.edu/mailman/listinfo/types-list
| ]
| >
| > Hi everyone,
| > with regards to the integration of Haskell-style typeclasses into an SML-
| style module system I've come across a really difficult problem caused by the
| interaction of GADTs and scoped type-class instances, and am wondering if
| anyone has any ideas on how it might be solved.
| >
| > Consider the following Haskell module that compiles under ghc 6.8.2 with -
| fglasgow-exts enabled:
| >
| >   module Test where
| >      data G a where
| >         Ok      :: forall a. Num a => a -> G ()
| >         Problem :: forall a. Num a => a -> G a
| >
| >      analyse :: forall a. (G a, G a) -> a
| >      analyse (Ok x, Ok y)           = ()
| >      analyse (Ok x, Problem y)      = ()
| >      analyse (Problem x, Ok y)      = ()
| >      analyse (Problem x, Problem y) = x + y
| >
| >   $ ghci -fglasgow-exts test.hs
| >   GHCi, version 6.8.2: http://www.haskell.org/ghc/  :? for help
| >   Loading package base ... linking ... done.
| >   [1 of 1] Compiling Test             ( test.hs, interpreted )
| >   Ok, modules loaded: Test.
| >   *Test> analyse (Problem (1 :: Int), Problem (2 :: Int))
| >   3
| >   *Test> Leaving GHCi.
| >
| > Conceptually, the Problem constructor stores a Num dictionary for the type
| of its value argument, so in the above example each Problem constructor in
| the tuple stores a Num dictionary for the type Int (as well as an Int value).
| >
| > Because type class instances in Haskell are global, so that there can only
| be one instance per type, in the fourth clause of (analyse) even though
| pattern matching makes available *two* Num dictionaries for Int, the compiler
| knows that they must be the same and so can just forget about one of them (I
| assume), allowing the arithmetic operator (+) to be resolved for example to
| the function stored in the first dictionary.
| >
| > However in a modular setting, there is no such guarantee, so when compiling
| (x + y) there would be an ambiguity about which Num Int dictionary to use
| e.g.:
| >
| >
| >    structure P1 = struct
| >       instance Num Int where ...
| >
| >       val p1 : G Int = Problem 1
| >
| >    structure P2 = struct
| >       instance Num Int where ...
| >
| >       val p2 : G Int = Problem 2 (* different dictionary from p1 *)
| >
| >    structure Client = struct
| >
| >       val x = analyse (p1, p2)   (* 2 different dicts for Num Int *)
| >
| > (I've written types the same way as in Haskell to avoid confusion and used
| layout to avoid having to write "end".)
| >
| > I've sidestepped the issue of how type class instances are defined and made
| canonical since this does not seem to me to be relevant: the point is that we
| can have two incompatible dictionaries stored in values of the same type
| where the types associated with the dictionaries are visible in the scope of
| the client, because of the way GADTs allow us to refine a type by adding
| extra predicates.
| >
| > (In contrast, uses of the Ok constructor would be fine since the type
| associated with the dictionary (and value) is hidden.)
| >
| > Many similar examples could be constructed e.g.
| >
| >     foo :: forall a. Num a => G a -> a -> a
| >     foo (Problem x) y = x + y
| >     ...
| >
| > Here the compiler must choose between the GADT dict and the dict sent into
| the (foo) function from the calling context.
| >
| > So far the only solution I can think of to the problem of designing a
| modular language with typeclasses is to simply forbid the use of user-defined
| predicates for GADT constructors (they are fine in existentials as pointed
| out above with the Ok constructor).
| >
| > This might be ok but it seems a bit draconian and also I'm wondering if it
| reveals a fundamental limitation with the whole concept of typeclasses as
| used in modern Haskell: are they fundamentally non-modular?
| >
| > Anyway any thoughts or pointers to literature addressing any of these
| issues would be welcome,
| >
| > Thanks, Brian.
| >
| >
| >