[TYPES] Issues regarding typeclasses and modular languages like ML

Wed Apr 8 23:11:46 EDT 2009

Hi,
Although there is some literature on the issue of integrating 
Haskell-style typeclasses into modular languages such as ML, it appears 
to me that several important issues remain unresolved.

For example, in "Modular Type Classes" [1], the authors model some 
aspects of typeclasses by a stylised use of the ML module system, 
translating a Haskell class declaration into an ML signature with a 
distinguished type component (t), an instance declaration into an ML 
structure or functor, and provide a facility to explicitly declare which 
structures/functors are to be regarded as "canonical" in any specific 
context.

While this is certainly a useful contribution, there are some aspects of 
the Haskell typeclass system that do not appear to be amenable to this 
treatment. Specifically, their treatment seems (to me) to be only 
applicable to those uses of typeclasses where the overloading can be 
statically resolved at compile time, rather than the more general case 
where one wants to have restricted parametric polymorphism, perhaps in 
conjunction with polymorphic recursion and/or existential 
quantification, which involves dynamic generation of (or dispatch from) 
"dictionary arguments" at runtime, and therefore I do not see how 
functor application, which afaict is a strictly compile time activity, 
can be used to model this.

Moreover their solution to the coherence problem in the face of multiple 
named instances does not appear to deal with the issue of partial 
application. This is because they seem to assume that a polymorphic 
function can be translated into a suitable functor application that 
returns a structure containing a monomorphic value, which is indeed 
suggested by the syntax of qualified types in Haskell, but is confounded 
by some experimentation with ghci, which shows that dictionary arguments 
are actually interspersed with "core" arguments so as to maximize 
polymorphism in the face of partial application:

   $ ghci -fglasgow-exts -fno-monomorphism-restriction
   GHCi, version 6.8.2: http://www.haskell.org/ghc/  :? for help
   Loading package base ... linking ... done.
   Prelude> let f x y = show y
   Prelude> :t f
   f :: forall t a. (Show a) => t -> a -> String
   Prelude> let g = f 'c'
   Prelude> :t g
   g :: forall a. (Show a) => a -> String
   Prelude> (g 1, g "hi")
   ("1","\"hi\"")
   Prelude>

The salient point in the above example is that the type of (f) in 
reality is something like:

   Pi (t : *) . Pi (x : t) . Pi (a : * | Show a) . Pi (y : a) . String

so that the dictionary is passed only at the point where we choose how 
to instantiate the type variable (a).

Since a partially applied function could be passed as a polymorphic 
value to be used anywhere in the program, it seems unlikely that there 
is any hope of solving the coherence problem with lexical scoping.

(Aside: two uses of Pi above would usually just be written as arrows but 
I find it easier to understand types if everything is written uniformly 
in quantifier notation - is there a better quantifier instead of Pi for 
this case?)

Of course I am not trying to criticise the above paper in any way, and 
many of these issues would not arise in an ML setting at the moment due 
to the constraints imposed on the type system by the traditional desire 
to continue to support top level HM type inference. It is just that when 
one is trying to design a new language, with different tradeoffs (e.g. 
discarding TI in favour of first class polymorphism and automatic 
type-driven namespace management), one finds that the existing 
literature does not yet provide all the answers regarding the 
integration of typeclasses with a powerful module system.

Another issue that is problematic concerns the question of identity of a 
  type, for example:

   signature S = sig
      type t
      val x : t -> int
   end

   structure M = struct
      type t = string
      val x = String.toInt
   end

   structure N = M :> S

In the above, is N.t the same type as M.t or not (as far as any instance 
declarations inside M / outside M are concerned)? i.e. if one defines an 
instance of some typeclass for M.t would N.t also be a member of the 
class via this instance, and if so, would this not imply that 
abstraction has been broken?

The problem does not arise in the context of Haskell, since Haskell does 
not have any notion of abstract type at all: all types are completely 
concrete and the only abstraction facility is to hide the data 
constructor(s) so that no inhabitants of the type can be constructed 
elsewhere in the program.

I have not been able to find any papers that address the above issue, 
since they all seem to just use a pervasive type like (int) or tuples/ 
lists thereof to illustrate their encoding of typeclasses. The issue 
concerns the meaning of "a posteriori" abstraction in ML, and the 
literature seems to be divided between merely hiding the relationship 
between an abstract type and its implementing type, or actually really 
generating a new distinct type (for the latter see [2]). Where does the 
notion of instance definition for "*the* type t" fit into this question?

The question *may* be addressed in "ML mit Typklassen" [3] but 
unfortunately although the examples are in English the text is in German 
and I think I would need more than just a pocket dictionary to try to 
follow the arguments therein. Does anyone know of an English 
translation, or, if the issue is addressed there, could anyone point me 
to the relevant section since with enough time I probably could 
translate a few paragraphs.

Apologies for the length of this post. Any pointers to literature or 
corrections to my no doubt incomplete understanding would be much 
appreciated.

Thanks,
Brian.

[1] "Modular Type Classes" Derek Dreyer, Robert Harper, Manuel M.T. 
Chakravarty, Gabriele Keller. 2006/10/26

[2] "Generativity and Dynamic Opacity for Abstract Types (Extended 
Version)" Andreas Rossberg. 2003

[3] "ML mit Typklassen" Gerhard Schneider. 4th June 2000.