[TYPES] Questions regarding a clock inference system for data-flow languages

Fri Aug 30 12:01:42 EDT 2013

Hello Rajagopal and TYPES list,

I am doing my PhD under the supervision of prof. M. Pouzet and A.
Cohen, working on extended clock calculi for data-flow langages. I'll
try to answer your questions to the best of my knowledge.

From: Rajagopal Pankajakshan <rajagopal.pankajakshan at live.com>
To: "types-list at lists.seas.upenn.edu" <types-list at lists.seas.upenn.edu>
Subject: [TYPES] Questions regarding a clock inference system for
data-flow languages Date: Fri, 30 Aug 2013 07:15:24 +0000
Sender: "Types-list" <types-list-bounces at lists.seas.upenn.edu>

> During my internship at google this summer I was assigned to study
> type theory applied to stream-processing languages.  One approach we
> considered was that of the so called clock calculi, as found for
> instance in Lucid-Synchrone [3].

First, you may want to read "Clocks at First Class Abstract
Types" (Colaço and Pouzet, EMSOFT'03), because the section dedicated to
clock typing in EMSOFT'04 is indeed very terse. However, the full
system as implemented in Lucid Synchrone has never been described
anywhere, as far as I know.

>
> Lucid-synchrone defines a type inference system to analyse
> synchronisation between data-flow streams. In this calculus, a stream
> clock s can be the base execution clock, the sample of another clock
> s on some condition (a name X or a variable n) or a variable α. The
> paper claims that clock analysis can be performed using
> Hindley-Milner type inference.

It is indeed HM-like, but it needs to be extended with first-class
abstract datatypes (restricted existential types) à la Laufer &
Odersky. See EMSOFT'03 above for references.

> The basic mechanism of type polymorphism is to universally quantify
> the free clocks α1,...,αn and names X1,...,Xm of a definition in its
> context H, here:
>
>     gen(H, cl) = ∀α⃗.∀X⃗.cl where α⃗ = FV(cl) - FV(H) and X⃗ = FN(cl)
> - FN(H)
>
> This allows to reuse that definition where it is used, by considering
> an instantiation relation:
>
>     cl[⃗s/α⃗][⃗c/X⃗] ≤ ∀α⃗.∀X⃗.cl
>
> My first question arises from the understanding that names X do not
> seem to have the same status as variables α in this type system. They
> are generalised and instantiated in the same manner.  For instance,
> consider the following Lucid program:
>     node flip (x) = x when yes where clock yes = true
>
> The boolean stream function flip samples all occurrence of x
> according to a local stream yes. According to the type system [3,
> figure 5], it has clock ∀α.∀X.α→α on X, as the following derivation
> shows:
>
>   |- true : s
>   _________________________________________________________________________
>   |- clock yes = true : (m:s)
>   _________________________________________________________________________
>   [yes : (m:s), x : s] |- x when yes : s on m
>   _________________________________________________________________________
>   [x : s] |- x when yes where clock yes = true : s on m  (m not in
> [x : s])
> _________________________________________________________________________
> |- node flip (x) = x when yes where clock yes = true : [flip : s->s
> on m]
>
>   Gen([],s->s on m) = ∀s∀m.s->s on m

If I'm not mistaken, in your example m is a name, not a condition
variable X. Thus it does not get generalized by Gen(). Indeed this
program gets rejected by Lucid v3, as you note: a condition name may
not escape its scope.

> The opposite stream function flop, below, can also be given type
> ∀t∀n.t-> t on n.
>
> node flop (x) = let x when no where clock no = false : [flop :
> ∀s∀n.t->y on n]
>
> Now, if we let H = [flip,:∀s∀m.s-> s on m,flop:∀t∀n.t-> t on n] then,
> by rule (Inst) in [3, Figure 5], we can assign the same clock to of
> flip and flop with the following type derivation

>     u -> u on o ≤ ∀s.∀m.s->s on m    u -> u on o ≤ ∀t.∀n.t->t on n
>     ______________________________    ______________________________
>     H[x:s] |- flip : u -> u on o    H[x:s] |- flop : u -> u on o
> H[x:s] |- x : s
> _______________________________________________________________________________
> H[x:s] |- flip (x) : u on o    H[x:s] |- flop (x) : u on o
> ______________________________________________________________ H[x:s]
> |- flip (x) + flop (x) : u on o
> ______________________________________________________________ (1)
> H |- node sum (x) = flip (x) & flop (x) : [sum : u-> u on o]

> Doesn't function flipflop deadlock or does this mean that it needs a
> buffer ?

Indeed, from the pure data-flow point of view, this program either
needs an infinite buffer on the left input of (&), or deadlocks if (&)
is synchronous (i.e. has no buffers on its inputs and outputs).

> I tried to figure this out by testing the example with the
> implementation of Lucid-Synchrone available on the web [2].
> Actually, it behaves differently from what the paper says.  For
> instance, if we try to define function flip as above, the compiler
> says:
>
>   "This expression has clock 'c but is used with clock (__c1:'d)"
>
> It refuses to consider the assumption 'c as a sampled clock (__c1:'d)
> generated by the when operator. Shouldn't it infer (__c1:'d) instead ?
> However, if we add a clock declaration to do that:
>
>   let node sample (x, clock y) = x when y
>   let node yes (x) = sample (x, true)
>   let node no (x) = sample (x, false)
>   let node yesandno (x) = yes x & no x
>
> then the compiler replies:
>
>   # lucyc -i yesandno.ls
>   val sample : 'a * bool => 'a
>   val sample :: 'a * 'a -> 'a on ?_y0
>   val yes : 'a => 'a
>   val yes :: 'a -> 'a on ?_y0
>   val no : 'a => 'a
>   val no :: 'a -> 'a on ?_y0
>   val yesandno : bool => bool
>   val yesandno :: 'a -> 'a on ?_y0
>
> and produces a Caml program. But there are so many clocks in it that
> I can't really figure out what it does.

Strange, this code fails to compile on my machine with the same version
of the compiler. The clock types that you get are clearly incorrect.
This bug seems to be triggered by the "clock" keyword on inputs. If you
remove it, you get a saner clock signatures and "yes" and "no" are
rejected:

   val sample : 'a * clock => 'a
   val sample :: 'a * (__c0:'a) -> 'a on __c0
   val yes : 'a => 'a
   File "test.ls", line 2, characters 27-34:
   >let node yes (x) = sample (x, true)
   >                           ^^^^^^^
   This expression has clock 'b * 'c,
   but is used with clock 'b * (__c1:'b).

> Instead, I looked as the user manual [1] to
> determine which of [2] or [3] to focus on. Concerning instantiation,
> page 67, it says something different:
>
>   "The clock (carrier-clock : stream-clock) of a stream e means that
> e has value carrier-clock which has the stream clock stream-clock.
> The value may be either a global value (defined at top-level) or an
> identifier. This identifier is unique and is an abstraction of the
> actual value of e."
>
> I don't understand how this uniqueness rule is specified in the paper
> [3] and, more generally, how it relates to type polymorphism.
> Doesn't it instead refer to existential quantification ?  Also, in
> which cases does it apply in the implementation ?

You are right and I have a hard time figuring out from EMSOFT'04 too.
It appears in a clearer form in EMSOFT'03, but I believe the final
system of LSv3 is still quite a bit different.

The basic idea is simple: you may not have an expression with a clock
"'a on ?_c" if the condition ?_c is not present in the current scope.
Thus, when clocking "e where D", one basically checks that the
conditions "?_c_i" appearing in the clock type of "e" are still present
in current environment.

It may help to consider this property in relation with the code
generation scheme. In the generated code, equations get translated to
imperative statements, each one conditioned according to its clock type
interpreted as a boolean formula. The current values of both clocks
variables and conditions will thus have to be accessible from the
current scope, at run-time.

The compiler could automatically pass condition variables around, but
it is a design choice that prof. Pouzet deliberately rejected.

> In the end, I'm a bit confused and wonder whether polymorphism
> matters at all. As long as one is not using streams of functions,
> doesn't a simple sub-typing rule suffice (to handle a base clock and
> samples "c on x") ?

I am not sure to follow what you have in mind, but note that in Lucid
Synchrone, a node do not have a single "base clock" in general. Please
consider the following example, where f is parametric in two clock
variables that get instantiated to two different clock types in g:

   # cat when.ls
   let node f x y = (0 fby x, 0 fby y)
   let node g x c = f x (x when c)
   # lucyc -i test.ls
   val f : int -> int => int * int
   val f :: 'a -> 'b -> 'a * 'b
   val g : int -> clock => int * int
   val g :: 'a -> (__c0:'a) -> 'a * 'a on __c0

Also, and although experts on this list are far more knowledgeable than
me on this matter, it seems to be common knowledge that type inference
for HM-style parametric polymorphism is simpler (and thus easier to
implement) and more scalable than (non-structural?) subtyping. An
implementation of clock typing following the principles outlined above
and using destructive unification takes less than a thousand lines of
ML code and is very efficient.

To conclude, you may be interested in the heavily simplified version of
clock typing that appears in "Clock-Directed Modular Code Generation
for Synchronous Data-Flow Languages" by Biernacki et al. (LCTES'08),
even if it is restricted to the first-order setting. The paper may also
serve as a nice introduction to the code generation machinery.

Hope that helps, do not hesitate to send me follow-up questions.

-- Adrien Guatto
   http://www.di.ens.fr/~guatto/index_en.html