[TYPES] RFC: ARF - An interprocedural flow-based type checker

Thu Sep 10 11:41:12 EDT 2015

> On the other hand, I may have just reinvented a type system that already
> exists somewhere in the academic literature, since I am not familiar
> enough with type-related terminology to effectively search the literature.

I am not familiar with the Whiley language and with the references that
Gabriel and François provided but I am familiar with some of the
literature on type inference for dynamically typed languages. So here
comes a bit of a brain dump of things from my masters...

On a cursory glance, the types you infer for Python are unions of base
types. This reminds me of the Soft Typing work for Scheme, from the 90s.
Soft typing systems are based on global type inference (the user doesn't
add any type annotations) and use the result of the type inference is a
classification of the primitive operations in the program based on how
safe they are at runtime. For example, in an addition `x + y` if it can
deduce that x and y are numbers, then the addition is certainly safe. If x
and y are always None, then the addition is certainly unsafe and if x and
y can be either a number or None, then the safety of the addition is
uncertain and can only be checked at runtime. One of the interesting
features of soft typing systems is that by classifying operations this way
the output in case of a type error is more informative than just saying "I
failed to typecheck your program because some types didn't match".

The first Soft Typing systems for scheme [1][2][3] were based on HM type
inference with row-polymorphism. Later systems, like MrSpidey for PLT
Scheme [4][5] moved to flow-based analysis, which is more accurate. The
cost is that flow-analysis has to look at the whole program while HM
inference can check modules separately. Inferring
parametrically-polymorphic types using flow analysis is also trickier to
do efficiently[7].

Anyway, it might be interesting for you to dig up an old version of
DrScheme that comes bundled with MrSpidey so you can play a bit with it.

(disconnected brain dump: [3] has a very comprehensive section on the
downsides of using HM inference for typechecking. [6] frames flow analysis
as a more traditional type system and [7] is a very deep survey on
flow-analysis.)

One thing that  you should be aware of is that global type inference is
very brittle, which is partly a reason why people have been moving on to
Optional Typing and Gradual Typing for dynamic languages lately [8]. My
impression is that the niche for global type inference has been automated
program analysis that does not expect any interaction from the programmer.
You see these sometimes inside compilers for dynamic languages (which use
the inference to omit tag checks and things like that) but one notable
example of a global type inference system that is programmer facing is the
Dialyzer tool for Erlang.

Dialyzer is a soft type system that silences warnings for "potentially
unsafe" operations and only complains about things that it is 100% sure
that are wrong. The authors call this a  "success type system" [9], which
instead of focusing on proving that some programs don't go wrong, focuses
on proving that some programs do go wrong. As a result of this more lax
typechecking, Dialyzer can have more efficient inference (if one of its
inferred types ever grows too big it can "give up" and infer the "any"
type instead) and the error messages it reports are almost certainly going
to be bugs (and not just some imprecision in the type checker), which is
nice for their use case of running a linter mature code bases with
thousands of lines of code.

Summing up, I hopefully what I wrote has to do with what you are looking for.

Hugo

----------

[1]
Cartwright, R. & Fagan, M.
Soft Typing Proceedings of the ACM SIGPLAN 1991 Conference on Programming
Language Design and Implementation, ACM, 1991

[2]
Fagan, M.
Soft Typing: an approach to type checking for dynamically typed languages
Rice University, 1991
www.ccs.neu.edu/scheme/pubs/thesis-fagan.ps.gz

[3]
Wright, A. K. & Cartwright, R. A Practical Soft Type System for Scheme
Transactions on Programming Languages and Systems, ACM, 1997

[4]
Flanagan, C.; Flatt, M.; Krishnamurthi, S.; Weirich, S. & Felleisen, M.
Catching bugs in the web of program invariants Proceedings of the ACM
SIGPLAN 1996 Conference on Programming Language Design and Implementation,
Association for Computing Machinery (ACM), 1996
http://cs.brown.edu/~sk/Publications/Papers/Published/ffkwf-mrspidey/

[5]
Flanagan, Cormac
Effective Static Debugging via Componential Set-Based Analysis Rice
University, 1997
https://users.soe.ucsc.edu/~cormac/papers/thesis.pdf

[6]
Palsberg, J. & Schwartzbach, M. I.
Safety Analysis Versus Type Inference for Partial Types Information
Processing Letters, Elsevier North-Holland, Inc., 1992
http://www.cs.ucla.edu/~palsberg/paper/ipl92.pdf

[7]
Midtgaard, J. Control-Flow Analysis of Functional Programs University of
Aarhus, 2007
http://www.brics.dk/RS/07/18/index.html

[8]
Felleisen, M. From Soft Scheme to Typed Scheme: Experiences from 20 Years
of Script Evolution, and Some Ideas on What Works 2009
http://www.ccs.neu.edu/home/matthias/Presentations/stop.html

[9]
Lindahl, T. & Sagonas, K.
Practical type inference based on success typings Proceedings of the 8th
ACM SIGPLAN International Conference on Principles and Practice of
Declarative Programming, Association for Computing Machinery (ACM), 2006
http://it.uu.se/research/group/hipe/papers/succ_types.pdf