[POPLmark] Meta-comment

Fri May 13 13:11:40 EDT 2005

As Karl has mentioned, it would be useful, and it seems entirely 
feasible, to augment the Twelf meta-theorem prover to produce a 
checkable witness for its inductive proofs.

As to the issue of "trust", I am astonished that there could be any 
question here at all.  To believe in the consistency of, say, Coq or 
HOL is an enormous step that requires you to accept strong 
comprehension principles and that admits no simple inductive 
characterization to justify them semantically.  In sharp contrast all 
that the Twelf prover relies on is inductive reasoning over inductively 
defined sets of natural numbers (ie, the canonical forms of various 
types, which are easily Goedel-numbered by standard methods).  There is 
nothing in logic more rock solid than mathematical induction.

Moreover, the implemented systems mentioned have "features" that go 
beyond even the extraordinarily powerful concepts needed to justify the 
underlying core theories.  If one wishes to be a skeptic, these 
features make the systems even more questionable, not less.

Bob

On May 12, 2005, at 6:52 AM, James Cheney wrote:

> On Thu, 2005-05-12 at 01:31 -0400, Daniel C. Wang wrote:
>
>>>> Personally, If I wanted to use first-order encodings I might as 
>>>> well be using,
>>>> Coq, Isabell, or HOL. They are more mature than Twelf and I'd say 
>>>> more
>>>> appropiate for first-order encodings.
>>>>
>>>>
>>>
>>> My inner snot can't resist pointing out that "mature" is as mature
>>> does.  Let's see a few solutions in the other systems Dan Wang
>>> mentioned, and compare.  There seems to be a lot of talk about other
>>> systems, and few complete solutions.
>>>
>>>
>>>
>> Look, the POPLMark is not the only theorem proving problem in the 
>> world!
>> Lets not forget that Coq has recently been use to reprove the 
>> four-color
>> problem. (Google and I'm sure you'll find it.) Isabell/HOL now 
>> includes
>> a full executable specification for a TCP/IP stack. HOL-lite is used 
>> to
>> verify the correctness of the algoritms use in the x86 FPU. Lets see a
>> Twelf signature that encodes the theory of the real numbers from first
>> principles then we can talk. :)
>
> In addition to supporting tactics and user interactive proof, the three
> systems Dan mentioned (and others) also have very small "trusted theory
> bases" and small "trusted computing bases".  By "trust", I mean 
> anything
> that cannot be checked.  That is, the mathematical foundations of these
> logics are small, standard, and has been exhaustively studied, the
> type/proof checking algorithms are well understood (and in many cases
> have been formally proved correct in some other system), and as long as
> you stick to the system's methodology (e.g., defining structures in
> terms of existing structures rather than introducing possibly-
> inconsistent axiomatizations in Isabelle/HOL etc), you know that what
> you are doing is consistent with a mathematical foundation that many
> people have studied and attempted to poke holes in, and failed.
>
> These properties do not hold of LF/Twelf when used according to the
> proposed methodology.  I am *not* saying that I think LF is 
> inconsistent
> or untrustworthy.  The underlying LF type theory is well understood and
> has a solid grounding in type theory, and the LF type checking 
> algorithm
> is well studied and has a formalized proof of correctness (I believe
> that Randy Pollack and James McKinna's formalization of PTS type
> checking is one example).  But the kind of proofs we are interested in
> formalizing (that is proofs of meta-properties of programming 
> languages)
> are not expressed in LF itself in the Twelf methodology.
>
> Instead, these proofs rely on a much deeper theory of how to do
> induction/recursion on HOAS or how to check worlds/blocks/totality of
> relations defined in LF.  (I will call this the the meta-logic and 
> meta-
> proof theory of Twelf; maybe this is not the standard term.)  This (it
> seems to me) brings both a large amount of "paper proof" (of a very 
> deep
> and interesting, but by the same token hard to trust nature) back into
> the trusted theoretical base and a large amount of code (e.g. Twelf's
> coverage/totality/world checking) back into the trusted computing base,
> relative to plain LF or other theories.  I am not saying that that I
> think there is a flaw in Twelf's meta-logic, just that I don't have the
> same confidence in it that I have in, say, pure LF, Coq's type theory,
> ZFC or higher-order logic.
>
>
> In addition, in Twelf one typically describes one's system by
> introducing many Twelf constants describing the syntax, axioms, and
> rules of the system.  That is, each new Twelf development is really
> about a new logic encoded in LF.  It is easy to write things that look
> right but are inconsistent/inadequate for subtle reasons or just typos.
>> From Twelf's point of view this is no problem, it just means that lots
> of types are inhabited.  Checking adequacy seems to require a trusted
> paper proof that lies outside the system.  (I remember hearing that 
> this
> was a problem with adding obvious-seeming derived proof rules to the
> original form of PCC in LF.)
>
> FPCC, I believe, remained foundational while using Twelf by sticking to
> a very small set of axioms, namely those for higher-order logic.
> Everything else was done by introducing definitions and providing
> explicit derivations.  This foundational approach means that things 
> stop
> working almost right away if the definitions are wrong, that is, the
> canary dies before the miners.
>
> Also, one reason I am interested in nominal logic and related 
> techniques
> is that it is provides much of the benefit of higher-order techniques
> (that is, alpha-equivalence comes for free and capture-avoiding
> substitution can be done automatically) while remaining much closer to 
> a
> foundational system (as Christian mentioned, the techniques can be
> implemented definitionally within higher-order logic/Isabelle/HOL).
>
> I think that it would be appropriate to differentiate among
> solutions/tools both on the basis of convenience, maturity of
> implementation, and rapid prototyping (where Twelf clearly wins), and 
> on
> the basis of mathematical foundations (where Twelf is, to me at least,
> not a convincing win).  That is, I don't view the correctness of the
> Twelf solution (or Coq or other putative solutions) to POPLMark as an
> absolute judgment, but only something that can be evaluated relative to
> the trusted theory base and trusted computing base.
>
> This difficulty would be ameliorated if the crucial components of
> Twelf's meta-logic and proof checker were formalized in a foundational
> system (or even by translation to Twelf proofs relative to a small,
> trusted theory).  I can understand why providing a formalization of
> Twelf's meta-logic/proof checker in some other system may not seem like
> a worthwhile endeavor to Twelf's developers, but in my view it (or
> something equivalent) is a necessary sanity check.  In the absence of
> such a proof, it does not seem fair to me to compare the systems above
> apples-to-apples with Twelf since they may be solving a harder problem.
> The flip side of this is that *if Twelf meta-proofs are correct*, then
> they evidently save you a whole lot of work to prove the same things.  
> I
> am not willing to take the "if" part of the previous sentence for
> granted.
>
>
>
> I also think that it would be helpful to draw the following 
> distinctions
> in discussions about the expressiveness of Twelf (or any other 
> candidate
> system for POPLMarking).  It seems to me that there are roughly 
> speaking
> three kinds of formalizations possible in Twelf:
>
> 1.  A "foundational" formalization in which one writes down the axioms
> of some trusted logic (using HOAS, LF contexts, etc) and then does
> everything specific to a problem by introducing definitions and writing
> down concrete derivations for theorems.
>
> Here Twelf provides some assistance in the form of unification and
> typechecking, but so do most of the other type-theoretic systems; also,
> the latter provide tactics and fine grained control over interactive
> proof, which Twelf (by itself) does not at present.  I think the FPCC
> development is in this category.
>
> 2.  A "elegant" formalization in which the intended methodology of the
> system is used without compromise (e.g. in Twelf, full HOAS used for
> syntax and LF context used for all hypothetical judgments).  In such an
> approach, one reasons explicitly (and only) about the things really in
> the theory, such as terms, derivations, and judgments, instead of (for
> example) reasoning by induction on the height of derivations encoded in
> some unpleasant way as trees.
>
> For any language for which such a formalization is adequate, Twelf is
> clearly superior (modulo the foundational concerns I expressed above).
> The problem is, how does one know *a priori* whether a problem has an
> elegant formalization?
>
> 3.  Something in between, where some features of the methodology are
> used but others are not, or where additional well-formedness judgments
> are needed.  e.g. a logic such as linear logic (or bunched 
> implications)
> where one can make full use of alpha-beta-eta-equivalence provided by
> HOAS but where the LF context isn't quite what we want so we need to
> represent it explicitly as a list/tree or use additional 
> well-formedness
> judgments to ensure correctness/adequacy.  Another example would be
> Andreas Abel's [Merlin 2001] second-order formulation of the lambda-mu
> calculus (with mu : (cont A -> exp B) -> exp B rather than the more
> elegant, but inadequate mu : ((exp A -> exp B) -> exp B) -> exp A).  I
> think the POPLMark's FSub problems are also in this category.
>
> For "in-between" formalizations, Twelf may or may not be a win,
> depending on how work you save by using built-in alpha-beta-eta and the
> LF context versus how much extra work it is to reason about low-level
> things by hand instead of using tactics.
>
> Twelf is clearly beneficial in situation 2, but suppose you have a
> language that combines well-behaved features, e.g. quantification and,
> lambda, with some features that aren't handled elegantly, so that you
> need to perform explicit reasoning about representations (e.g., 
> encoding
> heap cell labels as integers and propagating a counter through all
> judgments for generating new labels, or reasoning about distinct names
> in pattern matches).  In that case, the parts of your language for 
> which
> Twelf provides assistance work great, but you have to do all the other
> reasoning using more primitive tools than in a tactic based prover.
> (This sounds like the situation Dan described with FPCC).
>
> I think that when some people have asked "can Twelf do X", they really
> mean "can Twelf do X 'elegantly' (i.e., using approach 2)?"  Since
> elegant is a subjective term, I don't see the point of arguing it.  I
> think it would be interesting to determine whether there is a technical
> characterization of those languages for which the Twelf methodology
> works as intended i.e., without compromises or reverting to a
> foundational approach), although I'm not sure this is even a sensible
> technical question.
>
> --James
>
> _______________________________________________
> Poplmark mailing list
> Poplmark at lists.seas.upenn.edu
> http://lists.seas.upenn.edu/mailman/listinfo/poplmark