[POPLmark] Adequacy for LF
Karl Crary
crary at cs.cmu.edu
Tue May 3 01:23:21 EDT 2005
Hi Steve,
I think this is the root of our disagreement. If the point was to
compare proofs of "the same thing", then I agree that we did not satisfy
the spirit of the challenge. I thought the point was otherwise: to
explore how ready automated systems are for expressing metatheoretical
arguments. Certainly that is what the point *should* be. And if that
is the point, then the process of coming up with the best presentation
is an essential part of the exercise, just as picking the right data
structures are an essential part of implementing an algorithm.
(Remember, proofs are programs!)
This is why Bob and I have been so stubborn on this matter. It would
not have been very difficult for us to express the system with an
evaluation context, but there is no reason whatsoever to do so and good
reason not to. One of the important dividends of a logic framework is
that in helps to to come up with the cleanest possible presentation of
your system. Very often one actually makes minor changes to the
language in response to what one learns by encoding it, although we did
not do so here.
On point, when one formalizes operational semantics using evaluation
contexts, one quickly learns how bad of an idea they are. Evaluation
contexts appear attractive in direct proportion to how much of the
detail associated with them one elides. When formalized, elision is not
an option, and all the shortcomings come front-and-center. (Jerome
pointed out a few of these in his email.)
-- Karl
Stephan Zdancewic wrote:
> Karl Crary wrote:
>
>> There's been some confusion about the adequacy of our encoding of the
>> operational semantics; in particular, why our encoding is adequate,
>> given that the paper presents the semantics using evaluation contexts
>> and our encoding does not. We wanted to clarify why it is adequate,
>> and moreover, why we thought its adequacy was obvious.
>
>
> Hi Karl,
> I would also like to elaborate a bit more on why we originally gave
> the Twelf implementation a grade of "-" with respect to the
> operational semantics. Part of the confusion (and this is probably
> our fault entirely) is that the point of PoplMark is *not* to prove
> the soundness of FSub. Instead, the point is to allow us to compare
> proofs of "the same thing" using different theorem proving and term
> representation strategies. Because different term representations
> will likely need to use different proof techniques, there has to be
> some wiggle room for what exactly "the same thing" is, but by trying
> to keep as many features of the formalism the same across different
> implementations, we had hoped to make comparison easier. In addition,
> the other point of the PoplMark Challenge is to see how the
> technologies scale to handle different language constructs, including
> things like exceptions, etc. that are more cleanly handled by
> evaluation contexts. (See Peter's mail.)
>
> As you rightly point out, the operational semantics for FSub doesn't
> require evaluation contexts and omitting them may make the proofs
> cleaner. In fact, never have we said that your solution was
> inadequate (in the technical sense). Our objection was solely to the
> fact that it didn't follow the spirit of the challenge. We should
> have clarified this from the beginning.
>
> Putting the adequacy of your original encoding aside for the moment,
> let me ask you this: How hard would it be to formalize the operational
> semantics, using evaluation contexts, following the paper version as
> closely as possible? My impression is that it wouldn't be that much
> harder than the existing solution. If it *would* be significantly
> harder, then that in itself is interesting, and I would like to
> understand why.
> Thanks,
> Steve
> _______________________________________________
> Poplmark mailing list
> Poplmark at lists.seas.upenn.edu
> http://lists.seas.upenn.edu/mailman/listinfo/poplmark
>
>
More information about the Poplmark
mailing list