[POPLmark] Comparison of certified code transformations for functional languages?
Karl Crary
crary at cs.cmu.edu
Sat May 12 21:21:09 EDT 2007
Adam Chlipala wrote:
> Karl Crary wrote:
>> I know nothing at all about the development you're looking at, so I
>> can't give you an answer in any case. However, your question about
>> "fairness" suggests that you're looking to draw some sort of
>> conclusion from these numbers, so I'm curious: What sort of
>> conclusion are you looking to draw? You're comparing lines of code
>> in proofs of different results in different systems. It's hard for
>> me to see that the comparison could tell you much of anything.
>
> I don't think that the fact that it's different systems should matter
> much, modulo standard concerns about measuring code size between
> languages. The choice of system is as much open to debate as other
> elements of the formalization strategy. If one system's
> implementations are systematically bulkier, then that suggests a
> useful conclusion about which system to use.
I would agree with you, provided disk space were a primary
consideration. On the contrary, I would argue that the number of lines
is much less important than how easily those lines were written. For
example, the Nuprl proof development system admitted *extremely* compact
proofs, but was less easy to use than more verbose systems like Twelf.
(I believe you just made this point yourself in your other email.)
Adam Chlipala wrote:
> I don't deny that the theorems being proved are different, but they
> aren't enormously different. Some quantitative comparison seems
> better than pure hand-waving. It would naturally be better to have
> apples-to-apples comparisons, but the last time I tried to facilitate
> that by suggesting a compilation benchmark problem on this list, no
> one responded with alternate solutions. ;)
We'll have to disagree on that score. I think a misleading quantitative
comparison is no better than hand-waving. I'm sure we can agree that
minor changes to a theorem can lead to radical differences its
difficulty of proof.
By the way, why can't the POPLmark challenge serve for an
apples-to-apples comparison? It was certainly intended to.
-- Karl
More information about the Poplmark
mailing list