[TYPES] Two-tier reviewing process

Fri Jan 29 09:28:47 EST 2010

Of course, Benjamin's and Steve's discussion about Expertise vs  
Confidence
indicates that we simply have way too little time to evaluate conference
submissions fairly and robustly -- so why does this community insist on
using this brittle and unreliable mechanism to evaluate scientific work?

-- Matthias

On Jan 29, 2010, at 7:04 AM, Philip Wadler wrote:

> [ The Types Forum, http://lists.seas.upenn.edu/mailman/listinfo/types-list 
>  ]
>
> Just a personal plea here.
>
> The meaning of X, Y, Z was fixed clearly in Nierstrasz's paper:
>
> X: I am an expert in the subject area of this paper.
> Y: I am knowledgeable in the area, though not an expert.
> Z: I am not an expert. My evaluation is that of an informed outsider.
>
> This was a huge improvement over the 'confidence' scores we used to
> give, and which were often used to form a weighted average of
> numerical scores, a horrid scheme which led to the spurious impression
> that a 6.7 paper must be better than a 6.3, say.
>
> Please don't go back to the bad old days of rating for confidence.  Of
> course you should say how confident you are in your rating, but the
> place to do that is in the text.  The only things we need scores for
> are overall rating (ABCD) and expertise (XYZ).  Everything else of
> import can be dealt with in the text of the referee's report.
>
> If you are an expert and you are not confident because the paper is
> intricate, the best service you can render to the PC chair, to the PC,
> to the conference, and to the author is to give an X rating---and then
> explain your confidence level and the reason for it in the review.
>
> Note also that while the quest to find an X rating for every paper is
> good, the best possibility is for a paper to receive both X and Z
> reviews.  (Preferably both high!)
>
> Cheers,  -- P
>
>
>
> On Fri, Jan 29, 2010 at 8:01 AM, Stephan Zdancewic <stevez at cis.upenn.edu 
> > wrote:
>> [ The Types Forum, http://lists.seas.upenn.edu/mailman/listinfo/types-list 
>>  ]
>>
>> Following up on Benjamin's comments about "X" reviews.  There are two
>> different axes that are important when understanding a review.  One  
>> is
>> the reviewer's  *expertise* in the subject of the paper.  Another  
>> is the
>> reviewer's *confidence* in his or her assessment carried out in the
>> review.  Using only one score to indicate both leads to some  
>> confusion,
>> since the two properties get conflated.  As Benjamin suggests, I  
>> often
>> find myself wanting to indicate confidence when I'm not an expert,  
>> and
>> sometimes, even though I'm an expert in terms of the related work, I
>> still don't have high confidence in my review (perhaps because it's a
>> really intricate paper).
>>
>> --Steve
>>
>>
>> Benjamin Pierce wrote:
>>> [ The Types Forum, http://lists.seas.upenn.edu/mailman/listinfo/types-list 
>>>  ]
>>>
>>> I'm suspicious of statistics about number of X reviews.  I know  
>>> what X
>>> is supposed to mean ("I am an expert in the topic") but in  
>>> practice I
>>> see many people (including myself) acting as if it means "I  
>>> understood
>>> the paper completely," and therefore often falling back to Y to
>>> indicate things like "Although I'm an expert, the paper was poorly
>>> explained and I couldn't completely understand it in a reasonable
>>> amount of time."
>>>
>>> This isn't to say that comparing figures for POPL and ICFP is not
>>> worthwhile -- just that the numbers themselves should be taken  
>>> with a
>>> grain of salt.
>>>
>>>      - Benjamin
>>>
>>>
>>> On Jan 27, 2010, at 9:11 PM, Norman Ramsey wrote:
>>>
>>>
>>>> [ The Types Forum, http://lists.seas.upenn.edu/mailman/listinfo/types-list
>>>>  ]
>>>>
>>>>
>>>>> At the POPL discussion, one goal that was raised was to improve  
>>>>> the
>>>>> number of "expert" reviews per paper. People are dissatisfied when
>>>>> their paper is rejected by self-proclaimed non-experts.  I believe
>>>>> that Jens pointed out that this year's POPL had 77% papers with  
>>>>> one
>>>>> "X" review.
>>>>>
>>>> I went back and got archival data for ICFP 2007.  ICFP is a
>>>> significantly smaller conference which that year had only 120
>>>> submissions.  110 of 120 submissions (91%) received at least one X
>>>> review.  When comparing these data, here are some points to keep in
>>>> mind:
>>>>
>>>>  - ICFP reviewing was double-blind that year.
>>>>  - Otherwise ICFP used substantially the same review process that
>>>>    POPL uses now.
>>>>  - POPL is probably a broader conference than ICFP, which may  
>>>> make it
>>>>    more difficult to find expert external reviewers.
>>>>
>>>> I remember great difficulty in finding external reviewers for  
>>>> papers
>>>> involving functional programming and XML---many were multi-author
>>>> papers, and this is a small community with a lot of cross-
>>>> fertilization, so there were quite a few papers for which all the
>>>> obvious expert reviewers had conflicts.  (One of the problems with
>>>> double-blind review is that it makes a prudent program chair more
>>>> cautious about conflicts of interest.)
>>>>
>>>>
>>>>
>>>> Norman
>>>>
>>>
>>>
>>
>>
>>
>
>
>
> -- 
> .\ Philip Wadler, Professor of Theoretical Computer Science
> ./\ School of Informatics, University of Edinburgh
> /  \ http://homepages.inf.ed.ac.uk/wadler/
>
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>