[TYPES] Two-tier reviewing process

Fri Jan 29 10:28:16 EST 2010

Hi, Matthias.

I know how you feel about the CS overemphasis on conferences, but I am
nonetheless a bit mystified by your interpretation of the expertise
vs. confidence discussion.  Just because experts are not always
confident about their conference reviews does not per se imply
anything about anything.  Experts are not always confident about their
journal reviews either.

When I give an expert review for a conference paper, I spend a
significant amount of time reviewing it, typically several days.  For
a journal paper, I may spend more time checking details and reading
over carefully and so forth, but at a high level my expert conference
reviews and my journal reviews are roughly similar in quality.  At the
end of either kind of review period, I may or may not be confident
about my review, usually depending on how well the paper was written.
If the paper was clearly written, I am usually more confident in my
review.  Thus, I feel that the confidence of my expert reviews
reflects (at least to some extent) the quality of the paper.

Am I some weird outlier?  Maybe so.  There are certainly expert
reviewers who clearly "phone it in" due to time constraints or because
they don't care.  But in my (comparatively limited) experience reading
other people's reviews (both of my and other people's papers), I have
seen a lot of really thorough expert reviews.

It would be nice if program committees reliably gave more weight to
those *thorough* reviews.  My impression is that sometimes they do and
sometimes they don't, and especially when the only thorough review is
an external review, the PC has a tendency to avoid abdicating their
judgmental authority to some external reviewer who cannot engage in
the PC discussion.  (The PC's behavior in such cases is
understandable, and may in some cases be the right thing to do, but it
nonetheless means that external reviews have second-class status.)  I
view that as a real problem, and frankly I'm not sure what the right
solution is.

Derek

On Fri, Jan 29, 2010 at 3:28 PM, Matthias Felleisen
<matthias at ccs.neu.edu> wrote:
> [ The Types Forum, http://lists.seas.upenn.edu/mailman/listinfo/types-list ]
>
>
> Of course, Benjamin's and Steve's discussion about Expertise vs
> Confidence
> indicates that we simply have way too little time to evaluate conference
> submissions fairly and robustly -- so why does this community insist on
> using this brittle and unreliable mechanism to evaluate scientific work?
>
> -- Matthias
>
>
>
>
>
> On Jan 29, 2010, at 7:04 AM, Philip Wadler wrote:
>
>> [ The Types Forum, http://lists.seas.upenn.edu/mailman/listinfo/types-list
>>  ]
>>
>> Just a personal plea here.
>>
>> The meaning of X, Y, Z was fixed clearly in Nierstrasz's paper:
>>
>> X: I am an expert in the subject area of this paper.
>> Y: I am knowledgeable in the area, though not an expert.
>> Z: I am not an expert. My evaluation is that of an informed outsider.
>>
>> This was a huge improvement over the 'confidence' scores we used to
>> give, and which were often used to form a weighted average of
>> numerical scores, a horrid scheme which led to the spurious impression
>> that a 6.7 paper must be better than a 6.3, say.
>>
>> Please don't go back to the bad old days of rating for confidence.  Of
>> course you should say how confident you are in your rating, but the
>> place to do that is in the text.  The only things we need scores for
>> are overall rating (ABCD) and expertise (XYZ).  Everything else of
>> import can be dealt with in the text of the referee's report.
>>
>> If you are an expert and you are not confident because the paper is
>> intricate, the best service you can render to the PC chair, to the PC,
>> to the conference, and to the author is to give an X rating---and then
>> explain your confidence level and the reason for it in the review.
>>
>> Note also that while the quest to find an X rating for every paper is
>> good, the best possibility is for a paper to receive both X and Z
>> reviews.  (Preferably both high!)
>>
>> Cheers,  -- P
>>
>>
>>
>> On Fri, Jan 29, 2010 at 8:01 AM, Stephan Zdancewic <stevez at cis.upenn.edu
>> > wrote:
>>> [ The Types Forum, http://lists.seas.upenn.edu/mailman/listinfo/types-list
>>>  ]
>>>
>>> Following up on Benjamin's comments about "X" reviews.  There are two
>>> different axes that are important when understanding a review.  One
>>> is
>>> the reviewer's  *expertise* in the subject of the paper.  Another
>>> is the
>>> reviewer's *confidence* in his or her assessment carried out in the
>>> review.  Using only one score to indicate both leads to some
>>> confusion,
>>> since the two properties get conflated.  As Benjamin suggests, I
>>> often
>>> find myself wanting to indicate confidence when I'm not an expert,
>>> and
>>> sometimes, even though I'm an expert in terms of the related work, I
>>> still don't have high confidence in my review (perhaps because it's a
>>> really intricate paper).
>>>
>>> --Steve
>>>
>>>
>>> Benjamin Pierce wrote:
>>>> [ The Types Forum, http://lists.seas.upenn.edu/mailman/listinfo/types-list
>>>>  ]
>>>>
>>>> I'm suspicious of statistics about number of X reviews.  I know
>>>> what X
>>>> is supposed to mean ("I am an expert in the topic") but in
>>>> practice I
>>>> see many people (including myself) acting as if it means "I
>>>> understood
>>>> the paper completely," and therefore often falling back to Y to
>>>> indicate things like "Although I'm an expert, the paper was poorly
>>>> explained and I couldn't completely understand it in a reasonable
>>>> amount of time."
>>>>
>>>> This isn't to say that comparing figures for POPL and ICFP is not
>>>> worthwhile -- just that the numbers themselves should be taken
>>>> with a
>>>> grain of salt.
>>>>
>>>>      - Benjamin
>>>>
>>>>
>>>> On Jan 27, 2010, at 9:11 PM, Norman Ramsey wrote:
>>>>
>>>>
>>>>> [ The Types Forum, http://lists.seas.upenn.edu/mailman/listinfo/types-list
>>>>>  ]
>>>>>
>>>>>
>>>>>> At the POPL discussion, one goal that was raised was to improve
>>>>>> the
>>>>>> number of "expert" reviews per paper. People are dissatisfied when
>>>>>> their paper is rejected by self-proclaimed non-experts.  I believe
>>>>>> that Jens pointed out that this year's POPL had 77% papers with
>>>>>> one
>>>>>> "X" review.
>>>>>>
>>>>> I went back and got archival data for ICFP 2007.  ICFP is a
>>>>> significantly smaller conference which that year had only 120
>>>>> submissions.  110 of 120 submissions (91%) received at least one X
>>>>> review.  When comparing these data, here are some points to keep in
>>>>> mind:
>>>>>
>>>>>  - ICFP reviewing was double-blind that year.
>>>>>  - Otherwise ICFP used substantially the same review process that
>>>>>    POPL uses now.
>>>>>  - POPL is probably a broader conference than ICFP, which may
>>>>> make it
>>>>>    more difficult to find expert external reviewers.
>>>>>
>>>>> I remember great difficulty in finding external reviewers for
>>>>> papers
>>>>> involving functional programming and XML---many were multi-author
>>>>> papers, and this is a small community with a lot of cross-
>>>>> fertilization, so there were quite a few papers for which all the
>>>>> obvious expert reviewers had conflicts.  (One of the problems with
>>>>> double-blind review is that it makes a prudent program chair more
>>>>> cautious about conflicts of interest.)
>>>>>
>>>>>
>>>>>
>>>>> Norman
>>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>>
>>
>> --
>> .\ Philip Wadler, Professor of Theoretical Computer Science
>> ./\ School of Informatics, University of Edinburgh
>> /  \ http://homepages.inf.ed.ac.uk/wadler/
>>
>> The University of Edinburgh is a charitable body, registered in
>> Scotland, with registration number SC005336.
>>
>
>