The problem of whether the rankings of some objects given by a set of judges show any agreement or are more or less independent is addressed. The most familiar measure for concordance is the Kendall W coefficient. Classical tests for concordance are the Friedman and tests. Legendre (2005) showed via simulation that the Friedman test is too conservative and less powerful than its permutation version but his study was very limited. In this paper, the study of Legendre is deeply extended. It is shown that the Friedman test is too conservative and less powerful than both the F test and the permutation test for concordance which always have a correct size and very similar power. The F test should be preferred because it is computationally much easier.
Nonparametric testing for agreement among several judges
MAROZZI, Marco
2012
Abstract
The problem of whether the rankings of some objects given by a set of judges show any agreement or are more or less independent is addressed. The most familiar measure for concordance is the Kendall W coefficient. Classical tests for concordance are the Friedman and tests. Legendre (2005) showed via simulation that the Friedman test is too conservative and less powerful than its permutation version but his study was very limited. In this paper, the study of Legendre is deeply extended. It is shown that the Friedman test is too conservative and less powerful than both the F test and the permutation test for concordance which always have a correct size and very similar power. The F test should be preferred because it is computationally much easier.I documenti in SFERA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.