OxfordCroquet Logo

Dr Ian Plummer

Donation Button

http://www.oxfordcroquet.com/tech/appleton/index.asp
Technical
May the Best Man Win?

The Statistican (1995) 44, No. 4, pp. 529-538

By DAVID R. APPLETON

University of Newcastle upon Tyne, UK

[Received March 1995. Final revision July 1995]

Summary

Simulations of various kinds of sporting tournament have been carried out to assess their relative ability to produce as winner the best of the entrants. A strong contender when it is necessary to play relatively few games is the seeded draw and process. When the players are closely matched and there can be more games the round robin played twice is most effective.

Keywords: Croquet; Paired comparisons; Sporting tournaments

1. Introduction

'It takes a good team to win the league, but you need luck to win the cup.' 'You get some funny results over five frames but the best snooker player will win the first to 19.' 'I don't know why they bother seeding tennis tournaments; the seeds always seem to get knocked out.' Observations like these are the clichés of the sports fan, and it is the intention of this paper to look at such points and to assess their validity. What we shall say will often have applicability to many sports, but it will in fact refer almost exclusively to Association Croquet, not only because it is the sport with which the author is most familiar, but also because the ingenuity of managers of croquet events in devising new formats gives us the opportunity to look at many different kinds of tournament.

It is not our intention to describe the game itself; several books are available, from the simple (Mulliner, 1989), through the more detailed (Lamb, 1990) to one which makes it clear that only those with a highly developed interest in decision theory should attempt to play (Wylie, 1985). What we must do, however, is point out that a game is either won or lost-there are no draws-and that the score is generally agreed to be irrelevant: a player may win a close game 26-25, or 26-9 or even (some would maintain) 26-0. The reason for this is that (as in snooker) a game may begin with defensive tactics on either side, then one player may gain a slight initiative and make a break which (possibly after another spell of evenly matched play) proves decisive. The loser may not have done much wrong.

Our main aim will be to determine which form of tournament is most likely to produce as winner the best of the entrants, and we shall do this by simulating various situations. Nevertheless we shall remember that many tournaments are also devised to make sure that all entrants have a reasonable amount of play, and that who wins a friendly week-end event is relatively unimportant. Prestigious events might take place over a week or more, though some are scheduled for a long week-end; club events might involve intermittent matches throughout a season. These considerations affect how many matches it is feasible to schedule for a tournament, and hence its reliability of finding the 'right' winner. Most croquet games take less than three hours, and indeed such a limit may be imposed; some take less than an hour. This variability makes it difficult to estimate how many games or rounds can be completed in a given time.

In Section 2 we shall describe some tournament designs and in Section 3 discuss how ties for first place may be resolved for those designs which do not guarantee a unique winner. We shall then give details of the simulation methods, present the results and conclude with a discussion of the relevance of the simulations to real life.

We shall refer throughout to the (English) Croquet Association as the CA, and to its Scottish equivalent as the SCA.

2. Types of tournament

2.1. The knock-out

The knock-out format is important in its own right and also provides the basis for several other designs of tournament. The format is too well known to need much explanation. If there are n players then n - 1 matches are required, taking [log2 n] rounds. If n is not an integer power of 2 some players receive byes into the second round, which has the unfortunate effect of giving players different numbers of matches; this may be done by the Bagnall-Wild method (Croquet Association, 1989), but we shall not consider its repercussions further here as in the tournaments we have in mind it is usually possible to have either eight or 16 players. The event culminates in a final, which is good for spectators. The CA play their men's and women's tournaments in this format.

2.2. Best-of-three knock-out

Losing a single game, for which an unaccountable stroke of ill fortune may be responsible, is enough to cause elimination from a knock-out tournament if each match consists of only one game. A greater degree of reliability is achieved by playing each match as the best of three games. The extent of this will be investigated. However, apart from taking longer to play, this modification makes the tournament more difficult to schedule as time in each round must be allowed for three games when only two may be required especially in the latter stages when there are fewer matches (although they may be closer), The CA Open Championship is played as a best of three.

It is easy to show that if a player has probability p of winning a single game he has probability p2(3 - 2p) of winning a best-of-three match if results are independent. If we write p = ½ + d this is ½ + 3/2 d - 2d 3.

2.3. Seeded knock-out

Seeding is a well-known way of trying to ensure that the best players will meet late in the tournament and that the final stages will therefore provide the highest quality of play and most excitement. If games go to form, the number one seed (the pre-tournament assessment of the best player) should meet the number two seed in the final, the number four seed in the semi-final, the number eight before that, etc. The second seed should meet the third in the semi-final, number seven in the previous round, etc. There are different ways of organizing seeding; the following (for 16 players) illustrates the scheme used in this study: 1 versus 16; 8 versus 9; 4 versus 13; 5 versus 12; 11 versus 6; 14 versus 3; 10 versus 7; 15 versus 2. It can be difficult to justify seeding unless there is a well-accepted ranking list of the entrants.

2.4. Draw and process

The draw and process arrangement is another attempt to reduce the effect of a single stroke of luck: an individual must lose two games before being eliminated. A draw is made for a knock-out in the usual way, and a second independent knock-out (the process) is formed deterministically from the first. If the order of players in the draw is 1 versus 2, 3 versus 4, ... , 15 versus 16, then the order in the process is 1 versus 9, 5 versus 13, 3 versus 11, 7 versus 15, 2 versus 10, 6 versus 14, 4 versus 12, 8 versus 16. Similar schemes exist for other numbers of entrants (Croquet Association, 1989). What is ensured is that any players who meet in the first or second rounds of the draw cannot meet until the final or semi-final respectively of the process (and vice versa for the first two rounds of the process). If the same player wins both knock-outs, there is no need for a play-off, the lack of which can be a slight anticlimax; otherwise two players have lost only one match each and they play to decide the overall winner.

2.5. Seeded draw and process

The seeded draw and process is a combination of the previous two schemes. The first two seeds are scheduled to meet in both finals. If this occurs there is essentially a best-of-three final consisting of the final matches of the draw and the process, and the play-off.

2.6. Round robin

In contrast with the knock-out (cup) format, the league format is at the other end of the spectrum in terms of matches played. Also known as an American block, this format involves each player playing each of the others once, so that there are n - 1 rounds and ½ n(n - 1) matches. It is therefore impractical, in terms of both playing time and facilities, for most tournaments with a substantial number of players but is feasible for eight players at a week-end tournament. With players over a wide range of ability there may be many games of little interest. Used throughout a season it can prove difficult to complete in some sports once players have no chance of winning it. There is no final to provide a focal point for the tournament, and there may not be a unique winner, but unlike the previous designs it offers each player the same number of matches.

Many books on graph theory use the round robin tournament as an application of a directed graph; a useful summary of the main points is given by Harary and Moser (1966).

2.7. Repeated round robin

The repeated round robin is a popular choice for croquet invitation events with small numbers of players. The CA's Masters tournament and the SCA's Chairman's Salver both use this form. With eight good players it should be possible to play the 14 rounds required over four days, though there is still the possibility that there will be a tie for first place.

2.8. American blocks with play-off

A mixture of the two main forms is provided by having blocks of 4, the winners of which go through to a knock-out. The main problem with this format is the very strong possibility that the six matches per block will fail to provide a unique winner to proceed to the next stage.

2.9. Swiss

The Swiss style of tournament also has some of the attributes of the knock-out and the round robin. A draw is made as for a knock-out tournament, but only the first round is played. Everyone participates in the second round, in which the players are ordered in terms of their number of wins, ties being resolved by maintaining the order of the previous round (or possibly of the original draw). This system continues, except that two players may not meet twice in the tournament. It is usual to play at least two rounds more than would be the case with a knock-out, so that a player may lose once (or more) and still win the tournament. We shall provide an example of a tournament (with eight players), because this is the most difficult type of event to manage. It has the advantage that all players have the same number of matches, and since players usually meet opponents with a similar record in the tournament fewer matches than in a round robin are likely to be one sided. The last round may see the two principal contenders in opposition, but frequently does not because they have played previously. Again there may be a tie for first place.

Suppose that the ties in the first round are as follows, and the winners are shown in bold type: 1 versus 2; 3 versus 4; 5 versus 6, 7 versus 8. Then the second round (with winners again shown) is 1 versus 3, 6 versus 7, 2 versus 4 and 5 versus 8. The natural order for the third round is 1 versus 6, 3 versus 2, 7 versus 8 and 4 versus 5, but players 7 and 8 have already met. As little disruption as possible is caused to the top of the table, and the matches are in fact 1 versus 6, 3 versus 2, 7 versus 5 and 8 versus 4. The present state in terms of games won is 1 (3), 6 (2), 2 (2), 8 (2), 3 (1), 7 (1), 5 (1) and 4 (0). The first player in the list that player 1 has still to play is number 8; number 6 can play number 2; number 3 can play 7 and 5 can meet 4. After 1 versus 8, 6 versus 2, 3 versus 7 and 5 versus 4 we have 1 (4), 2 (3), 8 (2), 6 (2), 7 (2), 5 (2), 3 (1) and 4 (0). For the fifth and final round 1 has only 4, 5 and 7 to play and 7 is highest of these; should number 2 play number 8, number 6 cannot play 5 because they have already met, nor number 3, as this leaves 5 and 4 who have just played one another, but 6 can play 4 and 5 finishes by playing 3. The algorithm for creating a round when the obvious games lead to problems can take several forms, but the exact details are not very important as long as players are confident that a consistent algorithm is being used. The method given by Croquet Association (1989) treats the top and bottom of the table alike; although this is pleasingly symmetric it is illogical if it is more important to ascertain who comes first than who comes last.

An interesting feature of the Swiss tournament is that it can be made the basis of a schedule for a round robin. If we make the assumption that the higher player in the draw always wins, and we treat the bottom of the table as being of equal importance as the top (to maintain symmetry) we can derive a pattern such as that for 12 players (A - L) shown in Table 1. Such symmetry can only be obtained if the number of players is a multiple of 4. The rounds may, of course, be played in any order, so they have been denoted by symbols.

matrix for round robin
Scheme for a round robin for 12 players, derived from extending a Swiss tournament until each player has met all the others.

2.10. Other formats

The descriptions above are of the types of tournament which have been simulated; there are many minor variants. If there are blocks followed by a knock-out the blocks may be seeded or the subsequent knock-out may be best of three, both of which adaptations occur in the Australian Open croquet championship. There may also be consolation events of practically any description after one of the knock-out forms. Sometimes a Swiss tournament is run simultaneously with the main knock- out event and continues as a 'plate' event while a best-of-three final is held. One interesting version known as a Swiss robin tries to simplify the management of a Swiss event by establishing a round robin design as above, drawing the first round at random and determining subsequent rounds by those implied by what would be the top pairing in a Swiss tournament (Hunter, 1990). In 1993 the SCA Open consisted of a Swiss tournament to determine which four players would play off in the semi-finals and final. It could have been anticipated that there would be a tie for fourth place (by four players in fact), necessitating further rounds. The World Championship has begun with blocks of 7, with four players from each progressing to the next stage; again there is likely to be a tie for fourth place in at least one block, At present it has the rather strange feature of letting four of the five players in each block progress to a knock-out stage, using the blocks essentially as a method of seeding that knock-out.

Other formats, including the popular 'Egyptian' system (Hands, 1990), are designed to fit in as many games as possible rather than to determine a winner and will not be considered here; nor will the 'ladder' style of event which is too poorly defined to admit analysis.

3. The question of ties

As we have indicated, any scheme which involves a round robin or a Swiss element may lead to two or more players tying for first (or another) place. If time allows there may be a play-off, which is the most satisfactory method of resolving a tie between two players, but with more than two this option is unlikely to be available, and even with two it frequently happens that this last game must be rushed and is not a fair reflection of the players' abilities. In the simulation studies we have resolved ties by lot, so the frequency with which such schemes identify the best player will be slightly underestimated in comparison with having a proper match. The effect of this will be quantified, but since the option of an extra game may be impossible in practice such an approach is quite reasonable.

Another common way of resolving ties is to give the title to any player who has beaten all the others in the tie. This is only likely to be workable if two players tie in a Swiss tournament or round robin where they have met only once, but though it is an intuitively plausible rule it is in fact illogical, as we shall show with a very small example.

Suppose that in a small tournament of five players A has beaten B, C and D, B has beaten C, D and E, C has beaten D and E, D has beaten E, and E has beaten A. A and B therefore tie for first place and since A beat B he is the overall winner. However, should not the fact that A has lost to the worst player in the field penalize him with respect to B? A possible approach is via Bradley and Terry's (1952) model, which in this context assumes that when n players meet (in pairs) there are n parameters (pi)i >= 0 such that in a game between players i and j the odds of a win for player i are (pi)i/(pi)j. Since these parameters are only identifiable to within a scale factor we may impose an arbitrary condition, e.g. that min{(pi)i = 1. If we define (theta)i  = log (pi)i we may evaluate the (theta)i  by using the GLIM program in Appendix A.

This gives values (theta)a = (theta)b = 2.024, (theta)c = 1.012 and (theta)d = (theta)e, = 0, showing that in the view of this plausible model A and B cannot be separated. However, it might cause some dissension at most tournaments if the Bradley-Terry model were used, particularly if the only competitor who understood it was one of those in the tie and he was applying it on his portable computer! There is also a potential problem in identifiably if a player loses all his games.

A simpler way of attempting to break ties is to count the number of games won by each player's defeated opponents. This is one of the many techniques which have come to croquet from chess tournaments (Kazic (1980), p. 64) and is designed to credit a player with beating the better rather than the worse players. It has the added advantage of making all games in a tournament likely to contribute towards the identification of the winner, thereby adding interest to games which might otherwise have little. However, as can be seen from the example above, it may be at odds with the Bradley-Terry model. Better is the method proposed by David (1987) where each player is given a score of their own wins minus their own losses plus the total number of wins of players that they beat minus the total number of losses of players they lost to. This calculation gives a score of 5 to each of players A and B in the above example.

4. Simulation methods

Since our intention is to compare tournament designs, and not to estimate the chance of the best player winning any particular tournament, we may within reason take whatever model of determining winners that we please. What we have assumed is that there is an underlying rating for each player in a tournament and that these ratings are normally distributed (without loss of generality with zero mean and unit variance). For each game a player's actual rating is his underlying rating with a normal error with mean 0 and variance (rho)2 . If the ratings for two opponents are then xA and xB the probability that A beats B is assumed to be (Phi){k(xA - xB)} for some k, where (Phi)( ) is the cumulative normal distribution function. We shall report briefly the effect of changing (rho) in the context of a knock-out tournament and thereafter give results for (rho) = 0.2. We have used k = 1 to simulate tournaments with considerable disparity of ability and k = 0.25 for those, such as invitation events or events played to handicap, where the standard is more homogeneous. We have considered only tournaments of eight or 16 players. As indicated earlier, when a tie occurs it is broken at random, but results will be reported also for cases when there was a unique winner. When seeding is involved it is assumed that it is done perfectly.

In each case 10,000 simulations were carried out (with different ratings for the participants), so two standard errors of the percentages given correspond to at most one percentage point. All programs were written in APL which uses the random number generator x(n+ 1) =75x(n) (mod 231 - 1).

5. Results

Table 2, row (a), shows the effect of changing the within-player variance relative to the between- player variance for a knock-out tournament of eight players. Not surprisingly the best player wins most often the more consistently players perform. Row (b) shows how the games per match increase with increasing error variance in a best-of-three tournament, as does the average number of games lost by the winner (row (c)). Row (d) shows the frequency of ties in the block stage of a blocks and play-off tournament.

TABLE 2
Effect of varying the within-player variability*

Results for the following values of (rho) :
 
0
0.2
0.5
1
2
(a)
59
58
54
44
32
(b)
2.28
2.29 
2.32
2.38
2.45
(c)
0.74
0.78
0.87
1.05 
1.29
(d)
68
67
61
48
34
(a) Percentage of knock-out tournaments won by the best player;
(b) mean number of games played per match in best-of-three tournaments;
(c) mean number of games lost by the winner in the same tournaments;
(d) percentage of tournaments played as two blocks of four plus a final, where blocks had unique winners.
For all designs there are eight players and k = 1.

The principal results, detailing the percentage of times on which the best player won each style of tournament, are shown in Table 3, and Table 4 gives some further details about Swiss and round robin tournaments. Table 5 shows the percentage of times that there is a unique winner, and the number of games lost by that winner. Of the 25% of occasions when there is no outright winner of a round robin ((rho) = 0.2 and k = 1), the best player is involved in a tie with one other player on 63% (16% of the total) and with more than one other on a further 22% (5% of the total).

TABLE 3
Percentage of times that the best player won each style of tournament outright**

Type of tournament
Frequency (%) for 8 players  
Frequency (%) for 16 players 
 
k = 1
k = 0.25 
k = 1
k = 0.25
Knock-out 
58 
27
51
19
Seeded knock-out
62
30
56
22
Blocks of 4 plus play-off
57
26
51
19
Draw and process
67
31
60
23
Seeded draw and process
71
34
63
25
Best-of-three knock-out 
68
33
62
26
Swiss with random tie-break
59
29
50
21
Round robin with random tie-break
61
31
62
30
Round robin played twice with random tie-break
71
40
71
39

** based on 10 000 simulations of each with (rho) = 0.2 (see text). Tournaments are ordered by the total number of games played.

TABLE 4
Further results for Swiss and round robin tournaments using the same simulations as for Table 3

Type of tournament 
Frequency (%) for 8 players 
Frequency (%) for players 16 players
 
k = 1 
k = 0.25 
k = 1
k = 0.25
Swiss        
Tournaments with unique winner
68
57
62
47
Best player sole winner
48
18
40
12
Best player winner after random tie-break
59
29
50
21
Best player at least shares first place
74
41
70
33
         
Round robin        
Tournaments with unique winner  
75
63
77
69
Best player sole winner 
52 
22
53
23
Best player winner after random tie-break
61
31
62
30
Best player at least shares first place
73
43
72
39

TABLE 5
Percentage of tournaments of different types where there was a unique winner, and games lost by that person+

Type of tournament 
% for the following numbers of games lost: 
Total
 
0
1
2
3
4
> 4
 
8 players              
Draw and process  
44
56
       
100
Best-of-three knock-out
42
40
16
2
   
100
Swiss
30
27
       
57
Round robin 
11
40
11
     
63
Round robin played twice
< 1
4
13
25
25
7
75
               
16 players              
Draw and process  
37
63 
       
100
Best-of-three knock-out
36 
38
20
6
   
100
Swiss
30
17
       
47
Round robin 
< 1
4
16 
27
19 
2
69
Round robin played twice
13
20
21
17
10
6
87

+ k = 0.25, (rho) = 0.2.

Fig. 1 relates the probability that the best player wins to the total number of matches played for toumwnents with eight players and k = 1, (rho) = 0.2. The arrows indicate the effect of playing a single extra game as a tie-breaker when only two competitors have tied for first place.

games vs probability of best player winning

Fig. 1. Probability that the best player of eight wins as a function of the number of games in the tournament (k = 1, (rho)  = 0.2): ^ effect of having a single deciding game in the event of a tie between two competitors

Fig, 2 shows how the probability of winning a particular type of tournament differs for players of all rankings. The tournaments illustrated are for eight players of comparable abilities (k = 0.25, (rho) = 0.2).

probability distribution

Fig. 2. Probabilities that each of the eight competitors wins a particular type of tournament as a function of their actual ranking (k = 0.25, (rho)  = 0.2) (since only outright winners are considered, the probabilities only sum to 57% for the Swiss and 75% for the round robin tournament played twice): ? knock-out; ? seeded draw and process; ? round robin played twice; ? Swiss

6. Discussion

It cannot be denied that many of the most fascinating aspects of competition have been excluded from this investigation. What psychological effects do knock-out tournaments engender compared with round robins? Is either player favoured in the deciding game of a best of three? Is it more difficult to win as a top set or as a defending champion? Are there particular styles of play such that, although A is likely to beat B and B to beat C, nevertheless C is likely to beat A? Is it possible that what really, makes winners is their smaller within-subject variability (as well as general ability)? There is no doubt that these are interesting questions to the sportsman, but we must assume that they are second-order effects in the context of our present investigation.

Similarly, if our purpose had been to find the probability that the best player would win a particular tournament we would have had to justify by recourse to data the assumptions that we have made - but how can we tell who the best player is? Since our aim has been to compare formats we believe our assumptions to be adequate, but we shall emphasize that the results in the tables are for comparative purposes only. The assumption of normality of the distribution of players' ratings is not important in a comparative study: if the distribution is actually skew to the right the best player is more likely to win, and conversely if it is skew to the left he is less likely to.

The results that we have obtained can certainly be used to justify many of the current designs of croquet tournaments. Where the spread of ability is small a good case can be made for the use of the round robin tournament played twice, as long as the title at stake is sufficiently prestigious to justify the amount of time which will be taken up by this format. It is probably out of the question for more than eight (or at most 10) players. The best alternative in most circumstances appears to be the draw and process, and this is the more effective if it is seeded-though because of that it is important that there should be good grounds for the seeding. It compares well with the best-of-three knock-out format and is as easy to schedule, though the absence of a guaranteed final could be a problem, and it should be noted that for eight players there are as many rounds, though not as many matches, as for a round robin.

It is possible to expand slightly on the effect of seeding. We can use Spearman's rank correlation coefficient Rs to summarize the agreement between the correct seeding and the actual draw order in an ordinary knock-out tournament. Since a negative correlation is just as good as a positive correlation (getting the seeding completely wrong produces the correct games), we are only interested in the absolute value of Rs. Its expectation for a sample of size 8 can be shown to be 0.31.

Thus, from Table 3 (k = 3) wwe can see the when Rs = 1 the best player wins 62% of tournaments; when Rs = 0.31 he only wins 58%. If we were to have the worst seeding possible (i.e. Rs = 0) he would win about 56%.

The Swiss tournament is a useful format, as it performs reasonably well in comparison with the round robin in eliciting the best player as winner, and it gives all players an equal number of games, an important criterion if players have travelled long distances. Its main failing is that of the round robin, but more pronounced: its frequent inability to produce a unique winner and the consequent need for a tie-break or the sharing of a trophy. If that is not regarded as a major problem it may be considered to do very well, in as much as the best player often at least ties for first place.

The results in Table 5 indicate that in terms of demanding of the winner that he win every game the competing formats vary. The draw and process and the best-of-three schemes are similar and are more tolerant to a single loss than is the Swiss system. The larger number of games in a round robin, even with eight players, means that a loss is well tolerated, and when this is played twice the winner is likely to have lost several games.

Although the method of allocating wins in the simulation exercise was not based on data from real tournaments, the results suggest quite reasonably that the 'favourite' in a competitive field of 16 might start at odds of about 3:1 or 4:1 against in one of the shorter tournaments, but that more than evens in a round robin tournament played twice would be generous. However, the corresponding results for entrants of more widely varying ability (where for eight players the best has about twice the chance of winning than in the more homogeneous case) indicate the dangers of extrapolating the numbers that we have produced in any other than a comparative sense.
 

Appendix A: GLIM program to apply Bradley-Terry model to example of Section 3

$units          10       $data        a b c d e    res $read

1      -1          0          0         0         1

1        0        -1          0         0         1

1        0         0         -1         0         1

1        0         0          0        -1         0

0        1        -1          0         0         1

0        1         0         -1         0         1

0        1         0          0         1         1

0        0         1         -1         0         1

0        0         1          0         -1        1

0        0         0          1         -1        1

$calc n=1  $yvar     res  $err b  n $fit a + b + c + d - 1  $disp  e $stop

References

Bradley, R. A. and Terry, M. E. (1952) Rank analysis of incomplete block designs: I, The method of paired comparisons. Biometrika, 39, 324-345.
Croquet Association (1989) The Laws of Association Croquet and Golf Croquet and the Regulations for Tournaments, 5th edn. London: Croquet Association.
David, H. A. (1987) Ranking from unbalanced paired-comparison data. Biometrika, 74, 432-436.
Hands, P. (1990) The 'Egyptian' system. Croquet, 207, 26-27.
Harary, F. and Moser, L. (1966) The theory of round-robin tournaments. Am. Math. Mnthly , 73, 231-246.
Hunter, A. (1990) The Swiss robin. Bull. Scot. Croquet Ass., 40, 6.
Kazic, B. M. (1980) The Chess Competitors' Handbook. London: Batsford.
Lamb, W. E. (1990) Croquet: the Skills of the Game. Swindon: Crowood.
Mulliner, S. (1989) Croquet (Play the Game). London: Ward Lock.
Wylie, K. F. (1985) Expert Croquet Tactics. Southampton: Unitset.

Author: David Appleton
All rights reserved © 1995


Updated 28.i.16
About, Feedback
www.oxfordcroquet.com/tech/appleton/index.asp
on www.oxfordcroquet.com
Hits: 21493