IAC 2004 Contest Data Analysis

The pages indexed here contain data from IAC sponsored regional contests in 2004. The page organization is a tree hierarchy:

  1. The top level page contains a list of all contests and summary statistics for all contests.
  2. Clicking on a contest shows a list of categories flown for that contest and summary statistics for that contest.
  3. Clicking on a category shows a list of flights flown in that category along with category results and statistics.
  4. Clicking on a flight shows detailed scores and statistics for that flight.

The regional pages start a look at what is going-on in the regional events.

The page linked here contains information about any errors or corrections, and updates to the data.


The purposes of this work are twofold:

  • First, to get a look inside the workings of TBLP. The summary statistics show how often TBLP adjusts individual scores and overall flight scores from individual judges. The flight pages contain detailed data showing which grades TBLP adjusted. Tracking over the scores exposes the intermediate values TBLP used in doing the math.
  • Second, to compare three methods for ranking pilots.

A look inside of TBLP

The detail figures for TBLP come from an experimental implementation of TBLP (eTBLP) whose results match closely, but not exactly with those of the official scoring program.

The flight detail pages show all pilots' grades from all judges. The pages display individual figure grades and overall scores that eTBLP adjusted with orange. They display grades that eTBLP averaged entirely with red. Tracking over the figures with the cursor exposes pop-up frames that contain additional details of the eTBLP calculation.

The page linked here contains additional explanation of the computational details exposed by eTBLP.

Three methods for ranking pilots

The three methods for pilot ranking compared here are as follows:

The official TBLP rankings from 2004 according to the official scoring program.
Also known as the "average." The method planned for use in 2005.
Short for, "Consensus Ranking." This method places one pilot ahead of another when the majority of judges ranks the pilot ahead of the other.

The contests list and the individual contest pages show summary statistics regarding TBLP and how closely results from the three methods correlate. You will find an explanation for those statistics here.

Consensus Ranking

The following result provides an example of consensus ranking. The figures in the columns are the ranks from each judge. The rows represent the pilots, ordered by the consensus rank. Going down the rows it is easy to verify that the majority of judges ranks each pilot ahead of the pilots further down.

pilot123 45
hubie tolson 12132
alan bush 23211
goody thomas 31323
bob gordon 44444

The page linked here contains further description of the CR method.


Computation of the consensus rank is known in Computer Science circles as an "np-complete" problem. It becomes exponentially more difficult to do with increasing number of pilots. Andrew Davenport at the IBM T.J.Watson Research Center has developed an efficient pruning algorithm for finding the consensus rank. All of the results in this data set were computed with that algorithm in one or two seconds on a notebook computer. We are hoping to try larger data sets from the world contests, where there are as many as sixty or seventy pilots in a category.

The suggestion to try ranking pilots using individual judges' ranks, rather than scores, came from David Harville, also at the IBM T.J.Watson Research Center. The use of ranks eliminates differences in the individual scoring practices of judges, yielding a metric that may fairly be compared from one judge to another.

Thank you to Lisa Popp and Tom Meyers at IAC for supplying the contest data.

These pages and the experimental TBLP implementation were produced by Douglas Lovell at the IBM T.J.Watson Research Center.

