where \(W_i = \sum_{j=1}^K w_{ij}\) is the number of wins for item \(i\) and \(n_{ij} = w_{ij} + w_{ji}\) is the number of comparisons between item \(i\) and item \(j\) and where \(a\) and \(b\) are the shape and rate parameters of a gamma-distributed prior on \(\pi\): \(p(\pi) = \prod_{i=1}^K \mathcal{G}(\pi_i; a, b)\).
where \(W_i = \sum_{j=1}^K w_{ij}\) is the number of wins for item \(i\) and \(n_{ij} = w_{ij} + w_{ji}\) is the number of comparisons between item \(i\) and item \(j\) and where \(a\) and \(b\) are the shape and rate parameters of a gamma-distributed prior on \(\pi\): \(p(\pi) = \prod_{i=1}^K \mathcal{G}(\pi_i; a, b)\).
BradleyTerryScalable
Aims
Fit the Bradley-Terry model to large and sparse data sets
Has to be fast enough
Has to be able to deal with cases when the comparison graph is not fully connected
Easy to use, both in interface and workflow
Workflow: data
btdata(x) to create object of class "btdata"
x can be a data frame, graph, matrix or contigency table
may need to call codes_to_counts() first
summary(btdata)
select_components(btdata, subset)
Workflow: fit
btfit(btdata, a) to fit model and create object of class "btfit"
If a = 1, finds MLE on each fully connected component
If a > 1, finds the MAP estimate of \(\pi\)
Methods for btfit object:
summary, fitted, coef, vcov, simulate
btprob(object) for Bradley-Terry probabilities \(p_{ij}\)
Number of items: 4
Density of wins matrix: 1
Fully-connected: TRUE
toy_data
player1 player2 outcome
1 Cyd Amy W1
2 Amy Ben D
3 Ben Eve W2
4 Cyd Dan W2
5 Ben Dan D
6 Dan Eve W2
7 Fin Eve W2
8 Fin Gal W2
9 Fin Han W2
10 Eve Gal W1
11 Fin Gal D
12 Han Gal W1
13 Han Gal W2
14 Amy Dan W1
15 Cyd Amy W1
16 Ben Dan D
17 Dan Amy W2
Number of items: 8
Density of wins matrix: 0.25
Fully-connected: FALSE
Number of fully-connected components: 3
Summary of fully-connected components:
Component size Freq
1 1 1
2 3 1
3 4 1
toy_fit_MAP <-btfit(toy_btdata, a =1.1)summary(toy_fit_MAP)
$call
btfit(btdata = toy_btdata, a = 1.1)
$item_summary
# A tibble: 8 × 3
component item estimate
<chr> <chr> <dbl>
1 full_dataset Eve 1.90
2 full_dataset Cyd 0.472
3 full_dataset Han 0.245
4 full_dataset Amy -0.0766
5 full_dataset Gal -0.102
6 full_dataset Ben -0.423
7 full_dataset Dan -0.536
8 full_dataset Fin -1.48
$component_summary
# A tibble: 1 × 4
component num_items iters converged
<chr> <int> <int> <lgl>
1 full_dataset 8 101 TRUE
toy_fit_MLE <-btfit(toy_btdata, a =1)summary(toy_fit_MLE, SE =TRUE)
$call
btfit(btdata = toy_btdata, a = 1)
$item_summary
# A tibble: 7 × 4
component item estimate SE
<chr> <chr> <dbl> <dbl>
1 2 Han 0.696 0.911
2 2 Gal 0.413 0.768
3 2 Fin -1.11 1.05
4 3 Cyd 0.592 0.991
5 3 Amy 0.0325 0.699
6 3 Ben -0.243 0.944
7 3 Dan -0.382 0.712
$component_summary
# A tibble: 2 × 4
component num_items iters converged
<chr> <int> <int> <lgl>
1 2 3 6 TRUE
2 3 4 10 TRUE
btprob(object)
Gives the Bradley-Terry probabilities \(\frac{\pi_i}{\pi_i + \pi_j}\)
Number of items: 27137
Density of wins matrix: 0.0005642131
Fully-connected: FALSE
Number of fully-connected components: 11297
Summary of fully-connected components:
Component size Freq
1 1 11285
2 2 10
3 3 1
4 15829 1