Classical bootstrap is compared with Bayesian bootstrap. The underlying difference lies in the distribution over individual sample weights. While classical bootstrap uses multinomial distribution, the Bayesian bootstrap uses Dirichlet distribution. Hence the Bayesian bootstrap can be considered as a smooth version of classical bootstrap. Both converge as N grows large. Dirichlet prior is used for probabilities/proportions, more info here The below shown Bayesian bootstrap has higher dispersion and in case of statistic like mean
, lower spread is possible by directly using dirichlet
weights in weighted mean.
set.seed(42)
dat <- c(1,4,9,10)
N <- 1e5
mat1 <- matrix(NA, nrow = N, ncol = length(dat)) # freq
mat2 <- matrix(NA, nrow = N, ncol = length(dat)) # dirichlet
mat3 <- numeric(N)
for(i in 1:N) {
p.multinomial <- rep(1/length(dat), length(dat)) #default prob of sample function
mat1[i,] <- sample(dat, length(dat), replace=TRUE, prob=p.multinomial)
p.dirch <- rdirichlet(1, rep(1, length(dat)))[1,]
#higher spread, but more general (applicable to median etc)
mat2[i,] <- sample(dat, length(dat), replace=TRUE, prob=p.dirch)
#lower dispersion, can be used only with weighted statistic like mean
mat3[i] <- weighted.mean(dat, p.dirch)
}
Using boot
and bayesboot
packages:
mat4 <- boot(dat, function(d,w) {mean(d[w])}, N)$t[,1] # boot
mat5 <- bayesboot(dat, mean, N)$V1 # bayesboot
The model for Bayesian bootstrap in JAGS:
bayesian.bootstrap.model <-
"model {
p ~ ddirch(rep(1,N))
for (j in 1:N) {
pick[j] ~ dcat(p[])
yboot[j] <- y[pick[j]]
}
y.samples <- yboot[]
}"
jgs = jags.model(file = textConnection(bayesian.bootstrap.model), data = list('y'=dat,'N'=length(dat)), n.adapt = 1000)
update(jgs, 1000)
out = jags.samples(jgs, c('y.samples'), 3*N, 3)
mat6 = out$y.samples %>% as.vector %>% matrix(nrow=N, byrow=TRUE)
## Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph information:
## Observed stochastic nodes: 0
## Unobserved stochastic nodes: 5
## Total graph size: 17
##
## Initializing model