The BUGS Project Examples Volume 2
The BUGS Project Examples Volume 2
The BUGS Project Examples Volume 2
Examples Volume 2
Dugongs: a nonconjugate, nonlinear model Orange trees: a hierarchical, nonlinear model
Multivariate Orange trees: a hierarchical, nonlinear model
Biopsies: latent class model Eyes: normal mixture model Hearts: a mixture model for count data Air: covariate measurement error Cervix: case-control study with errors in covariates Jaw: repeated measures analysis of variance Birats: a bivariate Normal hierarchical model Schools: muliivariate hierarchical model of examination results Ice: non-parametric smoothing in an age-cohort model Beetles: logistic, probit and extreme value models Alli: multinomial logistic model Endo: conditional inference in case-control studies Stagnant: a change point problem Asia: an expert system References: Sorry - an on-line version of the references is currently unavailable. Please refer to the existing Examples documentation available from http://www.mrc-bsu.cam.ac.uk/bugs.
[1]
Examples Volume II
Dugongs
The data are length and age measurements for 27 captured dugongs (sea cows). Carlin and Gelfand (1991) model this data using a nonlinear growth curve with no inflection point and an asymptote as Xi tends to infinity: Yi ~ Normal(i, ), i = 1,...,27 i = - Xi , > 1; 0 < < 1 Standard noninformative priors are adopted for , and , and a uniform prior on (0,1) is assumed for . However, this specification leads to a non conjugate full conditional distribution for which is also non log-concave. The graph and corresponding BUGS code is given below
alpha
beta
gamma
U3
tau
mu[i]
x[i]
sigma
Y[i] for(i IN 1 : N)
[2]
Dugongs model { for( i in 1 : N ) { Y[i] ~ dnorm(mu[i], tau) mu[i] <- alpha - beta * pow(gamma,x[i]) } alpha ~ dnorm(0.0, 1.0E-6) beta ~ dnorm(0.0, 1.0E-6) gamma ~ dunif(0.5, 1.0) tau ~ dgamma(0.001, 0.001) sigma <- 1 / sqrt(tau) U3 <- logit(gamma) }
Examples Volume II
model { for( i in 1 : N ) { Y[i] ~ dnorm(mu[i], tau) mu[i] <- alpha - beta * pow(gamma,x[i]) } alpha ~ dnorm(0.0, 1.0E-6) beta ~ dnorm(0.0, 1.0E-6) logit(gamma) <- U3 tau ~ dgamma(0.001, 0.001) sigma <- 1 / sqrt(tau) U3 ~ dnorm(0, 1.0E-4) }
list(alpha = 1, beta = 1, tau = 1, U3 = 0) mean sd MC_error val2.5pc median val97.5pc start sample
[3]
Examples Volume II
U3 alpha beta gamma sigma 1.912 2.665 0.9753 0.8684 0.09871 0.2609 0.07564 0.07757 0.02941 0.01474 0.01072 0.002835 0.00325 0.00123 2.373E-4 1.415 2.544 0.8274 0.8046 0.07482 1.904 2.655 0.9752 0.8704 0.09716 2.459 2.848 1.132 0.9212 0.1321 2001 2001 2001 2001 2001 9000 9000 9000 9000 9000
Dugongs
[4]
Otrees
Examples Volume II
i1 = log( i1) i2 = log( i2 + 1) i3 = log( i3) The BUGS code is as follows model { for (i in 1:K) { for (j in 1:n) { Y[i, j] ~ dnorm(eta[i, j], tauC) eta[i, j] <- phi[i, 1] / (1 + phi[i, 2] * exp(phi[i, 3] * x[j])) } phi[i, 1] <- exp(theta[i, 1]) phi[i, 2] <- exp(theta[i, 2]) - 1 phi[i, 3] <- -exp(theta[i, 3]) for (k in 1:3) { theta[i, k] ~ dnorm(mu[k], tau[k]) } } tauC ~ dgamma(1.0E-3, 1.0E-3) sigmaC <- 1 / sqrt(tauC) varC <- 1 / tauC for (k in 1:3) { mu[k] ~ dnorm(0, 1.0E-4) tau[k] ~ dgamma(1.0E-3, 1.0E-3) sigma[k] <- 1 / sqrt(tau[k]) } }
[5]
Examples Volume II
Otrees
The current point Metropolis algorithm is used to sample the theta parameters in this model. The Gaussian proposal distribution used for this algorithm adapts for the first 4000 iterations and these samples are discarded from the summary statistics. A further 1000 update burn-in followed by 10000 updates gave the following parameter estimates:
mu[1] mu[2] mu[3] sigma[1] sigma[2] sigma[3] sigmaC theta[1,1] theta[1,2] theta[1,3] theta[2,1] theta[2,2] theta[2,3] theta[3,1] theta[3,2] theta[3,3] theta[4,1] theta[4,2] theta[4,3] theta[5,1] theta[5,2] theta[5,3] mean 5.254 2.22 -5.861 0.2245 0.1342 0.1098 8.025 5.079 2.134 -5.851 5.395 2.207 -5.825 5.079 2.187 -5.908 5.441 2.269 -5.816 5.291 2.299 -5.907 sd 0.1242 0.1252 0.1143 0.1235 0.1219 0.09349 1.216 0.08832 0.1542 0.149 0.05096 0.1245 0.1015 0.09932 0.1351 0.1494 0.04836 0.1395 0.1021 0.06828 0.1351 0.1075 MC_error 0.004513 0.007917 0.008563 0.00357 0.005743 0.005828 0.03895 0.007158 0.01001 0.0126 0.003465 0.008209 0.007943 0.008296 0.008393 0.01298 0.003287 0.009928 0.008087 0.005157 0.009323 0.008937 val2.5pc 5.002 1.994 -6.098 0.07706 0.02447 0.02354 6.03 4.949 1.823 -6.19 5.3 1.962 -6.028 4.945 1.915 -6.286 5.347 2.024 -6.008 5.174 2.05 -6.125 median 5.258 2.216 -5.86 0.1963 0.1009 0.08214 7.89 5.066 2.136 -5.849 5.393 2.205 -5.828 5.06 2.188 -5.89 5.442 2.256 -5.825 5.284 2.295 -5.903 val97.5pc 5.488 2.469 -5.657 0.5306 0.4428 0.3591 10.77 5.326 2.423 -5.583 5.505 2.46 -5.624 5.356 2.447 -5.666 5.543 2.566 -5.591 5.438 2.589 -5.7 start 5001 5001 5001 5001 5001 5001 5001 5001 5001 5001 5001 5001 5001 5001 5001 5001 5001 5001 5001 5001 5001 5001 sample 10000 10000 10000 10000 10000 10000 10000 10000 10000 10000 10000 10000 10000 10000 10000 10000 10000 10000 10000 10000 10000 10000
[6]
OtreesMVN
Examples Volume II
[7]
Examples Volume II
mean 5.265 2.2 -5.88 0.2581 0.2679 0.2296 7.853 sd 0.1351 0.1656 0.141 0.1145 0.1291 0.1101 1.19 MC_error 0.001577 0.002555 0.002287 0.001681 0.002343 0.001523 0.02499 val2.5pc 4.992 1.874 -6.171 0.1268 0.1191 0.1085 5.923 median 5.263 2.197 -5.877 0.231 0.2368 0.2036 7.715 val97.5pc 5.537 2.522 -5.614 0.558 0.5925 0.5048 10.53 start 5001 5001 5001 5001 5001 5001 5001 sample 10000 10000 10000 10000 10000 10000 10000
OtreesMVN
[8]
Biopsies
Examples Volume II
Combination Multinomial response Session frequency _______________________________________________________ O O (2, 0, 0, 0) 12 M M O (1, 2, 0, 0) 10 + + O (1, 0, 2, 0) 17 ++ ++ ++ (0, 0, 0, 3) 5
The sampling procedure may not detect the area of maximum rejection, which is considered the true underlying state at the time of the session and denoted ti --- the underlying probability distribution of the four true states is denoted by the vector p. It is then assumed that each of the observed biopsies are conditionally independent given this truestate with the restriction that there are no`false positives': i.e. one cannot observe a biopsy worse than the true state. We then have the sampling model bi ~ Multinomial(eti, ni) ti ~ Categorical(p) where bi denotes the multinomial response at session i where ni biopsies have been taken, and ejk is the probability that a true state ti = j generates a biopsy in state k.The no-false-positive restriction means that e12 = e13 = e14 = e23 = e24 = e34 = 0. Spiegelhalter and Stovin (1983) estimated the parameters ej and p using the EM algorithm, with some smoothing to avoid zero estimates. The appropriate graph is shown below, where the role of the true state ti is simply to pick the appropriate row from the 4 x 4 error matrix e. Here the probability vectors ej (j = 1,...,4) and p are assumed to have uniform priors on the unit simplex, which correspond to Dirichlet priors with all parameters being 1. The BUGS code for this model is given below. No initial values are provided for the latent states, since the forward sampling procedure will find a configuration of starting values that is compatible with the expressed constraints. We also note the apparent ``cycle'' in the graph created by the expression nbiops[i] <- sum(biopsies[i,]). This will lead Such ``cycles'' are [9]
Examples Volume II
Biopsies
permitted provided that they are only data transformation statements, since this does not affect the essential probability model. model { for (i in 1 : ns){ nbiops[i] <- sum(biopsies[i, ]) true[i] ~ dcat(p[]) biopsies[i, 1 : 4] ~ dmulti(error[true[i], ], nbiops[i]) } error[2,1 : 2] ~ ddirch(prior[1 : 2]) error[3,1 : 3] ~ ddirch(prior[1 : 3]) error[4,1 : 4] ~ ddirch(prior[1 : 4]) p[1 : 4] ~ ddirch(prior[]); # prior for p }
error[2,1] error[2,2] error[3,1] error[3,2] error[3,3] error[4,1] error[4,2] error[4,3] error[4,4] p[1] p[2] p[3] p[4]
[10]
Eyes
Examples Volume II
Part of the analysis involves fitting a mixture of two normal distributions with common variance to this distribution, so that each observation yi is assumed drawn from one of two groups. Ti = 1, 2 be the true group of the i th observation, where group j has a normal distribution with mean j and precision . We assume an unknown fraction P of observations are in group 2, 1 - P in group 1. The model is thus yi ~ Normal(Ti, ) Ti ~ Categorical(P). We note that this formulation easily generalises to additional components to the mixture, although for identifiability an order constraint must be put onto the group means. Robert (1994) points out that when using this model, there is a danger that at some iteration, all the data will go into one component of themixture, and this state will be difficult to escape from --this matches our experience. obert suggests a re-parameterisation, a simplified version of which is to assume 2 = 1 + , > 0. 1, , , P, are given independent ``noninformative" priors, including a uniform prior for P on (0,1). The appropriate graph and the BUGS code are given below.
[11]
Examples Volume II
name: mean theta 0.0 type: precision stochastic 1.0E-6 density: dnorm upper bound 0 upper bound
Eyes
theta
lambda[1]
P[1:2]
alpha[]
lambda[2]
T[i]
tau
mu[i]
model { for( i in 1 : N ) { y[i] ~ dnorm(mu[i], tau) mu[i] <- lambda[T[i]] T[i] ~ dcat(P[]) } P[1:2] ~ ddirch(alpha[]) theta ~ dnorm(0.0, 1.0E-6)I(0.0, ) lambda[2] <- lambda[1] + theta lambda[1] ~ dnorm(0.0, 1.0E-6) tau ~ dgamma(0.001, 0.001) sigma <- 1 / sqrt(tau) }
[12]
Eyes
mean 0.6014 0.3986 536.8 548.9 3.805 sd 0.08981 0.08981 1.023 1.388 0.726 MC_error 0.002305 0.002305 0.03708 0.03856 0.03322 val2.5pc 0.4267 0.2299 535.0 546.0 2.932 median 0.602 0.398 536.7 548.9 3.652 val97.5pc 0.7701 0.5733 539.0 551.3 6.014 start 1001 1001 1001 1001 1001
Examples Volume II
sample 10000 10000 10000 10000 10000
[13]
Examples Volume II
Hearts
Farewell and Sprott (1988) model these data as a mixture distribution of Poisson counts in which some patients are "cured" by the drug, whilst others experience varying levels of response but remain abnormal. A zero count for the post-drug PVC may indicate a "cure", or may represent a sampling zero from a patient with a mildly abnormal PVC count. The following model thus is assumed: xi ~ Poisson(i) for all patients yi ~ Poisson( i) for all uncured patients P(cure) = To eliminate nuisance parameters li, Farewell and Sprott use the conditional distribution of yi given ti = xi + yi. This is equivalent to a binomial likelihood for yi with denominator ti and probability p = b /(1+b) (see Cox and Hinkley, 1974 pp. 136-137 for further details of the conditional distribution for Poisson variables). Hence the final mixture model may be expressed as follows: P(yi = 0 | ti ) = + (1 ) (1 p) ti P(yi | ti ) = (1 ) (ti! / (yi! (tiyi)!)) (pyi (1 p) (ti yi)
yi = 1,2,...,ti
The BUGS code for this model is given below: model { for (i in 1 : N) { y[i] ~ dbin(P[state1[i]], t[i]) state[i] ~ dbern(theta) [14]
Hearts state1[i] <- state[i] + 1 t[i] <- x[i] + y[i] prop[i] <- P[state1[i]]
Examples Volume II
} P[1] <- p P[2] <- 0 logit(p) <- alpha alpha ~ dnorm(0,1.0E-4) beta <- exp(alpha) logit(theta) <- delta delta ~ dnorm(0, 1.0E-4)
[15]
Examples Volume II
Air
A discrete covariate zj (j = 1,2,3) representing NO2 concentration in the child's bedroom classified into 3 categories is used as a surrogate for true exposure. The nature of the measurement error relationship associated with this covariate is known precisely via a calibration study, and is given by xj = + zj + j where = 4.48, = 0.76 and j is a random element having normal distribution with zero mean and variance 2 (= 1/) = 81.14. Note that this is a Berkson (1950) model of measurement error, in which the true values of the covariate are expressed as a function of the observed values. Hence the measurement error is independent of the latter, but is correlated with the true underlying covariate values. In the present example, the observed covariate zj takes values 10, 30 or 50 for j = 1, 2, or 3 respectively (i.e. the mid-point of each category), whilst xj is interpreted as the "true average value" of NO2 in group j. The response variable is binary, reflecting presence/absence of respiratory illness, and a logistic regression model is assumed. That is yj ~ Binomial(pj, nj) logit(pj) = 1 + 2 xj where pj is the probability of respiratory illness for children in the jth exposure group. The regression coefficients 1 and 2 are given vague independent normal priors. The graphical model is shown below:
[16]
Air
Examples Volume II
tau
theta[2]
p[j]
n[j]
y[j] for(j IN 1 : J)
model { for(j in 1 : J) { y[j] ~ dbin(p[j], n[j]) logit(p[j]) <- theta[1] + theta[2] * X[j] X[j] ~ dnorm(mu[j], tau) mu[j] <- alpha + beta * Z[j] } theta[1] ~ dnorm(0.0, 0.001) theta[2] ~ dnorm(0.0, 0.001) }
Data Inits
Results
A 1000 update burn in followed by a further 10000 updates gave the parameter estimates
[17]
Examples Volume II
mean 12.92 27.21 40.85 -0.9628 0.04927 sd 7.877 7.473 8.721 1.0 0.04276 MC_error 0.4227 0.1946 0.3502 0.08808 0.003759 val2.5pc -3.775 13.05 24.18 -4.233 0.004071 median 13.3 27.01 40.84 -0.7183 0.03857 val97.5pc 26.96 42.63 58.37 0.2104 0.1951 start 1001 1001 1001 1001 1001 sample 10000 10000 10000 10000 10000
Air
Re-parameterised model with centred covariates: model { for( j in 1 : J ) { y[j] ~ dbin(p[j],n[j]) logit(p[j]) <- theta0+ theta[2] * (X[j] - mean(mu[])) X[j] ~ dnorm(mu[j],tau) mu[j] <- alpha + beta * Z[j] } theta0 ~ dnorm(0.0,0.001) theta[2] ~ dnorm(0.0,0.001) theta[1] <- theta0 - theta[2] * mean(mu[]) }
[18]
Cervix
Examples Volume II
d x w Count __________________________ Complete data __________________________ 1 0 0 13 1 0 1 3 1 1 0 5 1 1 1 18 0 0 0 33 0 0 1 11 0 1 0 16 0 1 1 16 _________________________ Incomplete data _________________________ 1 0 318 1 1 375 1 0 701 1 1 535
They fitted a prospective logistic model to the case-control data as follows di ~ Bernoulli(pi) i = 1,...,2044 logit(pi) = 0C + xi i = 1,...,2044 where is the log odds ratio of disease. Since the relationship between d and x is only directly observable in the 115 women with "complete'' data, and because there is evidence of differential measurement error, the following parameters are required in order to estimate the logistic model
[19]
Examples Volume II 1,1 1,2 2,1 2,2 q = = = = = P(w=1 | x=0, d=0) P(w=1 | x=0, d=1) P(w=1 | x=1, d=0) P(w=1 | x=1, d=1) P(x=1)
Cervix
The differential probability of being exposed to HSV (x=1) for cases and controls is calculated as follows
= =
P(x=1 | d=1) P(d=1 | x=1) P(x=1) ----------------------------P(d=1) 1 ----------------------------------------------------1 + (1 + exp 0C + ) / (1 + exp 0C) P(x=1 | d=0) P(d=0 | x=1) P(x=1) ----------------------------P(d=0) 1 1-q -----------------------------------------------------------1 + (1 + exp 0C ) / (1 + exp 0C) q 1-q -------q
= =
The BUGS code is given below. The role of the variables x1 and d1 is to pick the appropriate value of (the incidence of w) for any given true exposure status x and disease status d. Since x and d take the values 0 or 1, and the subscripts for take values 1 or 2, we must first add 1 to each x[i] and d[i] in the BUGS code before using them as index values for . BUGS does not allow subscripts to be functions of variable quantities --- hence the need to create x1and d1 for use as subscripts. In addition, note that 1 and 2 were not simulated directly in BUGS, but were calculated as functions of other parameters. This is because the dependence of 1 and 2 on d would have led to a cycle in the graphical model which would no longer define a probability distribution. model { for (i in 1 : N) { x[i] ~ dbern(q) # incidence of HSV logit(p[i]) <- beta0C + beta * x[i] # logistic model d[i] ~ dbern(p[i]) # incidence of cancer x1[i] <- x[i] + 1 d1[i] <- d[i] + 1 [20]
Cervix
Examples Volume II w[i] ~ dbern(phi[x1[i], d1[i]]) # incidence of w } q ~ dunif(0.0, 1.0) # prior distributions beta0C ~ dnorm(0.0, 0.00001); beta ~ dnorm(0.0, 0.00001); for(j in 1 : 2) { for(k in 1 : 2){ phi[j, k] ~ dunif(0.0, 1.0) } } # calculate gamma1 = P(x=1|d=0) and gamma2 = P(x=1|d=1) gamma1 <- 1 / (1 + (1 + exp(beta0C + beta)) / (1 + exp(beta0C)) * (1 - q) / q) gamma2 <- 1 / (1 + (1 + exp(-beta0C - beta)) / (1 + exp(-beta0C)) * (1 - q) / q) }
Re-parameterised model with centred covariates: model { for (i in 1 : N) { x[i] ~ dbern(q) # incidence of HSV logit(p[i]) <- beta0 + beta * (x[i] - mean(w[])) # logistic model d[i] ~ dbern(p[i]) # incidence of cancer x1[i] <- x[i] + 1 d1[i] <- d[i] + 1 w[i] ~ dbern(phi[x1[i], d1[i]]) # incidence of w } q ~ dunif(0.0, 1.0) # prior distributions [21]
Examples Volume II beta0 ~ dnorm(0.0, 0.00001); beta ~ dnorm(0.0, 0.00001); for(j in 1 : 2) { for(k in 1 : 2){ phi[j, k] ~ dunif(0.0, 1.0) } } # calculate gamma1 = P(x=1|d=0) and gamma2 = P(x=1|d=1) gamma1 <- 1 / (1 + (1 + exp(beta0C + beta)) / (1 + exp(beta0C)) * (1 - q) / q) gamma2 <- 1 / (1 + (1 + exp(-beta0C - beta)) / (1 + exp(-beta0C)) * (1 - q) / q) beta0C <- beta0 - mean(w[]) * beta }
Cervix
[22]
Birats
Examples Volume II
This represents our prior guess at the order of magnitude of the covariance matrix -1 for i (see Classic BUGS manual (version 0.5) section on Multivariate normal models), and is equivalent to the prior specification used by Gelfand et al. Finally, a non-informative Gamma(0.001, 0.001) prior was assumed for the measurement precision c.
[23]
Examples Volume II
Birats
mean[]
mu.beta[1:2]
R[1:2, 1:2]
Omega[ , ]
prec[ , ]
beta[i, 1:2]
sigma
model { for( i in 1 : N ) { beta[i , 1 : 2] ~ dmnorm(mu.beta[], R[ , ]) for( j in 1 : T ) { Y[i, j] ~ dnorm(mu[i , j], tauC) mu[i, j] <- beta[i, 1] + beta[i, 2] * x[j] } } mu.beta[1 : 2] ~ dmnorm(mean[], prec[ , ]) R[1 : 2 , 1 : 2] ~ dwish(Omega[ , ], 2) tauC ~ dgamma(0.001, 0.001) sigma <- 1 / sqrt(tauC)
[24]
Birats
mean 106.6 6.185 6.136 sd 2.35 0.1062 0.4781 MC_error 0.0335 0.001351 0.009095 val2.5pc 101.8 5.981 5.283 median 106.6 6.185 6.1 val97.5pc 111.2 6.397 7.137 start 1001 1001 1001
Examples Volume II
sample 10000 10000 10000
[25]
Examples Volume II
Schools
where i refers to pupil and j indexes school. We wish to specify a regression model for the variance components, and here we model the logarithm of ij (the inverse of the between-pupil variance) as a linear function of each pupil's LRT score. This differs from Goldstein et al.'s model which allows the variance 2ij to depend linearly on LRT. However, such a parameterization may lead to negative estimates of 2ij. Prior distributions The fixed effects k (k=1,...,8), and were assumed to follow vague independent Normal distributions with zero mean and low precision = 0.0001. The random school-level coefficients [26]
Schools
Examples Volume II
kj (k = 1,2,3) were assumed to arise from a multivariate normal population distribution with unknown mean and covariance matrix . A non-informative multivariate normal prior was then specified for the population mean , whilst the inverse covariance matrix = -1 was assumed to follow a Wishart distribution. To represent vague prior knowledge, we chose the the degrees of freedom for this distribution to be as small as possible (i.e. 3, the rank of ). The scale matrix R was specified as
0.1 0.005 0.005 0.005 0.01 0.005 0.005 0.005 0.01
which represents our prior guess at the order of magnitude of . The BUGS code is given below: model { for(p in 1 : N) { Y[p] ~ dnorm(mu[p], tau[p]) mu[p] <- alpha[school[p], 1] + alpha[school[p], 2] * LRT[p] + alpha[school[p], 3] * VR[p, 1] + beta[1] * LRT2[p] + beta[2] * VR[p, 2] + beta[3] * Gender[p] + beta[4] * School.gender[p, 1] + beta[5] * School.gender[p, 2] + beta[6] * School.denom[p, 1] + beta[7] * School.denom[p, 2] + beta[8] * School.denom[p, 3] log(tau[p]) <- theta + phi * LRT[p] sigma2[p] <- 1 / tau[p] LRT2[p] <- LRT[p] * LRT[p] } min.var <- exp(-(theta + phi * (-34.6193))) # lowest LRT score = -34.6193 max.var <- exp(-(theta + phi * (37.3807))) # highest LRT score = 37.3807 # Priors for fixed effects: for (k in 1 : 8) { beta[k] ~ dnorm(0.0, 0.0001) } theta ~ dnorm(0.0, 0.0001) phi ~ dnorm(0.0, 0.0001) # Priors for random coefficients: for (j in 1 : M) { alpha[j, 1 : 3] ~ dmnorm(gamma[1:3 ], T[1:3 ,1:3 ]); alpha1[j] <- alpha[j,1] } # Hyper-priors: [27]
Examples Volume II gamma[1 : 3] ~ dmnorm(mn[1:3 ], prec[1:3 ,1:3 ]); T[1 : 3, 1 : 3 ] ~ dwish(R[1:3 ,1:3 ], 3)
Schools
beta[1] beta[2] beta[3] beta[4] beta[5] beta[6] beta[7] beta[8] gamma[1] gamma[2] gamma[3] phi theta
mean 2.64E-4 0.4219 0.1725 0.125 0.06201 -0.2769 0.1441 -0.1667 -0.6778 0.03135 0.9597 -0.002605 0.5801
sd 9.842E-5 0.06225 0.04834 0.1377 0.1038 0.1875 0.1061 0.1733 0.09568 0.01019 0.08626 0.002829 0.03205
MC_error 2.429E-6 0.00344 0.001427 0.006001 0.004941 0.007158 0.004271 0.006393 0.005593 1.396E-4 0.004849 3.159E-5 3.518E-4
val2.5pc 7.499E-5 0.301 0.07689 -0.1558 -0.1475 -0.6584 -0.05912 -0.4943 -0.8668 0.01139 0.7947 -0.008146 0.5163
median 2.638E-4 0.4206 0.1722 0.1276 0.06323 -0.2728 0.1428 -0.1675 -0.6783 0.03139 0.959 -0.00259 0.5803
val97.5pc 4.558E-4 0.5466 0.2682 0.3939 0.2624 0.08729 0.36 0.1846 -0.4862 0.05137 1.129 0.002927 0.6414
start 1001 1001 1001 1001 1001 1001 1001 1001 1001 1001 1001 1001 1001
sample 10000 10000 10000 10000 10000 10000 10000 10000 10000 10000 10000 10000 10000
Estimating the ranks The school-specific intercept j1 measures the 'residual effect' for school j after adjusting for pupil- and school-level covariates. This might represent an appropriate quantity by which to rank schools' performance. We compute the ranks in BUGS using the "rank" option of the "Statistics" menu, which we set for the variable alpha at the same time as we set the "sample monitor" option. Since the rank is a function of stochastic nodes, its value will change at every iteration. Hence we may obtain a posterior distribution for the rank of alpha[, k] which may be summarized by posterior histograms as shown below:
[28]
Ice
Examples Volume II
i agei yeari casesi person-yearsi _____ ________________________________________ 1 1 6 2 41380 2 1 7 0 43650 ... ... 77 13 5 31 13600
In order to pull in the extreme risks associated with small birth cohorts, Breslow and Clayton first consider the exchangeable model casesi ~ Poisson(i) log i = log person-yearsi + agei + yeari k ~ Normal( 0, )
Autoregressive smoothing of relative risks They then consider the alternative approach of smoothing the rates for the cohorts by assuming an auto-regressive model on the 's, assuming the second differences are independent normal variates. This is equivalent to a model and prior distribution casesi ~ Poisson(i) log i = log person-yearsi + agei + yeari 1 ~ Normal( 0, 0.000001 ) 2 | 1 ~ Normal( 0, 0.000001 ) k | 1,...,k-1 ~ Normal( 2 k-1 k-2, )
k>2
We note that 1 and 2 are given "non-informative" priors, but retain a term in order to provide the appropriate likelihood for . For computational reasons Breslow and Clayton impose constraints on their random effects k in order that their mean and linear trend are zero, and counter these constraints by introducing a [29]
Examples Volume II
Ice
linear term b x yeari and allowing unrestrained estimation of j. Since we allow free movement of the 's we dispense with the linear term, and impose a "corner" constraint 1 =0 . model { for (i in 1:I) { cases[i] ~ dpois(mu[i]) log(mu[i]) <- log(pyr[i]) + alpha[age[i]] + beta[year[i]] } betamean[1] <- 2 * beta[2] - beta[3] Nneighs[1] <- 1 betamean[2] <- (2 * beta[1] + 4 * beta[3] - beta[4]) / 5 Nneighs[2] <- 5 for (k in 3 : K - 2) { betamean[k] <- (4 * beta[k - 1] + 4 * beta[k + 1]- beta[k - 2] - beta[k + 2]) / 6 Nneighs[k] <- 6 } betamean[K - 1] <- (2 * beta[K] + 4 * beta[K - 2] - beta[K - 3]) / 5 Nneighs[K - 1] <- 5 betamean[K] <- 2 * beta[K - 1] - beta[K - 2] Nneighs[K] <- 1 for (k in 1 : K) { betaprec[k] <- Nneighs[k] * tau } for (k in 1 : K) { beta[k] ~ dnorm(betamean[k], betaprec[k]) logRR[k] <- beta[k] - beta[5] tau.like[k] <- Nneighs[k] * beta[k] * (beta[k] - betamean[k]) } alpha[1] <- 0.0 for (j in 2 : Nage) { alpha[j] ~ dnorm(0, 1.0E-6) } d <- 0.0001 + sum(tau.like[]) / 2 r <- 0.0001 + K / 2 tau ~ dgamma(r, d) sigma <- 1 / sqrt(tau) }
Ice
mean -1.075 -0.7717 -0.4721 -0.2016 0.1588 0.319 0.4829 0.6512 0.8466 1.059 0.05286 sd 0.2503 0.1584 0.08179 0.03908 0.04625 0.06949 0.08673 0.1066 0.1281 0.1811 0.04374 MC_error 0.008578 0.005506 0.002555 6.68E-4 0.001162 0.002112 0.002982 0.003936 0.00484 0.006206 0.001335 val2.5pc -1.619 -1.107 -0.651 -0.278 0.04592 0.164 0.3022 0.4366 0.5911 0.7041 0.006732 median -1.043 -0.755 -0.463 -0.2018 0.1683 0.3282 0.4896 0.6566 0.8513 1.06 0.04159 val97.5pc -0.6951 -0.5203 -0.338 -0.1166 0.2269 0.4369 0.6469 0.8613 1.094 1.415 0.1625 start 1001 1001 1001 1001 1001 1001 1001 1001 1001 1001 1001
Examples Volume II
sample 100000 100000 100000 100000 100000 100000 100000 100000 100000 100000 100000
logRR[1] logRR[2] logRR[3] logRR[4] logRR[6] logRR[7] logRR[8] logRR[9] logRR[10] logRR[11] sigma
[31]
Examples Volume II
Beetles
We assume that the observed number of deaths ri at each concentration xi is binomial with sample size ni and true rate pi. Plausible models for pi include the logistic, probit and extreme value (complimentary log-log) models, as follows pi = exp( + xi) / (1 + exp( + xi) pi = Phi( + xi) pi = 1 - exp(-exp( + xi)) The corresponding graph is shown below:
[32]
Beetles
Examples Volume II
alpha
beta
alpha.star
x[i]
p[i]
n[i]
r[i] for(i IN 1 : N)
rhat[i]
model { for( i in 1 : N ) { r[i] ~ dbin(p[i],n[i]) logit(p[i]) <- alpha.star + beta * (x[i] - mean(x[])) rhat[i] <- n[i] * p[i] } alpha <- alpha.star - beta * mean(x[]) beta ~ dnorm(0.0,0.001) alpha.star ~ dnorm(0.0,0.001) }
[33]
Examples Volume II
mean -60.79 34.31 3.56 9.932 22.47 33.87 50.03 53.21 59.14 58.68 sd 5.147 2.893 0.9488 1.677 2.091 1.751 1.646 1.102 0.7338 0.4241 MC_error 0.05624 0.03171 0.009435 0.01549 0.01736 0.0152 0.01661 0.01191 0.008143 0.004761 val2.5pc -71.29 28.91 1.997 6.909 18.36 30.32 46.66 50.86 57.52 57.72 median -60.67 34.24 3.463 9.851 22.46 33.89 50.06 53.28 59.22 58.74 val97.5pc -51.17 40.23 5.634 13.43 26.63 37.25 53.12 55.17 60.38 59.36 start 1001 1001 1001 1001 1001 1001 1001 1001 1001 1001 sample 10000 10000 10000 10000 10000 10000 10000 10000 10000 10000
Beetles
alpha beta rhat[1] rhat[2] rhat[3] rhat[4] rhat[5] rhat[6] rhat[7] rhat[8]
Probit model
mean -35.04 19.79 3.442 10.76 23.48 33.81 49.6 53.27 59.6 59.17 sd 2.646 1.488 1.014 1.692 1.916 1.626 1.648 1.17 0.7542 0.3729 MC_error 0.02934 0.01657 0.0106 0.01684 0.01865 0.01706 0.01865 0.01353 0.008725 0.004308 val2.5pc -40.46 16.9 1.743 7.632 19.79 30.58 46.27 50.76 57.88 58.28 median -34.98 19.75 3.348 10.7 23.47 33.83 49.63 53.33 59.67 59.23 val97.5pc -29.93 22.84 5.693 14.23 27.24 36.96 52.73 55.38 60.84 59.72 start 1001 1001 1001 1001 1001 1001 1001 1001 1001 1001 sample 10000 10000 10000 10000 10000 10000 10000 10000 10000 10000
alpha beta rhat[1] rhat[2] rhat[3] rhat[4] rhat[5] rhat[6] rhat[7] rhat[8]
alpha beta rhat[1] rhat[2] rhat[3] rhat[4] rhat[5] rhat[6] rhat[7] rhat[8]
[34]
Endo
Examples Volume II
We denote estrogen exposure as xij for the ith case-control pair, where j=1 for a case and j=2 for a control. The conditional likelihood for the log (odds ratio) is then given by i exp xi1 / (exp xi1 + exp xi2) We shall illustrate three methods of fitting this model. It is convenient to denote the fixed disease status as a variable Yi1 = 1, Yi2 = 0. First, Breslow and Day point out that for case-control studies with a single control per case, we may obtain this likelihood by using unconditional logistic regression for each case-control pair. That is Yi1 ~ Binomial(pi,2) logit pi = (xi1 xi2) Second, the Classic BUGS manual (version 0.5) section on Conditional likelihoods in casecontrol studies discusses fitting this likelihood directly by assuming the model Yi. pij log eij ~ Multinomial(pi., 1) = eij / j eij = xij
Finally, the Classic BUGS manual (version 0.5) shows how the multinomial-Poisson transformation can be used. In general, this will be more efficient than using the multinomiallogistic parameterisation above, since it avoids the time-consuming evaluation of j eij. However, in the present example this summation is only over J=2 elements, whilst the multinomial-Poisson parameterisation involves estimation of an additional intercept parameter for each of the 183 strata. Consequently the latter is less efficient than the multinomial-logistic in this case.
[35]
Examples Volume II
Endo
We note that all these formulations may be easily extended to include additional subject-specific covariates, and that the second and third methods can handle arbitrary numbers of controls per case. In addition, the Bayesian approach allows the incorporation of hierarchical structure, measurement error, missing data and so on. All these techniques are illustrated in the code given below, which includes a transformation of the original summary statistics into full data. In this example, all but the second conditionallikelihood approach are commented out. model { # transform collapsed data into full for (i in 1 : I){ Y[i,1] <- 1 Y[i,2] <- 0 } # loop around strata with case exposed, control not exposed (n10) for (i in 1 : n10){ est[i,1] <- 1 est[i,2] <- 0 } # loop around strata with case not exposed, control exposed (n01) for (i in (n10+1) : (n10+n01)){ est[i,1] <- 0 est[i,2] <- 1 } # loop around strata with case exposed, control exposed (n11) for (i in (n10+n01+1) : (n10+n01+n11)){ est[i,1] <- 1 est[i,2] <- 1 } # loop around strata with case not exposed, control not exposed (n00) for (i in (n10+n01+n11+1) :I ){ est[i,1] <- 0 est[i,2] <- 0 } # PRIORS beta ~ dnorm(0,1.0E-6) ; # LIKELIHOOD for (i in 1 : I) { # loop around strata # METHOD 1 - logistic regression # Y[i,1] ~ dbin( p[i,1], 1) # logit(p[i,1]) <- beta * (est[i,1] - est[i,J])
[36]
Endo
Examples Volume II # METHOD 2 - conditional likelihoods Y[i, 1 : J] ~ dmulti( p[i, 1 : J],1) for (j in 1:2){ p[i, j] <- e[i, j] / sum(e[i, ]) log( e[i, j] ) <- beta * est[i, j] } # METHOD 3 fit standard Poisson regressions relative to baseline #for (j in 1:J) { # Y[i, j] ~ dpois(mu[i, j]); # log(mu[i, j]) <- beta0[i] + beta*est[i, j]; } #beta0[i] ~ dnorm(0, 1.0E-6) }
[37]
Examples Volume II
Stagnant
Note the repeated x's. We assume a model with two straight lines that meet at a certain changepoint xk --- this is slightly different from the model of Carlin, Gelfand and Smith (1992) who do not constrain the two straight lines to cross at the changepoint. We assume Yi i ~ Normal(i, ) = + J[i] (xi - xk) J[i]=1 if i <= k J[i]=2 if i > k
giving E(Y) = at the changepoint, with gradient 1 before, and gradient 2 after the changepoint. We give independent "noninformative'' priors to , 1, 2 and . Note: alpha is E(Y) at the changepoint, so will be highly correlated with k. This may be a very poor parameterisation. Note way of constructing a uniform prior on the integer k, and making the regression parameter depend on a random changepoint. model { [38]
Stagnant for( i in 1 : N ) { Y[i] ~ dnorm(mu[i],tau) mu[i] <- alpha + beta[J[i]] * (x[i] - x[k]) J[i] <- 1 + step(i - k - 0.5) punif[i] <- 1/N } tau ~ dgamma(0.001,0.001) alpha ~ dnorm(0.0,1.0E-6) for( j in 1 : 2 ) { beta[j] ~ dnorm(0.0,1.0E-6) } k ~ dcat(punif[]) sigma <- 1 / sqrt(tau)
Examples Volume II
Data
( click to open )
Examples Volume II model { for(i in 1 : N) { Y[i] ~ dnorm(mu[i], tau) mu[i] <- alpha + beta[J[i]] * (x[i] - x.change) J[i] <- 1 + step(x[i] - x.change) } tau ~ dgamma(0.001, 0.001) alpha ~ dnorm(0.0,1.0E-6) for(j in 1 : 2) { beta[j] ~ dnorm(0.0,1.0E-6) } sigma <- 1 / sqrt(tau) x.change ~ dunif(-1.3,1.1) }
Stagnant
Data
( click to open )
[40]
Stagnant
alpha beta[1] beta[2] sigma x.change mean 0.537 -0.4184 -1.014 0.0221 0.02597 sd 0.02569 0.01511 0.01747 0.003271 0.03245 MC_error 0.001316 6.303E-4 5.38E-4 3.919E-5 0.001668 val2.5pc 0.4895 -0.4468 -1.049 0.0168 -0.03754 median 0.535 -0.419 -1.013 0.02171 0.02868 val97.5pc 0.5881 -0.3876 -0.9799 0.02952 0.0839 start 1001 1001 1001 1001 1001
Examples Volume II
sample 20000 20000 20000 20000 20000
[41]
Examples Volume II
Asia
[42]