What do we mean by "adaptive"?

Chow and Chang (2008), Chow and Chang (2011)

  1. Adaptive dosing: dose-escalate as we become more confident lower dose levels are safe

  2. Treatment-adaptive allocations: continually adjust assignment probabilities to favor balanced allocations

  3. Response adaptive allocations: continaully adjust assignment probabilities to favor better-performing arm

  4. Adaptive sequential testing: sequentially evaluate data to determine if sufficient evidence to declare success early

Others.

Adaptive sequential testing.

Why is it needed?

  1. If strong evidence for no difference between treatment arms, or greater-than-expected incidence of side effects, then trial should stop early.

  2. If interim data suggest that chance of finding difference is very small, wise to stop and invest resources elsewhere

  3. If early evidence is promising, may want to change patient allocation

How does it work?

Two types of interim analyses:

  1. Pre-planned statistical analyses, often intended to identify treatment arm that is found to be superior before end of trial

  2. Safety and other qualitative analyses conducted by Data Safety Monitoring Board (DSMB). Difficult to pre-plan this behavior

We've already seen an example

  • Simon's two-stage phase 2 designs (Simon, 1989):

  • Pause trial after \(n_1\) patients. Continue only if \(>r_1\) responders

  • After all planned patients have enrolled, hypothesis test is conducted

  • Differences from present context: one arm; only two analyses; testing only for lack of efficacy ("futility")

"Group sequential testing"

  • Basically, allow for rejection of \(H_0\) during trial

  • Notation:

  • \(n_A\) (=\(n_B\)) patients in each arm
  • Plan to conduct total of \(K\) analyses
  • \(m = n_A/K\) patients per analysis per arm

arm A \(m\) \(2m\) \(3m\) \(\cdots\) \(Km = n_A\)
arm B \(m\) \(2m\) \(3m\) \(\cdots\) \(Km = n_B\)
\(\uparrow\) \(\uparrow\) \(\uparrow\)

Group sequential tests, background

  • \(X_{iA} \sim N(\mu_A,\sigma^2)\); \(X_{iB} \sim N(\mu_B,\sigma^2)\)

  • \({H_0}: \mu_A - \mu_B = 0\)
  • \({H_1}: \mu_A - \mu_B = \delta\)

Group sequential tests, background

  • At analysis \(k\), \(k=1,\ldots,K\), construct test statistic \(Z_k\): \[ Z_k = \sqrt{\dfrac{km}{2\sigma^2}}\left(\dfrac{1}{km}\sum_i^{km} X_{iA} - \dfrac{1}{km}\sum_i^{km} X_{iB}\right) \]

  • How big should \(|Z_k|\) be to reject \(H_0\) at "time" \(k\)?

Group sequential tests, background

  • Strategy based upon joint distribution of \(\{Z_1,\ldots,Z_K\}\):
  1. \(\{Z_1,\ldots,Z_K\}\) is multivariate normal

  2. \(E Z_k = \sqrt{I_k} \delta\), \(I_k = (2\sigma^2/[km])^{-1}\) is information level

  3. \(\mathrm{cov}(Z_{k'},Z_k) = \sqrt{I_{k'}/I_{k}} = \sqrt{k'/k}\), for \(1\leq k'\leq k \leq K\)

Group sequential tests, background

  • Group sequential approaches choose set of constants \(\{c_1(\alpha), \ldots, c_K(\alpha)\}\), where \(\alpha\) is desired probability of making type I error at time, i.e. \[ \begin{aligned} &\Pr(|Z_1| > c_1(\alpha)) \\ &\quad + \Pr(|Z_1| < c_1(\alpha), |Z_2| > c_2(\alpha)) \\ &\quad + \ldots \\ &\quad + \Pr(|Z_1| < c_1(\alpha), |Z_2| < c_2(\alpha), \ldots, |Z_K| > c_K(\alpha)) \\ &\quad = \alpha \end{aligned} \]

Two common group sequential tests

  1. Set each \(c_k(\alpha)\) to constant value \(c_P(K,\alpha)\). Test at each time has same size (Pocock, 1977)

  2. Set threshold to be decreasing: \(c_k(\alpha) = c_B(K,\alpha)\sqrt{K/k}\), i.e. \(c_1(\alpha) > c_2(\alpha) > \ldots > c_K(\alpha)\). Size of test increasing over time (O’Brien and Fleming, 1979)

Example, \(K=2\)

\[ \begin{aligned} Z_1 &= \sqrt{\dfrac{m}{2\sigma^2}}\left(\dfrac{1}{m}\sum_i^{m} X_{iA} - \dfrac{1}{m}\sum_i^{m} X_{iB}\right)\\ Z_2 &= \sqrt{\dfrac{2m}{2\sigma^2}}\left(\dfrac{1}{2m}\sum_i^{2m} X_{iA} - \dfrac{1}{2m}\sum_i^{2m} X_{iB}\right) \end{aligned} \]

  • \(EZ_1 = \sqrt{\dfrac{m}{2\sigma^2}}\delta\); \(\mathrm{var}(Z_1) = 1\)
  • \(EZ_2 = \sqrt{\dfrac{2m}{2\sigma^2}}\delta\); \(\mathrm{var}(Z_2) = 1\)

Example, \(K=2\)

\[ \begin{aligned} &E Z_1 Z_2\\ &\quad= \dfrac{1}{\sqrt{2}} E\left(Z_1 \left[ \sqrt{\dfrac{m}{2\sigma^2}} \dfrac{1}{m}\sum_{i=1}^{2m}\left(X_{iA} - X_{iB}\right)\right]\right)\\ &\quad= \dfrac{1}{\sqrt{2}} E\left(Z_1 \left[ \sqrt{\dfrac{m}{2\sigma^2}}\left( \dfrac{1}{m}\sum_{i=1}^{m}\left(X_{iA} - X_{iB}\right)+ \dfrac{1}{m}\sum_{i=m+1}^{2m}\left(X_{iA} - X_{iB}\right)\right)\right]\right)\\ &\quad= \dfrac{1}{\sqrt{2}} E Z_1^2 + \dfrac{1}{\sqrt{2}} E\left( Z_1 \sqrt{\dfrac{m}{2\sigma^2}} \dfrac{1}{m} \sum_{i=m+1}^{2m}\left(X_{iA} - X_{iB}\right)\right)\\ &\quad= \dfrac{1}{\sqrt{2}} E Z_1^2 + \dfrac{1}{\sqrt{2}} E Z_1 E\left( \sqrt{\dfrac{m}{2\sigma^2}} \dfrac{1}{m} \sum_{i=m+1}^{2m}\left(X_{iA} - X_{iB}\right)\right) \end{aligned} \]

Example, \(K=2\)

(continued from previous slide) \[ \begin{aligned} E Z_1 Z_2 &= \dfrac{1}{\sqrt{2}} \left(\dfrac{m}{2\sigma^2}\delta^2 + 1\right) + \dfrac{1}{\sqrt{2}} \left(\dfrac{m}{2\sigma^2}\delta^2\right) \\ &= \dfrac{1}{\sqrt{2}} + \sqrt{2} \dfrac{m}{2\sigma^2}\delta^2\\ &= \dfrac{1}{\sqrt{2}} + EZ_1 EZ_2 \end{aligned} \]

\(\mathrm{cov}(Z_1,Z_2)= E Z_1 Z_2 - EZ_1 EZ_2 = 1/\sqrt{2}\)

Example, \(K=2\)

So, \[ \begin{aligned} \begin{pmatrix} Z_1 \\ Z_2 \end{pmatrix}\sim N\left(\sqrt{\dfrac{m}{2\sigma^2}}\delta \begin{pmatrix} 1\\ \sqrt{2} \end{pmatrix}, \begin{pmatrix} 1 & 1/\sqrt{2}\\ 1/\sqrt{2} & 1 \end{pmatrix} \right) \end{aligned} \]

\[ \begin{aligned} &\Pr(|Z_1| > c_1(\alpha)) + \Pr(|Z_2| > c_2(\alpha), |Z_1| < c_1(\alpha)) \\ &\quad =1 - \Pr(|Z_2| < c_2(\alpha), |Z_1| < c_1(\alpha)) \end{aligned} \]

Pocock boundary in \(K=2\)

  • Set \(c_P(\alpha) = c\), where \(c\) is solution to \[ \begin{aligned} \alpha = 1 - \int^c_{-c}\int^c_{-c} f_{Z_1,Z_2}(x,y|\delta=0) dx dy \end{aligned} \]

  • Solution is \(c=2.178\) when \(\alpha = 0.05\)

O'Brien/Fleming boundary in \(K=2\)

  • Set \(c_B(\alpha) = c\), where \(c\) is solution to \[ \begin{aligned} \alpha = 1 - \int^{\sqrt{2}c}_{-\sqrt{2}c}\int^c_{-c} f_{Z_1,Z_2}(x,y|\delta=0) dx dy \end{aligned} \]

  • Solution is \(c=1.977\) when \(\alpha = 0.05\), i.e. \(c_1(0.05) = \sqrt{2} \times 1.977=2.796\); \(c_2(0.05) = 1.977\)

Power of group sequential designs

  • For one-stage design, \(n_A = 2(z_{1-\alpha/2} + z_{1-\beta})^2\dfrac{\sigma^2}{\delta^2}\)

  • Moving to \(K\) stages, \(n_A\) will be [greater than, less than, or equal to?] \(n_A^*\equiv mK\) (\(m\) patients per arm per stage times \(K\) stages)

Power of group sequential designs

  • In order to maintain overall type I error and power (for given \(\delta/\sigma\)), then adding interim analyses increases maximum total sample size requirement

  • Tradeoff from conducting multiple analyses is larger maximum possible sample size but smaller average sample size

  • Typically presented as multiplicative "inflation" of one-stage analysis:

  1. Identify one-stage sample size (as usual)

  2. Decide upon number of analyses to conduct (\(K\))

  3. Identify inflation factor \(r(\alpha,\beta,K)\) and set \(n_A^* = r(\alpha,\beta,K) \times n_A\) (per-analysis sample size is \(m=n_A^*/K\))

Example

  • \(K=5\) tests; \(\alpha=0.05\) \(\Rightarrow\) \(c_P(K,\alpha) = 2.413\), \(c_B(K,\alpha) = 2.040\)

Example

  • \(1-\beta=0.90\); \(\delta^2/\sigma^2 = 1/16\) \(\Rightarrow\) \(n_A = 2\times(1.96 + 1.28)^2\times 16=337\)

  • Pocock inflation is \(r_P(\alpha=0.05,\beta=0.20,K=5)=1.207\) \(\Rightarrow\) \(m=1.207\times 337/5 = 82\) patients per arm, per analysis

  • O'Brien inflation is \(r_B(\alpha=0.05,\beta=0.20,K=5)=1.026\) \(\Rightarrow\) \(m=1.026\times 337/5 = 69\) patients per arm, per analysis

Group Sequential Testing

  • O'Brien-Fleming more commonly seen:
  1. Usually scientifically undesirable to stop very early, e.g. have little data to evaluate secondary endpoints
  2. First analysis sometimes viewed only as practice run. Are interim analyses feasible to do? What logistical challenges are there?
  3. DSMBs usually unwilling to stop trial early on
  • See also Ch. 10, (Cook and DeMets, 2007) and (Jennison and Turnbull, 1999)

Response Adaptive Allocation

Rosenberger and Lachin (2004), Rosenberger et al. (2001)

Setting:

  • Testing \(H_0: p_A-p_B=0\) versus \(H_0: p_A - p_B = \delta\) when outcomes are binary and \(p_A\) and \(p_B\) correspond to probabilities of response

  • Response is good

  • \(\sigma_A^2 \equiv p_A(1-p_A)\), and \(\sigma_B^2\equiv p_B(1-p_B)\)

  • Question: can we modify \(\Pr(T_i=1)\) after each \(i\) to reflect what we're learning about each arm?

  • Probably want \(\Pr(T_1=1)=1/2\). How should this be increased/decreased for \(i>1\)?

Optimal allocation

  • Recall discussion of setting allocation ratio \(r\) to minimize \(n = (1+r)n_A\)?

  • \(r_\text{opt}=\min_r (\sigma^2_A + \sigma^2_B/r)(1+r) =\sqrt{p_B(1-p_B)/p_A(1-p_A)}\)

  • Called Neyman allocation. Optimal but not ethical (HW5)

Alternative that optimizes ethical outcomes

  • Minimize expected non-responses: \(n_A(1-p_A) + n_B(1-pB)\)

\[ \begin{aligned} &n_A(1-p_A) + n_B(1-pB)\\ &= n_A[(1-p_A)+r(1-p_B)]\\ &\propto (\sigma^2_A + \sigma^2_B/r)[(1-p_A)+r(1-p_B)]\\ &= [p_A(1-p_A) + p_B(1-p_B)/r][(1-p_A)+r(1-p_B)]\\ \end{aligned} \]

  • Minimum with respect to \(r\) is [polleverywhere question]

Suggests adaptive allocation procedure

  • Update estimated response probabilities after each patient: \(\hat p_{A,i-1} =\sum_{j=1}^{i-1} O_iT_i/\sum_{j=1}^{i-1} T_i\)

  • Then, next patient is assigned to arm \(A\), i.e. \(T_i=1\), according to \(\Pr(T_i = 1) \propto \sqrt{\hat p_{A,i-1}}\)

  • To correct for small numbers at beginning of trial, recommend

\[ \begin{aligned} \hat p_{A,0} &= 1/2\\ \hat p_{A,i-1} &=\dfrac{\sum_{j=1}^{i-1} O_iT_i + 1}{\sum_{j=1}^{i-1} T_i + 2} \end{aligned} \]

  • After \(n\) patients, calculate \(Z\) test statistic as usual. Based on some assumptions, it is still asymptotically normal.

Simulation study

  • Compare simple randomization to response-adaptive randomization to minimize non-responders

  • Compare differences in type I error/power, treatment allocation, and treatment failure

Simulation study

Simulator function

sim_rand_alloc = function(nsim, nsubj, pa, pb, rho = 0.5) {
  if(is.numeric(rho)) {
    all_alloc = rbinom(nsim,nsubj,rho);
    all_alloc_resp = rbinom(nsim,all_alloc,pa);
    all_resp = rbinom(nsim,nsubj-all_alloc,pb) + all_alloc_resp;
  } else {
    #Initializes vector (as long as nsim) to store number of
    #subjects assigned to group 'A' and also makes first assignments
    all_alloc = curr_alloc = rbinom(nsim,1,.5);
    all_resp = curr_resp = rbinom(nsim,1,pb+curr_alloc*(pa-pb));
    all_alloc_resp = curr_alloc*curr_resp;
    for(i in 2:nsubj) {
      pa_hat =  (all_alloc_resp + 1)/(all_alloc + 2);
      pb_hat =  (all_resp-all_alloc_resp + 1)/(i - 1 - all_alloc + 2);
      stopifnot(max(pa_hat,pb_hat)<1 & min(pa_hat,pb_hat)>0);
      alloc_prob = 1/(1+sqrt(pb_hat/pa_hat));
      curr_alloc = rbinom(nsim,1,alloc_prob);
      all_alloc = all_alloc + curr_alloc;
      curr_resp = rbinom(nsim,1,pb+curr_alloc*(pa-pb));
      all_resp = all_resp + curr_resp;
      all_alloc_resp = all_alloc_resp + curr_alloc*curr_resp;
    }
  }
  pa_hat =  (all_alloc_resp + 1)/(all_alloc + 2);
  pb_hat =  (all_resp - all_alloc_resp + 1)/(nsubj - all_alloc + 2);
  se_diff = sqrt(pa_hat*(1-pa_hat)/(all_alloc)+
                   pb_hat*(1-pb_hat)/(nsubj-all_alloc));
  zscore = (pa_hat-pb_hat)/se_diff;
  list(zscore = zscore, pa_hat=pa_hat, pb_hat=pb_hat, se_diff = se_diff,
       all_alloc = all_alloc, all_resp = all_resp, all_alloc_resp = all_alloc_resp)
}

Simulation study

Run Simulations, gather results

(nsubj = ceiling(2*(qnorm(.975)+qnorm(.9))^2*(0.1*(1-0.1)+0.25*(1-0.25))/(0.1-0.25)^2));
## [1] 260
nsim = 2e4;
prob_matrix = rbind(c(.1,.1),
                    c(.25,.25),
                    c(.6,.6),
                    c(.9,.9),
                    c(.1,.2),
                    c(.1,.25),
                    c(.1,.3),
                    c(.1,.35),
                    c(.4,.25),
                    c(.4,.55),
                    c(.4,.75),
                    c(.4,.95));

all_reject = all_alloc = all_fail = 
  matrix(NA, nrow(prob_matrix), 2,dimnames = list(paste(prob_matrix[,1],prob_matrix[,2],sep="_"),c("fixed","adaptive optimal")));

for(j in 1:nrow(prob_matrix)) {
  set.seed(10);
  true_pa = prob_matrix[j,1];
  true_pb = prob_matrix[j,2];
  fixed_equal = sim_rand_alloc(nsim, nsubj, true_pa, true_pb, rho = 0.5);
  adapt_opt = sim_rand_alloc(nsim, nsubj, true_pa, true_pb, rho = "Opt");
  
  #Rejection rate    
  all_reject[j,] = formatC(c(mean(abs(fixed_equal$zscore)>qnorm(.975)),
                  mean(abs(adapt_opt$zscore)>qnorm(.975))),format="f",digits=3);

  #Number allocated to arm A
  foo2 = formatC(c(quantile(fixed_equal$all_alloc,p=0.5),
                   quantile(adapt_opt$all_alloc,p=0.5)),format="d",digits=0);
  foo3 = formatC(c(quantile(fixed_equal$all_alloc,p=0.975),
                   quantile(adapt_opt$all_alloc,p=0.975)),format="d",digits=0);
  all_alloc[j,] = paste0(foo2,"(",foo3,")");
  
  #Treatment falures
  foo4 = formatC(c(quantile(nsubj - fixed_equal$all_resp,p=0.5),
                   quantile(nsubj - adapt_opt$all_resp,p=0.5)),format="d",digits=0);
  foo5 = formatC(c(quantile(nsubj - fixed_equal$all_resp,p=0.975),
                   quantile(nsubj - adapt_opt$all_resp,p=0.975)),format="d",digits=0);
  all_fail[j,] = paste0(foo4,"(",foo5,")");
  rm(foo2,foo3,foo4,foo5);
}

Simulation study

Run Simulations, gather results

#rejection rate
knitr::kable(all_reject);
fixed adaptive optimal
0.1_0.1 0.043 0.048
0.25_0.25 0.050 0.054
0.6_0.6 0.049 0.047
0.9_0.9 0.043 0.044
0.1_0.2 0.614 0.608
0.1_0.25 0.894 0.893
0.1_0.3 0.985 0.985
0.4_0.25 0.730 0.735
0.4_0.55 0.681 0.683
0.4_0.75 1.000 1.000
#treatment allocation (median, 97.5th percentile)
knitr::kable(all_alloc);
fixed adaptive optimal
0.1_0.1 130(146) 130(160)
0.25_0.25 130(146) 130(154)
0.6_0.6 130(146) 130(148)
0.9_0.9 130(146) 130(146)
0.1_0.2 130(146) 111(138)
0.1_0.25 130(146) 105(130)
0.1_0.3 130(146) 100(124)
0.4_0.25 130(146) 144(167)
0.4_0.55 130(146) 120(139)
0.4_0.75 130(146) 111(129)
#failures (median, 97.5th percentile)
knitr::kable(all_fail);
fixed adaptive optimal
0.1_0.1 234(243) 234(243)
0.25_0.25 195(208) 195(209)
0.6_0.6 104(120) 104(120)
0.9_0.9 26(36) 26(36)
0.1_0.2 221(232) 219(231)
0.1_0.25 215(226) 211(223)
0.1_0.3 208(220) 202(215)
0.4_0.25 176(190) 173(188)
0.4_0.55 136(152) 135(151)
0.4_0.75 110(126) 104(119)

Controversies

  • Great article by Hey and Kimmelman (2015) critiquing outcome adaptive randomization.

    • Benefit/differences only exists when one arm is substantially better

    • Any difference further mitigated when outcomes occur slowly relative to enrollments

    • Informed consent:
      • By overturning equipoise, or lack of preference between two treatment arms, minority of subjects will enroll to treatment arm that is believed to be inferior
      • Do subjects understand that they are not guaranteed enrollment to arm that is believed to be best?
    • Potential for 'drift' in treatment effects. Urgent, less healthy patients enroll early on, whereas healthier patients are implicitly encouraged to wait until later in trial. This potentially changes response rates over course of trial.

  • Robust commentary of article.

    • They characterize adaptive randomization in two-armed trials as unethical. I do not object to implication that I am unethical because I take their premise to be unethical; these authors are naive; (Berry)

    • Don Berry’s response is, in places, vinegary and strange (Hey and Kimmelman)

Word of the day!

References

Chow, S.-C. and Chang, M. (2008) Adaptive design methods in clinical trials–a review. Orphanet journal of rare diseases, 3, 11.

Chow, S.-C. and Chang, M. (2011) Adaptive Design Methods in Clinical Trials. CRC press.

Cook, T.D. and DeMets, D.L. (2007) Introduction to Statistical Methods for Clinical Trials. CRC Press.

Hey, S.P. and Kimmelman, J. (2015) Are outcome-adaptive allocation trials ethical? Clinical trials, 12, 102–106.

Jennison, C. and Turnbull, B.W. (1999) Group Sequential Methods with Applications to Clinical Trials. CRC Press.

O’Brien, P.C. and Fleming, T.R. (1979) A multiple testing procedure for clinical trials. Biometrics, 549–556.

Pocock, S.J. (1977) Group sequential methods in the design and analysis of clinical trials. Biometrika, 64, 191–199.

Rosenberger, W.F. and Lachin, J.M. (2004) Randomization in clinical trials: Theory and practice. John Wiley & Sons, New York.

Rosenberger, W.F., Stallard, N., Ivanova, A., Harper, C.N. and Ricks, M.L. (2001) Optimal adaptive designs for binary response trials. Biometrics, 57, 909–913.

Simon, R. (1989) Optimal two-stage designs for phase ii clinical trials. Controlled clinical trials, 10, 1–10.