Bios619 Lecture 13: Adaptive Features of RCTs

What do we mean by "adaptive"?

Chow and Chang (2008), Chow and Chang (2011)

Adaptive dosing: dose-escalate as we become more confident lower dose levels are safe
Treatment-adaptive allocations: continually adjust assignment probabilities to favor balanced allocations
Response adaptive allocations: continaully adjust assignment probabilities to favor better-performing arm
Adaptive sequential testing: sequentially evaluate data to determine if sufficient evidence to declare success early

Others.

Adaptive sequential testing.

Why is it needed?

If strong evidence for no difference between treatment arms, or greater-than-expected incidence of side effects, then trial should stop early.
If interim data suggest that chance of finding difference is very small, wise to stop and invest resources elsewhere
If early evidence is promising, may want to change patient allocation

How does it work?

Two types of interim analyses:

Pre-planned statistical analyses, often intended to identify treatment arm that is found to be superior before end of trial
Safety and other qualitative analyses conducted by Data Safety Monitoring Board (DSMB). Difficult to pre-plan this behavior

We've already seen an example

Simon's two-stage phase 2 designs (Simon, 1989):
Pause trial after \(n_1\) patients. Continue only if \(>r_1\) responders
After all planned patients have enrolled, hypothesis test is conducted
Differences from present context: one arm; only two analyses; testing only for lack of efficacy ("futility")

"Group sequential testing"

Basically, allow for rejection of \(H_0\) during trial
Notation:
\(n_A\) (=\(n_B\)) patients in each arm
Plan to conduct total of \(K\) analyses
\(m = n_A/K\) patients per analysis per arm

arm A	\(m\)		\(2m\)		\(3m\)		\(\cdots\)	\(Km = n_A\)
arm B	\(m\)		\(2m\)		\(3m\)		\(\cdots\)	\(Km = n_B\)
		\(\uparrow\)		\(\uparrow\)		\(\uparrow\)

Group sequential tests, background

\(X_{iA} \sim N(\mu_A,\sigma^2)\); \(X_{iB} \sim N(\mu_B,\sigma^2)\)
\({H_0}: \mu_A - \mu_B = 0\)
\({H_1}: \mu_A - \mu_B = \delta\)

Group sequential tests, background

At analysis \(k\), \(k=1,\ldots,K\), construct test statistic \(Z_k\): \[ Z_k = \sqrt{\dfrac{km}{2\sigma^2}}\left(\dfrac{1}{km}\sum_i^{km} X_{iA} - \dfrac{1}{km}\sum_i^{km} X_{iB}\right) \]
How big should \(|Z_k|\) be to reject \(H_0\) at "time" \(k\)?

Group sequential tests, background

Strategy based upon joint distribution of \(\{Z_1,\ldots,Z_K\}\):

\(\{Z_1,\ldots,Z_K\}\) is multivariate normal
\(E Z_k = \sqrt{I_k} \delta\), \(I_k = (2\sigma^2/[km])^{-1}\) is information level
\(\mathrm{cov}(Z_{k'},Z_k) = \sqrt{I_{k'}/I_{k}} = \sqrt{k'/k}\), for \(1\leq k'\leq k \leq K\)

Group sequential tests, background

Group sequential approaches choose set of constants \(\{c_1(\alpha), \ldots, c_K(\alpha)\}\), where \(\alpha\) is desired probability of making type I error at time, i.e. \[ \begin{aligned} &\Pr(|Z_1| > c_1(\alpha)) \\ &\quad + \Pr(|Z_1| < c_1(\alpha), |Z_2| > c_2(\alpha)) \\ &\quad + \ldots \\ &\quad + \Pr(|Z_1| < c_1(\alpha), |Z_2| < c_2(\alpha), \ldots, |Z_K| > c_K(\alpha)) \\ &\quad = \alpha \end{aligned} \]

Two common group sequential tests

Set each \(c_k(\alpha)\) to constant value \(c_P(K,\alpha)\). Test at each time has same size (Pocock, 1977)
Set threshold to be decreasing: \(c_k(\alpha) = c_B(K,\alpha)\sqrt{K/k}\), i.e. \(c_1(\alpha) > c_2(\alpha) > \ldots > c_K(\alpha)\). Size of test increasing over time (O’Brien and Fleming, 1979)

Example, \(K=2\)

\[ \begin{aligned} Z_1 &= \sqrt{\dfrac{m}{2\sigma^2}}\left(\dfrac{1}{m}\sum_i^{m} X_{iA} - \dfrac{1}{m}\sum_i^{m} X_{iB}\right)\\ Z_2 &= \sqrt{\dfrac{2m}{2\sigma^2}}\left(\dfrac{1}{2m}\sum_i^{2m} X_{iA} - \dfrac{1}{2m}\sum_i^{2m} X_{iB}\right) \end{aligned} \]

\(EZ_1 = \sqrt{\dfrac{m}{2\sigma^2}}\delta\); \(\mathrm{var}(Z_1) = 1\)
\(EZ_2 = \sqrt{\dfrac{2m}{2\sigma^2}}\delta\); \(\mathrm{var}(Z_2) = 1\)

Example, \(K=2\)

\[ \begin{aligned} &E Z_1 Z_2\\ &\quad= \dfrac{1}{\sqrt{2}} E\left(Z_1 \left[ \sqrt{\dfrac{m}{2\sigma^2}} \dfrac{1}{m}\sum_{i=1}^{2m}\left(X_{iA} - X_{iB}\right)\right]\right)\\ &\quad= \dfrac{1}{\sqrt{2}} E\left(Z_1 \left[ \sqrt{\dfrac{m}{2\sigma^2}}\left( \dfrac{1}{m}\sum_{i=1}^{m}\left(X_{iA} - X_{iB}\right)+ \dfrac{1}{m}\sum_{i=m+1}^{2m}\left(X_{iA} - X_{iB}\right)\right)\right]\right)\\ &\quad= \dfrac{1}{\sqrt{2}} E Z_1^2 + \dfrac{1}{\sqrt{2}} E\left( Z_1 \sqrt{\dfrac{m}{2\sigma^2}} \dfrac{1}{m} \sum_{i=m+1}^{2m}\left(X_{iA} - X_{iB}\right)\right)\\ &\quad= \dfrac{1}{\sqrt{2}} E Z_1^2 + \dfrac{1}{\sqrt{2}} E Z_1 E\left( \sqrt{\dfrac{m}{2\sigma^2}} \dfrac{1}{m} \sum_{i=m+1}^{2m}\left(X_{iA} - X_{iB}\right)\right) \end{aligned} \]

Example, \(K=2\)

(continued from previous slide) \[ \begin{aligned} E Z_1 Z_2 &= \dfrac{1}{\sqrt{2}} \left(\dfrac{m}{2\sigma^2}\delta^2 + 1\right) + \dfrac{1}{\sqrt{2}} \left(\dfrac{m}{2\sigma^2}\delta^2\right) \\ &= \dfrac{1}{\sqrt{2}} + \sqrt{2} \dfrac{m}{2\sigma^2}\delta^2\\ &= \dfrac{1}{\sqrt{2}} + EZ_1 EZ_2 \end{aligned} \]

\(\mathrm{cov}(Z_1,Z_2)= E Z_1 Z_2 - EZ_1 EZ_2 = 1/\sqrt{2}\)

Example, \(K=2\)

So, \[ \begin{aligned} \begin{pmatrix} Z_1 \\ Z_2 \end{pmatrix}\sim N\left(\sqrt{\dfrac{m}{2\sigma^2}}\delta \begin{pmatrix} 1\\ \sqrt{2} \end{pmatrix}, \begin{pmatrix} 1 & 1/\sqrt{2}\\ 1/\sqrt{2} & 1 \end{pmatrix} \right) \end{aligned} \]

\[ \begin{aligned} &\Pr(|Z_1| > c_1(\alpha)) + \Pr(|Z_2| > c_2(\alpha), |Z_1| < c_1(\alpha)) \\ &\quad =1 - \Pr(|Z_2| < c_2(\alpha), |Z_1| < c_1(\alpha)) \end{aligned} \]

Pocock boundary in \(K=2\)

Set \(c_P(\alpha) = c\), where \(c\) is solution to \[ \begin{aligned} \alpha = 1 - \int^c_{-c}\int^c_{-c} f_{Z_1,Z_2}(x,y|\delta=0) dx dy \end{aligned} \]
Solution is \(c=2.178\) when \(\alpha = 0.05\)

O'Brien/Fleming boundary in \(K=2\)

Set \(c_B(\alpha) = c\), where \(c\) is solution to \[ \begin{aligned} \alpha = 1 - \int^{\sqrt{2}c}_{-\sqrt{2}c}\int^c_{-c} f_{Z_1,Z_2}(x,y|\delta=0) dx dy \end{aligned} \]
Solution is \(c=1.977\) when \(\alpha = 0.05\), i.e. \(c_1(0.05) = \sqrt{2} \times 1.977=2.796\); \(c_2(0.05) = 1.977\)

Power of group sequential designs

For one-stage design, \(n_A = 2(z_{1-\alpha/2} + z_{1-\beta})^2\dfrac{\sigma^2}{\delta^2}\)
Moving to \(K\) stages, \(n_A\) will be [greater than, less than, or equal to?] \(n_A^*\equiv mK\) (\(m\) patients per arm per stage times \(K\) stages)

Power of group sequential designs

In order to maintain overall type I error and power (for given \(\delta/\sigma\)), then adding interim analyses increases maximum total sample size requirement
Tradeoff from conducting multiple analyses is larger maximum possible sample size but smaller average sample size
Typically presented as multiplicative "inflation" of one-stage analysis:

Identify one-stage sample size (as usual)
Decide upon number of analyses to conduct (\(K\))
Identify inflation factor \(r(\alpha,\beta,K)\) and set \(n_A^* = r(\alpha,\beta,K) \times n_A\) (per-analysis sample size is \(m=n_A^*/K\))

Example

\(K=5\) tests; \(\alpha=0.05\) \(\Rightarrow\) \(c_P(K,\alpha) = 2.413\), \(c_B(K,\alpha) = 2.040\)

Example

\(1-\beta=0.90\); \(\delta^2/\sigma^2 = 1/16\) \(\Rightarrow\) \(n_A = 2\times(1.96 + 1.28)^2\times 16=337\)
Pocock inflation is \(r_P(\alpha=0.05,\beta=0.20,K=5)=1.207\) \(\Rightarrow\) \(m=1.207\times 337/5 = 82\) patients per arm, per analysis
O'Brien inflation is \(r_B(\alpha=0.05,\beta=0.20,K=5)=1.026\) \(\Rightarrow\) \(m=1.026\times 337/5 = 69\) patients per arm, per analysis

Group Sequential Testing

O'Brien-Fleming more commonly seen:

Usually scientifically undesirable to stop very early, e.g. have little data to evaluate secondary endpoints
First analysis sometimes viewed only as practice run. Are interim analyses feasible to do? What logistical challenges are there?
DSMBs usually unwilling to stop trial early on

See also Ch. 10, (Cook and DeMets, 2007) and (Jennison and Turnbull, 1999)

Response Adaptive Allocation

Rosenberger and Lachin (2004), Rosenberger et al. (2001)

Setting:

Testing \(H_0: p_A-p_B=0\) versus \(H_0: p_A - p_B = \delta\) when outcomes are binary and \(p_A\) and \(p_B\) correspond to probabilities of response
Response is good
\(\sigma_A^2 \equiv p_A(1-p_A)\), and \(\sigma_B^2\equiv p_B(1-p_B)\)
Question: can we modify \(\Pr(T_i=1)\) after each \(i\) to reflect what we're learning about each arm?
Probably want \(\Pr(T_1=1)=1/2\). How should this be increased/decreased for \(i>1\)?

Optimal allocation

Recall discussion of setting allocation ratio \(r\) to minimize \(n = (1+r)n_A\)?
\(r_\text{opt}=\min_r (\sigma^2_A + \sigma^2_B/r)(1+r) =\sqrt{p_B(1-p_B)/p_A(1-p_A)}\)
Called Neyman allocation. Optimal but not ethical (HW5)

Alternative that optimizes ethical outcomes

Minimize expected non-responses: \(n_A(1-p_A) + n_B(1-pB)\)

\[ \begin{aligned} &n_A(1-p_A) + n_B(1-pB)\\ &= n_A[(1-p_A)+r(1-p_B)]\\ &\propto (\sigma^2_A + \sigma^2_B/r)[(1-p_A)+r(1-p_B)]\\ &= [p_A(1-p_A) + p_B(1-p_B)/r][(1-p_A)+r(1-p_B)]\\ \end{aligned} \]

Minimum with respect to \(r\) is [polleverywhere question]

Suggests adaptive allocation procedure

Update estimated response probabilities after each patient: \(\hat p_{A,i-1} =\sum_{j=1}^{i-1} O_iT_i/\sum_{j=1}^{i-1} T_i\)
Then, next patient is assigned to arm \(A\), i.e. \(T_i=1\), according to \(\Pr(T_i = 1) \propto \sqrt{\hat p_{A,i-1}}\)
To correct for small numbers at beginning of trial, recommend

\[ \begin{aligned} \hat p_{A,0} &= 1/2\\ \hat p_{A,i-1} &=\dfrac{\sum_{j=1}^{i-1} O_iT_i + 1}{\sum_{j=1}^{i-1} T_i + 2} \end{aligned} \]

After \(n\) patients, calculate \(Z\) test statistic as usual. Based on some assumptions, it is still asymptotically normal.

Simulation study

Compare simple randomization to response-adaptive randomization to minimize non-responders
Compare differences in type I error/power, treatment allocation, and treatment failure

Simulation study

Simulator function

sim_rand_alloc = function(nsim, nsubj, pa, pb, rho = 0.5) {
  if(is.numeric(rho)) {
    all_alloc = rbinom(nsim,nsubj,rho);
    all_alloc_resp = rbinom(nsim,all_alloc,pa);
    all_resp = rbinom(nsim,nsubj-all_alloc,pb) + all_alloc_resp;
  } else {
    #Initializes vector (as long as nsim) to store number of
    #subjects assigned to group 'A' and also makes first assignments
    all_alloc = curr_alloc = rbinom(nsim,1,.5);
    all_resp = curr_resp = rbinom(nsim,1,pb+curr_alloc*(pa-pb));
    all_alloc_resp = curr_alloc*curr_resp;
    for(i in 2:nsubj) {
      pa_hat =  (all_alloc_resp + 1)/(all_alloc + 2);
      pb_hat =  (all_resp-all_alloc_resp + 1)/(i - 1 - all_alloc + 2);
      stopifnot(max(pa_hat,pb_hat)<1 & min(pa_hat,pb_hat)>0);
      alloc_prob = 1/(1+sqrt(pb_hat/pa_hat));
      curr_alloc = rbinom(nsim,1,alloc_prob);
      all_alloc = all_alloc + curr_alloc;
      curr_resp = rbinom(nsim,1,pb+curr_alloc*(pa-pb));
      all_resp = all_resp + curr_resp;
      all_alloc_resp = all_alloc_resp + curr_alloc*curr_resp;
    }
  }
  pa_hat =  (all_alloc_resp + 1)/(all_alloc + 2);
  pb_hat =  (all_resp - all_alloc_resp + 1)/(nsubj - all_alloc + 2);
  se_diff = sqrt(pa_hat*(1-pa_hat)/(all_alloc)+
                   pb_hat*(1-pb_hat)/(nsubj-all_alloc));
  zscore = (pa_hat-pb_hat)/se_diff;
  list(zscore = zscore, pa_hat=pa_hat, pb_hat=pb_hat, se_diff = se_diff,
       all_alloc = all_alloc, all_resp = all_resp, all_alloc_resp = all_alloc_resp)
}

Simulation study

Run Simulations, gather results

(nsubj = ceiling(2*(qnorm(.975)+qnorm(.9))^2*(0.1*(1-0.1)+0.25*(1-0.25))/(0.1-0.25)^2));

## [1] 260

nsim = 2e4;
prob_matrix = rbind(c(.1,.1),
                    c(.25,.25),
                    c(.6,.6),
                    c(.9,.9),
                    c(.1,.2),
                    c(.1,.25),
                    c(.1,.3),
                    c(.1,.35),
                    c(.4,.25),
                    c(.4,.55),
                    c(.4,.75),
                    c(.4,.95));

all_reject = all_alloc = all_fail = 
  matrix(NA, nrow(prob_matrix), 2,dimnames = list(paste(prob_matrix[,1],prob_matrix[,2],sep="_"),c("fixed","adaptive optimal")));

for(j in 1:nrow(prob_matrix)) {
  set.seed(10);
  true_pa = prob_matrix[j,1];
  true_pb = prob_matrix[j,2];
  fixed_equal = sim_rand_alloc(nsim, nsubj, true_pa, true_pb, rho = 0.5);
  adapt_opt = sim_rand_alloc(nsim, nsubj, true_pa, true_pb, rho = "Opt");
  
  #Rejection rate    
  all_reject[j,] = formatC(c(mean(abs(fixed_equal$zscore)>qnorm(.975)),
                  mean(abs(adapt_opt$zscore)>qnorm(.975))),format="f",digits=3);

  #Number allocated to arm A
  foo2 = formatC(c(quantile(fixed_equal$all_alloc,p=0.5),
                   quantile(adapt_opt$all_alloc,p=0.5)),format="d",digits=0);
  foo3 = formatC(c(quantile(fixed_equal$all_alloc,p=0.975),
                   quantile(adapt_opt$all_alloc,p=0.975)),format="d",digits=0);
  all_alloc[j,] = paste0(foo2,"(",foo3,")");
  
  #Treatment falures
  foo4 = formatC(c(quantile(nsubj - fixed_equal$all_resp,p=0.5),
                   quantile(nsubj - adapt_opt$all_resp,p=0.5)),format="d",digits=0);
  foo5 = formatC(c(quantile(nsubj - fixed_equal$all_resp,p=0.975),
                   quantile(nsubj - adapt_opt$all_resp,p=0.975)),format="d",digits=0);
  all_fail[j,] = paste0(foo4,"(",foo5,")");
  rm(foo2,foo3,foo4,foo5);
}

Simulation study

Run Simulations, gather results

#rejection rate
knitr::kable(all_reject);

	fixed	adaptive optimal
0.1_0.1	0.043	0.048
0.25_0.25	0.050	0.054
0.6_0.6	0.049	0.047
0.9_0.9	0.043	0.044
0.1_0.2	0.614	0.608
0.1_0.25	0.894	0.893
0.1_0.3	0.985	0.985
0.4_0.25	0.730	0.735
0.4_0.55	0.681	0.683
0.4_0.75	1.000	1.000

#treatment allocation (median, 97.5th percentile)
knitr::kable(all_alloc);

	fixed	adaptive optimal
0.1_0.1	130(146)	130(160)
0.25_0.25	130(146)	130(154)
0.6_0.6	130(146)	130(148)
0.9_0.9	130(146)	130(146)
0.1_0.2	130(146)	111(138)
0.1_0.25	130(146)	105(130)
0.1_0.3	130(146)	100(124)
0.4_0.25	130(146)	144(167)
0.4_0.55	130(146)	120(139)
0.4_0.75	130(146)	111(129)

#failures (median, 97.5th percentile)
knitr::kable(all_fail);

	fixed	adaptive optimal
0.1_0.1	234(243)	234(243)
0.25_0.25	195(208)	195(209)
0.6_0.6	104(120)	104(120)
0.9_0.9	26(36)	26(36)
0.1_0.2	221(232)	219(231)
0.1_0.25	215(226)	211(223)
0.1_0.3	208(220)	202(215)
0.4_0.25	176(190)	173(188)
0.4_0.55	136(152)	135(151)
0.4_0.75	110(126)	104(119)

Controversies

Great article by Hey and Kimmelman (2015) critiquing outcome adaptive randomization.
- Benefit/differences only exists when one arm is substantially better
- Any difference further mitigated when outcomes occur slowly relative to enrollments
- Informed consent:
  - By overturning equipoise, or lack of preference between two treatment arms, minority of subjects will enroll to treatment arm that is believed to be inferior
  - Do subjects understand that they are not guaranteed enrollment to arm that is believed to be best?
- Potential for 'drift' in treatment effects. Urgent, less healthy patients enroll early on, whereas healthier patients are implicitly encouraged to wait until later in trial. This potentially changes response rates over course of trial.
Robust commentary of article.
- They characterize adaptive randomization in two-armed trials as unethical. I do not object to implication that I am unethical because I take their premise to be unethical; these authors are naive; (Berry)
- Don Berry’s response is, in places, vinegary and strange (Hey and Kimmelman)

Word of the day!

References

Chow, S.-C. and Chang, M. (2008) Adaptive design methods in clinical trials–a review. Orphanet journal of rare diseases, 3, 11.

Chow, S.-C. and Chang, M. (2011) Adaptive Design Methods in Clinical Trials. CRC press.

Cook, T.D. and DeMets, D.L. (2007) Introduction to Statistical Methods for Clinical Trials. CRC Press.

Hey, S.P. and Kimmelman, J. (2015) Are outcome-adaptive allocation trials ethical? Clinical trials, 12, 102–106.

Jennison, C. and Turnbull, B.W. (1999) Group Sequential Methods with Applications to Clinical Trials. CRC Press.

O’Brien, P.C. and Fleming, T.R. (1979) A multiple testing procedure for clinical trials. Biometrics, 549–556.

Pocock, S.J. (1977) Group sequential methods in the design and analysis of clinical trials. Biometrika, 64, 191–199.

Rosenberger, W.F. and Lachin, J.M. (2004) Randomization in clinical trials: Theory and practice. John Wiley & Sons, New York.

Rosenberger, W.F., Stallard, N., Ivanova, A., Harper, C.N. and Ricks, M.L. (2001) Optimal adaptive designs for binary response trials. Biometrics, 57, 909–913.

Simon, R. (1989) Optimal two-stage designs for phase ii clinical trials. Controlled clinical trials, 10, 1–10.