Non-inferiority Example 1

"Isavuconazole versus voriconazole for primary treatment of invasive mould disease caused by Aspergillus and other filamentous fungi (SECURE): a phase 3, randomised-controlled, non-inferiority trial" (Maertens et al., 2016)

https://www.ncbi.nlm.nih.gov/pubmed/26684607

  • Novel antifungal (isavuconazole): want to demonstrate similar efficacy but improved safety (fewer adverse events) compared to voriconazole

  • "1:1 allocation"; "Roughly 255 patients"; "80% power"; "upper limit of 95% CI"; "treatment difference of 10% or less (prespecified non-inferiority margin)"; "20% mortality rate was assumed for both drugs"

#Per-arm sample size approximately matches
ceiling(2*(qnorm(0.975) + qnorm(0.8))^2/(0.10^2)*(0.2*0.8));
## [1] 252

Non-inferiority Example 2

"Tenofovir alafenamide versus tenofovir disoproxil fumarate, coformulated with elvitegravir, cobicistat, and emtricitabine, for initial treatment of HIV-1 infection: two randomised, double-blind, phase 3, non-inferiority trials" (Sax et al., 2015)

https://www.ncbi.nlm.nih.gov/pubmed/25890673

  • Study of anti-retroviral treatment in untreated HIV-infected patients. Standard formulation of tenofovir causes substantial renal (kidney) toxicity. Two separate studies

  • "two-sided 95% CI"; "840 patients in each study"; "at least 95% power"; "overall response rate of 85% for viral suppression at week 48"; "with a prespecified noninferiority margin of 12%"

#Expect close to 420
ceiling(2*(qnorm(0.975) + qnorm(0.95))^2/(0.12^2)*(0.15*0.85));
## [1] 231
#Closer to their answer if we assume maximum binomial variance?
ceiling(2*(qnorm(0.975) + qnorm(0.95))^2/(0.12^2)*(0.50*0.50));
## [1] 452
  • Let's do some sanity checks:
na = nb = 420;#Stated sample size
nsim = 2e4;
true_pa = 0.85; 
true_pb = 0.85;
noninf_boundary = 0.12;
sim_arma = rbinom(nsim,na,true_pa);
sim_armb = rbinom(nsim,nb,true_pb);
mean_pa = sim_arma / na;
mean_pb = sim_armb / nb;
#Power
mean(((mean_pa - mean_pb) - qnorm(0.975) * sqrt(mean_pa * (1 - mean_pa) / na + mean_pb * (1 - mean_pb) / nb)) > (-noninf_boundary));
## [1] 0.998
#Avg width of CI
mean(2 * qnorm(0.975) * sqrt(mean_pa * (1 - mean_pa) / na + mean_pb * (1 - mean_pb) / nb));
## [1] 0.0965
na = nb = 230;#Our calculation
nsim = 2e4;
true_pa = 0.85; 
true_pb = 0.85;
noninf_boundary = 0.12;
sim_arma = rbinom(nsim,na,true_pa);
sim_armb = rbinom(nsim,nb,true_pb);
mean_pa = sim_arma / na;
mean_pb = sim_armb / nb;
#Power
mean(((mean_pa - mean_pb) - qnorm(0.975) * sqrt(mean_pa * (1 - mean_pa) / na + mean_pb * (1 - mean_pb) / nb)) > (-noninf_boundary));
## [1] 0.952
#Avg width of CI
mean(2 * qnorm(0.975) * sqrt(mean_pa * (1 - mean_pa) / na + mean_pb * (1 - mean_pb) / nb));
## [1] 0.13
  • Magnitude of these differences slightly disconcerting

Recap of strategy for sample size calculations

  1. Calculate Wald-type test statistic

  2. Find its distribution under alternative hypothesis, often a location-shifted Normal

  3. Solve for \(n\)

Most phase 3 outcomes are not Normally distributed

  • We've derived (many) Normal-based sample size formulas

  • Often, outcomes will be yes/no or time-to-event

  • Simplest strategy for sample size calculations:

Time-to-event, background

  • Many (most) phase 3 outcomes are times, e.g.
    • time to disease progression or death
    • time to recurrence
    • time to death
  • Not all patients will experience the outcome of interest and so are not equally informative

  • Question is not "how many patients do I need?" but "how much information do I need?" or "how many events do I need?"

  • Using central limit theorem, sample size formulas we've already studied provide basis for suitable answers

Hazard

  • Usually denoted by \(\lambda(t)\), hazard is rate of occurence; instantaneous probability

  • Mathematically, \[ \begin{aligned} \lambda(t) &= \lim_{\epsilon\rightarrow 0} \dfrac{ \Pr(t<X<t+\epsilon|X>t)}{\epsilon}\\ &= \dfrac{1}{\Pr(X>t)} \lim_{\epsilon\rightarrow 0} \dfrac{\Pr(X>t) - \Pr(X>t+\epsilon)}{\epsilon}\\ &= \dfrac{1}{\Pr(X>t)}\left(-\dfrac{d}{dt} \Pr(X>t)\right)\\ &= \dfrac{f(t)}{\Pr(X>t)} \end{aligned} \]

  • \(\lambda(t)>0\). Larger values indicate events, i.e. death, disease progression, recurrence, exacerbation of symptoms, occurring at higher rate

Risk accumulates with hazard

  • Formally, \(\Pr(X>t) = \exp\{-\int_0^t \lambda(u) du\}\)

Risk accumulates with hazard

  • Formally, \(\Pr(X>t) = \exp\{-\int_0^t \lambda(u) du\}\)

Primary question in many Phase 3 trials

  • Do events occur at higher rate in arm B than arm A?

  • \(\lambda_B(t)/\lambda_A(t)>1\Leftrightarrow\log\lambda_B(t)-\log\lambda_A(t)>0\)

Set-up

  • Assume complete follow-up, time-independent hazards, i.e. \(\lambda_A(t)\equiv\lambda_A\)

  • Collect \(V_1,\ldots,V_{n_A} \sim \text{Exp}(\lambda_A)\) from \(n_A\) iid subjects (experimental arm)
  • Collect \(W_1,\ldots,W_{n_B} \sim \text{Exp}(\lambda_B)\) from \(n_B\) iid subjects (standard of care/control/placebo)

Hypothesis Test

  • \(H_0\): \(\log\lambda_B-\log\lambda_A = 0\)
  • \(H_1\): \(\log\lambda_B-\log\lambda_A = \delta\)

Test Statistic

  • MLE for \(\lambda_A\) is \(\hat\lambda_A = 1/ \bar V\). For large \(n\), \[ \begin{aligned} \hat\lambda_A & \overset{\cdot}{\sim} N\left(\lambda_A ,\dfrac{1}{n_AI(\lambda_A)}\right)\\ & \overset{d}{=} N\left(\lambda_A ,\dfrac{\lambda_A^2}{n_A}\right)\\ \end{aligned} \]

\(I(\lambda_A)\) is expected Fisher information from one subject for estimating \(\lambda_A\).

Heuristic assessment of normality

\((\hat\lambda_A - \lambda_A)/(\lambda_A/\sqrt{n_A})\overset{\cdot}{\sim} N(0,1)\)

set.seed(1);
lambda = 0.025;
base_draws = matrix(rexp(100 * 1e4), nrow = 1e4);
nsamp = 5;
ggplot() + 
  geom_histogram(aes(x = ( 1/rowMeans(base_draws[,1:nsamp,drop=F]/lambda) - lambda)/ (lambda / sqrt(nsamp)),
                     y = ..density..),
                 bins=30) + 
  stat_function(aes(x = c(-2,2)), fun = dnorm, n = 301, args = list(mean = 0, sd = 1)) + 
  theme(text = element_text(size = 14)) + 
  labs(x = "n = 5", y = "");

Consider Delta method to improve convergence

\[ \begin{aligned} -\log \hat\lambda_A = \log \bar V &\overset{\cdot}{\sim}N\left(-\log \lambda_A,E(V_1)^2\dfrac{\lambda_A^2}{n_A}\right)\\ &\overset{d}{=} N(-\log\lambda_A,1/n_A) \end{aligned} \] Similarly, \(\log \bar W \overset{\cdot}{\sim} N(-\log\lambda_B,1/n_B)\)

Applying Delta method

Apply two-arm, constant variance formula

  • \(T = \dfrac{\log \bar V - \log \bar W}{\sqrt{1/n_A + 1/n_B}}\)

  • Let \(n_B = rn_A\)

  • Under \(H_0\), \(T\sim N(0,1)\);
  • Under \(H_1\), \(T\sim N\left(\delta\sqrt{n_A}/\sqrt{1 + 1/r},1\right)\)

  • \(n_A = \dfrac{(z_{1-\alpha} + z_{1-\beta} )^2}{\delta^2}(1 + 1/r)\)
  • \(n_B = r n_A\)

Time-to-event considerations

Sample size formulae assumes that event time is measured for all subjects. Usually cannot count on this:

  1. Trial may end before all subjects experience event ("administrative censoring")

  2. Subjects will not enroll instantaneously ("staggered enrollment")

  3. Subjects may be lost to follow-up before experiencing event ("lost to follow-up")

  • Less follow-up per subject requires enrolling more subjects.

1. Administrative Censoring

  • Trial ends at some time \(t_0\)
  • Some subjects will not have event by time \(t_0\)
  • Data are \(R_i = \min\{V_i, t_0\}\) and \(S_i =\min\{W_i,t_0\}\)

1. Administrative Censoring

  • Likelihood contribution from individual \(i\) in arm A is \(\lambda_A \exp\{-\lambda_A R_i \}\) if \(V_i < t_0\) (exponential density) and \(\Pr(V_i > t_0)= \exp\{-\lambda_A t_0\}\) if \(V_i \geq t_0\)

  • Use this to show that MLE for hazard is \(\hat\lambda_A = \dfrac{\sum_i 1_{[V_i<t_0]}}{\sum_i R_i}\) (Observed events divided by total follow-up)

  • (Expected) Fisher information is \((1-\exp\{-\lambda_A t_0\})/\lambda_A^2\), so that \[ \hat\lambda_A \overset{\cdot}{\sim} N\left(\lambda_A, \dfrac{1}{n_A}\dfrac{\lambda_A^2}{1-\exp\{-\lambda_A t_0\}} \right) \]

1. Administrative Censoring

Applying Delta method,

\[ \begin{aligned} -\log\hat\lambda_A &\overset{\cdot}{\sim} N\left(-\log\lambda_A,\left(\dfrac{1}{\lambda_A}\right)^2\dfrac{1}{n_A}\dfrac{\lambda_A^2}{1-\exp\{-\lambda_A t_0\}} \right)\\ &\overset{d}{=} N\left(-\log\lambda_A,\dfrac{1}{n_A}\dfrac{1}{1-\exp\{-\lambda_A t_0\}} \right) \end{aligned} \]

Compare to no-censoring case: asymptotic variance of \(\log\hat\lambda_A\) multiplicatively increases by \(\left(1-\exp\{-\lambda_A t_0\}\right)^{-1}\), i.e. inverse of probability of arm A subject experiencing event.

As \(t_0\) increases, amount of censoring […]?

1. Administrative Censoring

  • \(T = \dfrac{-\log \hat\lambda_A + \log \hat\lambda_B}{\sqrt{\dfrac{1}{n_A(1- \exp\{-\lambda_A t_0\})} + \dfrac{1}{n_B(1- \exp\{-\lambda_B t_0\})}}}\)
  • Under \(H_0\), \(T\sim N(0,1)\)

  • \(1- \exp\{-\lambda_A t_0\} = \Pr(V_i < t_0)\)

1. Administrative Censoring

\[ \begin{aligned} n_A &= \dfrac{(z_{1-\alpha} + z_{1-\beta} )^2}{\delta^2}\times\\ &\quad([1- \exp\{-\lambda_A t_0\}]^{-1} + [1- \exp\{-\lambda_B t_0\}]^{-1}/r) \end{aligned} \]

For simplicity, may be reasonable to assume

\[ ([1- \exp\{-\lambda_A t_0\}]^{-1} + [1- \exp\{-\lambda_B t_0\}]^{-1}/r)\approx (1+1/r)/\pi, \]

where

\[ \pi = \dfrac{1}{1+r}\left(1 - \exp\{-\lambda_A t_0\}\right) + \dfrac{r}{1+r}\left(1-\exp\{-\lambda_B t_0\}\right) \]

so that

\[ \begin{aligned} n_A &= \dfrac{(z_{1-\alpha} + z_{1-\beta} )^2}{\pi\delta^2}(1 + 1/r) \end{aligned} \]

Power driven by events, not subjects

  • \(\pi\) is crude approximation for average probability of observing event in single subject under \(H_1\)

  • If \(\pi\) is small, more subjects are required

  • Denote \(\pi^{-1}\) by \(\text{IF}\), which always exceeds 1 and stands for "inflation factor".

Word of the day!

Self-check:

  • Test of monoclonal antibody belimumab in autoimmune disease lupus
  • Primary efficacy endpoint was improvement in the Systemic Lupus Erythematosus Responder Index (SRI) at week 52
  • 1:1:1 randomization to one of two dose levels or placebo
  • Powered for comparison of one dose level against placebo arm

  • Text your calculated sample size via PollEverywhere

Self-check:

ceiling(2*((qnorm(0.975) + qnorm(0.90))^2/(0.14^2)) * 0.5^2);
## [1] 269

Reference: "Efficacy and safety of belimumab in patients with active systemic lupus erythematosus: a randomised, placebo-controlled, phase 3 trial" (Navarra et al., 2011)

General strategy for time-to-event considerations

Assumption 1: Censoring mechanism is independent of event mechanism

Assumption 2: Hazard is constant (will discuss this soon)

  1. Apply appropriate Normal-based sample size formula (tells you how many events needed)

  2. Identify estimate of \(\text{IF}=\pi^{-1}\)

  3. Multiplicatively increase (1) by (2) (tells you how many subjects to enroll to get needed events`)

Relationship between expected follow-up and probability of event

  • Let \(C_i\) indicate censoring time, \(R_i = \min\left\{V_i, C_i\right\}\) be observed follow-up, and \(\varepsilon_i = 1_{[V_i<C_i]}\) be event indicator
  • Assume \(C_i\perp V_i\) and \(V_i\) is exponential with rate \(\lambda_A\)
  • Observed information is \(\tfrac{\sum_i \varepsilon_i}{\hat\lambda_A^2}\), where \(\hat\lambda_A = \sum_i \varepsilon_i / \sum_i R_i\). The average information contribution then becomes

\[ \begin{aligned} \dfrac{1}{n_A}\dfrac{\sum_i \varepsilon_i}{\hat\lambda_A^2}\rightarrow_P \dfrac{\Pr(V_i < C_i)}{\lambda_A^2} \equiv \dfrac{\Pr(\text{observing event})}{\lambda_A^2} \end{aligned} \]

  • Can also be written as

\[ \begin{aligned} \dfrac{1}{n_A}\dfrac{\sum_i \varepsilon_i}{\hat\lambda_A^2} = \dfrac{1}{n_A}\dfrac{\sum_i R_i}{\hat\lambda_A} \rightarrow_P \dfrac{E[R_i]}{\lambda_A} \end{aligned} \]

  • Note that \(\lambda_A = 1/E[V_i]\), i.e. inverse of expected follow-up under no censoring or simply expected event time. Putting these together, we have \[ \begin{aligned} \Pr(\text{observing event})=\dfrac{E[R_i]}{E[V_i]} \end{aligned} \]

2. Staggered Enrollment

  • Patients enroll throughout the trial (rather than time zero), \({E_{iA}}\sim \text{Unif}(0,t_0)\), \({E_{iB}}\sim \text{Unif}(0,t_0)\)

  • Data are \(R_i = \min\{V_i, t_0-{E_{iA}}\} \overset{d}{=}\min\{V_i, {E_{iA}}\}\) and \(S_i =\min\{W_i,t_0-{E_{iB}}\}\overset{d}{=}\min\{W_i,{E_{iB}}\}\)

2. Staggered Enrollment

Expected follow-up time (length of "stick") in arm A is \[ \begin{aligned} E[R_i] &= \dfrac{t_0\lambda_A - (1 - \exp\{-\lambda_A t_0\})}{t_0\lambda_A^2} \end{aligned} \] and probability of observing event is \[ \begin{aligned} \Pr(\varepsilon_i = 1) = \Pr(V_i < t_0 - E_{iA}) = \dfrac{t_0\lambda_A - (1 - \exp\{-\lambda_A t_0\})}{t_0\lambda_A} \end{aligned} \]

Similar calculation for Arm B, then calculate \(\pi\) as before:

\[ \begin{aligned} \pi = \dfrac{1}{1+r}\Pr(V_i < t_0 - E_{iA}) + \dfrac{r}{1+r}\Pr(W_i < t_0 - E_{iB}) \end{aligned} \]

3. Additional Follow-up Period

  • All subjects followed for additional period of \(t_1-t_0\), to ensure minimum possible follow-up
  • Data are \(R_i = \min\{V_i, t_1-{E_{iA}}\}\) and \(S_i =\min\{W_i,t_1-{E_{iB}}\}\)

3. Additional Follow-up Period

Expected follow-up time in arm A \[ \begin{aligned} E[R_i] &= \dfrac{t_0\lambda_A - \exp\{-\lambda_A(t_1-t_0)\}(1 - \exp\{-\lambda_A t_0\})}{t_0\lambda_A^2} \end{aligned} \] and probability of observing event is \[ \begin{aligned} \Pr(V_i< t_1-E_{iA}) = \dfrac{t_0\lambda_A - \exp\{-\lambda_A(t_1-t_0)\}(1 - \exp\{-\lambda_A t_0\})}{t_0\lambda_A} \end{aligned} \]

Time-to-event Example 1

  • "Dabrafenib in BRAF-mutated metastatic melanoma: a multicentre, open-label, phase 3 randomised controlled trial" (Hauschild et al., 2012)

https://www.ncbi.nlm.nih.gov/pubmed/22735384

  • BRAF-inhibitor (arm A) versus chemotherapy (arm B).

  • Wanted 3:1 randomization ratio in favor of BRAF-inhibitor arm: \(n_B = (1/3)n_A\), or \(r=1/3\)

  • Median time to progression or death is 2 months on chemotherapy. Expect increase to 6 months on dabrafenib. For exponential distribution, \(\text{median} = \log(2)/\lambda\). So aim to establish

\[ \delta = \log \lambda_B - \log \lambda_A = \log(\log(2)/2) - \log(\log(2)/6) = \log(3) \]

  • Targeting power \(1-\beta = 0.997\); one-sided type I error rate \(\alpha = 0.02\).
r=1/3;
lambdaA = log(2)/6;
lambdaB = log(2)/2;
#na
(na = ceiling((1+1/r)*(qnorm(0.98) + qnorm(0.997))^2/log(lambdaB/lambdaA)^2));
## [1] 77
#nb
(nb = ceiling(r * na));
## [1] 26
  • Our calculation of 103 is close to authors' 102

Time-to-event Example 1

  • Planned to enroll and randomize 200 patients (implies \(\pi = 102/200=0.51\) and \(\text{IF}=1.96\))

  • Enrollment period from Dec 23, 2010 to Sept 1, 2011 (\(t_0=8.25\) months)

  • Administratively censored on Dec 19, 2011 (\(t_1-t_0=3.5\) months)

t0 = 8.25;t1 = t0 + 3.5;
piA = (t0*lambdaA - exp(-lambdaA*(t1-t0))*
                 (1-exp(-t0*lambdaA)))/(t0*lambdaA);
piB = (t0*lambdaB - exp(-lambdaB*(t1-t0))*
                 (1-exp(-t0*lambdaB)))/(t0*lambdaB);
pi = piA/(1+r) + piB*r/(1+r);
1/pi;#IF
## [1] 1.53
  • We'll revisit this soon

Time-to-event Example 2

  • "Afatinib versus cisplatin plus gemcitabine for first-line treatment of Asian patients with advanced non-small-cell lung cancer harbouring EGFR mutations (LUX-Lung 6): an open-label, randomised phase 3 trial" (Wu et al., 2014)

https://www.ncbi.nlm.nih.gov/pubmed/24439929

  • Treatment for non-small cell lung cancer.

  • Testing for \(\delta = \log(1.57)\) (median survival of 7 months in arm A versus 11 months in arm B)

  • Targeting power \(1-\beta = 0.90\); type I error rate \(\alpha = 0.025\).

r=1/2;
lambdaA = log(2)/11;
lambdaB = log(2)/7;
#na 
(na = ceiling((1+1/r)*(qnorm(0.975) + qnorm(0.90))^2/log(lambdaB/lambdaA)^2));
## [1] 155
#nb
(nb = ceiling(r * na));
## [1] 78
  • Thus, require 233 events (authors calculate 217)

Time-to-event Example 2

  • Planned to enroll 330 patients (implies \(\text{IF}=330/217=1.52\))

  • Enrollment period from April 27, 2010 to Nov 16, 2011 (\(t_0=18.5\) months)

  • Administratively censored on Oct 29, 2012 (\(t_1-t_0=11.5\) months)

t0 = 18.5;t1 = t0 + 11.5;
piA = (t0*lambdaA - exp(-lambdaA*(t1-t0))*
                 (1-exp(-t0*lambdaA)))/(t0*lambdaA);
piB = (t0*lambdaB - exp(-lambdaB*(t1-t0))*
                 (1-exp(-t0*lambdaB)))/(t0*lambdaB);
pi = piA/(1+r) + piB*r/(1+r);
1/pi;#IF
## [1] 1.32

4. Lost-to-follow-up

Why were our calculated IFs smaller than papers' IFs? One possible reason: subjects will drop out of studies with long follow-up (for reasons unrelated to outcome), e.g. \(F_{iA}\sim \text{Exp}(\eta)\)

\[ \begin{aligned} R_i &= \min\{V_i, {F_{iA}}, t_1-{E_{iA}}\}\\ &\overset{d}{=}\min\{\min\{V_i,{F_{iA}}\}, t_1-{E_{iA}}\}\\ &\overset{d}{=}\min\{V^*_i, t_1-{E_{iA}}\} \end{aligned} \]

4. Lost-to-follow-up

What is distribution of \(V^*_i\)?

\[ \begin{aligned} \Pr(V^*_i > u) &= \Pr(V_i>u, F_{iA} >u)\\ &= \exp\{-\lambda_A u\} \exp\{-\eta u\}\\ &= \exp\{-(\lambda_A+\eta) u\}\\ \Rightarrow V^*_i&\sim\text{Exp}(\lambda_A+\eta) \end{aligned} \]

4. Lost-to-follow-up

Expected follow-up time in arm A is \[ \begin{aligned} E[R_i] &= \dfrac{t_0(\lambda_A+\eta) - \exp\{-(\lambda_A+\eta)(t_1-t_0)\}(1 - \exp\{-(\lambda_A+\eta) t_0\})}{t_0(\lambda_A+\eta)^2} \end{aligned} \]

and probability of observing event is

\[ \begin{aligned} &\Pr(V_i< \min\{F_{iA}, t_1- E_{iA}\}) =\\ &=\dfrac{\lambda_A\left[t_0(\lambda_A+\eta) - \exp\{-(\lambda_A+\eta)(t_1-t_0)\}(1 - \exp\{-(\lambda_A+\eta) t_0\})\right]}{t_0(\lambda_A+\eta)^2} \end{aligned} \]

5. Non-constant hazards

  • So far, have assumed \(\lambda_A(t)\equiv \lambda_A\) (likewise for \(\lambda_B\)).

  • Can relax this to assume that hazards change over time but remain proportional:

    • \(H_0\): \(\log\lambda_B(t)-\log\lambda_A(t) = 0\)
    • \(H_1\): \(\log\lambda_B(t)-\log\lambda_A(t) = \delta\)

5. Non-constant hazards

  • Very common test for equality of survival curves (or hazards) is non-parametric "log-rank test"

  • Assuming \(H_0\), i.e. equality of curves, and conditional on data, each event should occur with probability proportional to the number at risk (uncensored and free of event) at that time.

5. Non-constant hazards

  • Let \(j=1,\ldots,\pi(n_A+n_B)\) index the number of events observed out of \(n_A+n_B\) total patients

  • If \(n_{Aj}\) arm A subjects and \(n_{Bj}\) arm B subjects at risk at time \(t_j\) (\(n_j = n_{Aj} + n_{Bj}\)), then, knowing that subject will experience event at \(t_j\), expect \(p_{Aj}=n_{Aj}/n_j\) events in arm A and \(p_{Bj}=n_{Bj}/n_j\) events in arm B.

  • We observe event in arm B (suppose), so discrepancy between observed and expected at time \(j\) is \(O_{Bj} - p_{Bj} = 1 - n_{Bj}/n_j= n_{Aj}/n_j\)

  • Variance of this discrepancy is \(V_j = p_{Bj}(1-p_{Bj})\)

5. Non-constant hazards

  • Under \(H_0\), each \(O_{Bj} - p_{Bj}\) has mean zero and no correlation with \(O_{Bj'} - p_{Bj'}\), for \(j\neq j'\)

\[ \begin{aligned} T = \dfrac{\sum_j O_{Bj} - p_{Bj}}{\sqrt{\sum_j p_{Bj}(1-p_{Bj}}} \end{aligned} \]

  • Under \(H_0\), \(T \sim N\left(0,1\right)\)

5. Non-constant hazards

Under \(H_1\),

\[ \begin{aligned} E[O_{Bj} - p_{Bj}] &= \dfrac{\lambda_B(t_j) n_{Bj}}{\lambda_A(t) n_{Aj} + \lambda_B(t_j) n_{Bj}} - \dfrac{ n_{Bj}}{n_{Aj} + n_{Bj}}\\ &= \dfrac{e^\delta n_{Bj}}{ n_{Aj} + e^\delta n_{Bj}}- \dfrac{ n_{Bj}}{n_{Aj} + n_{Bj}}\\ &= \text{...algebra...}\\ &= \left(\dfrac{e^\delta - 1}{n_{Aj} + n_{Bj} e^\delta}\right)\left(\dfrac{n_{Aj}n_{Bj}}{n_{Aj}+n_{Bj}}\right)\\ &\approx \left(\dfrac{\delta}{n_{Aj} + n_{Bj}}\right)\left(\dfrac{n_{Aj}n_{Bj}}{n_{Aj}+n_{Bj}}\right)\quad\text{(next slide)}\\ &=\delta p_{Bj}p_{Aj}=\delta p_{Bj}(1-p_{Bj}) \end{aligned} \]

5. Non-constant hazards

\((n_{Aj} + n_{Bj})\left(\dfrac{e^\delta - 1}{n_{Aj} + n_{Bj} e^\delta}\right)\) versus \(\delta\) for \(n_{Aj}=n_{Bj}=50\)

5. Non-constant hazards

Asymptotically, under \(H_1\),

\[ \begin{aligned} E[T] &\approx \delta \dfrac{\sum_j p_{Bj}(1-p_{Bj}) }{\sqrt{\sum_j p_{Bj}(1-p_{Bj}) }} \\ &= \delta \sqrt{\sum_j p_{Bj}(1-p_{Bj})}\\ &\approx \delta \sqrt{\pi (n_A+n_B)\dfrac{n_An_B}{(n_A+n_B)^2}}\quad\left(\text{using } p_{Bj}\approx n_B/(n_A+n_B)\right)\\ &= \delta \sqrt{\dfrac{\pi n_A}{1+1/r}}\\ \end{aligned} \]

  • \(T \overset{\cdot}{\sim} N\left(\delta\sqrt{\pi n_A}/\sqrt{1 + 1/r},1\right)\)

5. Non-constant hazards

  • Sample size formula is identical (!):

  • \(n_A = \dfrac{(z_{1-\alpha} + z_{1-\beta} )^2}{\pi \delta^2}(1 + 1/r)\)
  • \(n_B = r n_A\)

(Schoenfeld, 1981), (Schoenfeld, 1983)

Discussion

  1. Our approach emphasizes difference between number of events required and number of subjects to enroll in order to observe events.

    • Adds layer of approximation, uncertainty.
    • Actual values of hazards (rather than only relative relationship) matter if patients are censored
  2. Can be difficult to re-create others' power calculations. Important to provide details on assumptions used to calculate power/sample size

  3. Simulation studies to confirm, explore power characteristics very useful

  4. Implict assumption is that censoring mechanism is (i) unrelated to event mechanism and (ii) non-differential across arms

See Chapter 4 (Cook and DeMets, 2007)

References

Cook, T.D. and DeMets, D.L. (2007) Introduction to Statistical Methods for Clinical Trials. CRC Press.

Hauschild, A., Grob, J.-J., Demidov, L.V., Jouary, T., Gutzmer, R., Millward, M., et al. (2012) Dabrafenib in braf-mutated metastatic melanoma: A multicentre, open-label, phase 3 randomised controlled trial. The Lancet, 380, 358–365.

Maertens, J.A., Raad, I.I., Marr, K.A., Patterson, T.F., Kontoyiannis, D.P., Cornely, O.A., et al. (2016) Isavuconazole versus voriconazole for primary treatment of invasive mould disease caused by aspergillus and other filamentous fungi (secure): A phase 3, randomised-controlled, non-inferiority trial. The Lancet, 387, 760–769.

Navarra, S.V., Guzmán, R.M., Gallacher, A.E., Hall, S., Levy, R.A., Jimenez, R.E., et al. (2011) Efficacy and safety of belimumab in patients with active systemic lupus erythematosus: A randomised, placebo-controlled, phase 3 trial. The Lancet, 377, 721–731.

Sax, P.E., Wohl, D., Yin, M.T., Post, F., DeJesus, E., Saag, M., et al. (2015) Tenofovir alafenamide versus tenofovir disoproxil fumarate, coformulated with elvitegravir, cobicistat, and emtricitabine, for initial treatment of hiv-1 infection: Two randomised, double-blind, phase 3, non-inferiority trials. The Lancet, 385, 2606–2615.

Schoenfeld, D. (1981) The asymptotic properties of nonparametric tests for comparing survival distributions. Biometrika, 68, 316–319.

Schoenfeld, D.A. (1983) Sample-size formula for the proportional-hazards regression model. Biometrics, 499–503.

Wu, Y.-L., Zhou, C., Hu, C.-P., Feng, J., Lu, S., Huang, Y., et al. (2014) Afatinib versus cisplatin plus gemcitabine for first-line treatment of asian patients with advanced non-small-cell lung cancer harbouring egfr mutations (lux-lung 6): An open-label, randomised phase 3 trial. The lancet oncology, 15, 213–222.