What we've covered

  • Why randomize? \(\checkmark\)

  • Standard randomized designs \(\checkmark\)

  • How to analyze data from randomized trials \(\times\)

    • Deviations from randomization

    • Adjusting for covariates

Analyzing data from randomized trials

  • Primary analysis for many trials is often direct comparison of outcome in treatment groups, i.e. t-test, log-rank test, permutation test. Secondary analyses may adjust for specific covariates

  • Question: Patients enroll and are randomized. Some deviate from assignment and move to the other arm. Do you (i) analyze using randomized assignment (say \(T_R\)) or (ii) actual treatment (\(T\))?

Intended DAG

Actual DAG

Conditional treatment effect when assignments are followed

  • Recall that we estimate \(E[O_i(A)|T_i=A]-E[O_i(B)|T_i=B]\) and implicitly equate to \(E[O_i(A)-O_i(B)]\)

  • Suppose there is no \(T\)-\(O\) association in truth. In intended DAG, \(T_R\equiv T\). Does it matter which of these we use?

    • \(E[O_i(A)|T_i=A]-E[O_i(B)|T_i=B]\)

    • \(E[O_i(A)|T_{iR}=A]-E[O_i(B)|T_{iR}=B]\)

Conditional treatment effect when assignments are not followed

  • In actual DAG, \(T_R\) and \(T\) may differ. Now does it matter which we use?

\[ \begin{aligned} &E[O_i(A)|T_i=A]-E[O_i(B)|T_i=B] \\ &= E[U_i|T_i~=~A]-E[U_i|T_i=B] \\ &\neq 0 \end{aligned} \]

(because of indirect path through \(U\)) \[ \begin{aligned} &E[O_i(A)|T_{iR}=A]-E[O_i(B)|T_{iR}=B]\\ &= E[U_i|T_{iR}=A]-E[U_i|T_{iR}=B] \\ &= 0 \end{aligned} \]

Called 'Intent-to-Treat'

  • 'As-Treated' analysis may undo the effects of randomization if there is significant deviation from protocol

  • 'Intent-to-Treat' analysis is appropriate one. Formally, want to test for effect of being randomized to treatment rather than effect of treatment itself

Covariate restricted randomization

  • Reasons to consider stratifying randomization based upon covariate values:
  1. Pre-specified subgroup analyses

  2. Negative public perception of large imbalances

    • Probability of large overall treatment imbalance decreases with sample size, but chance imbalances on specific covariates still likely
  3. Increased precision (?)

Stratified permuted block randomization

  1. Identify one or two strong prognostic factors

  2. Create separate strata defined by these prognostic factors

  3. Within each stratum, conduct separate, independent permuted block randomization

Example: Stratified Block Randomization

Block size \(b=4\); after 2 patients

Block Stratum 1 Stratum 2 Stratum 3 Stratum 4
1 A A
2
3

Section 4.3, Rosenberger and Lachin (2004)

Example: Stratified Block Randomization

Block size \(b=4\); after 3 patients

Block Stratum 1 Stratum 2 Stratum 3 Stratum 4
1 A A
B
2
3

Example: Stratified Block Randomization

Block size \(b=4\); after 4 patients

Block Stratum 1 Stratum 2 Stratum 3 Stratum 4
1 B A A
B
2
3

Example: Stratified Block Randomization

Block size \(b=4\); after 5 patients

Block Stratum 1 Stratum 2 Stratum 3 Stratum 4
1 B A A
B B
2
3

Example: Stratified Block Randomization

Block size \(b=4\); after 10 patients

Block Stratum 1 Stratum 2 Stratum 3 Stratum 4
1 B A A A
B B B B
A B
2
3

Example: Stratified Block Randomization

Block size \(b=4\); after 13 patients

Block Stratum 1 Stratum 2 Stratum 3 Stratum 4
1 B A A A
B B B B
A B A
A
2 A
3

Example: Stratified Block Randomization

Block size \(b=4\); after 30 patients

Block Stratum 1 Stratum 2 Stratum 3 Stratum 4
1 B A A A
B B B B
A A B A
A B A B
2 A A A B
B A A A
B B A
A B
3 B

Example: Stratified Block Randomization

  • Subsequent analysis should then adjust for strata, e.g.

    • Stratified Mantel-Haenszel, log-rank tests
    • Putting covariates that defined strata directly in model
    • Random intercept for each stratum
    • Individual baseline hazards (for time-to-event)

Adjusting for Other Baseline Covariates

  • Question: when analyzing data from randomized trial (not necessarily stratified randomization), should analysis be adjusted for baseline factors?

Case study: Continuous outcomes

  • Two potential analysis models:

    1. \(O = \alpha^* + \delta^*1_{[T=A]} + \beta V + U\); \(U|\{T,V\} \sim N(0,\sigma^2)\)

    2. \(O = \alpha + \delta 1_{[T=A]} + U\); \(U|T \sim N(0,\tau^2)\)

  • Which to use?

Case study: Continuous outcomes

  • Question 1: When can both models hold, i.e. both satisfy all assumptions for linear regression?

  • Question 2: Assuming both models hold, when are treatment effects the same, i.e. \(\delta^* = \delta\)?

  • Question 3: When both models hold, when is there gain in precision (smaller variance) by adjusting for \(V\)?

Case study: Continuous outcomes

  • From model 1: \(E[O|T] = \alpha^* + \delta^*1_{[T=A]} + \beta E[V|T]\), which matches regression mean according to model 2 when \(E[V|T] = \mu + \gamma 1_{[T=A]}\), for some \(\mu\) and \(\gamma\), so that \(\alpha = \alpha^* + \beta\mu\) and \(\delta = \delta^* + \beta\gamma\)

  • And because \(\tau^2=\mathrm{var}[O|T] = \beta^2 \mathrm{var}[V|T] + \sigma^2\), matches constant variance according to model 2 when \(\mathrm{var}[V|T] = \omega^2\), so that \(\tau^2 = \beta^2 \omega^2 + \sigma^2\)

Case study: Continuous outcomes

Thus, answer to Q1 is when \(V|T \sim N(\mu + \gamma 1_{[T=A]}, \omega^2)\), i.e. when \(V\) given \(T\) is normal with mean equal to linear function of \(T\) and constant variance

Case study: Continuous outcomes

  • In that case, \[ \begin{aligned} E[O|T] &= \alpha^* + \delta^*1_{[T=A]} + \beta (\mu + \gamma 1_{[T=A]})\\ &= (\alpha^* + \beta\mu) + (\delta^* + \beta\gamma) 1_{[T=A]}\\ \end{aligned} \]

  • So, answer to Q2 is \(\delta^* = \delta\) when \(\beta=0\) or \(\gamma=0\)

Case study: Continuous outcomes

  • For Q3, we compare \(\mathrm{var}(\hat\delta^*)\) to \(\mathrm{var}(\hat\delta)\)

  • For brevity, we'll use \(T=1\) as equivalent to \(1_{[T=A]}\). Then, \[ \begin{aligned} \mathrm{var}(\hat\delta) = \tau^2\begin{pmatrix} 1 & ET\\ ET & ET^2 \end{pmatrix}^{-1}_{[2,2]} = \dfrac{\beta^2\omega^2 + \sigma^2}{\mathrm{var}[T]} \end{aligned} \]

Case study: Continuous outcomes

\[ \begin{aligned} \mathrm{var}(\hat\delta^*) &= \sigma^2\begin{pmatrix} 1 & ET & EU \\ ET & ET^2 & ETV \\ EU & ETV & EV^2 \end{pmatrix}^{-1}_{[2,2]} \\ & = ... \\ & = \dfrac{\mathrm{var}[V] \sigma^2}{\mathrm{var}[V]\mathrm{var}[T] - \mathrm{cov}^2(T,V)} \end{aligned} \]

Case study: Continuous outcomes

  • Then, asymptotic relative efficiency of \(\hat\delta^*\) to \(\hat\delta\) is

\[ \begin{aligned} \dfrac{\mathrm{var}(\hat\delta)}{\mathrm{var}(\hat\delta^*)}& = \dfrac{\beta^2\omega^2 + \sigma^2}{\mathrm{var}[T]}\dfrac{\mathrm{var}[V]\mathrm{var}[T] - \mathrm{cov}^2(T,V)}{\mathrm{var}[V] \sigma^2}\\ & = \dfrac{\beta^2\omega^2 + \sigma^2}{\sigma^2}\dfrac{\mathrm{var}[V]\mathrm{var}[T] - \mathrm{cov}^2(T,V)}{\mathrm{var}[V]\mathrm{var}[T]}\\ & = \dfrac{\beta^2\omega^2 + \sigma^2}{\sigma^2}\left(1- \mathrm{cor}^2(T,V)\right)\\ & = \dfrac{1- \mathrm{cor}^2(T,V)}{1 - \dfrac{\beta^2\omega^2}{\beta^2\omega^2 + \sigma^2}} \end{aligned} \]

Case study: Continuous outcomes

  • \(\dfrac{\beta^2\omega^2}{\beta^2\omega^2 + \sigma^2}\) resembles 'coefficient of variation' or proportion of variance explained, \(R^2\):

\[ \begin{aligned} \dfrac{\beta^2\omega^2}{\beta^2\omega^2 + \sigma^2} = \dfrac{\mathrm{var}(\alpha^* + \delta^*T + \beta V|T)}{\mathrm{var}(\alpha^* + \delta^* T + \beta V + U|T)} \end{aligned} \]

  • This is proportion of variance of \(O\) explained by \(V\) after conditioning on \(T\)

  • Often write \(\sqrt{\dfrac{\beta^2\omega^2}{\beta^2\omega^2 + \sigma^2}} = \mathrm{cor}(V,O|T)\), the partial correlation between \(V\) and \(O\) given \(T\)

Case study: Continuous outcomes

\[ \begin{aligned} \dfrac{\mathrm{var}(\hat\delta)}{\mathrm{var}(\hat\delta^*)} = \dfrac{1- \mathrm{cor}^2(T,V)}{1 - \mathrm{cor}^2(V,O|T)} \end{aligned} \]

Case study: Continuous outcomes

  • If \(\mathrm{cor}(V,O|T)=0\), meaning \(V\) and \(O\) have no linear association once \(T\) is accounted for, then \(\dfrac{\mathrm{var}(\hat\delta)}{\mathrm{var}(\hat\delta^*)} = 1- \mathrm{cor}^2(T,V) \leq 1\)

  • Adjusting for \(V\) when has no prognostic ability beyond treatment will tend to increase variance of test statistic

Case study: Continuous outcomes

  • If \(\mathrm{cor}(T,V)=0\), meaning \(T\) is unrelated to \(V\) (encouraged by proper randomization) \(\dfrac{\mathrm{var}(\hat\delta)}{\mathrm{var}(\hat\delta^*)} = (1- \mathrm{cor}^2(V,O|T))^{-1} \geq 1\)

  • Adjusting for prognostic covariate will decrease variance of test statistic

Case study: Continuous outcomes

  • Note also that, related to Q2,
    • \(\mathrm{cor}(V,O|T) = 0 \Leftrightarrow \beta=0\)
    • \(\mathrm{cor}(T,V) = 0 \Leftrightarrow \gamma=0\)
  • Thus
\(\mathrm{cor}(T,V) = 0\) \(\mathrm{cor}(T,V) \neq 0\)
\(\text{cor}(V,O\|T)=0\) \(\delta=\delta^*\); \(\mathrm{var}(\hat\delta)=\mathrm{var}(\hat\delta^*)\) \(\delta=\delta^*\); \(\mathrm{var}(\hat\delta) < \mathrm{var}(\hat\delta^*)\)
\(\mathrm{cor}(V,O\|T) \neq 0\) \(\delta=\delta^*\);\(\mathrm{var}(\hat\delta)> \mathrm{var}(\hat\delta^*)\) \(\delta\neq \delta^*\); \(\mathrm{var}(\hat\delta)(?)\mathrm{var}(\hat\delta^*)\)

Conclusions not as neat for other types of outcomes

  • In logistic regression, adding other covariates always increases variance of \(\hat\delta\). However, \(\hat\delta\) may be less biased (Robinson and Jewell, 1991)

  • Similar but less definitive findings for time-to-event outcomes (Chastang, Byar and Piantadosi, 1988)

Analysis in Randomized Trial: Recommendations

  • (from strongest to most tentative)
  1. Pre-specify analyses: state what primary analysis, secondary analysis will be before trial starts. Be clear where reported analyses deviate from planned analyses

  2. Stratify only on few, strong prognostic covariates. However, if multi-center trial, center should always be stratification factor

  3. Do not add covariates only to improve precision

  4. Statistically acceptable to ignore imbalances occuring by chance; may not be palatable to broader research community

Randomization: Conclusions

  • Randomization key to making causal claims: properly conducted, ensures there is no unmeasured confounding, but cannot address external validity concerns, e.g. Simulation 5 preferentially enrolled patients having larger treatment benefit, over-estimated population-average treatment effect. Under-estimate of treatment effect also possible

  • Restricted randomization used to encourage balance between treatment groups

  • Chance of large imbalance decreases with sample size; bias due to unmeasured confounding does not

References

Chastang, C., Byar, D. and Piantadosi, S. (1988) A quantitative study of the bias in estimating the treatment effect caused by omitting a balanced covariate in survival models. Statistics in Medicine, 7, 1243–1255.

Robinson, L.D. and Jewell, N.P. (1991) Some surprising results about covariate adjustment in logistic regression models. International Statistical Review/Revue Internationale de Statistique, 227–240.

Rosenberger, W.F. and Lachin, J.M. (2004) Randomization in clinical trials: Theory and practice. John Wiley & Sons, New York.