Bios619 Lecture 12: Randomization

What we've covered

Why randomize? \(\checkmark\)
Standard randomized designs \(\checkmark\)
How to analyze data from randomized trials \(\times\)
- Deviations from randomization
- Adjusting for covariates

Analyzing data from randomized trials

Primary analysis for many trials is often direct comparison of outcome in treatment groups, i.e. t-test, log-rank test, permutation test. Secondary analyses may adjust for specific covariates
Question: Patients enroll and are randomized. Some deviate from assignment and move to the other arm. Do you (i) analyze using randomized assignment (say \(T_R\)) or (ii) actual treatment (\(T\))?

Intended DAG

Actual DAG

Conditional treatment effect when assignments are followed

Recall that we estimate \(E[O_i(A)|T_i=A]-E[O_i(B)|T_i=B]\) and implicitly equate to \(E[O_i(A)-O_i(B)]\)
Suppose there is no \(T\)-\(O\) association in truth. In intended DAG, \(T_R\equiv T\). Does it matter which of these we use?
- \(E[O_i(A)|T_i=A]-E[O_i(B)|T_i=B]\)
- \(E[O_i(A)|T_{iR}=A]-E[O_i(B)|T_{iR}=B]\)

Conditional treatment effect when assignments are not followed

In actual DAG, \(T_R\) and \(T\) may differ. Now does it matter which we use?

\[ \begin{aligned} &E[O_i(A)|T_i=A]-E[O_i(B)|T_i=B] \\ &= E[U_i|T_i~=~A]-E[U_i|T_i=B] \\ &\neq 0 \end{aligned} \]

(because of indirect path through \(U\)) \[ \begin{aligned} &E[O_i(A)|T_{iR}=A]-E[O_i(B)|T_{iR}=B]\\ &= E[U_i|T_{iR}=A]-E[U_i|T_{iR}=B] \\ &= 0 \end{aligned} \]

Called 'Intent-to-Treat'

'As-Treated' analysis may undo the effects of randomization if there is significant deviation from protocol
'Intent-to-Treat' analysis is appropriate one. Formally, want to test for effect of being randomized to treatment rather than effect of treatment itself

Covariate restricted randomization

Reasons to consider stratifying randomization based upon covariate values:

Pre-specified subgroup analyses
Negative public perception of large imbalances
- Probability of large overall treatment imbalance decreases with sample size, but chance imbalances on specific covariates still likely
Increased precision (?)

Stratified permuted block randomization

Identify one or two strong prognostic factors
Create separate strata defined by these prognostic factors
Within each stratum, conduct separate, independent permuted block randomization

Example: Stratified Block Randomization

Block size \(b=4\); after 2 patients

Block	Stratum 2	Stratum 3
1	A	A



2



3

Section 4.3, Rosenberger and Lachin (2004)

Example: Stratified Block Randomization

Block size \(b=4\); after 3 patients

Block	Stratum 2	Stratum 3
1	A	A
		B


2



3

Example: Stratified Block Randomization

Block size \(b=4\); after 4 patients

Block	Stratum 1	Stratum 2	Stratum 3
1	B	A	A
			B


2



3

Example: Stratified Block Randomization

Block size \(b=4\); after 5 patients

Block	Stratum 1	Stratum 2	Stratum 3
1	B	A	A
	B		B


2



3

Example: Stratified Block Randomization

Block size \(b=4\); after 10 patients

Block	Stratum 1	Stratum 2	Stratum 3	Stratum 4
1	B	A	A	A
	B	B	B	B
		A	B

2



3

Example: Stratified Block Randomization

Block size \(b=4\); after 13 patients

Block	Stratum 1	Stratum 2	Stratum 3	Stratum 4
1	B	A	A	A
	B	B	B	B
		A	B	A
			A
2			A



3

Example: Stratified Block Randomization

Block size \(b=4\); after 30 patients

Block	Stratum 1	Stratum 2	Stratum 3	Stratum 4
1	B	A	A	A
	B	B	B	B
	A	A	B	A
	A	B	A	B
2	A	A	A	B
	B	A	A	A
	B		B	A
	A			B
3	B

Example: Stratified Block Randomization

Subsequent analysis should then adjust for strata, e.g.
- Stratified Mantel-Haenszel, log-rank tests
- Putting covariates that defined strata directly in model
- Random intercept for each stratum
- Individual baseline hazards (for time-to-event)

Adjusting for Other Baseline Covariates

Question: when analyzing data from randomized trial (not necessarily stratified randomization), should analysis be adjusted for baseline factors?

Case study: Continuous outcomes

Two potential analysis models:
1. \(O = \alpha^* + \delta^*1_{[T=A]} + \beta V + U\); \(U|\{T,V\} \sim N(0,\sigma^2)\)
2. \(O = \alpha + \delta 1_{[T=A]} + U\); \(U|T \sim N(0,\tau^2)\)
Which to use?

Case study: Continuous outcomes

Question 1: When can both models hold, i.e. both satisfy all assumptions for linear regression?
Question 2: Assuming both models hold, when are treatment effects the same, i.e. \(\delta^* = \delta\)?
Question 3: When both models hold, when is there gain in precision (smaller variance) by adjusting for \(V\)?

Case study: Continuous outcomes

From model 1: \(E[O|T] = \alpha^* + \delta^*1_{[T=A]} + \beta E[V|T]\), which matches regression mean according to model 2 when \(E[V|T] = \mu + \gamma 1_{[T=A]}\), for some \(\mu\) and \(\gamma\), so that \(\alpha = \alpha^* + \beta\mu\) and \(\delta = \delta^* + \beta\gamma\)
And because \(\tau^2=\mathrm{var}[O|T] = \beta^2 \mathrm{var}[V|T] + \sigma^2\), matches constant variance according to model 2 when \(\mathrm{var}[V|T] = \omega^2\), so that \(\tau^2 = \beta^2 \omega^2 + \sigma^2\)

Case study: Continuous outcomes

Thus, answer to Q1 is when \(V|T \sim N(\mu + \gamma 1_{[T=A]}, \omega^2)\), i.e. when \(V\) given \(T\) is normal with mean equal to linear function of \(T\) and constant variance

Case study: Continuous outcomes

In that case, \[ \begin{aligned} E[O|T] &= \alpha^* + \delta^*1_{[T=A]} + \beta (\mu + \gamma 1_{[T=A]})\\ &= (\alpha^* + \beta\mu) + (\delta^* + \beta\gamma) 1_{[T=A]}\\ \end{aligned} \]
So, answer to Q2 is \(\delta^* = \delta\) when \(\beta=0\) or \(\gamma=0\)

Case study: Continuous outcomes

For Q3, we compare \(\mathrm{var}(\hat\delta^*)\) to \(\mathrm{var}(\hat\delta)\)
For brevity, we'll use \(T=1\) as equivalent to \(1_{[T=A]}\). Then, \[ \begin{aligned} \mathrm{var}(\hat\delta) = \tau^2\begin{pmatrix} 1 & ET\\ ET & ET^2 \end{pmatrix}^{-1}_{[2,2]} = \dfrac{\beta^2\omega^2 + \sigma^2}{\mathrm{var}[T]} \end{aligned} \]

Case study: Continuous outcomes

\[ \begin{aligned} \mathrm{var}(\hat\delta^*) &= \sigma^2\begin{pmatrix} 1 & ET & EU \\ ET & ET^2 & ETV \\ EU & ETV & EV^2 \end{pmatrix}^{-1}_{[2,2]} \\ & = ... \\ & = \dfrac{\mathrm{var}[V] \sigma^2}{\mathrm{var}[V]\mathrm{var}[T] - \mathrm{cov}^2(T,V)} \end{aligned} \]

Case study: Continuous outcomes

Then, asymptotic relative efficiency of \(\hat\delta^*\) to \(\hat\delta\) is

\[ \begin{aligned} \dfrac{\mathrm{var}(\hat\delta)}{\mathrm{var}(\hat\delta^*)}& = \dfrac{\beta^2\omega^2 + \sigma^2}{\mathrm{var}[T]}\dfrac{\mathrm{var}[V]\mathrm{var}[T] - \mathrm{cov}^2(T,V)}{\mathrm{var}[V] \sigma^2}\\ & = \dfrac{\beta^2\omega^2 + \sigma^2}{\sigma^2}\dfrac{\mathrm{var}[V]\mathrm{var}[T] - \mathrm{cov}^2(T,V)}{\mathrm{var}[V]\mathrm{var}[T]}\\ & = \dfrac{\beta^2\omega^2 + \sigma^2}{\sigma^2}\left(1- \mathrm{cor}^2(T,V)\right)\\ & = \dfrac{1- \mathrm{cor}^2(T,V)}{1 - \dfrac{\beta^2\omega^2}{\beta^2\omega^2 + \sigma^2}} \end{aligned} \]

Case study: Continuous outcomes

\(\dfrac{\beta^2\omega^2}{\beta^2\omega^2 + \sigma^2}\) resembles 'coefficient of variation' or proportion of variance explained, \(R^2\):

\[ \begin{aligned} \dfrac{\beta^2\omega^2}{\beta^2\omega^2 + \sigma^2} = \dfrac{\mathrm{var}(\alpha^* + \delta^*T + \beta V|T)}{\mathrm{var}(\alpha^* + \delta^* T + \beta V + U|T)} \end{aligned} \]

This is proportion of variance of \(O\) explained by \(V\) after conditioning on \(T\)
Often write \(\sqrt{\dfrac{\beta^2\omega^2}{\beta^2\omega^2 + \sigma^2}} = \mathrm{cor}(V,O|T)\), the partial correlation between \(V\) and \(O\) given \(T\)

Case study: Continuous outcomes

\[ \begin{aligned} \dfrac{\mathrm{var}(\hat\delta)}{\mathrm{var}(\hat\delta^*)} = \dfrac{1- \mathrm{cor}^2(T,V)}{1 - \mathrm{cor}^2(V,O|T)} \end{aligned} \]

Case study: Continuous outcomes

If \(\mathrm{cor}(V,O|T)=0\), meaning \(V\) and \(O\) have no linear association once \(T\) is accounted for, then \(\dfrac{\mathrm{var}(\hat\delta)}{\mathrm{var}(\hat\delta^*)} = 1- \mathrm{cor}^2(T,V) \leq 1\)
Adjusting for \(V\) when has no prognostic ability beyond treatment will tend to increase variance of test statistic

Case study: Continuous outcomes

If \(\mathrm{cor}(T,V)=0\), meaning \(T\) is unrelated to \(V\) (encouraged by proper randomization) \(\dfrac{\mathrm{var}(\hat\delta)}{\mathrm{var}(\hat\delta^*)} = (1- \mathrm{cor}^2(V,O|T))^{-1} \geq 1\)
Adjusting for prognostic covariate will decrease variance of test statistic

Case study: Continuous outcomes

Note also that, related to Q2,
- \(\mathrm{cor}(V,O|T) = 0 \Leftrightarrow \beta=0\)
- \(\mathrm{cor}(T,V) = 0 \Leftrightarrow \gamma=0\)
Thus

	\(\mathrm{cor}(T,V) = 0\)	\(\mathrm{cor}(T,V) \neq 0\)
\(\text{cor}(V,O\\|T)=0\)	\(\delta=\delta^\); \(\mathrm{var}(\hat\delta)=\mathrm{var}(\hat\delta^)\)	\(\delta=\delta^\); \(\mathrm{var}(\hat\delta) < \mathrm{var}(\hat\delta^)\)
\(\mathrm{cor}(V,O\\|T) \neq 0\)	\(\delta=\delta^\);\(\mathrm{var}(\hat\delta)> \mathrm{var}(\hat\delta^)\)	\(\delta\neq \delta^\); \(\mathrm{var}(\hat\delta)(?)\mathrm{var}(\hat\delta^)\)

Conclusions not as neat for other types of outcomes

In logistic regression, adding other covariates always increases variance of \(\hat\delta\). However, \(\hat\delta\) may be less biased (Robinson and Jewell, 1991)
Similar but less definitive findings for time-to-event outcomes (Chastang, Byar and Piantadosi, 1988)

Analysis in Randomized Trial: Recommendations

(from strongest to most tentative)

Pre-specify analyses: state what primary analysis, secondary analysis will be before trial starts. Be clear where reported analyses deviate from planned analyses
Stratify only on few, strong prognostic covariates. However, if multi-center trial, center should always be stratification factor
Do not add covariates only to improve precision
Statistically acceptable to ignore imbalances occuring by chance; may not be palatable to broader research community

Randomization: Conclusions

Randomization key to making causal claims: properly conducted, ensures there is no unmeasured confounding, but cannot address external validity concerns, e.g. Simulation 5 preferentially enrolled patients having larger treatment benefit, over-estimated population-average treatment effect. Under-estimate of treatment effect also possible
Restricted randomization used to encourage balance between treatment groups
Chance of large imbalance decreases with sample size; bias due to unmeasured confounding does not

References

Chastang, C., Byar, D. and Piantadosi, S. (1988) A quantitative study of the bias in estimating the treatment effect caused by omitting a balanced covariate in survival models. Statistics in Medicine, 7, 1243–1255.

Robinson, L.D. and Jewell, N.P. (1991) Some surprising results about covariate adjustment in logistic regression models. International Statistical Review/Revue Internationale de Statistique, 227–240.

Rosenberger, W.F. and Lachin, J.M. (2004) Randomization in clinical trials: Theory and practice. John Wiley & Sons, New York.

	\(\mathrm{cor}(T,V) = 0\)	\(\mathrm{cor}(T,V) \neq 0\)
\(\text{cor}(V,O\\|T)=0\)	\(\delta=\delta^\); \(\mathrm{var}(\hat\delta)=\mathrm{var}(\hat\delta^)\)	\(\delta=\delta^\); \(\mathrm{var}(\hat\delta) < \mathrm{var}(\hat\delta^)\)
\(\mathrm{cor}(V,O\\|T) \neq 0\)	\(\delta=\delta^\);\(\mathrm{var}(\hat\delta)> \mathrm{var}(\hat\delta^)\)	\(\delta\neq \delta^\); \(\mathrm{var}(\hat\delta)(?)\mathrm{var}(\hat\delta^)\)