Randomization of subjects into different groups can generate statistically equivalent groups. Randomization robustly generates groups that are well matched where group size is large relative to variability. However, when group sizes are small, the expected discrepancy in any covariate under randomization can be too large. This problem is further aggravated as the number of groups increases. This is the situation faced in numerous disciplines in which the rarity or expense of subjects makes assembly of large groups impractical. In such circumstances, simple randomization fails to reliably generate statistically equivalent groups, and therefore fails to generate reliable inference. It is clearly more desirable that experiments be conducted with groups that are similar, particularly in mean and variance of relevant baseline covariates. Here the composition of small statistically equivalent groups is treated as a mathematical optimization problem in which the goal is to minimize the maximum difference in both mean and variance between any two groups. Other prevailing methods such as pair-wise matching and re-randomization are also not practical in small sample groups. The article provides theoretical and computational evidence that groups created by optimization have exponentially lower discrepancy in pre- treatment covariates than those created by randomization or by existing matching methods. (17 refs.)
展开▼