Some popular software packages for sample size calculations have made it relatively straightforward to incorporate futility guidelines into a group sequential design (GSD). These packages offer a variety of methods for determining futility, the most popular options being spending functions and conditional power. Designs with futility can be derived quickly and with relatively little thought, which can present serious problems. At times this results in studies with horrendous operating characteristics when it comes to the probability of study termination under the alternative hypothesis. In other words, a study may be terminated prematurely even if there is some likelihood that under the alternative hypothesis that it could meet its primary objective. Our goal is to evaluate the strategies for futility in GSDs and provide a guide as to which strategy to use, and so we propose various metrics, including ones derived from concepts in diagnostic testing, and give guidelines for their use. We give an example where we assess common futility strategies for a GSD with a survival endpoint. Examples are also given for two common adaptive strategies for single arm Phase 2 oncology studies.