Several methods have recently been proposed to model treatment effect (TE) heterogeneity in clinical trials as a function of baseline covariates. A subset of these methods can be used to identify subgroups responding well to treatment. We generated data simulating trials with non-significant overall TEs, and compared the operating characteristics of 5 subgroup identification procedures: SIDES, Causal Tree (CT), model-based recursive partitioning (MOB), Interaction Trees (IT), a version of Virtual Twins (VT), and univariable regression model with interactions (UR). Operating characteristics included the number of subgroups identified, sensitivity/specificity, and classification accuracy. We quantified each method's ability to classify subjects by labeling subgroups with significant TEs together as responders, the rest as non-responders. Weak Type I error control is achieved for all methods using a common procedure. In our results, VT, MOB and UR were most likely to identify at least 1 subgroup. However, classification accuracy demonstrated by VT and SIDES was consistently high across simulation scenarios, and consistently lowest in UR, MOB, and IT.