Health policy decisions regarding patient treatment strategies require consideration of both treatment effectiveness and cost. Optimizing treatment rules with respect to effectiveness may result in prohibitively expensive strategies; on the other hand, optimizing with respect to costs may result in poor patient outcomes. We propose a two-step approach for identifying an optimally cost-effective and interpretable dynamic treatment regime. First, we develop a combined Q-learning and policy search-based approach to estimate an optimal list-based regime under a constraint on expected treatment costs. Second, we propose a procedure to select an optimally cost-effective regime from a set of candidate regimes corresponding to different cost-constraints. Our approach can estimate optimal regimes in the presence of time-varying confounding and correlated outcomes, both of which are common in cost-effectiveness analyses. Through simulation studies, we examine the operating characteristics of estimated optimal treatment rules. We apply our approach to identify an optimally cost-effective regime for assigning adjuvant radiation and chemotherapy to endometrial cancer patients.