TL23: Impact of missing data and their imputations in long-term treatment of chronic auto-immune diseases
*Achim Guettner, Novartis Pharama AG  *Carin Kim, FDA  *Karthinathan Thangavelu, Genzyme 

Keywords: multiple imputations, analysis standards

(i) Are there endpoints within a therapeutic area for which the handling of missing data can be harmonized? Which other design characteristics can be harmonized? How can ongoing improvements in statistical methods be considered for the development of therapeutic area standards? (ii) In which cases can we consider different methods for short-term timepoints and long-term timepoints for handling missing data? (iii) How can statisticians raise awareness to medical community that conclusions from indirect comparisons across compounds based on published data are dependent on design characteristics? For example, how can statisticians educate/inform non-statisticians on different assumptions underlying various imputation methods and the impact on the interpretation? For chronic diseases like auto-immune diseases such as psoriasis, rheumatoid arthritis or multiple sclerosis, the evaluation of long-term effects are important for physicians and patients with respect to treatment decisions. Long-term data (typically from extension studies, lacking a comparator group) are not necessarily included in product labels, but are available in publications. These are considered in treatment decisions by comparing long-term efficacy and safety across compounds in indirect comparisons. Whereas statisticians would prefer head-to-head comparative studies, the medical community and business analysts run descriptive comparisons based on published data. However, the comparisons across published data disregard design characteristics that need to be considered in the interpretation. For example, in long-term clinical trials, missing data due to treatment interruptions or discontinuation of studies have to be handled within the statistical analysis of treatment comparison or estimation of response rates or disease activity. For example, for psoriasis, the maintenance of efficacy is included up to 1 year in product labels, but published for time period of up to 5 years. Statistical methods for imputation of missing data vary and improve over time. In addition, there might not be one universal method suiting all study designs, and even within a study, the appropriateness of techniques might depend on the assessment timepoint (e.g. short-term versus long-term timepoints) and the type of endpoints (e.g., patient reported outcome [PRO]-based endpoint could be differently handled compared to an equally important investigator-reported endpoint, within the same study). For treatments within the same indication, this leads to publication of results where different imputation methods were used. These comparisons across compounds are inappropriate not only because of different techniques to handle missing data, but also due to different endpoints and/or design characteristics. Non-statisticians may not understand different assumptions underlying the imputation methods or that any imputation of missing data introduces bias, conservative or anti-conservative. For effective and safe treatments, the amount of missing data is moderate, but not negligible. For (binary) response variables in psoriasis, non-responder imputation is often used for short-term comparisons to controls. As drop-out reasons during the study might change, in particular comparing long-term versus short-term reasons non-responder imputation leads to decreasing response rates over time. In order to avoid a misinterpretation of the data, e.g. a diminishing effect, more adequate imputation methods are sought.

Next to non-responder imputation, last-observation-carried-forward imputation, conditional non-responder imputation (e.g. non-response is only imputed for drop-outs due to lack of efficacy, but not for others), observed data (ignoring missing values) and multiple imputations, as described in Little and Rubin (2002), Rubin (1987) and Schaefer (1997), are other options for dealing with missing data. Comparing these methods on the same dataset showed that decreases in response rates over time are driven by non-responder imputation whereas other methods are relatively indistinguishable, in particular show constant response rates over time. In contrast, there are disease areas in which trials may not widely use methods of missing data handling. For example, in relapsing remitting multiple sclerosis (RRMS) trials, patients with clinical relapses dropping out earlier from the study could possibly lead to inflation of the annualized relapse rate (typical primary endpoint) on the corresponding treatment group. On the other hand, for the more longer term endpoint of disability progression in RRMS trials, missing values may lead to loss of power for detecting true treatment effects. Seldom imputation of missing values is considered in the primary analysis of clinical endpoints in RRMS trials. Varying methods of missing values handling are used for the analysis of MRI-based endpoints in RRMS trials. In such disease areas uniform approaches handling missing values could be introduced in a harmonized manner and could help better understanding of clinical trial results in the context of varying levels of missing data (e.g., due to protocol-driven criteria for withdrawal from study leading to several missing data in one trial of a product vs another product’s trial in which completion rates are very high). Some clinicians identified a need for harmonized reporting of clinical trial data (Langley and Reich, 2013). Providing sensitivity analyses in the literature to the medical community, e.g. applying different analysis techniques, is not always possible due to size restrictions in publications. In addition, innovations and developments in statistical methods lead to changes of state-of-the art-methods over time. References: Langley, R.G., Reich, K. (2013). The interpretation of long-term trials of biologic treatments for psoriasis: Trial designs and the choices of statistical analyses affect ability to compare outcomes across trials. British Journal of Dermatology, 169, 1198-1206. Little, R.J.A and Rubin, D.B. (2002). Statistical Analysis with Missing Data. Wiley Series in Probability and Statistics, Chapter 10.

Rubin, D.B. (1987). Multiple Imputation for Nonresponse in Surveys . New York: Wiley.

Schaefer, J.L. (1997). Analysis of Incomplete Multivariate Data, Chapman&Hall.