Developing PRO Instruments in Clinical Trials: Issues, Considerations, and Solutions
*Wen-Hung Chen, FDA  *Cheryl Coon, Outcometrix  *Laura Lee Johnson, FDA  *Lisa Kammerman, Astra Zeneca  *Dennis Revicki, Evidera 

Keywords: patient-reported outcome (PRO), clinical outcome assessment (COA), clinical trial endpoint, psychometrics, interim analysis

This session aims to address when and how to evaluate the psychometric properties (including meaningful change and responder definitions) of a patient-reported outcome (PRO) or other clinical outcome assessment (COA) used to construct an endpoint in a clinical trial. The focus of this session will be identifying the risks and ramifications of evaluating (for the first time) the psychometric properties of a PRO in the same pivotal clinical trial in which the instrument is used to construct a primary, co-primary, or key secondary endpoint. In particular, the session will consider issues that can arise when these psychometric properties are assessed at an interim analysis within a clinical trial (i.e., within a “psychometric substudy”) and the interim findings are then applied to the interpretation of the final results.

(1) What is gained and what is lost by evaluating the psychometric characteristics of a PRO within a Phase 1, Phase 2, Phase 3, or Phase 4 study?

(2) What are the implications of the loss of statistical power due to reduced sample sizes in situations where, in order to determine and use responder definitions within the same study, the sample of patients enrolled in the clinical trial is split into two groups: (a) a subset for use in quantitatively evaluating the instrument’s psychometric properties, and (b) a subset in which the tool is used to help establish safety and efficacy?

(3) When data from a single clinical trial (a single sample of patients) are used both to develop a psychometric instrument and to establish efficacy there is potential for overfitting the instrument and corresponding responder definition to the specific sample of patients enrolled in that particular clinical trial. To what extent does this limit both the generalizability of the efficacy results and the generalizability of the psychometric properties of the tool itself?

(4) What issues are involved in conducting psychometric interim analyses (a.k.a., a “psychometric substudy”), in which the psychometric properties of an instrument (such as responsiveness to change) are evaluated part way through a clinical trial and then the trial either continues or is stopped? For example, what action should be taken if the tool is found not to be responsive and, therefore, cannot demonstrate efficacy?

(5) How do the risks and ramifications involved in conducting a psychometric substudy depend on the position of the COA-based endpoint(s) in the endpoint hierarchy? For example, if a COA is used to construct a secondary endpoint, the risk that an interim analysis poses to the study’s integrity may depend on how closely the primary and secondary endpoints are tied.

(6) In which situations might it be tolerable (statistically and psychometrically speaking) to conduct a psychometric substudy? In which situations would it be truly inadvisable? What are the deciding factors in each case? For example, some deciding factors might include (a) the disease area, (b) the medical product being evaluated, © the number of patients enrolled in the trial, (d) how long the instrument has been in use, and € where the COA-based endpoint is positioned in the endpoint hierarchy. (7) Who takes on risk in each case? Drug companies? The FDA? The general public?

(8) What issues are specific to the development of instruments for clinical trials in rare diseases and conditions, where large samples may not be available and one is dependent on the clinical trial data for psychometric analyses?

(9) When using clinical trial data for psychometric evaluation, decisions about item retention and interpretation guidelines should be made blinded to treatment group. What logistical issues arise when masking the psychometrician to treatment group for conducting the psychometric analyses using clinical trial data?

(10) The ideal clinical trial scenario in which to evaluate the psychometric properties of an instrument is one in which some patients get better, some remain the same, and some get worse. When assessing the psychometric properties of an instrument in a clinical trial, what are the implications if the treatment is very effective, especially with respect to the impact on defining clinically important differences, the responder definition, and responsiveness? Conversely, what are the implications if the treatment is completely ineffective?