Considerable research suggests that regression models with random effects can be used to establish a solid paradigm for the construction of the mathematics and statistics of personalized medicine research and practice, especially in the treatment of chronic diseases. The fact that generalized linear mixed models (GLMMs) have concepts that allow describing patient populations as a whole (the fixed effects) and, simultaneously, concepts that allow describing patients as individuals (the random effects) suggests that these models contain the key ideas for providing personalized medicine with a rigorous mathematical language. Underlying this is the belief that the variability of a random coefficient is not just a mathematical artifact to control for patients heterogeneity, but also the result of real variation in the biological and environmental factors that have made humans develop as individuals. Moreover, solid empirical and theoretical work by the Sheiner School of Pharmacology and others shows that a combination of mixed models with empirical Bayesian feedback (EBF) can be employed successfully in pharmacotherapy individualization and that EBF is well anchored in standard decision theory. Thus, both biological and mathematical arguments support the development of methodological instruments for personalized medicine based on GLMMs.
The objective of this half-day course is to introduce the main ideas of generalized linear mixed models, making emphasis on the interpretations from a personalized medicine viewpoint. Pharmacological applications taken from the extensive professional experience of the instructor will be shown. These applications will include: 1) methods to measure the individual benefit of medical or behavioral treatments; 2) analyses of bioequivalence studies; 3) the study of drug-drug interactions with patient samples, including the examination of the inducing or inhibiting effects of comedications; and 4) the utilization of mixed models in drug dosage individualization. Initially a historical account will be presented that will show how the original idea that random effects models are the key to developing personalized medicine can be traced back to pharmacological and genetic research developed in the second half of the past century. Examples of data analyses with the SAS and Stata computer packages will be shown.
Some of the applications will be taken from the following publications authored or coauthored by the instructor. A more complete reference list will be provided in the course class notes:
1. Diaz, F.J. Measuring the Individual Benefit of a Medical or Behavioral Treatment Using Generalized Linear Mixed-Effects Models. Stat Med 2016; 35:4077-4092.
2. Diaz FJ, Yeh H-W and de Leon J. Role of statistical random-effects linear models in personalized medicine. Curr Pharmacognomics Person Med 2012; 10:22-32.
3. Diaz FJ and de Leon J. The mathematics of drug dose individualization should be built with random effects linear models. Ther Drug Monit 2013; 35:276-277.
4. Diaz FJ, Rivera TE, Josiassen RC, et al. Individualizing drug dosage by using a random intercept linear model. Stat Med 2007; 26:2052-2073.
5. Diaz FJ, Cogollo M, Spina E, et al. Drug dosage individualization based on a random-effects linear model. J Biopharm Stat 2012; 22:463-484.
6. Diaz FJ, Berg MJ, Krebill R, et al. Random-effects linear modeling and sample size tables for two special cross-over designs of average bioequivalence studies: the 4-period, 2-sequence, 2-formulation and 6-period, 3-sequence, 3-formulation designs. Clin Pharmacokinet 2013; 52: 1033-1043.
7. Diaz FJ, Eap CB, Ansermot N, et al. Can valproic acid be an inducer of clozapine metabolism? Pharmacopsychiatry 2014; 47:89-96.
8. Botts S, Diaz FJ, Santoro V, et al. Estimating the effects of co-medications on plasma olanzapine concentrations by using a mixed model. Prog Neuro-Psychoph 2008; 32:1453-1458.
9. Diaz, F.J., Santoro, V., Spina, E., Cogollo, M., Rivera, T.E., Botts, S., de Leon, J. (2008). “Estimating the size of the effects of co-medications on plasma clozapine concentrations using a model that controls for clozapine doses and confounding variables.” Pharmacopsychiatry, Vol. 41; pp. 81-91.
Statisticians are extremely effective at analyzing data, performing simulations, and generating pages upon pages of analysis results. Despite their analytical prowess, however, statisticians continue to struggle to communicate the story hidden within the data to their colleagues. First and foremost, with the high cost of conducting translational clinical research, it is common to collect as much data as possible on as many endpoints as possible. This phenomenon is further reinforced due to our limited understanding of biological mechanisms and pathways, including the potential genomic underpinnings of a disease or treatment response. For example, we may have a clear understanding for how a novel therapy induces an efficacious response, but there is typically limited knowledge into the downstream effects of the drug to other body systems. A second challenge to communication lies in the increased use of sensitivity analyses to assess the consistency and robustness of results to varying assumptions. Given the volume of data to review and the variety of analyses to perform, it should come as no surprise that clear insight is often out of reach. In this environment, the traditional means of data summary – tables and listings – are ineffective for gaining insight; visualization is the key to effective communication for the modern statistician. Ben Shneiderman stated that “the purpose of visualization is insight.” Therefore, the goal of this short course is to describe data visualization techniques to aid in the understanding and communication of results from applications in clinical trials and genomics research.
Futility analyses (FA) are increasingly utilized in clinical trials. FA involves interim evaluation of the trial’s primary hypothesis to determine if there is a low probability of a positive result with trial continuation, or if the desired clinically meaningful effects can already be ruled out with reasonable confidence. FA can improve resource efficiency by the halting of trials with ineffective interventions and enabling sponsors to redirect efforts to more promising pursuits. FA also have ethical advantages in that fewer trial participants may be exposed to ineffective and possibly toxic interventions, and public health advantages in that trial results may be conveyed to the medical community in a more timely fashion.
FA should be carefully planned during trial design and described in the protocol as there are important statistical and operational consequences. Concerns include the control of statistical error rates and the concern for operational bias resulting from interim evaluations. There are varied and expanding statistical tools available for FA. Challenging questions arise during trial design regarding how FA should be conducted, a threshold at which futility would be established, and when futility should be assessed. Non-constancy of effect size and familiar limitations of accruing interim data can raise further challenges.
Data Monitoring Committees play a pivotal role in futility evaluation. Ensuring DMC access to appropriate data, ensuring DMC member understanding of futility methodologies, and thoughtful and efficient DMC reports describing FA are important for optimal recommendations. In this course, we will describe current practices and recent advances in methodological approaches and procedural issues, and illustrate with examples and case studies. We describe what FA are, why they are conducted, where and when they should be considered, and how they should be methodologically and operationally conducted.
Drug development has rapidly been globalized. Multi-regional clinical trial (MRCT) for regulatory submission has widely been conducted in the ICH and non-ICH regions. Regulatory agencies currently face challenges in evaluating data from MRCTs for drug approval. In order to harmonize points to consider in planning/designing MRCTs and minimize conflicting opinions, an ICH working group was established in late 2014 to create an international guideline for MRCT (ICH E17).
In September 2016, the US FDA announced the draft guidance entitled ‘‘E17 General Principles for Planning and Design of Multi-Regional Clinical Trials’’. The draft guidance describes general principles for planning and designing multi-regional clinical trials (MRCT). MRCTs conducted according to the guidance will investigate treatment effects in overall populations with multiple ethnic factors (intrinsic and extrinsic factors as described in the ICH guidance entitled ‘‘E5 Ethnic Factors in the Acceptability of Foreign Clinical Data’’). This half-day short course will (1) review regulatory history in ICH and non-ICH regions regarding the MRCT; (2) describe the key contents in the draft ICH E17 guidance; (3) discuss the statistical methodologies in designing MRCTs; and (4) illustrate relevant concepts using case studies.
Bayesian adaptive trial designs have drawn a tremendous amount of attention from industry, academia and government, and are increasingly used in practice. These designs have great potential to improve clinical trial ethics and increase the success rate and efficiency of clinical trials. However, due the newness of such designs, practitioners are less familiar with these methods, especially how to use them in practice. This course will introduce novel Bayesian adaptive designs, with a special focus on immunotherapy and drug combination trials, and illustrate the methodologies with real-world examples. More important, the course will provide a step-by-step tutorial to show attendees how to use R and other freely available software programs to design real-world clinical trials, thereby giving attendees a hands-on experience.
This short course will provide an exposition on health measurement scales – specifically, on patient-reported outcomes – based on the recently published book "Patient-Reported Outcomes: Measurement, Implementation and Interpretation" (Cappelleri et al., Chapman & Hall/CRC Press, December 2013). Some key elements in the development of a patient-reported outcome (PRO) instrument will be noted. Highlighted here will be the importance of the conceptual framework used to depict the relationship between items in a PRO instrument and the concepts measured by it. The core topics of validity and reliability will be discussed. Validity, which is assessed in several ways, provides the evidence and extent that the PRO measure taps into the concept that it is purported to measure in a particular setting. Reliability of a PRO measure involves its consistency or reproducibility as assessed by internal consistency and test-retest reliability. Anchor-based and distributed-based approaches to interpret PRO results will be elucidated in order to make these results useful and meaningful. Illustrations will be provided mainly through real-life published examples and also through selected simulated examples using SAS. Exploratory factor analysis and confirmatory factor analysis, mediation modeling, item response theory, longitudinal analysis, and missing data will be among the topics considered if time permits.
Precision medicine has paved the way for a new era of delivering tailored treatments to patients according to their biological profiles. In combination with innovative clinical design, this has presented drug developers unprecedented opportunities to engage novel thinking to accelerate drug discovery. In the first part of this course, step-by-step introductions to basic biology and genetics will be presented, and is followed by overviews of cutting edge technologies such as microarray and next generation sequencing technologies that have been widely used to generate omics data. Built on the basic knowledge of biology and omics data, key concepts of precision medicine studies and strategies of how in practice this novel approach can be applied to drug discovery will be discussed. In addition, statistical considerations and challenges posed in omics data such as data normalization, statistical modeling and interpretation will also be discussed. Examples are case studies from the instructors’ work and from medical literature. The second part of this course will cover design considerations in modern drug development for precision medicine. Different classical and adaptive design options including platform trial designs will be introduced with case studies. In addition, related statistical theories and analysis strategies will be covered . No prerequisite knowledge needed.
Clinical trials in the regulatory environment specify a primary outcome variable to avoid problems of multiplicity. A single outcome measurement is often insufficient to understand the effect of a drug, however. In particular, various things may happen that make the outcome variable unobservable, irrelevant, or nonexistent, or that change its interpretation. The outcome in such cases should be considered to be multivariate: either no such event occurs and the outcome is the value of the primary variable, or an event occurs and the outcome is the ensemble of the fact, the time, and the nature of the event, the observations before the event, and possibly further observations after the event.
In this respect trials are very different from sample surveys. In surveys the problem is not the existence or interpretability of the variable in question but the simple failure to ascertain it. There is no doubt that the value that would have been ascertained is the relevant quantity for analysis, and if it is not ascertained it must be estimated. Methods like those used to deal with missing data in surveys are commonly applied in clinical trials, with disastrous results, requiring implausible interpretations of what “would have” happened under different conditions.
We will discuss ways of defining effects that respect the multivariate nature of the outcome. These effects are of at least five kinds:
1. Actual values notwithstanding intercurrent events. 2. Transformed outcomes taking intercurrent events into account. 3. Values under hypothetical conditions. Careful attention will be given to what hypothetical conditions can yield estimable and interpretable effects and what conditions can’t. 4. Values in a subset without intercurrent events. The difference between this and “completers” or “per protocol” analysis will be carefully explained. 5. Values before an intercurrent event. We will consider when these reasonably represent a benefit to the patient and when they do not.
Recurrent events are repeated occurrences of the same type of event. Endpoints capturing recurrent event information can lead to interpretable measures of treatment effect that better reflect disease burden and are more efficient than traditional time-to-first-event endpoints in the sense that they use the available information beyond the first event.
Recurrent event endpoints are well established in indications where recurrent events are clinically meaningful, treatments are expected to impact the first as well as subsequent events and where the rate of terminal events such as death is very low. Examples include • seizures in epilepsy; • relapses in multiple sclerosis; and • exacerbations in pulmonary diseases such as chronic obstructive pulmonary disease. More recently recurrent event data endpoints have also been proposed in other indications where the rate of terminal events is high, e.g. chronic heart failure, but experience in this setting is limited.
In trials using recurrent event endpoints, interest usually lies in understanding the underlying recurrent event process and how this is impacted by explanatory variables such as treatment. In this context, different endpoints and measures of treatment effect – that is, different estimands - can be considered. Depending on the specific setting some estimands may be more appropriate than others. For example, accounting for the interplay between the recurrent event process and the terminal event process is important in indications where the rate of terminal events is high. The choice of estimands has direct impact on trial design, conduct and statistical analyses.
Sample size re-estimation (SSR) has been the most frequently used adaptive design method in confirmatory trials just after classical group sequential designs. The usefulness of interim SSR is mainly driven by uncertainty about the true effect size during the planning stage of the study. In general, SSR based on blinded interim analyses of aggregate/overall data are considered not to be problematic, as those approaches have very limited potential to introduce bias or impair the validity and interpretability of study results. SSR based on unblinded knowledge of interim treatment effects, on the other hand, can raise issues related to type I error inflation and/or operational bias, and has to be approached with greater caution. In planning any unblinded SSR in regulatory applications, clear analytical derivations or/and statistical justifications to demonstrate the control of type I error rate are expected, as well as strategies/plans to mitigate operational bias. There are additional issues in SSR. For example, it can lead to changing the minimum effect size for which a statistically significant result may no longer be clinically meaningful for the study indication. Also, the timing of SSR is critical as an early SSR may result in an unreliable new sample size because of limited accrual data; while a late SSR may be pointless as planned accrual may be completed by this time.
Disease interception is an emerging field that represents a paradigm shift in disease management: treating a pathogenic disease process before it is clinically detectable. For example, subjects at high risk of developing Alzheimer’s may be given a treatment designed to prevent the disease or delay its onset. As another example, early detection of disease in a preclinical disease phase may expand and/or improve treatment options. Evaluating a diagnostic test for use in a screening program designed for disease interception depends on many factors, including screening frequency, test accuracy, clinical benefits of an accurate test result, clinical consequences of an inaccurate test result, effectiveness of available treatments, and competing risks. Even an accurate test positive result for either slowly-progressing asymptomatic disease or high risk of future disease could have the clinical consequence of long treatment duration with potential for adverse events during a patient’s healthy years. Whether or to what extent these consequences are acceptable to patients or regulators as a result of intercepting disease that might or might not be present in occult form or may or may not occur in the future is unclear. Increasingly popular methods to assess benefit-risk tradeoffs, such as decision analysis with patient preferences and clinical utility measures could prove to be particularly valuable for jointly assessing medical tests and therapies in proposed disease interception programs. In this session, patient, academic, and regulatory perspectives on disease interception programs will be provided and illustrated with case studies. Recent advances in clinical utility measures as well as study designs and statistical evaluations will be discussed.
Most of the commonly used methods to analyze time to event data are built on the main assumption of proportional hazards. The results from traditional analyses such as Cox proportional hazards model and log-rank test may be difficult to interpret if the hazard rates from different groups change over time or cross at certain time points. Even though the Cox model with time-dependent covariate has been used when hazards cross, the interpretation of treatment effects in hazard ratios with covariate adjustment is challenging. Methods developed based on log-rank test can still be used to test for significance of the treatment effect but may not provide the best estimate of treatment effects. The goal is to provide a parsimonious method with easily interpretable treatment effect estimates. In this session, the speakers will present different strategies to handle non-proportional hazards issues in different clinical settings.
Discussants: Lee-Jen Wei, Harvard University ; Jim Hung, FDA/CDER
This session brings together experts in CMC issues but each speaker works on a distinct class of products. The products range from prescription forms of infant formula to biologics to components of in vitro diagnostic products. The session is designed to expose the audience to key scientific considerations in a fundamental question in manufacturing, namely how to best establish product shelf-life.
The 1984 Drug Price Competition and Patent Term Restoration Act, known as the “Hatch-Waxman Act,” established the modern system of generic drugs in the United States. According to 2016 Generic Drug Savings & Access in the United States Report published by Generic Pharmaceutical Association, generic drugs made up 89% of prescriptions dispensed in 2015 but only 27% of total medicine spending. With the passage of Generic Drug User Fee Act Amendments of 2012, there is an increased emphasis on regulatory science research for generic drugs, including the statistical evaluation of generic transdermal delivery systems and topical patches (hereafter referred as TDS products). Office of Generic Drugs in FDA CDER recommends three studies to be submitted in support of TDS products Abbreviated New Drug Applications (ANDAs), including a bioequivalence study with pharmacokinetic endpoints, an adhesion study evaluating the TDS adhesion to skin, and an irritation/sensitization study evaluating the skin irritation and sensitization potential of the TDS. To support regulatory approval, in addition to the bioequivalence of PK endpoints, the test product must adhere at least as well as the reference, be no more irritating than the reference, and be no more sensitizing than the reference. This session will discuss statistical issues, challenges and approaches in the adhesion study and the irritation/sensitization study for generic TDS products.
Successful drug development relies on close collaborations between industry and regulatory agencies. Many interactions take place during the development process to discuss various topics, some of which are statistical. Face to face meetings, teleconferences and written communications all occur. With the US FDA, these interactions can include the pre-IND meeting, end of phase II meeting, pre-submission meeting, and mid-cycle type A or type B review meetings. Cross-functional face to face meetings often have limited time allocated to relevant statistical issues. Decisions made at these meetings are formally documented and generally binding to the clinical development program. In addition type C meetings may also be held to discuss various other topics, such as protocol development, analysis plans, protocol amendments, submission orientation, or trial-specific questions. When a sponsor requests an interaction with a regulatory body, a few months may pass before an actual meeting can occur. As new and innovative products are being developed and made available to patients, more innovative and non-conventional trial designs and statistical analyses are being implemented, such as meta-analysis methodologies, Bayesian designs in drug development, adaptive designs in rare disease settings, pragmatic trials, PK/PD modeling, and inclusion of patient’s perspectives in drug development and decision making. Enhanced communications between sponsor and regulators are critical to ensure successful development programs. In July 2016 FDA released its PDUFA VI goals letter which provided specific important goals including enhancing FDA-sponsor communications, in part through a team of CDER/CBER staff dedicated to working more frequently with sponsors. This town hall session will include a panel and open audience engagement to explore enhanced strategies for more efficient interactions between industry and regulators in light of the PDUFA VI goals.
Randomization in clinical trials represents one of the cornerstones for experimental designs to reduce bias and substantiate the need for independent distributions of the treatment groups outcomes. A variety of randomization designs have been proposed in the literature for cluster randomized trials and individual subject randomized trials.
In individual subject randomized trials, the adaptive randomized design has become increasingly popular. Adaptations in the randomization scheme added weight to the ethical considerations and the need for acceleration in finding effective therapies as early as possible in the drug development process. The adaptive design framework offers an opportunity of making the “right decision early” by learning from the accrued data during the study, and adapting the randomization probabilities to randomly assign more patients to the more promising treatment arms. Practical considerations and challenges in implementing such dynamic designs will be addressed to show how to accelerate quantitative-decision making in drug development.
Cluster randomized trials with relatively few clusters have also been widely used in recent years for evaluation of health-care strategies. On average, randomized treatment assignment achieves balance in both known and unknown confounding factors between treatment groups, however, in practice investigators can only introduce a small amount of stratification and cannot balance on all the important variables simultaneously. Therefore, it is crucial to propose innovative randomization design to meet the challenge.
In this session, a presentation will address an innovative randomization design for cluster randomized trials from a regulatory perspective. In addition, the session will illustrate different response-adaptive randomization procedures in light of a most accurate benefit-risk assessment and of regulatory considerations.
Decisions about treatments are complex and often involve trade-offs between multiple, often conflicting, assessments of benefit and risk. Decision makers or stakeholders choose between alternatives and therefore are the source of preference scores and weights. Bayesian analysis allows formal utilization of prior information with repeated updates of knowledge from new evidence of data and thus is a natural choice to support such Benefit-Risk (BR) trade-offs and decision making. Bayesian BR approaches could allow one to explore the variability of the BR scores and weights in the presence of uncertainty in a sequential manner as information accrues. Perspectives from industry and FDA on the emerging challenges and importance of assessing BR preferences will be presented; speakers and panelist(s) will discuss various research experiences with Bayesian BR methods to offer possible solutions, and to share the lessons they have learned.
High failure rate in clinical trials remains one of the key causes of rising R&D costs in industry. It is therefore essential that sponsors make well-informed Go/No Go (GNG) decisions to move forward promising treatments, or halt ineffective treatments, at several time points during drug development. Utilizing appropriate statistical methods, such as Bayesian methods, is key to this decision making process. This session will showcase recent advancements in statistical methods and tools for GNG decision-making. Richard Simon from NCI will present a novel method for assessing the strength of evidence for making GNG decisions using Bayesian posterior probabilities. Pat Mitchell from AstraZeneca will present a dual target decision making framework and specialized software tool that can accommodate both Frequentist and Bayesian approaches. Jim Bolognese from Cytel will present a case study comparing a Bayesian design with and without an informative prior, to traditional group sequential options.
A Transdermal Delivery System (TDS), is designed to slowly deliver the active substance(s) through the intact skin. To ensure the safe and effective use of transdermal systems, the active substance(s) should be delivered through the skin at an adequate rate that is maintained for an appropriate time during system application and should not irritate the skin. Regulatory guidances [1, 2] are available for development of generic applications. United States Pharmacopeia has a general chapter on product quality tests for topical and transdermal drug products [3], however, there is no regulatory guidance on evaluating adhesion for new drug development. In this session, speakers from both industry and regulatory agency will present and discuss the most frontier knowledge and experience on evaluation of TDS performance in new drug development. In addition, we may also present and discuss the USP Chapter 3 on Topical and Transdermal Drug Products and EMA Guideline on Quality of Transdermal Patches. References: 1. FDA draft guidance for industry, Assessing Adhesion with Transdermal Delivery Systems and Topical Patches for ANDAs, 2016 2. EMA, Guideline on quality of transdermal patches, 2014 3. United States Pharmacopeia, Chapter <3>: topical and transdermal drug products-- product quality tests, 2016.
In addition to analytical and nonclinical studies, clinical PK/PD studies and comparative clinical efficacy studies may be conducted for assessing whether there is clinical meaningful difference between the biosimilar products and the reference product from PK/PD, efficacy and immunogenicity perspective. Bioequivalence margin and estimated statistics for outcome variables, derived based on reference product labeling and literature, are used in designing these biosimilar studies. However, due to limited available information, the reference point estimate and the corresponding variation in the outcome variables may not be reliably estimated. For example, from immunogenicity perspective, the sponsor may not have reliable estimate of anti-drug activity rate for the reference product. Study designed using these unreliably estimated parameters may be over- or under- powered. Adaptive design is often proposed to tackle these limitations. In this session, presenters from FDA and industry will discuss their experience in using adaptive design in biosimilar product development.
This session will be tied to the release of the FDA’s draft guidance on multiple endpoints in clinical trials and will serve as the forum to discuss key topics in the guidance document, including analysis of multiple endpoints, composite endpoints, subgroup analyses, gatekeeping strategies, etc. A summary of the guidance document will be given by Dr. Lisa LaVange (FDA) and the guidance will be discussed by multiplicity experts from academia and industry. This will include an overview of general guidelines for multiplicity adjustment strategies in confirmatory trials with multiple clinical objectives. The section will be aimed at a broad audience of statisticians involved in the design and analysis of confirmatory clinical trials. Speakers: Lisa LaVange, FDA; Ralph D’Agostino, Boston University; Alex Dmitrienko, Mediana Inc
A 'promising zone' design of a clinical trial allows the increase of study's sample size when the unblinded interim estimate of the treatment effect looks promising. In spite of the many years of research and discussion in the regulatory and statistical literature the evaluation of general usefulness of the 'promising zone' design and ranking of the usefulness of the different sample size re-assessment rules are still opened to debate. The goal of the session is to discuss recent methodological work in this area. We will compare relative attractiveness of the different sample size re-estimation rules in different settings.
Bayesian approach is becoming more and more powerful and popular in clinical trial design, monitoring, and analysis. FDA issued guidance on Bayesian statistical methods in the design and analysis of medical device clinical trials. In order to ensure that Bayesian methods are well-understood and broadly utilized for design, analysis and throughout the medical product development process and to improve industrial, regulatory and economic decision making. Bayesian Scientific Working Group (BSWG) of the DIA was formed in 2011 to achieve this version. BSWG hopes to offer clearer solutions to commonly occurring obstacles that have prevented industry confidence in this area through the voluntary contributions of the pharmaceutical company, Contract Research Organization (CRO), academic, and federal agency statisticians who provided their perspectives on improved special drug development program methods that would be more efficient while maintaining statistical integrity.
Composite endpoint combining several outcomes of clinical interest is frequently used as the primary endpoint in clinical trials. A main advantage of using such an endpoint is that the event rate of the composite endpoint is higher than its components (outcomes) alone, resulting in a smaller sample size. However, the conventional statistical methods for composite endpoints suffer from two major limitations. First, all components are considered equally important, and second, in time-to-event data analyses, the first event analyzed may not be the most important component. To address these limitations, some statistical methods have been recently developed to construct and analyze composite endpoints by considering the order of the clinical importance among the outcomes, such as (1) net chance of a better outcome (or proportion in favor of treatment), (2) weighted composite endpoint, and (3) win ratio. This session will focus on the third approach – win ratio.
Win ratio approach was introduced by Pocock et al. in 2012. It compares each patient in the treatment group with every patient in the control group to determine the winner/loser/ties within each pair. Each pair comparison starts with the most important outcome, with lower priority outcomes being used only if higher priority outcomes are missing or result in a tie. The win ratio is the ratio of the winners in the two groups.
During the past few years, the win ratio has been applied in the design and analysis of some clinical trials, and there are also some new methodological developments. This session will focus on (1) the generalized approach published by Dong et al. in 2016 to enable users define winners, losers and ties based on their specific study settings; (2) stratified win ratio to analyze clinical trials with a stratified randomization (to be published) vs the unweighted win ratio and an inverse variance weighted win ratio; (3) statistical issues in using the win ratio for clinical trial designs and analyses as well as interpretations of the win ratio results.
We will illustrate and discuss these recent methodological developments and our practical experiences with clinical trial examples from the perspectives of academia, sponsors, and regulatory agencies.
Discussant(s): Junshan Qiu, FDA/ James Hung, FDA
The development of biosimilar is an emerging area. Although several regulatory guidelines have been issued, associated statistical methodologies continue to evolve. For example, statistical accommodation of limited availability of biological materials and lots has been developed. In this session, statistical challenges and opportunities concerning biosimilar develop will be highlighted. In particular, the thought process that led to the FDA recommendation of the tiered approach to analytical similarity testing and equivalence margin setting and current EMA considerations of statistical methods for comparative assessments of quality attributes will be discussed. This session consists of two presentations, one expert statistician representing an industry perspective and one representing a regulatory perspective, who will together provide fresh insight on the application and issues surrounding statistical approaches to analytical similarity.
Combination therapies have shown great success in recent oncology drug development programs, including combination of immuno-therapies in NSCLC, and combination of PI/IMids with other drugs in multiple myeloma. Yet it is challenging to find the optimal combination of doses with acceptable toxicity profile: combinatorial choices exist with respect to escalation or de-escalating doses of one or more drugs. When multiple drugs are investigational, it becomes difficult to attribute observed safety signals to drugs, hence makes it difficult to make escalation decisions.
Another challenge exists within master protocol designs -- hundreds of clinical trials are needed within the traditional randomized clinical trial paradigm to test all possible combination therapies for different cancer types.
In this session, practical considerations of drug combination studies will be presented. A panel will discuss their perspectives regarding the following questions: 1. What is the panelist experience/recommendations about rule-based and model-based designs in dose finding for combination therapies? 2. Is there any specifics of designing studies for molecularly targeted agents, in particular the validity of the assumption about monotonically increasing relationship between dose and efficacy? 3. What are the special considerations about combination studies in immuno-oncology? 4. Regulatory issues when using novel trial designs (e.g., basket or umbrella) for combination therapies.
In this late-breaking session, Drs. Lisa LaVange (FDA CDER) and Deborah Ashby (Imperial College London) will discuss their journeys of becoming recognized leaders in statistics, a male dominated scientific discipline. They will share the lessons they learned on scientific research, career development and effective leadership. Drs. Weili He (AbbVie) and Telba Irony (FDA CBER) will moderate this inspiring event.
The concept of pragmatic clinical trials (PCTs) dates back to the 1960s (Schwartz and Lellouch, 1967). Many “large and simple” trials conducted in 1980s and 1990s in cardiovascular disease can be considered as PCTs . Current PCTs are primarily designed to compare strategeies for the prevention, diagnosis and treatment of diseases, conducted by academic medical institutes such as Patient-Center Outcome Research Institute (PCORI). Recently a new interest of PCTs emerges for marketing authorization application of new drugs with an increasing regulatory attention to using real world evidence in drug approval. GSK’s Salford Lung Study (New et al, 2014) was the first phase III pragmatic clinical trial supporting registration of the new drug. EMA Adaptive Pathway (EMA, 2016) encourages the use of PCTs to generate real world evidence (RWE) for the final drug approval. We expect PCTs will be increasingly adopted into new drug development as regulatory acceptance or requirement of RWE evolves. This session will discuss the future use of PCTs for the new drug or new inidcation approval from pharmaecutical industry, regulatory and academic aspects. Unlike the traditoinal randomized controlled trials, PCTs enroll the diverse population, follow the real world clinical practice, do not enforce adherence of patients and physcians, compare the real world different treatment alternatives. The appropriate guideline to design, conduct and analyze the PCTs should be developed before the PCTs can be used for drug registration purpose. How PCTs play the role in drug development and approval process is worthy investigating further in the near future.
Clinical trials with adaptive designs (ADs) use accumulated subject data to modify the parameters of an ongoing study without compromising the integrity of the study. These ADs are employed with the goal of being more efficient than a standard design, with efficiencies coming from various aspects, for example, possibly increasing power with fewer subjects or moving a compound through clinical development more expeditiously. With two guidance documents published by the Food and Drug Administration and theoretical advancements, it is of interest to review how adaptive designs are carried out in practice to understand any possible barriers to AD utilization so that we can continue to move forward with valuable clinical trial innovation. In this session, two parallel sets of surveys will be presented. The first is a set of four consecutive surveys conducted by the Drug Information Association (DIA) Adaptive Designs Scientific Working Group (ADSWG) Survey Subteam. These four surveys each span a four-year period (2000-2003, 2004-2007, 2008-2011, and 2012-2015) with earlier results being published by Quinlan et al in 2010 and Morgan et al in 2014. These four surveys have a structure of consistent questions for better understanding of the AD usage trends while a group of questions were added in later versions to better understand current circumstances. Responders include both industry and academic institutions. In addition to the first set of surveys, reviews of literature and registry entries over the four-year intervals were also summarized for wider understanding of current and past acceptance of ADs. The second set of surveys to be presented was conducted across centers (including CBER, Center for Biologics Evaluation and Research and CDRH, Center for Devices and Radiological Health) on what reviewers see through various submissions as well as challenges and pitfalls. Caroline Morgan will serve as discussant.
The Clinical Trials Transformation Initiative Data Monitoring Committees Project, which aimed to 1) describe the current landscape of DMC use and conduct, 2) clarify the purpose of and rationale for using a DMC, 3) identify best practices for independent DMC conduct, 4) describe effective communication practices between independent DMCs and trial stakeholders, during all phases of DMC activity, and 5) identify strategies for preparing a robust pool of DMC members, conducted a survey to assess current use and conduct of DMCs and assess training practices for DMC members and convened focus groups to gain an in-depth understanding of needs and best practices related to DMC use. High level survey and focus group findings will be presented.
Based on data gathered via these evidence-gathering activities and feedback from discussion at an expert meeting the project convened, the project team, made up of a diverse group of stakeholder from across the clinical trials enterprise, developed recommendations intended to improve the quality and efficiency of DMCs. The recommendations will be presented and will cover: 1. Role of the DMC, including issues related to DMC access to blinded data and independence; 2. DMC Composition, including issues related to conflict of interest and use of patient advocates in DMCs; 3. Communication related to the DMC, including communication between the DMC, trial sponsor, statistical analysis center, IRBs, and regulatory bodies; 4. DMC Charter, including a sample Table of Contents and points for consideration; and 5. Training DMC members and statistical analysis center statisticians, including suggested training formats and apprenticeship opportunities;
Additional tools related to the following will also be presented: 1. Specific responsibilities of the DMC 2. Best practices for conduct of DMC meetings
Speaker 1: Karim Calis, FDA Speaker 2: Ray Bain, Merck
In the 2014 concept paper for the ICH E9 (Revision 1) “Addendum to Statistical Principles for Clinical Trials on Choosing Appropriate Estimands and Defining Sensitivity Analyses in Clinical Trial”, two problems have been identified: the incorrect choice and unclear definitions for estimands and the absence of a framework for planning, conducting and interpreting sensitivity analyses. These problems could lead to inconsistencies in inference and decision making within and between regulatory regions. Consequently, an ICH E9(R1) Expert Working Group has been formed to provide recommendations on these problems, with a draft Addendum expected to be released in the second half of 2017. Following the initial recommendations, this session will discuss the intended impact of the suggested framework and any challenges for broad implementation in clinical trials. Topics to be approached include types of estimands related to various study objectives, statistical methods to be employed to handle different choices of estimands, and defining a set of sensitivity analyses for each estimand. Estelle Russek-Cohen as a discussant.
Former FDA Commissioner Robert Califf challenged us at last year’s workshop about how our National Clinical Research System is flawed. One problem he highlighted is that we are not focusing enough on how to decide who benefits from which treatment. To specifically answer Dr. Califf’s question, we have started sharing information across companies, e.g., using platform trials which not only compares treatments from different companies in the same trial but also incorporates important biomarker information. On the other hand, this question is also being addressed with how the FDA uses Drug Trial Snapshots to provide consumers information about whether there were any differences in the benefits and side effects among sex, race and age groups. Clinical researchers may expand on this approach by exploring additional subgroups to provide more information to the FDA for consideration of being included into communication with consumers. FDA speakers will present their views on how patients should be informed about expectations of how a drug should perform, both in terms of average treatment effect and on specific subsets of the population. Academic and/or industry leaders will present case studies where emerging subgroup identification methods provide reproducibility for identifying patients who benefit from a treatment. Since the numbers of patients in some groups are too limited to allow meaningful comparisons, another topic to be discussed is how to combine clinical trial data and real world evidence to improve subgroup findings based on clinical trial only.
Meaningful clinical benefit in the drug development should evaluate how patients “feel, function and survive”. How patients feel and function are usually captured by patient-reported outcomes (PROs). New development in oncology compounds are showing unprecedented efficacy using objective efficacy endpoints (survival and radiographic endpoints). PRO data would complement the objective efficacy endpoints to characterize the patient experience. Almost every oncology and hematology pivotal trial has PRO data; however, given the unique trial design characteristics in cancer trials, PRO endpoints could be potentially biased. They are therefore rarely used as primary or key secondary endpoints and neither are they included in the drug label. Challenges for assessing PROs in oncology include but are not limited to: lack of agreed upon instruments (questionnaires); trial designs not optimized for PROs; and lack of standardization in data analysis and data presentation. There is renewed interest in optimizing collection of high quality patient-centered data in the benefit-risk determination for cancer drugs. It is critical that we understand and find ways to mitigate the challenges associated with PRO data obtained in cancer clinical trials. In this session, we will invite speakers from FDA and industry to share their experience and discuss how PROs in oncology and hematology trials can be used to support overall drug development and regulatory approval.
Physicians often compare medical test results from an individual patient against a set of results from those deemed to be in good health, for the purpose of deciding how to manage medical care for the individual patient. This set of medical test results from those in good health is called a reference database. A common summary of reference databases is the reference interval, which is an interval that encompasses a specified percentage of the data in the reference database (e.g. 95%). Though the aforementioned comparison appears to be straightforward, there are many intricate study design and statistical issues associated with reference databases, including how to select individuals for a reference database and how to derive reference intervals (e.g. linear regression, quantile regression, nonparametric estimation of stratified data). As medical tests are becoming more sophisticated (e.g. reference databases for multiple outputs for Ophthalmological testing, reference databases for highly sensitive assays such as Troponin), there are new study design and statistical analysis challenges associated with reference databases and reference intervals. Given the importance of the reference database and reference interval concepts in the medical testing paradigm, this session aims to discuss study design and statistical analysis issues for medical tests with reference databases or reference intervals, from the perspectives of academia, industry, and government.
The growing area of data science has placed greater focus on visualization, predictive analytics and machine learning. This is in contrast to a drug development setting, where biostatisticians have tended to focus on techniques suited to the confirmatory paradigm, even when dealing with questions that are more exploratory in nature. Pre-planned hypothesis testing and static tables are also often used for exploratory analyses or in the earlier phases of development. Such an approach fails to do justice to the data collected in clinical research and in many cases does not address questions of real interest. In addition, interpretation is made complicated due to multiplicity and selection bias. Tukey, who emphasized the need for both exploratory and confirmatory paradigms, forcefully argued that in science important and relevant questions are generated from data exploration. He went on to refute the notion that exploratory analysis is just descriptive statistics, stating that it is an attitude, requiring flexibility, and a reliance on display but not a bundle of techniques. How can the thoughts of Tukey be combined with the latest developments in data science to improve exploratory analysis in drug development? In this session, we aim to examine this question through a series through a series of case-studies. Recent developments in dynamic visualization, machine learning and software tools will be discussed.
The role of modeling and simulation has forcefully and rapidly grown in the pharmaceutical and device industry. The practice of utilizing scientific innovation and adaptive design methodologies more often needs to be looked at through the lens of simulation.
This session will share the work of a cross-pharma working group composed of statisticians from industry and the FDA brought together within the Adaptive Design Scientific Working Group. This teamwork is focused on the best practices and recommendations around modeling and simulation conduct and reporting in various settings of the most frequently used adaptive designs landscape.
When adaptive designs are an integral part of a compound/device development program, the simulation report is to be regarded as a regulatory document. As stated in the 2010 FDA guidance on adaptive designs, detailed documentation on the simulation report and results should accompany the study protocol. The session will illustrate how the teamwork has expanded on recommendations for trial simulation reporting provided in Gaydos et al. (Drug Information Journal:43, 2009) and will provide the regulatory perspective as well as specific examples on key features associated with the simulation report. Commonly used adaptive designs are categorized into 4 main classes: dose escalation designs, dose ranging designs, designs that enable sample size re-estimation and early stopping, and multi-stage confirmatory designs.
Missing data and treatment adherence are common challenges in clinical trials. The final concept paper for ICH E9 Revision 1 (2014) - “Addendum to Statistical Principles for Clinical Trials on Choosing Appropriate Estimands and Defining Sensitivity Analyses in Clinical Trials”, pointed out that “Incorrect choice of estimand and unclear definitions for estimands lead to problems in relation to trial design, conduct and analysis and introduce potential for inconsistencies in inference and decision making”. How to clearly define and choose an appropriate estimand for a specific study design? What primary and sensitivity analyses should be used to estimate the selected estimand? Do these answers vary from superiority trials to non-inferiority or bioequivalence/biosimilar trials? Much work needs to be done to resolve the remaining challenges and in understanding the implications and interpretation of different statistical methods to handle different choices of estimands in drug development.
With advancements in science and technology, research in drug development for orphan diseases, including pediatric trials, is of great interest. Evidence of this is demonstrated by the increasing number of fast track and breakthrough submissions that FDA has received during the past two years. Even though a variety of approaches have been proposed to cope with the challenges associated with the conduct and analysis of small-sized trials, it is not clear whether a consensus has been reached in terms of what level of evidence can be used to make regulatory decisions. In particular, when the safety of the study drug is a concern, how to factor in the benefits and risks during the drug evaluation and approval process can be very challenging. In these cases, rather than meeting the requirement of two adequate and well-controlled trials based on a single pre-specified primary endpoint for efficacy, it may be necessary to rely on different types of study endpoints and innovative study designs in assessing the “totality” of the evidence. The ultimate goal is to be flexible and at the same time maintain the standards for our evaluation of the drug’s safety and efficacy.
In this session we will focus on the research areas dealing with small-sized trials, where the major concern is the lack of sufficient study power to demonstrate efficacy. Methods for efficiently utilizing innovative designs and statistical analyses, and even borrowing external or historical trial data, will be discussed. For example, although “relaxing” the level of confidence in determining the non-inferiority margin seems to be an option for small trials, how much advantage this approach will gain compared to the use of superiority tests with “relaxed” alpha requirements is not clear.
Wearable devices provide the opportunity to measure how patients are functioning in their daily lives in the real world. This creates an opportunity to transform the functional endpoints used in clinical trials from artificial standardized measurements observed in clinical setting into something that is more meaningful to patients. The measurements vary device, but can include heart rate, number of steps taken, measurements of activity, and estimates of energy expenditure. These variables can be collected for each minute over an extended period of time, such as two weeks. There are challenges in the data collection including educating the patients on how to use the device and how to minimize missing data. Analysis of the data has many challenges including how to summarize the data collected and how to handle missing data. Some approaches reduce the data into summary measures, such as intensity categories (sedentary, light, moderate and vigorous activities). These approaches depend on specifying cut points for intensity such as <100 activity counts (sedentary activity), 100-300 activity counts (light activity), etc. These cut points are somewhat arbitrary and reducing the data in this way has the potential to lose information. Other approaches use mixed effects models, functional data analysis, and principal components. Some of these more complicated methods have the potential to handle missing data and can use all of the data in a more meaningful way. The session will focus on how these methods can best be used to define a clinically meaningful measure of a treatment effect.
Agreement studies are often conducted to evaluate if a new method is equivalent to an existing method so we can replace the existing method by the new, or use the two methods interchangeably. In the medical device regulation, a 510(k) regulatory pathway is that to be marketed devices should be demonstrated to be at least as safe and effective, that is, substantially equivalent, to a legally marketed device (predicate). The equivalence for safety and effectiveness are often evaluated in an agreement study where agreement between the two methods is assessed. The agreement assessment can be used in evaluating the acceptability of a new or generic process, methodology and/or formulation in areas of lab performance, instrument/assay validation for both quantitative and qualitative device outputs. This session will discuss statistical issues in agreement studies.
A well-designed meta-analysis can provide valuable information for safety and effectiveness assessment in the regulation of medical products. However, there are many statistical considerations in using meta-analysis for regulatory decision making. Meta-analysis may be subject to bias such as publication bias; heterogeneity in the study population, study design and study conduct, etc. can create difficulties in generalizing statistical inference and interpreting results. The quality assessment of selected publications in meta-analysis such as blinding, missing data, etc. is crucial in the evaluation. In this session, we will invite speakers from industry, academia and FDA to share their thoughts in using meta-analysis for regulatory decision making.
Drug development has rapidly been globalized. Multi-regional clinical trial (MRCT) for regulatory submission has widely been conducted in the ICH and non-ICH regions. In order to harmonize points to consider in planning/designing MRCTs and minimize conflicting opinions, an ICH working group was established in late 2014 to create an international guideline for MRCT (ICH E17). This guideline is intended to describe general principles for the planning and design of MRCTs with the aim of encouraging the use of MRCTs in global regulatory submission. The draft ICH E17 has been issued for public comments in 2Q2016.
Missing data have raised concerns in the statistical analyses in areas such as patient reported outcomes (PROs), cost effectiveness analyses (CEA) and efficacy analyses of new treatments in clinical trials. In the first situation, individuals with complete information tend to be systematically different from those with missing data within each provider and systematic biases may result from the proportions of non-response among the providers. Similar bias may occur in the third situation, where patients who drop out may respond to the treatment systematically differently from those who stay. Inappropriate methods to handle missing data may lead to misleading results and ultimately can affect the regulatory decisions and the decision of whether an intervention is of good value. In this session, the concerns of conventional missing data methods will be discussed and new and innovative methods will be introduced in these areas.
The ICH E14 Q&A was revised in December 2015 and now enables pharmaceutical companies to use concentration-QTc (C-QTc) modeling as the primary analysis for assessing QTc prolongation risk of new drugs. Because the C-QTc modeling approach is based on using all data from varying dose levels and time points, a reliable assessment of QTc prolongation can be based on smaller-than-usual TQT trials or based on single- and/or multiple- dose escalation (SAD/MAD) studies during early-phase clinical development in order to meet the regulatory requirements of the ICH E14 guideline.
In the revised document, the E14 Implementation Working Group intentionally did not provide the technical details on how to perform and report C-QTc modeling to support regulatory submissions. The rationale for this omission is that specific analysis methodology is likely to evolve over time as pharmaceutical and regulatory scientists implement this approach across drugs with diverse pharmacokinetic (PK) and pharmacodynamic (PD) attributes.
In 2016, the E14 Implementation Working Group tasked an expert group of statisticians and pharmacometricians from industry and regulatory agencies to provide the technical details on how to design, perform, report, and review C-QTc analysis to support regulatory decision. This group developed a White Paper to propose current best practices in designing studies to use C- QTc modeling as the primary analysis, conducting a C-QTc analysis, reporting the results of the analysis to support regulatory submissions, and reviewing the analysis for regulatory decision. The recommendations within the White Paper provide opportunities for increasing efficiencies in this safety evaluation.
Discussant: Christine Garnett, FDA/CDER
Patient-reported outcome (PRO) is the term used to denote health data provided by the patient through a system of reporting, without interpretation of the patient’s response by a clinician or anyone else. PRO has attracted a lot of attention from many researchers in that the information directly from patients can provide valuable insight that others such as observers can’t. PRO is an instrument to capture data from patients that is used to measure treatment benefit or risk in medical product clinical trials. In this session, speakers from academia, industry and regulatory agencies will present their current research in important methodological issues in analyzing PRO, provide case studies and examples of PRO instrument development and validation in clinical studies.
Diagnostic classifiers involve developing the model as a first step. Then the model's performance is checked during internal validation, and modified if necessary. At that point the model needs to be fixed before an independent dataset is obtained to validate the classifier to assure it works in future patients. In case the data collection for external validation starts before the classifier is finalized, people developing the classifier should be blinded to this data. There is a lot of confusion about the three steps, particularly between internal and external validation. With real world data becoming more acceptable, this is becoming even more of an issue. How independent the datasets are has big impact on the performance of the classifiers. And whether people are blinded to the validation data while developing the model is even harder to determine when the data is not collected prospectively. In this session, We will discuss ways to design the studies so we can get unbiased performance estimates of the classifiers.
Speakers: Lakshmi Vishnuvajjala, CDRH/FDA; Ravi Varadhan, Johns Hopkins University; Susan H. Gawel, Northwestern University Institute of Public Health and Medicine, Feinberg School of Medicine; Xiaoqing (Quinnie) Yang, Abbott Diagnostics R&D Statistics.
Often patients randomized to the control arm in clinical trials for new oncology/hematology products are permitted to switch treatments after disease progression, often to the new therapy. When this occurs, the control arm of the trial is contaminated by the new treatment and a standard intention to treat (ITT) analysis does not fully address the question that may be of greatest interest – that is, what is the safety and effectiveness of the new treatment compared to the control treatment. There are several methods available that may be used to evaluate the overall survival benefit adjusted for treatment switching, for example, rank-preserving structural failure time models (RPSFTM) and inverse-probability-of-censoring weighting (IPCW). These are commonly proposed by sponsors as sensitivity analyses to evaluate the overall survival benefit adjusted for switching. However, how well these models estimate the real survival benefit remains unclear. It is also unclear how these adjusted survival benefits can be used in the regulatory decision making process, as opposed to the reimbursement decision making process. In some European countries, the overall survival benefit of the new therapy is directly linked to medical reimbursements or payments to physicians and patients, and therefore accurate point-estimates of the overall survival benefit (and uncertainty around this) are critical. In the regulatory setting it is unclear how critical point-estimates of overall survival benefits are for decision-making. Speakers: Erik Bloomquist, US FDA; Uwe Siebert from UMIT, Austria; Nicholas R Latimer University of Sheffield, UK
We see in many aspects of our society the power of images to convey an idea. How can we utilize this power within biostatistics for communicating more effectively to others and for deeper insights ourselves? Faced with a large number of endpoints to summarize, and with the increasing reliance on sensitivity analysis to assess the robustness of study design features and results, statisticians struggle to uncover and adequately portray the story hidden within the data. A well-designed graph can be the quickest way to convey what the data have to say, but it takes time to a) frame the question based on available data and b) design and refine the graph to meet this purpose. In this overview, we illustrate how to construct well-designed graphs for study design and analysis, with specific applications to drug safety, subgroups, and post-market surveillance.
Comparative-effectiveness (CE) research aims to produce evidence regarding the effectiveness and safety of medical products outside of randomized and well controlled clinical trials. In recent years, Real-world evidence (RWE) research is an increasingly important component of biopharmaceutical (pharmaceutical, biologic and medical device) product development and commercialization. There is a growing industry need for broader data and information on real-world effectiveness and safety —both of which will influence the eventual reimbursement and utilization of new products. The ultimate decision is driven by regulators, public and private payers, and prescribers, all of whom seek to better understand the impact of a new product to patients through their treatment journey. Many countries already make reimbursement decisions based on RWE. In United States, the FDA and the Center for Medicare and Medicaid Services have agreed to work together more closely to allow the use of RWE in drug approval and reimbursement. Unlike randomized control trials (RCT), which remain the gold standard for drug approval, RWE data comes from outcomes of heterogeneous patients as experienced as treated in real world clinical practice. The relevant data sources include phase IV trials, pragmatic trials, registries, post-authorization safety/efficacy studies, observational studies (prospective and retrospective), pharmacoeconomics studies, and expanded access/compassionate use programs of a drug etc. The absence of randomization and the multifarious nature of the data creates methodological challenges in generating quality evidence. This includes choice of right methodology for proper design and analysis of RWE studies. The next hurdle is to ensure proper synthesis of results from RWE with other types of evidence to make better healthcare decisions and support product throughout its lifecycle.
It is well recognized that the treatment effects may not be homogeneous across the study population. Patients want to know whether a medicine will work for him or her as an individual with his or her own specific characteristics. Therefore, subgroup analysis is an important step in the assessment of evidence from clinical trials. In the confirmatory phase, this can be a critical strategy where conclusions for an overall study population might not hold. On other hand, it is an integral part of the early development to identify the appropriate patient population to increase the probability of success of a clinical program. At this stage these analyses are exploratory in nature. One notable distinction between confirmatory and exploratory subgroup analyses relate to the efforts devoted for planning. The goal of a confirmatory subgroup analysis is to provide sufficient evidence for decision. However, the exploratory analyses are rather hypothesis generating. Whether confirmatory or exploratory in nature, the investigation of subgroups poses statistical, interpretation and regulatory challenges. Confirmatory subgroup analyses are known to be prone to statistical and methodological issues such as inflation of type I error due to multiple testing, low power, inappropriate statistical analyses or lack of pre-specification. Although powering within each possible subgroup is not mandatory but powering within specialized subgroups of interest is imperative for proper interpretation. However, the primary challenge with exploratory subgroup analyses is making decisions using only limited information which increase the chance of false detection. Use of naive estimates of the treatment effect to find a subgroup with high treatment effect induces random high bias and potentially misleading. Therefore, analyses in both settings need distinct statistical methodologies to address the problems appropriately. To recognize these challenges and proper interpretations of subgroup in medical product development, regulators have devoted their efforts toward guidance development on subgroup analysis in recent years (CHMP 2010 and FDA 2012 in context of enrichment strategy).
Discussant: Kathleen Fritsch, CDER, FDA
Many of the statistical issues encountered in studies intended for animal drug approvals are similar to those in regulatory human clinical trials. However, there are also statistical issues and challenges unique to regulatory animal drug studies, often related to various experimental designs to support drug indications for specific animal species. In this session, we will present animal drug studies reviewed by the Center for Veterinary Medicine (CVM) and discuss statistical issues and challenges associated with these studies.