The oncology drug development paradigm has changed over the last several years. With the novel targeted and immunotherapies, clinical trial design in oncology has evolved to help accelerate drug development and provide timely access to highly effective therapies to patients. Study designs, such as seamless expansion cohorts/single-arm studies are increasingly used to support accelerated regulatory approvals.
In rare disease settings with unmet need (e.g. patients with rare mutations, pediatric patients), interpreting results from a single-arm study requires the construction of an external control arm. Construction of contemporaneous and clinically relevant external cohorts from high-quality, real-world databases could help provide a robust comparability to evaluate the effectiveness of promising therapies in oncology drug development.
The objectives of the presentations in this session will be to examine whether contemporaneous longitudinal data from curated electronic health record (EHR) database of cancer patients could be used to (1) create control arm for a single arm study and (2) to evaluate the use of such data in novel design approaches such as “hybrid control” in a randomized trial (enhancing control arm). We intend to evaluate the reliability of developing external control from real-world data through case studies. For the development of “hybrid control” arm, we will evaluate the use of appropriate methodologies, including Bayesian approaches, to bring the appropriate level of evidence from real-world data to a clinical trial to enhance the control; while promoting internal validity of such novel designs.
Ever since the missing data report “The Prevention and Treatment of Missing Data in Clinical Trials” published by National Research Council in 2010, missing data prevention and handling has been a popular regulatory topic that has caught attention not only from both researchers and trialists, but also from both statisticians and clinicians. Furthermore, the draft addendum on estimands and sensitivity analysis in clinical trials to ICH E9 was issued in mid-2017 and it has been consulted throughout a trial by both regulatory and industry. Despite the many efforts and novel strategies of minimizing missing data, it is inevitable in most of the clinical trials however small the impact on the final treatment effect evaluation. Hence, most of the regulatory agencies continue recommending pre-specified missing data handling strategies for the primary analyses along with sensitivity analyses. Such a recommendation pushes missing data analysis research and implementation to a new height. The most popular primary missing data handling strategies are different variations of multiple imputation while the most favorable sensitivity analysis is tipping-point analysis based on the primary multiple imputation analysis. Nevertheless, most of the research work has been focused on superiority trials with binary or continuous endpoints. There is still room for further advancement in less common trial designs and endpoints, such as non-inferiority trials, time-to-event and longitudinal endpoints. The seemingly straightforward multiple imputation and tipping-point analysis cannot be adopted easily in these situations, or even more importantly, does not construe a reasonable investigation in missing data impact on the study conclusions. This session will venture to address some of these unique challenges with revamped multiple imputation models and tipping-point analyses.
The 2017 draft addendum of the ICH E9 guideline on Statistical Principles for Clinical Trials, introduces an estimand framework aimed at aligning trial objectives and statistical analyses based on a precise definition of the inferential quantity of interest, the estimand. One of the attributes of an estimand is the strategy for handling intercurrent events, i.e., post-baseline events which preclude observation of the endpoint of interest, or affect its interpretation, to reflect the scientific question of interest. Although causal estimands are not explicitly mentioned in the addendum, the hypothetical and principal stratum strategies for addressing intercurrent events lead to causal estimands.
This session will focus on causality in a time-to-event setting and provide examples where a causal estimand in a drug development program is desirable. The principal stratum strategy, rarely applied in drug development, will be in scope. Alternatives to the hazard ratio effect measure will be embedded in the estimand framework, reviewed under causality considerations, and their underlying assumptions and possible sensitivity analyses will be discussed.
This session will also consider causal inference methodology applied in oncology while analyzing overall survival in the presence of treatment switching. Different estimands in this setting will be presented, illustrating the impact of the estimand choice on study design, data collection, trial conduct, analysis, and interpretation.
The session will include three talks by members of the Joint EFSPI SIG for Estimands in Oncology and FDA representatives.
How to optimize study design has been extensively discussed for superiority/efficacy trials, but seldom discussed in bioequivalence (BE) or biosimilar studies. For locally-acting generic drugs, a three-arm parallel clinical end-point BE study is often used to establish BE between a generic (T) and an innovator drug (R). BE is established if equivalence is demonstrated between T and R AND superiority is established for T over placebo (P) and R over P. In practice, however, there are times when equivalence passes but one or both superiority tests fail. For certain biosimilar products, both pharmacokinetic (PK) and clinical efficacy are required to demonstrate equivalence. However, PK and efficacy parameters are sometimes inaccurate resulting in either failure to establish equivalence or unnecessary costs. Therefore, how to optimize study design for BE and biosimilar studies in order to improve the effectiveness and efficiency of the study is a pressing task under both the Biosimilar User Fee Amendments (BsUFA II) and the Generic Drug User Fee Act II (GDUFA II). In this session, three speakers from FDA and academia/industry will discuss various optimized study designs for BE and biosimilar studies. Speaker 1 will discuss the challenges encountered when reviewing clinical endpoint BE studies using case examples and provide recommendations for study designs. Speaker 2 will introduce a novel adaptive clinical endpoint study with interim analysis and sample size re-estimation in order to avoid an over-powered or under-powered clinical trial; Speaker 3 will discuss an adaptive seamless design for establishing PK and efficacy equivalence in developing biosimilars to remedy the risk of mis-specification of both PK and efficacy parameters. An expert in the field will comment and provide recommendations based on the three talks. Speakers and discussant from the FDA and academia/industry will present and discuss complex study designs to optimize BE and biosimilar studies.
Designing appropriate clinical trials to support approval of products for the treatment of rare diseases is hard. FDA defines an orphan product as those that impact less than 200,000 subjects per year but some rare diseases may impact 1000 subjects or less in the US. One might assume rare means homogeneous but that is far from the truth. Patients with a disorder may suffer from a subset of a list of symptoms on a daily basis. Because the disease is rare, natural history not always well understood, and accruing patients difficult, development of meaningful endpoints is hard and poses statistical challenges. We are proposing a session with an introduction from a physician with extensive experience designing clinical trials for rare diseases (e.g. disorders that are inherited such as Gaucher’s disease). This will be followed up by two statisticians that will talk about ways of approaching complex endpoints in clinical trials (we have asked L.-J. Wei of Harvard and he is interested but it is not confirmed as yet and George Kordzakhia, an FDA statistician with extensive experience in rare diseases, will provide some perspectives on the subject). Then we will have a panel and an interactive discussion (with the speakers and two additional panel members, one from pharma and one from FDA) and will provide opportunity for audience involvement. Dr. Mike Hale is an experience moderator and heads up Biostatistics at Shire, a pharma company with serious interest in developing products for rare diseases. He will moderate the session.
Master protocols, including basket, umbrella, and platform trials, provide improved efficiency to address broader questions on the effect of multiple drugs and/or in multiple sub-populations in one trial, as compared to multiple independent trials. In additional to the operational complexities, statistical challenges are not trivial; but development of novel statistical design or methods for analysis to accommodate those challenges are falling behind. FDA recently published a guidance on strategies of master protocols with major focus on the design and statistical considerations. This includes sample size considerations for achieving adequate power in nonrandomized studies, comparative analysis in the use of a common control arm, allocation of biomarker-defined subgroups, and adaptive design strategies for sample size modification, adding and dropping an arm, etc. Although the master protocol concept has gained a lot of momentum recently, statistical issues and practical challenges need to be fully addressed to pave the way for its broader application. In this sessions, experts and practitioners will share their experience on basket, umbrella, and platform trials with focus on innovative statistical considerations.
Medical diagnostic tests often provide a binary response - positive or negative result for a target condition. However, in some cases, the incorporation of a “Gray Zone” or intermediate zone, leading to more than two results seems reasonable. Using an intermediate/gray zone to define a 3x2 table is appropriate than ignoring the test scores in these zones. The six-cell matrix (3x2 table), however, would serve limited purpose if clinicians cannot apply the additional information provided by the conditional operating characteristics to have effective patient management decisions. A decision analytic approach utilizing pre-test and post-test probability to the target condition can be used for efficient categorization of more than two results for such tests.
With the 21st Century Cures Act enacted, the stakeholders from industry, academia, and regulatory agency have been exploring how machine learning and artificial intelligence (AI) can help prompt medical innovations and accelerate medical product development. The machine learning and AI technologies have been developed to support bedside clinical decision-making. Also, machine learning and AI open many exciting opportunities in developing innovative ways to synthesize evidence from clinical trial data and real-world big data to support pharmaceutical development. While machine learning and AI are quickly evolving in medical fields, many stakeholders feel that we are getting into uncharted waters, e.g., lack of experience, especially in the regulatory area, and the technical uncertainties such as “black box” machine learning / AI tools used in clinical decision support. Since machine learning and AI are new in regulatory sciences, the presentation and discussion on this topic have been rare in the regulatory context. This session will provide an open forum for speakers from industry, academia, and regulatory agencies to share their up-to-date methodological research, practical experience, and regulatory consideration. We hope that this session will bring machine learning and AI closer to our clinical / pharmaceutical statistics community, and inspire discussion, collaboration, and consensus building in this new front.
Demonstration of vaccine efficacy (VE) has been relying on evaluating clinical disease endpoints through randomized, double-blind, placebo-controlled trials (RCT) in selected population. However, this is not a one-size-fits-all approach as it may not always be feasible when disease incidence rate is low. Potential causes for a low incidence rate include previously licensed vaccine, disease being rare when there is not an outbreak, etc. In such cases, Immune response may be used to infer VE when correlates of protection have been established.
On the contrary, observational methods are more common in the assessment of vaccine effectiveness (VEV) post licensure due to ethical concern over the inclusion of a placebo group in the study when an efficacious vaccine is readily available. Furthermore, VE is estimated from restricted and well-defined scenarios in clinical trials, which may differ markedly from the actual effectiveness in the field when conditions are less ideal (e.g. poor compliance, etc.), or population is different. Real world evidence could complement data from conventional large-scale RCTs, and shed light on the effectiveness of a vaccine. In combination with post-licensure surveillance, observational VEV studies play a crucial role in evaluating benefits and risks of a vaccine, especially when the actual incidence rate is dissimilar to that observed in the pre-licensure studies.
In this session, speakers from industry, public health, and regulatory agency will discuss the contemporary challenges in evaluation of vaccine efficacy and effectiveness. Topics include, but are not limited to, study design, statistical considerations of traditional RCT for efficacy, and utilizing real-world evidence (e.g. observational studies) to assess vaccine effectiveness.
The FDA’s approval of three gene therapy products in 2017 has opened the door to a radically new class of treatments and/or cure of diseases that was thought impossible many decades ago. Additionally, the product development in rare diseases has been challenging and many rare diseases have identified genes responsible. As a result, there is a high hope and belief that gene therapies with gene editing, replacement or delivery of a new gene can treat and even cure diseases. These factors have been motivating a rapid development of gene therapies in treating various severe diseases in the drug development community. In addition to existing issues pertaining to product developments for rare diseases, gene therapies have unique features that are different from small molecules or other biologic products, for example, one dose for a life time effect and the safety impact on patients due to permanent modification of human genome. These features lend them challenges in the design and assessment of safety and efficacy in clinical development programs. FDA recently refreshed and published a number of guidances on the development of gene therapies, for example, the considerations in hemophilia gene therapy development. Therefore, it is beneficial for the community to discuss the issues and considerations pertaining to the design and development of gene therapies at the workshop for the first time. In this session, we invite speakers from FDA and industry to discuss statistical considerations and more general development issues that are important to the design of gene therapy clinical trials, and the assessment of clinical data, such as first-in-human study design, dose selection, randomization, utilization of external data, efficacy endpoints and safety considerations. Three speakers will provide their perspectives in this topic: • Shiowjen Lee, PhD, Center for Biologics Evaluation and Research (CBER), FDA • Jessie Gu, PhD, Novartis • John Zhong, PhD, Biogen
Recent rapid development of big data analytical methods makes it possible to adopt artificial intelligence (AI), which mimics human cognition functions by computers, into the healthcare domain. AI can not only reveal the clinically relevant information from the massive amount of healthcare data, but also assist the clinical practice in decision making. For example, the IBM Watson system which includes both machine learning and natural language processing modules may provide treatment recommendations that are coherent with physician decisions. The US Food and Drug Administration (FDA) published the guidance of Software as a Medical Device (SAMD): Clinical Evaluation in 2017, and permitted marketing of a medical device that uses a series of deep learning detectors to search for lesions specific to diabetic retinopathy in 2018. To explore the recent AI development in regulatory science, this session will provide topics from both scientific and regulatory perspectives.
Heterogeneity of treatment effects (HTE) is the variation in how individuals respond to a treatment. Treatment benefit often varies among individuals in a trial due to difference in important disease characteristics. In current practice, the primary analysis focuses on the treatment benefit observed in the intent-to-treat (ITT) population. However, this "average" benefit may not be applicable to all patients in the trial due to possible heterogeneous benefit in different subgroups. The conventional way of assessing HTE is subgroup analysis using data from each subgroup independently. It divides the trial population into different subgroups based on each potentially influential disease characteristic, and then performs separate analysis. This traditional subgroup analysis is prone to false positive signal detection (treatment benefit), and the estimated treatment effects has high variability due to limited data in each subgroup. HTE is also observed in the study-level parameters of meta-analysis because of an imbalance in important disease characteristics (known as “effect–modifiers”). Appropriate assessment of HTE is critical to regulators, pharmaceutical companies, policy makers, researchers, and patients. The key challenges are assessing heterogeneity and identifying subgroup of patients likely to benefit from the treatment with certain degree of precision. The current practice does not address those concerns adequately.
This session will focus on different statistical approaches to assess HTE for confirmatory and exploratory analysis along with their advantages and disadvantages. It will further reflect on some recent state-of-art methodologies, such as Bayesian shrinkage. The speakers and discussant of this session will discuss possible strategies for communicating HTE to all stakeholders involved in clinical trial. This session will feature four prominent participants (three speakers and one discussant) from industry, academia and regulatory agency.
The FDA Guidance on real-world evidence issued in August 2017 and the Framework of FDA’s Real-World Evidence Program released in December 2018 describe the current practice on the use of real-world data (RWD) for evidence generation and outlines the blueprint of evaluating RWD and RWE (real-world evidence) for use in regulatory decisions to support safety and effectiveness of medical products. While regulatory agencies gain more experience in using RWD and RWE in product approval, there are still substantial challenges and hence opportunities in deriving RWE from a variety of RWD for regulatory use in product development and decision-making. This session comprises four presentations given by statisticians from academia, industry and FDA, two of which focus on practice and perspectives on use of RWD and RWE in regulatory decision making for medical devices and drug products, respectively, and the other two on methodology aspects in using RWD for causal inference which is essential in transforming RWD to valid RWE.
Speakers and Their Affiliations and Presentation Titles:
Yi Huang, PhD University of Maryland Baltimore County Email: yihuang@umbc.edu Presentation Title: Comparison of Causal Methods for Average Effect Estimation Allowing Covariate Measurement Error Using Simulation Studies
Jie Chen, PhD Merck Research Laboratory Email: jie_chen@merck.com Presentation Title: Real world data, machine learning and causal inference
Rongmei Zhang, PhD US FDA, Center for Drug Evaluation and Research Email: rongmei.zhang@fda.hhs.gov Presentation Title: Real World Evidence for Regulatory Decision Making in Drug Safety and Efficacy
Lilly Yue, PhD US FDA, CDRH Email: lilly.yue@fda.hhs.gov Presentation Title: Incorporating Real World Evidence for Regulatory Decision Making in Medical Device Evaluation
Basket design trials are the clinical studies that test one investigational product on the patient population with the same gene mutation but different cancer types or other cancer characteristics. This is in contrast to the traditional trial design where each clinical trial only studies one drug in one indication. The basket design trials have been attracting more interest given the rapid development of precision medicine and biomarker discovery. While it brings efficiency and expedition in drug development to allow one master protocol to simultaneously evaluate one drug in multiple cancer types (FDA Guidance, 2018), it does not mean clinical trialists can take a “discount” on safeguarding the patient safety and trial integrity.
A data monitoring committee (DMC) is usually employed to monitor the safety and efficacy during the trial. With added complexities in basket design trial, it brings both operational and statistical challenges for DMCs to work organically to achieve the goal of safeguarding the patient safety. The composition of the DMC may be different considering multiple cancer types are studied under the master protocol. As Renfro (2016) stated a few statistical challenges in the master protocol design, there have not been many discussions about the safety perspective of the basket design trial. Should the DMC break down by cancer type or should we have one big group of members to monitor all the sub-studies? How would DMC(s) be alerted by safety signal from one sub-study and not overreact or properly react on the other studies? How will interim analysis result from some sub-studies impact the entire master protocol? Additional questions may be considered.
In this session, the presenters will share their thoughts from both industry and regulatory perspective on their experiences with implementing safety monitoring strategies in basket design trials, what they have learned to improve the organization of DMCs and appropriate interpretation of data.
Mobile devices with apps are more and more common and are starting to be used in clinical investigations for many purposes. There are several FDA guidances relating to this including Use of Electronic Health Record Data in Clinical Investigations (July 2018), Computerized Systems Used in Clinical Investigations (May 2007), Mobile Medical Applications (February 9, 2015), and Patient-Reported Outcome Measures: Use in Medical Product Development to Support Labeling Claims (December 2009). Potential subjects use apps to locate clinical investigations. Entities who pursue clinical investigations (e.g., pharmaceutical companies, CROs, private and public research groups) use apps and/or mobile devices with apps to find participants (recruitment), to assess eligibility for clinical investigations, to collect clinical investigations data. They are being promoted as reducing time lines, finding the right participants, and obtaining more accurate data for scales, PROs, etc. in the sense of real world, real time data from the participants themselves. This is happening across therapeutic areas and for all phases of clinical investigations. Statisticians for clinical investigations need to understand how these can be helpful and what the challenges are. Some devices bring in almost continuous data and there is a need to decide how to use these data. Some devices may have the capability to bring in data not directly related to the investigation and there is the need to determine the value of these data for research. We propose to invite individuals who have experience with mobile devices and apps in clinical investigations to discuss value, challenges, impacts on any aspect of clinical investigations such as design, assessments, analyses as well as individuals who understand the guidances.
Time-to-event outcomes are often used as the primary endpoint for clinical trials in many disease areas. Most randomized controlled trials with a time-to-event outcome are designed and analyzed using the log-rank test and the Cox model under the assumption of proportional hazards. The log-rank p-value evaluates the statistical significance of the treatment effect, and the hazard ratio (HR) from the Cox model is used to quantify such effect. The log-rank test is most powerful under proportional hazards (PH). In practice, however, non-PH patterns are often observed in clinical trials. In particular, patterns of delayed treatment effects have been observed recently across immuno-oncology trials. To mitigate the power loss, an increase in the sample size and/or a delay in study readout is needed, which often delays the availability of the therapy to patients with unmet medical needs. Alternative tests and estimation methods under non-PH for primary analysis could increase the probability of success, shorten the time to bring new treatments to patients, and provides more accurate description of the treatment effect. In this session, speakers from industry, health authority and academia will propose novel methods to analyze the time-to-event outcome under non-PH, compare the operating characteristics of different methods and discuss some statistical considerations in the design, conduct and data analysis of clinical studies when non-PH is suspected or has been observed. The presentation and discussion in this session will generate new ideas, create common ground in ongoing discussions and hopefully provide guidance in the further clinical research.
Machine learning (ML) has an increasing number of applications to drugs, devices and other areas in health care. ML has been used for: analyzing quantitative structure–activity relationships (QSARs), or absorption, distribution, metabolism, excretion, and toxicity (ADMET) model for drug discovery (Panteleev, et al. 2018); identifying skin cancer from images (Stanford AI Lab); recognizing the locations of transcription start sites (TSSs) in a genome sequence (Libbrecht and Noble, 2017); text-mining for pharmacovigilance (Cocos, et al, 2017); and predicting adverse reaction using FAERS data (Chen, 2018).
Statisticians and ML developers share many common goals, such as prediction, classification (supervised learning), and clustering (unsupervised learning). Statistics and ML also have many common techniques and methods. However, ML models cannot be understood in the same manner as traditional statistical models, for example, the concept of building a machine-learned structure based on observed data is different from the concept of modeling based on a pre-specified model structure. Machine learning involves a different approach to data analysis than many statisticians are used to, yet statisticians’ expertise and knowledge regarding uncertainty, inference, and trial design are invaluable in developing and evaluating its use in the medical arena. This section invites experienced academic, industry and regulatory speakers to talk about the recent developments and applications of ML, and the roles statisticians can play.
Highlights: Dr Grace Kim will present an application of using quantitative computer-aided diagnosis score from volumetric HRCT scans image as a clinical study primary endpoint. Dr Youran (Ryan) Qi will present a new framework to simulate the Phase 3 clinical trial based on the Phase 2 clinical trial data and the real-world data with an innovative deep learning method to predict the Ctrough and treatment effect in the Phase 3 clinical. Dr Jae Joon Song will present a project that uses machine learning for generating real-world evidence will monitor changes in prescription opioid use and guide proactive pharmacovigilance of drug abuse.
Since the establishment of the Best Pharmaceuticals for Children Act (BPCA) in 2002 and the Pediatric Research Equity Act (PREA) in 2003, there has been significant progress in pediatric drug development. However, substantial challenges still exist, such as logistical, technical, and ethical barriers. Specifically, children are vulnerable, are not able to consent themselves, and may not respond to medications in the same way as adults. Moreover, often in pediatric settings there are no known active comparators, parents are reluctant to put their children at risk, and the pediatric population has generally low disease prevalence. These challenges translate to pediatric studies having small sample sizes and lacking suitable control groups.
With the realization of these challenges and the consideration of having adequate evidentiary standards for pediatric drug development, it is critical that we incorporate innovative statistical design and analysis methods to develop needed pediatric drugs. Based on prior successful clinical trials and statistical research, many approaches have been proposed to tackle the challenges. These include the application of two-stage re-randomization designs, the utilization of historical control information to reduce the size of the placebo arm, Bayesian designs and analyses, and basket designs to enroll pediatric patients with multiple indications having a single targeted biomarker, etc.
In this session, three speakers are invited to share their experience on innovative designs and advanced statistical methodologies in pediatric drug development. A presentation from FDA will summarize the statistical designs applied or proposed to pediatric trials; An academic presenter will discuss Bayesian sequential monitored trial that allows early stopping for efficacy or futility; A representative from industry will share their research and experience on using a Bayesian design through a case example.