Online Program

Keynote Address | Concurrent Sessions | Poster Sessions
Short Courses (full day) | Short Courses (half day) | Tutorials | Practical Computing Demonstrations | Closing General Session with Refreshments

Last Name:

Abstract Keyword:


Thursday, February 19
Thu, Feb 19, 7:00 AM - 6:30 PM
Napoleon Foyer

SC1 Practical Data Mining: Challenges and Solutions
Fill out evaluation
Thu, Feb 19, 8:00 AM - 5:30 PM
Napoleon C3
Instructor(s): Richard D. De Veaux, Williams College

Download Handouts
Large data sets (or Big Data) are becoming more common as our ability to collect and store data increases. Many new tools and methods are now available to both the experienced analyst and casual user. Unfortunately, there is a strong belief—due in large part to a series of popular Big Data books—that good results are guaranteed with just powerful algorithms and a lot of data. Instead, success is dependent on the skill and domain knowledge of the analyst and the quality and relevance of the data. However, by using principles of statistical engineering and sound statistical knowledge, the chance of success in these problems is significantly increased. Through a series of case studies, we will show how to be successful in Big Data problems. We will show applications of many current and popular algorithms, as well as when and where they are most successful.

Outline & Objectives

What is Data Mining and Big Data and where is it being used?
How to pose the problem and get started in Data Mining.
The role of Graphics.
Models -- algorithms and assessment.
Challenges and solutions.
Case Studies -- the role of the statistician as collaborator in data mining.

Objectives: To develop a clear understanding of how to define a data mining problem. Develop the skills needed to attack Big Data problems in a logical and sequential manner, applying principles of statistical engineering and sound statistical thinking.
Clarify the common mistakes of data mining and how to avoid them.

About the Instructor

Dick De Veaux has taught dozens of data mining courses around the world. He is currently the C. Carlisle and Margaret Tippit Professor of Statistics at Williams College where he has taught since 1994. He is a fellow of the ASA, serves on the Board of Directors as a Council of Sections representative and is the newly elected Chair of the Section on Statistical Learning and Data Mining. He is the author (with Paul Velleman and Dave Bock) of Intro Stats, Stats: Data and Models and Stats: Modeling the World and with Norean Sharpe and Paul Velleman of Business Statistics and Business Statistics: A First Course. He is currently working on an introductory book for Scientists and Engineers with Doug Montgomery to be published by WIley.

Relevance to Conference Goals

The course is relevant to all three goals. Clearly it will cover data modeling and analysis and use big data as the main focus. It will also emphasize the importance of teamwork, problem definition and communication.

SC2 From Statistical Consultant to Effective Leader
Fill out evaluation
Thu, Feb 19, 8:00 AM - 5:30 PM
Napoleon C1
Instructor(s): Roger W. Hoerl, Union College; Ronald D. Snee, Snee Associates, LLC

Download Handouts
This workshop is designed to enhance the leadership skills of statisticians working in business, industry, and government. The goal is to help statisticians transition from being viewed as passive consultants to being viewed as proactive leaders within their organizations. Issues addressed include understanding what statistical leadership is and how it differs from consulting, why it is important to be viewed as leaders, and critical leadership skills required. As part of the course, each participant will develop a personal action plan to enhance their leadership in their own work environment.

Outline & Objectives

Workshop Agenda

• Introductions, purpose, agenda and participant expectations
• Why we need to be leaders
• What Management expects of leaders
• What changes are needed by statistics profession and statisticians individually:
o Presentation followed by brainstorming and discussion of brainstormed list
• Effective leadership skills and how to develop them
• Developing action plans for the profession and on a personal basis
• Personalize leadership development plans
• Discussion of selected personal plans
• Wrap up and path forward

Workshop Design

The workshop will use an integration of presentation and discussion of material from critical books and articles on leadership, sharing of personal experiences (participants and workshop leaders) in leading and working with leaders, brainstorming of problems and solutions and development of action plans. The session will be highly interactive enabling extensive participation by all.

About the Instructor

The instructors Ron Snee and Roger Hoerl are well versed in the subject of preparing statisticians to become better leaders. Their publications on Statistical Leadership document their work on the subject. They have presented the proposed workshop on more than 5 occasions over the years.The workshop content, design and delivery have worked well with a variety of statistician audiences including those at the JSM, QPRC and ASA/ASQ Fall technical Conference.

Snee, R. D. and R. W. Hoerl (2004) “Statistical Leadership – as Traditional Workplace Roles Change, Learn to Transition from Consultant to Leader”, Quality Progress, October 2004, 83-85.

Snee, R. D. (2005) “Leading Business Improvement: A New Role for Statisticians and Quality Professionals”, Quality and Reliability Engineering International, Volume 21, 2005, 235-242.

Snee, R. D., R. W. Hoerl and A. N. Patterson (2008) “In with The Right Crowd; Getting Management on Board to Support Statisticians’ Roles and Initiatives”, Quality Progress, May 2008, 70-73.

Snee. R. D. and R.W. Hoerl (2011) “Leadership—Essential for Developing the Discipline of Statistical Engineering”, Quality Engineering, Volume 24, April-June 2012,

Relevance to Conference Goals

This workshop fits nicely with Theme 1: Communication, Impact, and Career Development. The development of leadership and management skills is the purpose of the workshop.Both of the instructors have had leadership and managerial positions in their careers: Snee at DuPont, Bell Atlantic (now Verizon) and Joiner Associates and Hoerl at General Electric

SC3 What Can We Learn from Software Engineers?
Fill out evaluation
Thu, Feb 19, 8:00 AM - 12:00 PM
Napoleon C2
Instructor(s): Paul Teetor, Quant Development LLC

Download Handouts
Do any of the following problems sound familiar? Your organization is swimming in SAS or R code.You’ve saved numerous versions because you can’t afford to lose anything. People are unsure of which version is best.Testing your code is difficult.You’ve cut-and-pasted your code so often you’re seeing the same parts over and over. Everyone does their work differently, and people can’t share code easily.The code is now so convoluted that newcomers cannot understand it.The thought of major changes makes your head hurt. Software engineers have spent decades dealing with these problems, and the result is a body of best practices for managing software.These best practices are an art and not well known outside the discipline.This course will explain the techniques of software engineering and how they apply to managing your software. Topics range from code-level practices to design issues and project control.The course will focus on software engineering in the context of R, which provides a rich environment for statistical programming. Participants are expected to arrive with R and Rstudio installed on their laptops. Some familiarity with R is required.

Outline & Objectives

The course alternates between explanation and exercises.

Coding standards – Adopting a consistent style.
Exercise: Clean up dirty code.

Defensive programming – Protecting yourself from others... and yourself.
Exercise: Add defensive checks to a program.

Code walk-throughs – Shine a light on your code.
Exercise: Walk-through code together: bad code, good code.

Version control – Keeping track of your intellectual property.
Demonstration: Create a simple code repository.

Unit testing – Start with the quality control.
Exercise: Write and execute unit tests.

Libraries – Don't reinvent the wheel.
Exercise: Create a simple library.

Refactoring – Apply your powers of abstraction.
Exercise: Create a variation on a function, then refactor both versions.

Modularity – Keep your design strong and flexible.
Example: Distinguishing input vs. calculation vs. presentation.

Execution environments – Separate sandboxes for developing, testing, and production.

Integration testing – Be confident that everything works harmoniously.

Parting words - A few lessons from project management.

About the Instructor

I am both a professional software engineer and a professional statistician. My consulting practice focuses on the financial services sector, where I build quantitative applications by combining those twin backgrounds. I have over 30 years professional experience.

My public speaking and writing has been quite popular. Last year, I was asked to give the Introduction to R Workshop at the University of Chicago (Financial Mathematics program). This year, I've been asked to give that workshop plus a keynote speech at the Great Plains R User's Group conference. My talks have been accepted three times for the annual R/Finance Conference. I am frequent presenter at the Chicago R User's Group. I've taught undergraduate classes in statistics and in computer programming.

I spoke at CSP 2014, where my presentation on Bootstrapping Time Series Data received excellent evaluations.

I am the author of the R Cookbook (O'Reilly, 2011), one of the top-selling books for R.

My degrees are in Computer Science (BS, Cornell University; MS, Northwestern University) and Applied Statistics (MS, DePaul University).

Relevance to Conference Goals

I see a surprising number of statisticians and analysts that are “swimming in their software”. I've taught software engineering techniques to my clients, and they say it revolutionized the management of their code base. Practitioners and organizations that adopt these techniques have experienced higher quality results with less work and less chaos. Participants in this course will bring that benefit to their organizations.

SC4 How to Start and Run an Independent Statistical Consulting Business
Fill out evaluation
Thu, Feb 19, 8:00 AM - 12:00 PM
Napoleon D3
Instructor(s): Stephen David Simon, P.Mean Consulting

Download Handouts
An independent statistical consulting job is both rewarding and challenging. If you follow this career path, you will need to learn many business skills.This course will review practical issues you will face in setting up an independent consulting business. Should you set up a limited liability corporation or a subchapter S corporation? Should you bill by the hour or the project? What insurance do you need? Should you have a standard contract in place prior to any consulting work? In addition to these legal and accounting requirements, there are human issues that you as an independent consultant will have to face. Your most important job is finding new clients. The best method, by far, is “word of mouth,” and there are several strategies you can adopt to enhance your visibility and increase the number of referrals you receive. You also need to know how to keep your current clients happy. This class will include several small-group exercises during which you will share your thoughts and experiences on how to handle specific cases involving independent statistical consulting. No specific knowledge about business models, accounting, or legal issues will be assumed.

Outline & Objectives

Course outline: The first lecture will cover the types of independent consultants and the contrast these with a consultant who is part of a larger organization. This will be followed by a small group exercise where students discuss their career goals in one/five years. This followed by lectures on company types (sole proprietorship, partnership, limited liability corporation) and a discussion of the pros and cons of billing by the hour versus by the project.

A second small group exercise presents a hypothetical consulting project and asks each group to plan an estimate on the entire project cost or on the hours needed. Additional lectures will cover contracts, accounting, and insurance.

The last two lectures will discuss finding new clients and keeping existing clients happy. This includes a third small group exercise on a hypothetical consulting scenario that has gone sour. Students will discuss whether they should end the consulting relationship or find ways to get the interaction back on track.

The target audience is anyone who is considering a career as an independent consultant or who is curious about the advantages and disadvantages of this type of work.

About the Instructor

Steve Simon is a part-time independent statistical consultant with P.Mean Consulting and part-time faculty member in the Department of Biomedical and Health Informatics at the University of Missouri-Kansas City.

He presented a short course at the inaugural meeting of the Conference on Statistical Practice in 2012, "Promoting Your Consulting Career in the Era of Web 2.0" and has led a roundtable discussion on the same topic at the 2011 Joint Statistical Meetings. A brief summary of this talk is on his website:

He also was a panel member at the 2011 JSM on "Successful Statistical Consulting: The Practicalities" and discussed "How Independent Statistical Consulting is Different." For a brief overview of this presentation, see

Dr. Simon is the author of a book published by Oxford University Press, "Statistical Evidence in Medical Trials. What Do the Data Really Tell Us." He has a website ( with over 1,300 pages on statistics, research ethics, and evidence based medicine and is an active participant in the Statistical Consulting Section Discussion Board.

Relevance to Conference Goals

The first theme of the Conference on Statistical Practice is "Communication, Impact, and Career Development." This class addresses a very specific type of career development, starting and running an independent consulting business. The course will cover business skills, such as billing, contracts, and marketing that participants need to advance their careers. This course will emphasize communication with clients and customers, both to attract new clients and to keep existing clients happy.

SC5 An Overview of Clustering: Finding and Extracting Group Structure in High-Dimensional Data
Fill out evaluation
Thu, Feb 19, 8:00 AM - 12:00 PM
Instructor(s): Rebecca Nugent, Carnegie Mellon University; Samuel Ventura, Carnegie Mellon University
Clustering is the search for similar or homogeneous subgroups in a population, say, of consumers, patients, genes, images, text documents, or anything that can possibly contain group structure. For example, consumers might be divided into market segments based on their preferences and spending habits. In public health, we might be interested in predicting which outcome group a patient is likely to be in given their symptoms, past history, and current treatment. In document clustering, the goal is to group similar pieces of text (e.g., blogs, emails, posts, letters, articles, etc.) based on the words used, the frequency, and other text features. In all cases, the goal is to extract structure from potentially high-dimensional data.The difficulty, however, often lies in which clustering approach to adopt, particularly given that results are rarely independent of approach.This tutorial will give an overview of algorithmic and statistical approaches to clustering with an emphasis on how to choose an approach and its related parameter. Note that while we use the statistical software package R, these methods are available on other platforms.

Outline & Objectives

Our primary goal is to provide the practitioner with a solid background in the variety of available clustering approaches and their related assumptions, necessary parameter choices, cluster shapes and sizes, and advantages/disadvantages. The practitioners will also gain skills in critiquing and interpreting their final cluster solution and identifying unstable or undesirable clusters.

Topics include: (may be interspersed as appropriate)

Deterministic Algorithms
- Hierarchical Linkage Clustering
- K-Means (including fuzzy version)
- K-Medoids

Statistical Approaches
- Parametric mixture models/model-based clustering
- Nonparametric bump hunting or mode finding
- Spectral Clustering or Image Segmentation

Longitudinal Clustering

Validation and Visualization
- Uncertainty
- Cluster Validation Strength
- Silhouettes
- Stripes and Neighborhoods

About the Instructor

Professor Nugent is an Associate Teaching Professor in the Department of Statistics at Carnegie Mellon University. Her research primarily focuses on finding and visualizing high-dimensional structure. She was the 2009 Chikio Hayashi Award recipient (a Young Promising Researcher award presented by the International Federation of Classification Societies). She has served as the President of the Classification Society (of North America) and is active in the ASA Sections on Statistical Computing and Statistical Graphics. She has taught undergraduate and graduate classes in statistical learning, regression, document clustering, record linkage, among others. She has also won several teaching awards, including the Elliott Dunlap Smith Award for Distinguished Teaching and Educational Service.

Samuel L. Ventura is a PhD Candidate in the Department of Statistics at Carnegie Mellon University. His research focus is on large-scale clustering and classification techniques. He also brings extensive statistical computing experience. Sam has been an invited speaker at several statistical learning conferences and has taught several summer courses at CMU.

Relevance to Conference Goals

With the advent of "Big Data" sets and cheaper, more ubiquitous data collection, we have more data than we can handle. Describing and characterizing the structure in these high-dimensional data sets is paramount. Being able to reduce the complexity of your statistical analysis by honing in on the underlying group structure may increase your analysis options. In addition, with a dual focus of making informed decisions about choice of clustering approach and summarizing, visualizing, and interpreting the final clusters, attendees will be more confident and better positioned to interface with their clients and deliver statistically sound results that directly correspond to real, implementable action items in practice (e.g. a different strategy for each group).

SC6 Building Your Professional Brand
Fill out evaluation
Thu, Feb 19, 1:30 PM - 5:30 PM
Napoleon D3
Instructor(s): Bill Williams, Organizational Learning Consultant

Download Handouts
The world of work is full of people with ambition and aspirations to do bigger things as their careers progress. While the rules for success—most of which are unwritten—vary from organization to organization, two ingredients are always essential: 1) your current performance on the job and 2) the potential other people see in you. How people view your performance and potential is derived only in part by what you know and the functional expertise you possess. The rest is based on the image you project and the exposure to other people your job affords you. In this session, we’ll examine both the impression you want others to have of you as a professional—your “brand”—and how your communications can influence the impressions of others. You will define the brand you would most like people to associate with you and consider how to manage your behavior to support your brand, particularly when communicating with senior managers and leaders.

Outline & Objectives


Understand what is meant by the term “personal brand”
Identify the characteristics of your ideal personal brand – the experience you want others to have when working with you
Determine actions you can take and behaviors you can display when working with others – whether in-person or virtually – that will support your ideal personal brand

What is a brand and what characterizes a brand, for better or worse?
What brands do you value and why? Whose “personal brand” do you respect?
Brainstorm: what do you ideally want to characterize your personal brand?
-Capabilities: talents, skills and knowledge
-Behavioral characteristics: influencing style, composure, communication style
Who are your key constituents at work?
-What impressions of you do you want them to have?
-What do you want them to value in the way you support and collaborate with them?
Putting it all together – bringing your personal brand to life:
-With individuals in-person
-With groups in-person
-In writing
-In virtual communication

About the Instructor

Bill Williams is an organizational learning consultant and has been part of the Conference on Statistical Practice since its inaugural year.

Relevance to Conference Goals

This session aligns to the communication, impact and career development track by focusing on how to position oneself for success within an organization.

SC7 Design of Not-Simple Graphs
Fill out evaluation
Thu, Feb 19, 1:30 PM - 5:30 PM
Instructor(s): Richard M. Heiberger, Temple University

Download Handouts
Complex data analyses may require complex graphs to place the full information of the analysis into a form the intended client will be able to read. In our opinion, graphs are the heart of most statistical analyses; the corresponding tabular results are formal confirmations of our visual impressions. Data analysts are responsible for the display of data with graphs and tables that summarize and represent the data and the analysis. The graphs are often the best means of communication between the data analyst and client. This course will emphasize the design of graphical displays that best represent the message of an analysis.

Outline & Objectives

We will look at many examples of graphs, from simple to complex. We need to begin with simple graphs to learn the vocabulary of graphs. We then proceed to more complex graphs and see how they are constructed by using the same graphic vocabulary. The examples come from journal articles, text books, and general publications. We will mostly show good examples, but will of necessity show some not-good examples (and then revise them) to emphasize how the principles we recommend have been derived and why they are important for communication between the data analyst and the client. Most of the examples will be from the medical/pharmaceutical areas or from social sciences. The concepts are much more broadly applicable. The graphs we show will be drawn using the graphics functions in R because that platform offers substantial capabilities for producing graphs customized to the particular needs and visions of the analyst. They could be drawn in any other graphical system that has a reasonably rich set of graphical primitives.

About the Instructor

Professor of Statistics at Temple University. Chair (2011) of the Statistical Computing Section of the ASA. Consulting experience in the pharmaceutical and social science areas. I designed and programmed the AEdotplot, the now standard display for adverse events in clinical trials. I coauthored a paper in the Handbook of Data Visualization Consultant for a US Government agency on the design of visualizations to make their data more accessible. I have several packages available for the R system. My most recent book, R through Excel (Springer, 2009) with Erich Neuwirth, shows how to access the high quality of R graphics directly from the comfort of the familiar Excel spreadsheet.
I have a recent paper: Heiberger, R., Robbins, N. (2014). ``Design of Diverging Stacked Bar Charts
for Likert Scales and Other Applications.' in the Journal of Statistical Software, 57 (5), 1--32.
I am preparing the second edition Heiberger, Richard M., and Burt Holland (anticipated 2015) of
Statistical Analysis and Data Display: An Intermediate Course with Examples in R, Springer, New York. I presented a session on "Structured Sets of Graphs" at the 2014 CSP.

Relevance to Conference Goals

On conclusion, the course participants will have examples and experience with complex graphs. They
will be able to look at new data situations and analyses and to design graphs that will communicate the analyst's intended message to the reader. Better communication skills improve performance and improved performance enhances their professional development.

SC8 Text Analytics: Integrating Topic, Opinion, and Sentiment Analysis
Fill out evaluation
Thu, Feb 19, 1:30 PM - 5:30 PM
Napoleon C2
Instructor(s): Edward R. Jones, Texas A&M Statistical Services
This workshop discusses current statistical approaches to conducting a linked analysis of reviewer comments, sentiments, and rating. Today, statisticians have powerful tools available for integrating the analysis of structured and unstructured data. Reviewer and customer comments can be used with their ratings and other background information to build models linking ratings, opinions, and emotions. Done well, this provides a more complete picture of what people think and feel about services and products.

Outline & Objectives

Learning Objectives:

(1) Gain an understanding of the terminology and concepts of opinion, topic and sentiment analysis in text analytics.

(2) Understand an effective process for modeling reviewer comments with structured data.

(3) Understand of available software, both freeware and commercial, for conducting a linked analysis of text and structured data.


(1) Introduction to Text Analytics: Terminology and Software - Today and Tomorrow

(2) Opinion and Topic Analysis: Techniques and Tools for Discovering Opinions and Topics

(3) Sentiment Analysis: Extracting Emotional Content

(4) Discussion and Questions

About the Instructor

Over 10 years of experience in development of commercial analytics software, and over 15 years of experience teaching and applying techniques in analytics and quality assurance. Formerly an examiner for the Malcom Baldrige National Quality Award. Currently teaches advanced analytics at Texas A&M University and mentors graduate students in analytics competition.

See -

Relevance to Conference Goals

Attendees enjoy CSP because of its applied nature and because they leave with new knowledge and tools useful in their career and work. For many, this workshop provides a new tool; a statistical approach and tool for exploring the relationship between what people think and what they say and feel.

What people think is often captured as structured data. Usually in the form of a answers to a closed questions such as: "On a scale from 1 to 5, how satisfied are you with our product?"
What people say are acquired from answers to open-ended questions such as "why are you satisfied or not satisfied with our product?"
What people feel about what they are saying is discovered using sentiment analysis.

This workshop provides participants the background needed to start applying opinion, topic and sentiment analysis in their work. The approaches and tools are illustrated using customer review data.

Collaboration Corner
Thu, Feb 19, 2:00 PM - 5:00 PM
Napoleon AB
Meet in the Collaboration Corner in the front of Napoleon Ballroom. Recommend a topic or sign up for a topic recommended by someone else on the bulletin boards in this area.

PS1 Poster Session 1 & Opening Mixer
Thu, Feb 19, 5:30 PM - 7:00 PM
Napoleon AB

1 Simulating Confidential Epidemiological Data Sets
View Presentation View Presentation Ragheed Fadhil Al-Dulaimi, Hunter College
2 Here’s How I Helped a Client Forecast Sales of Her New Product!
View Presentation View Presentation Michael Latta, Coastal Carolina University YTMBA Research & Consulting
3 The Collaborating Statistician: Writing for Peer Review in the Scientific Literature
View Presentation View Presentation Alexandra L. Hanlon, University of Pennsylvania
4 Effective Communication with Clients to Estimate Effect Size for Power Analysis
View Presentation View Presentation Min-Kyung Jung, New York Institute of Technology College of Osteopathic Medicine
5 Training and Evaluating New Student Consultants at a University Consulting Center
View Presentation View Presentation Aaron Rendahl, University of Minnesota
6 Independent Means T-test or Robust Alternatives: A Guide to Selecting the Best Tool for Inferences
View Presentation View Presentation Anh P. Kellermann, University of South Florida
7 Optimizing Medical Chart Review Sample Size Reduction with a Monte Carlo Simulation
View Presentation View Presentation Qin Wen, Humana Inc.
8 Testing Homogeneity of Variance in One-Factor ANOVA Models: A Plethora of Approaches to Consider
View Presentation View Presentation Jeffrey D. Kromrey, University of South Florida
9 Economic Impact of Maternal Mortality in Africa: A Panel Data Approach
View Presentation View Presentation Emmanuel Thompson, Southeast Missouri State University
10 Statistics in Defense
View Presentation View Presentation Victoria Cox, Dstl
11 Clustering Box Office Score Dynamics Using Dynamic Time Warping
View Presentation View Presentation Kevin Harris, NC A&T State University
12 Using Six Sigma to Reduce Recyclables in Trash on a College Campus
View Presentation View Presentation Diane Evans, Rose-Hulman Institute of Technology
13 Analysis of Weather, Temporal, Population, and Socioeconomic Factors in Determining Crime Rates in Five U.S. Cities and Projections for the Future
View Presentation View Presentation Zhangxin Xue, Southern Methodist University
Exhibits Open
Thu, Feb 19, 5:30 PM - 7:00 PM
Napoleon AB

Friday, February 20
Fri, Feb 20, 7:30 AM - 5:30 PM
Napoleon Foyer

Continental Breakfast
Fri, Feb 20, 7:30 AM - 8:30 AM
Napoleon AB

Exhibits Open
Fri, Feb 20, 7:30 AM - 6:30 PM
Napoleon AB

GS1 Keynote Address
Fri, Feb 20, 8:00 AM - 9:00 AM
Napoleon C

8:05 AM Communication: A Two-Way Street
David Morganstein, Westat
CS01 Mentoring
Fill out evaluation
Fri, Feb 20, 9:15 AM - 10:45 AM
Napoleon D1&D2
Chair(s): Eric Vance, LISA, Virginia Tech

How Mentoring Can Help with the Practice of Statistics: A Panel Discussion
Sarah Kalicin, Intel corporation; Amarjot Kaur, Merck Research Labs; David Kline, The Ohio State University; David Morganstein, Westat; LeAnna Stork, Monsanto Co.
CS02 Special Estimation
Fill out evaluation
Fri, Feb 20, 9:15 AM - 10:45 AM
Chair(s): Wei-Ting Hwang, Univ. of Pennsylvania School of Medicine

9:20 AM Understanding and Estimating Treatment Effect Heterogeneity Using Adaptive Ensemble Methods
Diane M. Richardson, VA Center for Health Equity Research and Promotion
10:05 AM Much Ado About Almost Nothing: How to Deal with Limited Data
View Presentation View Presentation Stephen W. Looney, Georgia Regents University
CS03 Predictive Analytics in Health Care
Fill out evaluation
Fri, Feb 20, 9:15 AM - 10:45 AM
Napoleon C
Chair(s): Nancy Wang, Celerion

9:20 AM Risk Quantification for Branch Management
Momoko Fukasawa, Deloitte Touche Tohmatsu, LLC
CS04 Software for Analytics and Data Mining
Fill out evaluation
Fri, Feb 20, 9:15 AM - 10:45 AM
Chair(s): Michael Latta, Coastal Carolina University YTMBA Research & Consulting

9:20 AM Real-Time Analytics Using Business Intelligence Software
View Presentation View Presentation Sam Weerahandi, Pfizer
10:05 AM Taming the Big Data Beast at Texas Parks and Wildlife: Using Business Intelligence Tools and Value-Added Data to Evolve a Culture of Data-Driven Decisionmaking
John Taylor, Texas Parks and Wildlife
Refreshment Break, sponsored by Texas A&M Statistical Services
Fri, Feb 20, 10:45 AM - 11:00 AM
Napoleon AB

CS05 Leadership and Influence
Fill out evaluation
Fri, Feb 20, 11:00 AM - 12:30 PM
Napoleon D1&D2
Chair(s): Jay N. Mandrekar, Mayo Clinic

11:05 AM Understanding and Working with Difficult People
View Presentation View Presentation Colleen Mangeot, Cincinnati Children's Hospital Medical Center
11:50 AM The Influential Manager: Messages That Get Buy-In
View Presentation View Presentation Bill Williams, Organizational Learning Consultant
CS06 Business Applications
Fill out evaluation
Fri, Feb 20, 11:00 AM - 12:30 PM
Chair(s): Qin Liu, The Wistar Institute

11:05 AM Business Applications of Statistical Sampling
View Presentation View Presentation Laura Schweitzer, PwC
11:50 AM Applying Econometric Time Series Methods to CCAR Requirements
Kenneth Sanford, SAS Institute
CS07 Text Analytics and Dimension Reduction Methods
Fill out evaluation
Fri, Feb 20, 11:00 AM - 12:30 PM
Napoleon C
Chair(s): Elise Roberts, Johns Hopkins University Applied Physics Laboratory

11:05 AM Practical Text Analytics
View Presentation View Presentation Heath Rushing, Adsurgo LLC
11:50 AM Sparse Partial Robust M Regression
View Presentation View Presentation Sven Serneels, BASF Corp.
CS08 Exploratory and Interactive Graphics
Fill out evaluation
Fri, Feb 20, 11:00 AM - 12:30 PM
Chair(s): Jim Li, Procter & Gamble

11:05 AM Visualizing Data with Exploratory Data Analysis
View Presentation View Presentation Wendy L. Martinez, U.S. Bureau of Labor Statistics
11:50 AM Interactive Graphics Connect People to Data—with Some R Shiny Examples
View Presentation View Presentation Jean V. Adams, US Geological Survey - Great Lakes Science Center
Lunch (on own)
Fri, Feb 20, 12:30 PM - 2:00 PM

CS09 Effective Collaboration
Fill out evaluation
Fri, Feb 20, 2:00 PM - 3:30 PM
Napoleon D1&D2
Chair(s): Kim Love-Myers, Statistical Consulting Center, University of Georgia

2:05 PM Structuring Effective Statistical Collaborations and Consultations
View Presentation View Presentation Eric Vance, LISA, Virginia Tech
2:50 PM Communicating Effectively in Statistical Collaborations and Consultations
View Presentation View Presentation Heather Smith, Cal Poly
CS10 Special Designs
Fill out evaluation
Fri, Feb 20, 2:00 PM - 3:30 PM
Chair(s): Marie Kraska, Auburn University

2:05 PM Predictive Statistical Modeling of Clinical Trial Operation
View Presentation View Presentation Vladimir Anisimov, Quintiles
2:50 PM Analysis Plans for Doubly Repeated Measures Designs
View Presentation View Presentation Jeff Burton, Pennington Biomedical Research Center
CS11 Bootstrapping Applications
Fill out evaluation
Fri, Feb 20, 2:00 PM - 3:30 PM
Napoleon C
Chair(s): Phillippa Spencer, Dstl

2:05 PM Bootstrapping Time Series Data
View Presentation View Presentation Paul Teetor, Quant Development LLC
2:50 PM Bootstrapping Confidence Intervals for Effect Sizes (and Other Weird Things)
View Presentation View Presentation Erin Smith, University of Louisville
CS12 Statistical Plans and Charts
Fill out evaluation
Fri, Feb 20, 2:00 PM - 3:30 PM
Chair(s): Mary J. Kwasny, Northwestern University

2:05 PM Best Practice: The Data Analysis Plan—A Blueprint for Success
View Presentation View Presentation Kathleen A. Jablonski, The Biostatistics Center, The George Washington University
2:50 PM Using Charts to Present Your Results to Nonstatisticians
View Presentation View Presentation Jay Arthur, KnowWare International
Refreshment Break, sponsored by Texas A&M Statistical Services
Fri, Feb 20, 3:30 PM - 3:45 PM
Napoleon AB

CS13 Career Development
Fill out evaluation
Fri, Feb 20, 3:45 PM - 5:15 PM
Napoleon D1&D2
Chair(s): Shelley Brock Roth, Westat

3:50 PM Career Development Opportunities for Statisticians
View Presentation View Presentation LeAnna Stork, Monsanto Co.
4:35 PM G.R.O.W.—An Empowering Model for Career Success
View Presentation View Presentation Colleen Mangeot, Cincinnati Children's Hospital Medical Center
CS14 Interpretation
Fill out evaluation
Fri, Feb 20, 3:45 PM - 5:15 PM
Chair(s): Zoran Bursac, University of Tennessee Health Science Center

3:50 PM Statistical Methods for Bridging the Gap Between Interpretative and Predictive Analysis
View Presentation View Presentation Michael Regier, WVU, Department of Biostatistics
4:35 PM Using the Delta Method to Generate Means and Confidence Intervals from a Linear Mixed Model on the Original Scale, When the Analysis Is Done on the Log Scale
View Presentation View Presentation Brandy R. Sinco, University of Michigan School of Social Work
CS15 Social Media Applications
Fill out evaluation
Fri, Feb 20, 3:45 PM - 5:15 PM
Napoleon C
Chair(s): Pete Doe, Nielsen

3:50 PM Garbage in Garbage Out: Acquisition and Quality Assessment of Social Media Data in Health Research
Yoonsang Kim, University of Illinois at Chicago
4:35 PM Use of Social Media Data as a Lead Indicator to Predict Retail Sales Performance
Li Zhang, Alliance Data Systems
CS16 Dynamic Reporting Tools
Fill out evaluation
Fri, Feb 20, 3:45 PM - 5:15 PM
Chair(s): Alex Gilgur, Google

3:50 PM Creating an Easy-to-Use, Dynamic, Flexible Summary Table Macro with P-values in SAS for Research Studies
View Presentation View Presentation Amy Arlene Gravely, VA Medical Center, CCDOR
4:35 PM Reporting Results with R, R Markdown, and Shiny
Garrett Grolemund, RStudio, Inc.
PS2 Poster Session 2 & Refreshments
Fri, Feb 20, 5:15 PM - 6:30 PM
Napoleon AB

1 Comparing Linear Mixed Models Between Statistical Software
View Presentation View Presentation Danielle Guffey, Baylor College of Medicine
3 Developing Successful Career Relationships: Mentoring, Coaching, and Sponsorship
View Presentation View Presentation Sarah Kalicin, Intel Corporation
4 Please Don't Shoot the Messenger: Delivering Negative Results
View Presentation View Presentation David R Bristol, Statistical Consulting Services Inc
5 How to Apply Missing Data Techniques in Practice
Katherine M. Wright, Loyola University Chicago, Northwestern University
6 Piece-Wise Mixed Effect Model for Renal Function Data Analysis in Transplantation Patients
View Presentation View Presentation Zailong Wang, Novartis Pharmaceutics
7 Sample Size Estimation for Multiply Matched, Noninferiority Case-Control Studies with Binary Exposures
View Presentation View Presentation Charles Gene Minard, Baylor College of Medicine
8 Classification of Time Series Using Similarity Analysis
David J. Corliss, Wayne State University, Ford Motor Company
9 Viewing Problems Through a Multilevel Modeling Lens
View Presentation View Presentation Paul Roback, St. Olaf College
11 Liability Survival Analysis
Joseph Michael, Deloitte
12 Type 3 Statistics in SAS Procedures: What Do They Really Mean?
View Presentation View Presentation Leann Myers, Tulane University
13 Investigating Earthquake Magnitude Interdependency Through Stochastic Declustering
View Presentation View Presentation Devon Osgood Cook, California State University, Fullerton
14 Understanding Change Through Different Methodological Lenses
View Presentation View Presentation Jie Liao, Alliance Data Systems, Inc.
15 A Comparison of the Forecast of Pork Carcasses Futures by Three Methods: A SETAR Model, a Seasonal ARIMA Model, and Holt-Winters Smoothing
Gustavo Ramirez-Valverde, Colegio de Postgraduados
16 Adjusting Survey Mode Differences: Illustration of a Linear Equating Method
View Presentation View Presentation Sangeeta Agrawal, Gallup
Saturday, February 21
Sat, Feb 21, 7:30 AM - 2:30 PM
Napoleon Foyer

Exhibits Open
Sat, Feb 21, 8:00 AM - 1:00 PM
Napoleon AB

PS3 Poster Session 3 & Continental Breakfast
Sat, Feb 21, 8:00 AM - 9:15 AM
Napoleon AB

1 Are You Really Who We Think You Are? Recognizing and Controlling Biases in Statistical Analyses of Linked Data
View Presentation View Presentation Sigurd Wilson Hermansen, Westat
2 Predicting Buying Behavior: IT Software Customer Clustering with R and Weka
View Presentation View Presentation Emiliana Inez Patlan, SolarWinds
3 Technicians to Collaborators: Changing the Paradigm of Student Consulting at the University of Georgia
View Presentation View Presentation Kim Love-Myers, Statistical Consulting Center, University of Georgia
4 Communicating Applied Statistics Through Online Courses and Consulting
View Presentation View Presentation James Landis Rosenberger, Penn State University
5 Enrollment Modeling with Random Staggered Site Start-Up Times
View Presentation View Presentation Bradley Thomas Ferguson, Quintiles
6 Variability, Redundancy, and Reduction Using Principal Components Analysis
View Presentation View Presentation Marie Kraska, Auburn University
8 Multiple Imputation for Missing Data in Longitudinal Research Synthesis: Identifying and Overcoming Assumptions in Software
David Kline, The Ohio State University
9 Partial Least Squares Structural Equation Modeling as an Analysis Tool in Epidemiological Studies
View Presentation View Presentation kaushal Raj Chaudhary, Sanford Research
10 Experimental Design for Testing with Multiple Segments
View Presentation View Presentation Jinguo Gao, Dr.
11 Lessons Learned from Observational Studies: Considerations in Propensity Score Matching
View Presentation View Presentation Adin-Cristian Andrei, Northwestern University
12 Modeling Assessment Data with a Hierarchical Approach
View Presentation View Presentation Jimmy Wong, California Polytechnic State University, San Luis Obispo
13 An Empirical Investigation of the Impact of Measurement Error on Propensity Score Analysis
View Presentation View Presentation Patricia Rodríguez de Gil, University of South Florida
14 The Strength of Combining Public and Restricted Data: Tips for Using the Research Data Center (RDC)
View Presentation View Presentation Ellen E. Bishop, RTI, International
CS17 Presenting to Executives and Other Non-Statisticians
Fill out evaluation
Sat, Feb 21, 9:15 AM - 10:45 AM
Napoleon D1&D2
Chair(s): Felicia Hardnett, Centers for Disease Control and Prevention

9:20 AM Communications to Boards of Directors and Nonstatisticians
Joyce Nilsson Orsini, Fordham University Graduate School of Business
10:05 AM Coming Out of the Casket: Techniques for Becoming a More Effective Speaker
Eric Stephens, Vanderbilt University Medical Center
CS18 Method Reviews
Fill out evaluation
Sat, Feb 21, 9:15 AM - 10:45 AM
Chair(s): Dennis Lee Eggett, Brigham Young University

9:20 AM Speed Dating with Regression Methods
David J. Corliss, Wayne State University, Ford Motor Company
10:05 AM Are Data Science and Analytics Just New Names for Statistics?
View Presentation View Presentation Peter Bajorski, Rochester Institute of Technology
CS19 Analytics & Big Data Survey Review & Interpretation – Panel & Audience Discussion
Fill out evaluation
Sat, Feb 21, 9:15 AM - 10:45 AM
Napoleon C
Chair(s): K. Blayne Easter, The Vanguard Group

9:20 AM Analytics and Big Data Survey Review and Interpretation: Panel and Audience Discussion
Edward R. Jones, Texas A&M Statistical Services; Amarjot Kaur, Merck Research Labs; Elizabeth Kolodziej, Texas A&M University; Heath Rushing, Adsurgo LLC; F. Michael Speed, SAS Institute Inc
CS20 Effective Graphs
Fill out evaluation
Sat, Feb 21, 9:15 AM - 10:45 AM
Chair(s): Lasonja Kennedy, Independent Consultant

9:20 AM Correspondence Analysis
View Presentation View Presentation Jessica Thomson, USDA Agricultural Research Service
10:05 AM How to Avoid Some Common Graphical Mistakes
View Presentation View Presentation Naomi B. Robbins, NBR
Refreshment Break, sponsored by Texas A&M Statistical Services
Sat, Feb 21, 10:45 AM - 11:00 AM
Napoleon AB

CS21 Ethics and Impact
Fill out evaluation
Sat, Feb 21, 11:00 AM - 12:30 PM
Napoleon D1&D2
Chair(s): Steven B. Cohen, Agency for Healthcare Research and Policy

11:05 AM Teaching Ethics in Statistical Consulting
View Presentation View Presentation Alan C. Elliott, Southern Methodist University
11:50 AM Enhancing Communication Skills for Making Organizational Impact and Career Development
View Presentation View Presentation Jay N. Mandrekar, Mayo Clinic
CS22 Trial Enrollment
Fill out evaluation
Sat, Feb 21, 11:00 AM - 12:30 PM
Chair(s): Runhua Shi, LSU School of Medicine

11:05 AM Modeling Enrollment with Random Staggered Site Start-Up Times
View Presentation View Presentation Bradley Thomas Ferguson, Quintiles
11:50 AM Enrollment, Events Prediction, and Statistical Power Prediction for Event-Driven Trials
Vladimir Anisimov, Quintiles
CS23 Population Modeling
Fill out evaluation
Sat, Feb 21, 11:00 AM - 12:30 PM
Napoleon C
Chair(s): Mary Sailors, Chevron

11:05 AM Bayesian Spatial Joint Modeling of Asthma Admission and Readmission for Identifying High-Risk Neighborhood
View Presentation View Presentation Bin Huang, Cincinnati Children's Hospital Medical Center
11:50 AM Determining Different Population Distributions Using NHANES BMI Data
View Presentation View Presentation William Johnson, Pennington Biomedical Research Center
CS24 Special Uses of R
Fill out evaluation
Sat, Feb 21, 11:00 AM - 12:30 PM
Chair(s): Bryan Stanfill, CSIRO

11:05 AM Sparse Matrix Computation in R with an Application to GEEs
View Presentation View Presentation Lee S. McDaniel, LSUHSC, School of Public Health
11:50 AM Using R in a Regulated Environment
Keaven M. Anderson, Merck Research Laboratories
Collaboration Corner
Sat, Feb 21, 2:00 PM - 4:00 PM
Napoleon AB
Meet in the Collaboration Corner in the front of Napoleon Ballroom. Recommend a topic or sign up for a topic recommended by someone else on the bulletin boards in this area.

PCD1 Interactive Predictive Modeling with JMP 12 Pro: Keeping It in the Flow
Fill out evaluation
Sat, Feb 21, 2:00 PM - 4:00 PM
Napoleon C3
Instructor(s): Mia L. Stephens, JMP Division of SAS; Scott Lee Wise, SAS Institute, JMP Division
Interactive predictive modeling in JMP Statistical Software from SAS is more than building models. It allows you to take advantage of interactive and dynamic graphs and advanced analytic tools, keeping data visualization, analysis, and modeling in the flow. In this talk, we will use case studies to see how to explore and prepare data using the Column Switcher, Data Filter, Recode, and Graph Builder. We will use the Partition platform, Fit Model, and Generalized Regression platforms, as well as tools such as the Prediction Profiler and the Solution Path in JMP Pro 12, to interactively explore parameters and select potential models. Finally, we’ll compare a variety of competing models using Model Comparison.

PCD2 Tessera: Open Source Tools for Big Data Analysis in R
Fill out evaluation
Sat, Feb 21, 2:00 PM - 4:00 PM
Instructor(s): Landon Sego, Pacific Northwest National Laboratory; Amanda White, Pacific Northwest National Laboratory
Tessera is a set of R-based tools to enable data scientists to explore and analyze large, complex data.The Tessera computational environment is powered by divide and recombine (D&R), an approach for dividing data into subsets and computing on them in parallel. At the front end, the analyst programs in R. At the back end is a distributed parallel computa- tion environment such as Hadoop. In between are three Tessera packages: DataDR,Trelliscope, and RHIPE.The DataDR R package provides a high-level interface to D&R operations, making specification of divisions, analytic methods, and recombinations easy.The interface is designed to be back end agnostic, so it can harness new distributed computing tech- nologies as needed.Trelliscope is a scalable visualization tool in which data sets are divided into subsets and a visualization method is applied to each subset and shown in a multi-panel trellis display.This framework has proven to be a powerful mechanism for all data, large and small. RHIPE is the R and Hadoop Integrated Programming Environment. RHIPE allows an analyst to run Hadoop MapReduce jobs from within R. RHIPE is used by DataDR when the back end is Hadoop.

PCD3 Mathematica and Statistical Computing
Fill out evaluation
Sat, Feb 21, 2:00 PM - 4:00 PM
Napoleon D1
Instructor(s): Michael Kelly, Wolfram Research Inc.
Mathematica is the world’s leading symbolic and numerical software, pioneering the use of symbolic functional programming for the representation of mathematical, statistical, and computational objects in a universal, consistent, and high-level language that has allowed for a systematic treatment of the entire area of statistical analysis. Unlike other statistical programs that are mainly numerical, Mathematica combines the many advantages of symbolic representation of mathematical statistics with the numerical capabilities of advanced and novel algorithms. See

PCD4 Rating College Football Teams: A Case Study on Integrating Minitab with Statistical Programming Languages
Fill out evaluation
Sat, Feb 21, 2:00 PM - 4:00 PM
Napoleon D2
Instructor(s): Daniel Griffith, Minitab; Eduardo Santiago, Minitab, Inc.
College football is a sport with highly variable outcomes and teams that play highly unbalanced schedules due to conference affiliation, a large pool of potential opponents, and incentives that disfavor competitive balance. Despite these difficulties, it is highly desirable for fans, media, and the playoff selection committee to rate teams as accurately as possible. Using an unconventional method, the case study demonstrates how teams can be rated with minimal effect from uncontrollable aspects of the game.The method is performed using a combination of Minitab Statistical Software for its ease of use and graphical capabilities integrated with a statistical programming language for complex routines.

T1 A Case Study in Big Data Analytics
Fill out evaluation
Sat, Feb 21, 2:00 PM - 4:00 PM
Instructor(s): Patrick Hall, SAS Enterprise Miner

Download Handouts
So what exactly do you do when faced with a huge data set from which you are to derive insights? This happens in banks, insurance companies, government agencies, manufacturing centers, and other institutions all the time. This tutorial illustrates best practices for mining large data sets in the context of a case study. Participants will learn real-world techniques to explore and preprocess data; to select, extract, and engineer the most predictive features; to build the best predictive model for the job at hand; and to leverage predictive analytics to make decisions for their organization. Instructors also will point out common pitfalls and trade-offs inherent to contemporary Big Data approaches. SAS Enterprise Miner will be used for the analyses, but the focus will be on the methods and not the software. Participants will have access to the example data for further study.

Outline & Objectives

The objectives of this tutorial are for the participants to become familiar with general terminology, best practices, and practical, contemporary approaches for working with large data sets including the following:

• Data exploration

• Data preparation

• Data reduction techniques such as sampling, feature selection, feature extraction, and feature engineering

• Statistical and machine learning approaches for predictive analytics

• Scoring large data sets with predictive models

The tutorial will address the subjects outlined below:

1. Understanding the primary goal of a project in terms of inference or prediction: GLM and decision trees vs. machine learning algorithms.

2. Buzzwords: What are "Analytics", "Big Data", and "Machine Learning"?

3. Big Data Best Practices

4. Data Preparation and Exploration

5. Data Reduction

6. Prediction and Scoring

About the Instructor

Patrick designs new data mining and machine learning technologies for SAS. He is a Cloudera certified data scientist and a certified SAS Enterprise Miner predictive modeler. Patrick has two patent applications for his recent work in unsupervised learning. He studied computational chemistry before receiving an MS degree in analytics from the Institute for Advanced Analytics at North Carolina State University.

Relevance to Conference Goals

This tutorial will teach practical techniques that are broadly applicable for analysts, data scientists, machine learning engineers, and statisticians who mine large data sets in academia or industry.

T2 An Introductory Tutorial on Mixed Models
Fill out evaluation
Sat, Feb 21, 2:00 PM - 4:00 PM
Napoleon C1
Instructor(s): Funda Gunes, SAS Institute Inc.

Download Handouts
Mixed model analysis is one of the cornerstones of modern statistics. It extends the general linear model for independent and equivariant data by allowing a more flexible covariance for the error term. Using mixed models, you can fit models to a variety of data that follow the normal distribution, including repeated measurements and data from a randomized block design. This tutorial introduces the basics of mixed model methodology and illustrates the analysis of linear mixed models in typical applications, with numerous examples using the MIXED procedure in SAS/STAT software. This tutorial also includes an overview of other mixed modeling procedures in SAS, giving a brief introduction to analyzing generalized linear models by using the GLIMMIX procedure and discussing the scenarios in which you would use the nonlinear mixed models and NLMIXED procedure. Prerequisites are a working knowledge of the general linear model and basic matrix algebra.

Outline & Objectives

Outline & Objectives:
• Designed experiments
• Fixed effects versus random effects
Linear mixed models – MIXED procedure
• Randomized block design
• Nested mixed models
• Repeated measures
• High performance mixed models analysis
Generalized Linear Mixed Models – GLIMMIX procedure
Nonlinear Mixed Models – NLMIXED procedure

About the Instructor

Funda Gunes is a research statistician in the statistical applications department at SAS. She completed her PhD in statistics from North Carolina State University with a focus on model selection methods. As a research statistician in statistical R&D at SAS, she often gives expository talks and two-hour tutorials on a variety of topics, including mixed models, model selection, and Bayesian statistics. These presentations emphasize basic concepts and introduce applied statisticians to new methodology with relevant examples.

Relevance to Conference Goals

I attended CSP twice in the last few years. I especially enjoyed meeting statistical practitioners and learning how they use statistics in their work. Based on that experience, I think the CSP audience is a perfect fit for the content and level of this proposed tutorial, and I would love to participate.

T3 Speak & Connect: Harnessing PowerPoint
Fill out evaluation
Sat, Feb 21, 2:00 PM - 4:00 PM
Napoleon D3
Instructor(s): Andrew Causley, Speak & Connect

Download Handouts
Data-heavy presentations can overload an audience with information quickly, causing them to tune out. Learn how to create and deliver PowerPoint presentations that are interesting, effective, and memorable! It’s a fresh approach, one that combines information with effective visuals and personal engagement to connect with an audience in a credible and captivating manner. If you can answer YES to any of the following questions, you should attend this tutorial. Have some of your slides been loaded with text, bullet points, or complex data? Have there been times when you’ve read off your slides? Has your audience ever looked bored, inattentive, or asleep? Learn how to share data and information and create better decks.

Outline & Objectives

Create and deliver PowerPoint presentations that effectively educate your audience by keeping them engaged and motivated.

• The Do’s and Don’ts of slide presentation software (PowerPoint)
• Verbal vs Visual communication
• Slide Types
• Slide Test
• Interacting with the audience and the technology

• Pin-point the message behind the information
• Transform data heavy slides into effective visuals
• When to use (and not use) PowerPoint
• How to create slides that are clear, crisp, and high-impact
• A sure-fire method to assess the effectiveness of your slides
• Ideas to Open with impact
• How to properly utilize support materials
• Structuring modular presentations
• Effective ways to Close

About the Instructor

A veteran broadcast professional, Debra Stamp, is President of Debra Stamp Productions, and Principle Trainer for Speak & Connect, a growing business communication enterprise featuring communication skills coaching, instruction on creating truly effective visuals, and guidance in developing and delivering strong, memorable messages.

Debra is a professional voice talent and continues to work in radio, television, and business using her communication skills to inform, train, and entertain listeners and viewers around the world.

As a coach and trainer, she shares her techniques and years of experience with enthusiasm and energy. Her warm personal teaching style creates a learning atmosphere that’s friendly, interactive, and fun.

Andrew Causley is the owner of Ballistic Fish Studios and Lead Trainer for Speak & Connect, a growing business communication enterprise featuring communication skills coaching, instruction on creating truly effective visuals, and guidance in developing and delivering strong, memorable messages.

Creating and capturing compelling visuals is what he does best, and he has the technical know-how to get the most out of presentation software.

Relevance to Conference Goals

The key to successfully using PowerPoint lies in recognizing that it is a Visual Aid, there to support the presenter.

Simplified slides, focused messages, and engaged audiences are just a few of the benefits experienced when coming out from behind the data.

Once participants experience this tutorial, they will never look at PowerPoint in the same way again!

T4 Tutorial on Parallel Programming in R
Fill out evaluation
Sat, Feb 21, 2:00 PM - 4:00 PM
Napoleon C2
Instructor(s): Miranda Fix, Colorado State University; Josh Hewitt, Colorado State University; Henry Randall Scharf, Colorado State University

Download Handouts
This tutorial introduces participants to high-performance computing in R for analyzing research data and developing practical analytics. R is a free, open source programming language that concisely suppor ts a wide range of statistical computing and machine learning needs. Modern data sets are large and computational procedures can be intense, which may become prohibitive to practical data analysis projects. This tutorial introduces participants to workflows and packages that let practitioners use R to take advantage of the power of modern computing resources like multicore architectures and cloud technologies. Applications and examples include demonstrating parallel forms of popular classic and machine learning methods, using bootstrapping and cross validation to estimate uncertainty and accuracy, simulating data to analyze “what if” scenarios, and discussing related topics. Demonstrations will be presented with R. Attendees are encouraged to bring laptops with R installed so they may follow along and experiment with these tools.

Outline & Objectives

This tutorial's goal is to introduce participants to parallel computing ideas and give them tools they can use to scale their analyses up so they can handle large datasets and problems in their organization. The tutorial will advance through several sections that:

- Introduce parallel computing and identify common “parallelizable” tasks.
- Explain and demonstrate statistical procedures that excel with parallelization.
- Use several parallel programming R packages.
- Show integrations between R and the Hadoop/MapReduce cloud computing framework.

Pre-requisites: Familiarity with data analysis methods and general-purpose R programming.

About the Instructor

Henry Scharf is a PhD student at Colorado State University with a background in
both teaching and computational statistics. He received his Masters in Education from the University of Arizona, and has worked as an instructor at CSU. He currently works in conjunction with the National Renewable Energy Laboratory on questions surrounding prioritized compression of massive datasets sensitive to specific secondary analysis.

Josh Hewitt is a MS/PhD student at Colorado State University with strong interests in teaching and statistical theory and computing. He holds a Masters degree in Applied Mathematics and Statistics from The Johns Hopkins University and has worked on big data and analytic development projects with Booz Allen Hamilton for the United States Government.

Miranda Fix is a MS/PhD student at Colorado State University with experience in teaching and statistical consulting. She earned her Masters degree in Quantitative Ecology from the University of Washington, where she served as a teaching assistant and led R tutorials for several courses. She is currently working with the National Center for Atmospheric Research on analyzing large climate datasets.

Relevance to Conference Goals

This tutorial relates to Theme 3: Big Data Prediction and Analytics, and Theme 4: Software, Programming, and Graphics. The tutorial introduces participants to key ideas and examples in parallel statistical computing that enable practical Big Data and Analytic projects. The tutorial simultaneously exposes attendees to R packages they can use to analyze data and develop analytics in their own organizations.

Refreshment Break, sponsored by Texas A&M Statistical Services
Sat, Feb 21, 4:00 PM - 4:15 PM
Napoleon AB

GS2 Closing General Session
Sat, Feb 21, 4:15 PM - 5:30 PM
Napoleon C3
CSP Steering Committee chair, Sylvia Miller Dohrmann, and vice chair, Jim Rutherford, will lead a panel of CSP committee members as they summarize the conference and gather your feedback. Each panelist will speak for five minutes to share their conference experience. Discussion will then be extended to the audience for Q&A and feedback on how well the overall objectives of the conference were met, including areas of improvement for the future. The closing session is also a great time to let members of the CSP Steering Committee know if you are interested in helping out with future conferences.