Online Program

Keynote Presentation | Concurrent Sessions | Poster Sessions
Short Courses (full day) | Short Courses (half day) | Tutorials | Practical Computing Expos | Closing General Session with Refreshments

Last Name:

Abstract Keyword:

Title:

     

Viewing Short Course (half day)s onlyView Full Program
     
Thursday, February 20
SC4 Career Development Within Your Organization
Thu, Feb 20, 8:00 AM - 12:00 PM
Palma Ceia IV
Instructor(s): William Williams, Organizational Learning Consultant

Download Handouts
There are two fundamental keys to successfully pursuing opportunities within your organization: self-knowledge and the ability to represent your capabilities to other people. This workshop will help you with both. Through an assessment, you’ll identify and describe specifically what you’re good at and where your strongest interests and skills lie. This will allow you to make sound decisions about where to focus your energies for enhancing your career. We will include information about how to network effectively within your organization to locate pockets of opportunity or identify potential guides and mentors.

Outline & Objectives

1. Identify your top interest and skill areas;
2. Determine the contributions those can make in your current organization - and areas within your organization that align with those skills and interests.
3. Learn to network effectively with people in those areas of your organization so that you can best position yourself to find new opportunities that support your organization's objectives.

About the Instructor

Bill Williams is an Organizational Learning Consultant and has been a part of the Conference of Statistical Practice since its inaugural year.

Relevance to Conference Goals

This session will both help participants learn how to better navigate their career and learn how to have a more positive impact on their organization.

 
SC5 Modern Regression for Big Data Problems
Thu, Feb 20, 8:00 AM - 12:00 PM
Bayshore VI
Instructor(s): Simon J. Sheather, Texas A&M University

Download Handouts
In the past, regression applications have focused on modeling relationships based upon a relatively small amount of data. Many of these arise from statistically designed experiments or field trials. However, regression modeling is being applied increasingly to problems involving massively large and complex data and retrospective data collected routinely by businesses and government organizations. Does this change the approach statisticians take to modeling using regression techniques?

This workshop explores this question and provides concrete, practical advice for applying modern regression to solving big data problems. The presenter is author of A Modern Approach to Regression with R.

Outline & Objectives

1. Challenges and Issues for Applying Regression Modeling to Big Data Problems.

2. Practical Approaches and Advice on Using Regression Modeling in Modern Applications.

3. Modeling average airline ticket price across more than 5000 routes in the USA.

4. Modeling interest in NFL games using social media and other data.

5. Modeling loan defaults.- a case study involving logistic regression.

About the Instructor

Professor Sheather brings a wide scope of experience in both management and the integration of analytics into organizations and businesses. He received his BSc (Hons) degree from Melbourne and a Ph.D. in Statistics from La Trobe. Currently Simon is Professor and Head of the Statistics Department at Texas A&M University. Previously he was a faculty member at the Australian Graduate School of Management at the University of New South Wales.

Simon has over 20 years of experience applying analytics and statistical methods in business. His clients included banks, biotechnology, hospitality service companies, fashion, transportation, real estate, consumer products and government. During this time he published over 75 papers and 2 books. He is listed on the ISIHighlyCited.com website among the top one-half of one percent of all mathematical scientists for citations of published work.

Relevance to Conference Goals

The conference attracts statisticians involved in the practice of statistics in companies and organizations internationally. This workshop discusses how the application of a well known tool in statistical practice can be used to address modeling issues involving big data. It also discusses modern regression modeling can be misused when applied to big data problems.

 
SC6 Practical Bayesian Computation Using SAS
Thu, Feb 20, 8:00 AM - 12:00 PM
Bayshore VII
Instructor(s): Fang Chen, SAS Institute Inc.

Download Handouts
This half-day course reviews the basic concepts of Bayesian inference and focuses on the practical use of Bayesian computational methods. The objectives are to familiarize statistical programmers and practitioners with the essentials of Bayesian computing and equip them with computational tools through a series of worked-out examples that demonstrate sound practices for a variety of statistical models and Bayesian concepts.

The first part of the course provides a gentle introduction to Bayesian inference and covers the fundamentals of prior distributions and concepts in estimation. The course also will cover MCMC methods and related simulation techniques, emphasizing the interpretation of convergence diagnostics in practice.

The second part of the course involves applications using Bayesian capabilities in SAS/STAT software in the GENMOD, LIFEREG, PHREG, and FMM procedures. Examples will include methods such as linear regression, generalized linear models, survival analysis, and finite mixture models.

The third part of the course takes a topic-driven approach to cover broad Bayesian topics such as random-effects models, sensitivity analysis, prediction, and model assessment.

Outline & Objectives

Part I - Introduction to Bayesian statistics (30 to 40 minutes)
A. Concepts in Bayesian Methods
a. Motivations and Difference between Classical and Bayesian
Inference
b. Estimation (point and interval)
c. Prior Distributions
B. Computational Methods
a. Markov Chain Monte Carlo
b. Metropolis and Gibbs Samplers
C. Convergence Diagnostics
a. Terminologies
b. Diagnostics Tests
c. Visualization
d. Assessing Simulation Variability

Part II - Bayesian Computation using SAS (1 hour)
A. Introduction
B. Procedures with Bayesian Capabilities
a. GENMOD
b. PHREG
c. LIFEREG
d. FMM
C. Statistical Models and Topics (not necessarily in these order)
a. Linear Regression
b. Generalized Linear Model
c. Cox Regression and Piecewise Exponential Model
d. Frailty Model
e. Finite Mixture Model

Part III - Additional Bayesian Topics
A. Primer on PROC MCMC
B. Statistical Topics (not necessarily in these order)
a. Inference on Functions of Paramete

About the Instructor

Fang Chen is a Senior Manager of Bayesian Statistical Modeling and a
member of Advanced Analytics Division at SAS Institute Inc. Among his
responsibilities are development of Bayesian analysis software and MCMC procedure. He has written about Bayesian modeling using the MCMC
procedure and taught continuing education course on practical Bayesian
computation at JSM. Prior to joining SAS Institute, he received his
degree in statistics from Carnegie Mellon University in 2004.

Relevance to Conference Goals

Attendees will understand basic concepts and computational methods of
Bayesian statistics, and how to deal with practical issues that arise
from Bayesian analysis. Attendees will also be able to program using
SAS/STAT procedures with Bayesian capabilities to implement various
Bayesian models.

 
SC7 An Introduction to R for Data Analysts
Thu, Feb 20, 1:00 PM - 5:00 PM
Bayshore VI
Instructor(s): Robert Kabacoff, Management Research Group

Download Handouts
R has become one of the most popular languages for data analysis and graphics. This course will provide a practical introduction to this comprehensive platform. Participants will learn to import data into R from a variety of sources; clean, recode, and restructure data; and apply R’s many functions for summarizing, modeling, and graphing data. Both basic and more advanced forms of data analysis will be covered. Additional topics include navigating R’s comprehensive help systems, practical advice for processing data, common programming mistakes to avoid, and useful functions for data mining.

Outline & Objectives

I. Introduction – An introduction to R: R syntax and data structures; working interactively and in batch; alternative IDEs and GUIs; adding functionality through packages; common programming mistakes; getting unstuck – were to find answers to your questions.
II. Data Management – Importing, cleaning, and reformatting data: transforming and recoding variables; subsetting, merging, and aggregating data; control structures; user-written functions.
III. Graphics – Taking advantage of R’s powerful graphics: creating basic and advanced graphs; customizing and combining graphs; innovative methods for visualizing complex data.
IV. Statistical Analysis and Data Mining – Using R for description, prediction, and classification: descriptive statistics and multi-way tables; ANOVA variants; regression (e.g., linear, logistic, Poisson), classification trees, cluster analysis, and other multivariate methods; dealing effectively with missing data; Going further.

About the Instructor

Dr. Robert Kabacoff has twenty five years of experience teaching and consulting in the areas of statistics and computing for academia, healthcare, business, and government. Currently he is Vice President of Research for Management Research Group, a global human resource development firm, where he has provided research and statistical consultations to organizations around the world for the past 15 years. Prior to joining MRG he was a Professor of Psychology in the Center for Psychological Studies at Nova Southeastern University, where he taught graduate courses in research methodology, statistical computing and statistics.

Dr. Kabacoff is author of the book R in Action: Statistics and graphics with R (www.manning.com/kabacoff) and maintains Quick-R (http://www.statmethods.net), a popular tutorial site on the R language. In the past two years he has taught workshops on R programming for such organizations as the Association of Computing Machinery, the Society of Industrial and Organizational Psychology, and the United States Department of Defense.

Relevance to Conference Goals

Using freely available open source software, participants will learn practical data analytic skills covering the full range of the research endevour, from data aquisition and data munging, to building and testing models, visualizing results, and communitating finding to their constituencies. Additionally, suggested paths for continued learning (including online resources and help resources) are provided.

 
SC8 Peering into the Future: Introduction to Time Series Methods for Forecasting
Thu, Feb 20, 1:00 PM - 5:00 PM
Palma Ceia IV
Instructor(s): David A. Dickey, North Carolina State University

Download Handouts
This workshop will provide a practical guide to time series analysis and forecasting, focusing on examples and applications in modern software. Students will learn how to recognize autocorrelation when they see it and how to incorporate autocorrelation into their modeling. Models in the ARIMA class and their identification, fitting, and diagnostic testing will be emphasized and extended to models with deterministic trend functions (inputs) and ARMA errors. Diagnosing stationarity, a critical feature for proper analysis, will be demonstrated. After the course, students should be able to identify, fit, and forecast with this class of time series models and be aware of the consequences of having autocorrelated data. They should be able to recognize nonstationary cases in which the differences in the data, rather than the levels, should be analyzed. Underlying ideas and interpretation of output, rather than code, will be emphasized. No previous experience with any particular software is needed. Examples will be computed in SAS, but most modern statistical packages such as SPSS, R, STATA, etc. can be used for time series analysis.

Outline & Objectives

Outline of course topics:

(1)Identifying and fitting ARMA models,

(2)Diagnostics,

(3)Incorporating inputs: Regression with Time Series Errors,

(4)Intervention Analysis,

(5)Nonstationarity: Unit Roots and Stochastic Trends,

(Optional: Seasonal models time permitting)

Benefits of the course include an understanding of new issues encountered when data are taken over time and how to deal with these issues. Not only are new techniques of analysis necessary, which the student will learn, but additional terminology arises in these cases.
Examples and practical interpretation along with the strengths and weaknesses of competing forecasting methodologies will be emphasized.
I hope to give examples of interesting data analyses that can be used as templates for analyzing the participants' data when they return home.

About the Instructor

David A. Dickey received his PhD in statistics in 1976 from Iowa State University working with Dr. Wayne A. Fuller. Their “Dickey-Fuller” test is a part of most modern time series software packages. He is on the ISI’s list of highly cited researchers and is an ASA Fellow. Dickey is William Neal Reynolds Professor of Statistics at North Carolina State University where he does time series research, teaches graduate level methods courses, does consulting, and mentors graduate students. He is coauthor of several books on statistics, including “The SAS System for Forecasting Time Series,” a publication of SAS Institute. He has presented at many conferences including the 2013 ASA Conference on Statistical Practice and several JSM sessions. He has been a contact instructor for SAS Institute since 1981 teaching courses in statistical methodology, including time series, and has helped write some of their course notes. Recently Dickey has been teaching for NC State University's Institute for Advanced Analytics which offers an intensive applied Master’s degree in a 9 month cohort program. He has appointments in Economics and the NCSU Financial Math program.

Relevance to Conference Goals

The student will be better able to communicate intelligently with clients having data taken over time by learning the terms and the concepts behind them. The benefits of being able to better forecast what is going to happen next should be of obvious value to any company collecting data over time. The successful student should be able to carry out an analysis of time dependent data from model identification, through fitting and diagnostic checking, all the way to producing forecasts.

 
SC9 Text Analytics
Thu, Feb 20, 1:00 PM - 5:00 PM
Bayshore VII
Instructor(s): Edward R. Jones, Texas A&M Statistical Services

Download Handouts
Text Analytics is a new interdisciplinary area that blends methodology from statistics, computer science, and natural language processing. Understanding the terminology and general approach to the statistical analysis of large collections of text data is increasingly critical to connecting statisticians to important Big Data problems.

Computer scientists have developed sophisticated algorithms for extracting and compiling complex summaries of text data. Statisticians have adaptive statistical methods for text analytics designed to solve sophisticated business and government problems. This is rapidly evolving as the available data and applications change. In the beginning, text analytics involved the analysis of simple word counts. Now, with available software for natural language processing, text analytics is challenged with the analysis of contextual information.

This half-day workshop explores the terminology, common methodology, and software for analysis of large, complex text data.

Outline & Objectives

1. Text Analytics - History and Terminology
2. Concept and Content Extraction
3. Summarization and Categorization
4. Content Management & Sentiment Analysis
5. Useful Approaches to Applying Text Analytics

About the Instructor

Dr. Jones has a Ph.D. degree Statistics from Virginia Tech and a B.S. in Computer Science from Texas A&M University - Commerce. Currently he teaches data mining and analytics at Texas A&M University. He also mentors graduate students in data mining and analytics team competitions. He is also co-founder and Vice President of Texas A&M Statistical Services.

He has over 10 years in the development of statistical and data mining software for companies in Silicon Valley and Rogue Wave Software. He supervised the design, development and testing of the IMSL (International Mathematics and Statistics Library) data mining software.

Relevance to Conference Goals

The Conference on Statistical Practice attracts hundreds of statistical practitioners and researchers. With increasing applications involving text mining and text analytics, this workshop will provide the background applied statisticians can use to expand their field of practice to include problems involving text analytics and text mining.