Thursday, February 20 |
SC4 Career Development Within Your Organization
|
Thu, Feb 20, 8:00 AM - 12:00 PM
Palma Ceia IV
|
Instructor(s): William Williams, Organizational Learning Consultant
Download Handouts
|
|
|
There are two fundamental keys to successfully pursuing opportunities within your organization: self-knowledge and the ability to represent your capabilities to other people. This workshop will help you with both. Through an assessment, you’ll identify and describe specifically what you’re good at and where your strongest interests and skills lie. This will allow you to make sound decisions about where to focus your energies for enhancing your career. We will include information about how to network effectively within your organization to locate pockets of opportunity or identify potential guides and mentors.
|
Outline & Objectives
1. Identify your top interest and skill areas;
2. Determine the contributions those can make in your current organization - and areas within your organization that align with those skills and interests.
3. Learn to network effectively with people in those areas of your organization so that you can best position yourself to find new opportunities that support your organization's objectives.
About the Instructor
Bill Williams is an Organizational Learning Consultant and has been a part of the Conference of Statistical Practice since its inaugural year.
Relevance to Conference Goals
This session will both help participants learn how to better navigate their career and learn how to have a more positive impact on their organization.
|
|
|
SC5 Modern Regression for Big Data Problems
|
Thu, Feb 20, 8:00 AM - 12:00 PM
Bayshore VI
|
Instructor(s): Simon J. Sheather, Texas A&M University
Download Handouts
|
|
|
In the past, regression applications have focused on modeling relationships based upon a relatively small amount of data. Many of these arise from statistically designed experiments or field trials. However, regression modeling is being applied increasingly to problems involving massively large and complex data and retrospective data collected routinely by businesses and government organizations. Does this change the approach statisticians take to modeling using regression techniques?
This workshop explores this question and provides concrete, practical advice for applying modern regression to solving big data problems. The presenter is author of A Modern Approach to Regression with R.
|
Outline & Objectives
1. Challenges and Issues for Applying Regression Modeling to Big Data Problems.
2. Practical Approaches and Advice on Using Regression Modeling in Modern Applications.
3. Modeling average airline ticket price across more than 5000 routes in the USA.
4. Modeling interest in NFL games using social media and other data.
5. Modeling loan defaults.- a case study involving logistic regression.
About the Instructor
Professor Sheather brings a wide scope of experience in both management and the integration of analytics into organizations and businesses. He received his BSc (Hons) degree from Melbourne and a Ph.D. in Statistics from La Trobe. Currently Simon is Professor and Head of the Statistics Department at Texas A&M University. Previously he was a faculty member at the Australian Graduate School of Management at the University of New South Wales.
Simon has over 20 years of experience applying analytics and statistical methods in business. His clients included banks, biotechnology, hospitality service companies, fashion, transportation, real estate, consumer products and government. During this time he published over 75 papers and 2 books. He is listed on the ISIHighlyCited.com website among the top one-half of one percent of all mathematical scientists for citations of published work.
Relevance to Conference Goals
The conference attracts statisticians involved in the practice of statistics in companies and organizations internationally. This workshop discusses how the application of a well known tool in statistical practice can be used to address modeling issues involving big data. It also discusses modern regression modeling can be misused when applied to big data problems.
|
|
|
SC6 Practical Bayesian Computation Using SAS
|
Thu, Feb 20, 8:00 AM - 12:00 PM
Bayshore VII
|
Instructor(s): Fang Chen, SAS Institute Inc.
Download Handouts
|
|
|
This half-day course reviews the basic concepts of Bayesian inference and focuses on the practical use of Bayesian computational methods. The objectives are to familiarize statistical programmers and practitioners with the essentials of Bayesian computing and equip them with computational tools through a series of worked-out examples that demonstrate sound practices for a variety of statistical models and Bayesian concepts.
The first part of the course provides a gentle introduction to Bayesian inference and covers the fundamentals of prior distributions and concepts in estimation. The course also will cover MCMC methods and related simulation techniques, emphasizing the interpretation of convergence diagnostics in practice.
The second part of the course involves applications using Bayesian capabilities in SAS/STAT software in the GENMOD, LIFEREG, PHREG, and FMM procedures. Examples will include methods such as linear regression, generalized linear models, survival analysis, and finite mixture models.
The third part of the course takes a topic-driven approach to cover broad Bayesian topics such as random-effects models, sensitivity analysis, prediction, and model assessment.
|
Outline & Objectives
Part I - Introduction to Bayesian statistics (30 to 40 minutes)
A. Concepts in Bayesian Methods
a. Motivations and Difference between Classical and Bayesian
Inference
b. Estimation (point and interval)
c. Prior Distributions
B. Computational Methods
a. Markov Chain Monte Carlo
b. Metropolis and Gibbs Samplers
C. Convergence Diagnostics
a. Terminologies
b. Diagnostics Tests
c. Visualization
d. Assessing Simulation Variability
Part II - Bayesian Computation using SAS (1 hour)
A. Introduction
B. Procedures with Bayesian Capabilities
a. GENMOD
b. PHREG
c. LIFEREG
d. FMM
C. Statistical Models and Topics (not necessarily in these order)
a. Linear Regression
b. Generalized Linear Model
c. Cox Regression and Piecewise Exponential Model
d. Frailty Model
e. Finite Mixture Model
Part III - Additional Bayesian Topics
A. Primer on PROC MCMC
B. Statistical Topics (not necessarily in these order)
a. Inference on Functions of Paramete
About the Instructor
Fang Chen is a Senior Manager of Bayesian Statistical Modeling and a
member of Advanced Analytics Division at SAS Institute Inc. Among his
responsibilities are development of Bayesian analysis software and MCMC procedure. He has written about Bayesian modeling using the MCMC
procedure and taught continuing education course on practical Bayesian
computation at JSM. Prior to joining SAS Institute, he received his
degree in statistics from Carnegie Mellon University in 2004.
Relevance to Conference Goals
Attendees will understand basic concepts and computational methods of
Bayesian statistics, and how to deal with practical issues that arise
from Bayesian analysis. Attendees will also be able to program using
SAS/STAT procedures with Bayesian capabilities to implement various
Bayesian models.
|
|
|
SC7 An Introduction to R for Data Analysts
|
Thu, Feb 20, 1:00 PM - 5:00 PM
Bayshore VI
|
Instructor(s): Robert Kabacoff, Management Research Group
Download Handouts
|
|
|
R has become one of the most popular languages for data analysis and graphics. This course will provide a practical introduction to this comprehensive platform. Participants will learn to import data into R from a variety of sources; clean, recode, and restructure data; and apply R’s many functions for summarizing, modeling, and graphing data. Both basic and more advanced forms of data analysis will be covered. Additional topics include navigating R’s comprehensive help systems, practical advice for processing data, common programming mistakes to avoid, and useful functions for data mining.
|
Outline & Objectives
I. Introduction – An introduction to R: R syntax and data structures; working interactively and in batch; alternative IDEs and GUIs; adding functionality through packages; common programming mistakes; getting unstuck – were to find answers to your questions.
II. Data Management – Importing, cleaning, and reformatting data: transforming and recoding variables; subsetting, merging, and aggregating data; control structures; user-written functions.
III. Graphics – Taking advantage of R’s powerful graphics: creating basic and advanced graphs; customizing and combining graphs; innovative methods for visualizing complex data.
IV. Statistical Analysis and Data Mining – Using R for description, prediction, and classification: descriptive statistics and multi-way tables; ANOVA variants; regression (e.g., linear, logistic, Poisson), classification trees, cluster analysis, and other multivariate methods; dealing effectively with missing data; Going further.
About the Instructor
Dr. Robert Kabacoff has twenty five years of experience teaching and consulting in the areas of statistics and computing for academia, healthcare, business, and government. Currently he is Vice President of Research for Management Research Group, a global human resource development firm, where he has provided research and statistical consultations to organizations around the world for the past 15 years. Prior to joining MRG he was a Professor of Psychology in the Center for Psychological Studies at Nova Southeastern University, where he taught graduate courses in research methodology, statistical computing and statistics.
Dr. Kabacoff is author of the book R in Action: Statistics and graphics with R (www.manning.com/kabacoff) and maintains Quick-R (http://www.statmethods.net), a popular tutorial site on the R language. In the past two years he has taught workshops on R programming for such organizations as the Association of Computing Machinery, the Society of Industrial and Organizational Psychology, and the United States Department of Defense.
Relevance to Conference Goals
Using freely available open source software, participants will learn practical data analytic skills covering the full range of the research endevour, from data aquisition and data munging, to building and testing models, visualizing results, and communitating finding to their constituencies. Additionally, suggested paths for continued learning (including online resources and help resources) are provided.
|
|
|
SC8 Peering into the Future: Introduction to Time Series Methods for Forecasting
|
Thu, Feb 20, 1:00 PM - 5:00 PM
Palma Ceia IV
|
Instructor(s): David A. Dickey, North Carolina State University
Download Handouts
|
|
|
This workshop will provide a practical guide to time series analysis and forecasting, focusing on examples and applications in modern software. Students will learn how to recognize autocorrelation when they see it and how to incorporate autocorrelation into their modeling. Models in the ARIMA class and their identification, fitting, and diagnostic testing will be emphasized and extended to models with deterministic trend functions (inputs) and ARMA errors. Diagnosing stationarity, a critical feature for proper analysis, will be demonstrated. After the course, students should be able to identify, fit, and forecast with this class of time series models and be aware of the consequences of having autocorrelated data. They should be able to recognize nonstationary cases in which the differences in the data, rather than the levels, should be analyzed. Underlying ideas and interpretation of output, rather than code, will be emphasized. No previous experience with any particular software is needed. Examples will be computed in SAS, but most modern statistical packages such as SPSS, R, STATA, etc. can be used for time series analysis.
|
Outline & Objectives
Outline of course topics:
(1)Identifying and fitting ARMA models,
(2)Diagnostics,
(3)Incorporating inputs: Regression with Time Series Errors,
(4)Intervention Analysis,
(5)Nonstationarity: Unit Roots and Stochastic Trends,
(Optional: Seasonal models time permitting)
Benefits of the course include an understanding of new issues encountered when data are taken over time and how to deal with these issues. Not only are new techniques of analysis necessary, which the student will learn, but additional terminology arises in these cases.
Examples and practical interpretation along with the strengths and weaknesses of competing forecasting methodologies will be emphasized.
I hope to give examples of interesting data analyses that can be used as templates for analyzing the participants' data when they return home.
About the Instructor
David A. Dickey received his PhD in statistics in 1976 from Iowa State University working with Dr. Wayne A. Fuller. Their “Dickey-Fuller” test is a part of most modern time series software packages. He is on the ISI’s list of highly cited researchers and is an ASA Fellow. Dickey is William Neal Reynolds Professor of Statistics at North Carolina State University where he does time series research, teaches graduate level methods courses, does consulting, and mentors graduate students. He is coauthor of several books on statistics, including “The SAS System for Forecasting Time Series,” a publication of SAS Institute. He has presented at many conferences including the 2013 ASA Conference on Statistical Practice and several JSM sessions. He has been a contact instructor for SAS Institute since 1981 teaching courses in statistical methodology, including time series, and has helped write some of their course notes. Recently Dickey has been teaching for NC State University's Institute for Advanced Analytics which offers an intensive applied Master’s degree in a 9 month cohort program. He has appointments in Economics and the NCSU Financial Math program.
Relevance to Conference Goals
The student will be better able to communicate intelligently with clients having data taken over time by learning the terms and the concepts behind them. The benefits of being able to better forecast what is going to happen next should be of obvious value to any company collecting data over time. The successful student should be able to carry out an analysis of time dependent data from model identification, through fitting and diagnostic checking, all the way to producing forecasts.
|
|
|
SC9 Text Analytics
|
Thu, Feb 20, 1:00 PM - 5:00 PM
Bayshore VII
|
Instructor(s): Edward R. Jones, Texas A&M Statistical Services
Download Handouts
|
|
|
Text Analytics is a new interdisciplinary area that blends methodology from statistics, computer science, and natural language processing. Understanding the terminology and general approach to the statistical analysis of large collections of text data is increasingly critical to connecting statisticians to important Big Data problems.
Computer scientists have developed sophisticated algorithms for extracting and compiling complex summaries of text data. Statisticians have adaptive statistical methods for text analytics designed to solve sophisticated business and government problems. This is rapidly evolving as the available data and applications change. In the beginning, text analytics involved the analysis of simple word counts. Now, with available software for natural language processing, text analytics is challenged with the analysis of contextual information.
This half-day workshop explores the terminology, common methodology, and software for analysis of large, complex text data.
|
Outline & Objectives
1. Text Analytics - History and Terminology
2. Concept and Content Extraction
3. Summarization and Categorization
4. Content Management & Sentiment Analysis
5. Useful Approaches to Applying Text Analytics
About the Instructor
Dr. Jones has a Ph.D. degree Statistics from Virginia Tech and a B.S. in Computer Science from Texas A&M University - Commerce. Currently he teaches data mining and analytics at Texas A&M University. He also mentors graduate students in data mining and analytics team competitions. He is also co-founder and Vice President of Texas A&M Statistical Services.
He has over 10 years in the development of statistical and data mining software for companies in Silicon Valley and Rogue Wave Software. He supervised the design, development and testing of the IMSL (International Mathematics and Statistics Library) data mining software.
Relevance to Conference Goals
The Conference on Statistical Practice attracts hundreds of statistical practitioners and researchers. With increasing applications involving text mining and text analytics, this workshop will provide the background applied statisticians can use to expand their field of practice to include problems involving text analytics and text mining.
|
|
|