Online Program

Friday, February 20
CS07 Text Analytics and Dimension Reduction Methods Fri, Feb 20, 11:00 AM - 12:30 PM
Napoleon C

Practical Text Analytics (302901)

*Heath Rushing, Adsurgo LLC 

Keywords: text mining, text analytics, data mining, cluster analysis, natural language processsing, string processing

It is estimated that approximately 80% of data in most organizations are unstructured, such as text. This session will provide an overview of new methods easily implemented to find previously unknown relationships from a collection of unstructured data. Techniques used for predictive analytics and data mining also are explored with text from sources such as email, survey comments, incident reports, free-form data fields, websites, research reports, blogs, social media, and other text fields to discover potentially useful and actionable business insights. Multiple demonstrations will take place with example data sets that include applications to fraud detection, accident investigations, open-ended survey questions, social media, and authorship. Participants will be guided through end-to-end examples starting from assembling disparate text sources relevant to a business problem to creating a structured data table to applying analytical and graphical methods such as tabulation, decision trees, and cluster analysis to discovering useful and actionable relationships. Software packages such as R, SAS Text Miner, Statistica, and JMP will be used.