Online Program Home
My Program

Abstract Details

Activity Number: 256 - Contributed Poster Presentations: Section on Statistical Learning and Data Science
Type: Contributed
Date/Time: Monday, July 29, 2019 : 2:00 PM to 3:50 PM
Sponsor: Section on Statistical Learning and Data Science
Abstract #306600
Title: Connecting Diverse Data with the Power of Natural Language Processing Methods
Author(s): Tracy Schifeling* and Murat Tasan
Companies: Bluprint and Bluprint
Keywords: data science; media processing; natural language processing; combining data; Bluprint

At Bluprint, an online education and e-commerce company for all things arts and crafts, we have diverse sources of data. We have more than 1400 online video classes, 8000 blog articles, 20,000 products for sale, and 100,000 project images uploaded by our maker community, plus plenty of external data sources, such as the images from our members’ Pinterest and Instagram boards where they curate and share their crafting passions. Our data science team’s challenge is to link these data to create both useful internal tools for our colleagues and delightful customer-facing features. In this talk -- targeted for an audience of applied statisticians and data scientists -- I’ll share how we’ve been able to leverage existing media processing tools (e.g. video transcript and image recognition services) to map much of these data to a common data space of words (i.e. our crafting ‘dictionary’). From there we use natural language processing methods as the bridge to describe, compare, and connect these once-disparate data types. These connections allow us to more easily model our customers’ crafting journeys and ultimately deliver products that enhance their crafting experience.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2019 program