Name: 2021 Joint Statistical Meetings
Start: 2021-08-08T07:00:00+00:00
End: 2021-08-12

Online Program Home
My Program

All Times EDT

Activity Number:	389 - Words and Insights via Text Analysis
Type:	Invited
Date/Time:	Thursday, August 12, 2021 : 2:00 PM to 3:50 PM
Sponsor:	Text Analysis Interest Group
Abstract #316726
Title:	Transfer Learning for Latent Dirichlet Allocation
Author(s):	Tommy W Jones*
Companies:	In-Q-Tel
Keywords:	text mining; latent dirichlet allocation; natural language processing; latent variables; bayesian hierarchical models
Abstract:	People who use probabilistic topic models often need to update models based on new or updated data. There has been little research on transfer learning for Latent Dirichlet Allocation (LDA), which would enable updates. The result is that applied practitioners face an unpleasant tradeoff. Models may go stale, becoming less useful over time, or models must be re-trained from random initialization. When topics are re-initialized at random it breaks continuity with the old model. This research explores two complimentary methods for transfer learning in LDA. The first uses the topic-word distributions from previously-trained LDA model as a prior for a new model. The second involves a post-hoc alignment of counts of words and topics sampled in the original data set to the new or updated data set.

Authors who are presenting talks have a * after their name.

JSM 2021 Online Program