Abstract:
|
Using factor models for dimensionality reduction is common when analyzing high dimensional data. Unfortunately, data often come with concomitant variables that can dominate the estimated latent representation, such as in fair machine learning and domain adaptation. We modify the objective function of dimensionality reduction methods to penalize the predictability of the concomitant variables. This yields a minimax formulation that finds a latent representation to simultaneously encode the primary data while being unpredictive of the concomitant data. We show three different minimax or adversarial solutions to this type of objective function, highlighting key differences between the formulations. Remarkably, using a PCA-like objective yields an analytic solution calculated by eigendecompositions on an augmented space. For general factor models, we show how neural networks can be used to efficiently approximate the objectives. We apply these techniques to both synthetic and real datasets, including electrophysiological recordings and survey data, to demonstrate that the estimated factors can yield better representations for common objectives.
|