From: Jeff Bilmes <bilmes@ee.washington.edu>
Subject: [cs-ugrads] [Speech-seminar] Research Seminar Speech Recognition: Karen Livescu, TTI-Chicago, 8/25 3-4pm, PAC AE108
To: speech-seminar@crow.ee.washington.edu
Multi-view Learning of Speech Feature Spaces
Karen Livescu
TTI-Chicago
Thursday, August 25th, 3:00-4:00pm
PAC (Paul Allen Center) AE108
Many learning tasks (classification, regression, clustering) can be
improved when multiple views of the data are available. The meaning
of “views” may be a natural one like audio vs. images vs. text, or
more abstract like arbitrary subsets of the observation vector.
Multi-view learning algorithms, such as co-training, take advantage of
the relationships between the views. In this work, we explore
two-view learning of feature spaces for speech processing tasks.
Given two views of the training data, we learn a transformation of
each view that, in some sense, best predicts the other view. We can
then apply the learned transformations even when only one view (e.g.
audio) is available at test time. For this talk, I will focus on work
using canonical correlation analysis (CCA), in which a linear
projection of each view is learned, such that the two views’
projections are maximally correlated. I will describe experiments on
clustering tasks, speaker identification, and phonetic classification.
Time permitting, I will describe additional ongoing work in speech and
language at TTI-Chicago
______________________________