CSE Undergrad News » UW CSE Colloq / 2-12-13 / Ravikumar / CMU/UT, Austin / Statistical Machine Learning and Big-p, Big-n, Complex Dat

UW CSE Colloq / 2-12-13 / Ravikumar / CMU/UT, Austin / Statistical Machine Learning and Big-p, Big-n, Complex Dat

Up this week on Tuesday:

UNIVERSITY OF WASHINGTON
Computer Science and Engineering
COLLOQUIUM

SPEAKER: Pradeep Ravikumar, CMU/UT, Austin

TITLE: Statistical Machine Learning and Big-p, Big-n, Complex Data

DATE: Tuesday, February 12, 2013
TIME: 3:30pm
PLACE: EEB-105
HOST: Carlos Guestrin

ABSTRACT:
Drawing upon disparate fields as economics, psychology, operations
research and statistics, the subfield of statistical machine learning has
provided practically successful tools ranging from search engines to
medical diagnosis, image processing, speech recognition, and a wide array
of problems in science and engineering. However, over the past decade,
faced with modern data settings, off-the-shelf statistical machine
learning methods are frequently proving insufficient. These modern
settings pose three key challenges, which largely come under the rubric of
“Big Data”: (a) the data might have a large number of features, in what we
will call “Big-p” data, to denote the fact that the dimension “p” of the
data is large, or (b) the data might have a large number of data
instances, in what we will call “Big-n” data, to denote the fact that the
number of samples “n” is large, or (c) the data-types could be complex:
such as permutations, or strings, or graphs, which typically lie in some
large discrete space. A key approach in addressing such “Big Data”
settings has involved leveraging systems-related approaches such as
parallel and distributed algorithms, as well as architecture and
algorithms for efficient, possibly distributed, data access and storage.
In this talk, we will discuss the complementary approach of statistical
modeling, but which importantly is tuned to each of these three aspects of
modern statistical machine learning: big-p data, big-n data, and complex
data-types.

Statistical machine learning for Big-p data, with more variables than
samples, has been the focus of considerable research over the last decade.
It is now well understood that estimation with strong statistical
guarantees is still possible under such high-dimensional settings provided
we impose suitable constraints on the model space.
Accordingly, we will discuss a unified framework for learning general
structurally constrained high-dimensional models (such as models that are
sparse, low-rank, and so on). For Big-n data, a key sub-field that is
increasingly gaining importance is that of non-parametric models, where
the model components potentially lie in infinite-dimensional spaces. A key
caveat to the wide-spread use of these models has been the larger number
of observations required by these models as compared to parametric
methods, but this is much less of a problem in Big-n settings.
Accordingly, we will discuss a unified framework of structurally
constrained semi-parametric models (such as sparse additive models and so
on). For complex-typed data, even standard machine learning questions such
as devising suitable loss functions, and devising suitable statistical
models that respect interesting structure, are still outstanding. We will
address some of these questions for the specific complex data-type of
permutations.

Bio:
Pradeep Ravikumar received his B.Tech. in Computer Science and Engineering
from the Indian Institute of Technology, Bombay, and his PhD in Machine
Learning from the School of Computer Science at Carnegie Mellon
University. He was then a postdoctoral scholar at the Department of
Statistics at the University of California, Berkeley. He is now an
Assistant Professor in the Department of Computer Science, at the
University of Texas at Austin. He is also affiliated with the Division of
Statistics and Scientific Computation, and the Institute for Computational
Engineering and Sciences at UT Austin. His thesis has received honorable
mentions in the ACM SIGKDD Dissertation award and the CMU School of
Computer Science Distinguished Dissertation award. He is also a recipient
of the NSF CAREER Award.

Refreshments to be served in room prior to talk.

*NOTE* This lecture will be broadcast live via the Internet. See
http://www.cs.washington.edu/news/colloq.info.html for more information.

Email: talk-info@cs.washington.edu
Info: http://www.cs.washington.edu/
(206) 543-1695

The University of Washington is committed to providing access, equal
opportunity and reasonable accomodation in its services, programs,
activities, education and employment for individuals with disabilities.
To request disability accommodation, contact the Disability Services
Office at least ten days in advance of the event at: (206) 543-6450/V,
(206) 543-6452/TTY, (206) 685-7264 (FAX), or email at
dso@u.washington.edu.

February 11, 2013