Speaker: Kevin Gimpel
Date: Monday, April 12, 2010
Time: Noon
Venue: GHC 6115
Abstract:
Considerable speed-ups to machine learning problems have been achieved by two developments: distributed computing (either on multi-core or "cloud" architectures) and rapidly converging online learning algorithms. In this talk, we combine these two. Distributed computing has largely been paired with "batch" algorithms like EM and L-BFGS, in which the entire training dataset is processed once per iterative update; our approach makes more frequent online updates asynchronously, either in a pure online or mini-batch setting. Asynchronous updates can introduce error, but the approach has similar convergence guarantees to other online learning algorithms in certain settings, such as the case of online gradient-based optimization for convex objectives. We first consider this setting, and present a series of experiments exploring practical issues for a structured prediction task in natural language processing, named-entity recognition.
We also consider settings that are not yet supported by theoretical results. We apply an online version of EM (Cappe and Moulines, 2009) to two unsupervised structured learning tasks: (1) word alignment for machine translation, and (2) unsupervised part-of-speech tagging. For the former we use a model that actually has a concave log-likelihood function, while the latter fits the more common unsupervised learning scenario with a non-concave objective. In both cases we find significant speed-ups over batch algorithms with no observable problems arising from the use of asynchronous updates. In addition, we present experimental results when running asynchronous mini-batch algorithms on M45, a large cluster running the Hadoop MapReduce framework. We find that, while MapReduce is not an ideal fit for these algorithms, they do converge faster than batch algorithms on the same hardware and we expect that the MapReduce framework may become more appropriate for asynchronous learning as problem sizes continue to grow.
This is joint work with Dipanjan Das and Noah Smith.