Upcoming Talks

The Large-Scale Lunch is a monthly lunchtime presentation and discussion about the use of cluster, distributed, parallel, and other large-scale computing methods to solve current applied computer science research problems. The LSL is intended to help spread knowledge about how to do research that involves massive amounts of data and/or computation on modern, multiprocessor computing equipment. The ideal talk provides two types of insight:
  • insight into the basic research problem, why it is important, and how it was addressed, and
  • insight into the large-scale implementation problem, and how it was addressed.
Talks are typically informal, and are followed (or perhaps interrupted) by questions and discussion. A typical talk is about 30 minutes of content, to allow time for discussion and getting lunch. This series is sponsored by Yahoo! and open to all interested members of the SCS/ECE/etc. communities. We hope these events will be especially useful for those currently using or planning to use Hadoop on the M45 cluster. We will not be sending regular emails to these lists to inform about future events, so if you'd like to be notified of upcoming events, please join our mailing list. To subscribe, just send mail to

large-scale-request@mailman.srv.cs.cmu.edu

with message body "subscribe". Alternatively, go to the list information page :

https://mailman.srv.cs.cmu.edu/mailman/listinfo/large-scale

and subscribe there. Also, all information about upcoming events will be displayed on our website.

Automated problem diagnosis for Hadoop

Speaker: Soila Pertet
Date: Wednesday, October 21st
Time: 12:00 noon - 1:00pm
Location: Gates Hillman Complex 4405

Abstract:
Performance problems in software frameworks such as Hadoop, which
support long-running, parallelized, data-intensive computations, can
hamper cost-management efforts in cloud-computing environments. Manual
diagnosis does not scale in such environments because of the number of
nodes and the number of performance metrics to be analyzed on each
node. This talk provides an overview of our group’s research in
automated problem diagnosis (what we call "fingerpointing") for
Hadoop. We discuss three aspects of our research namely: (i) a
diagnosis approach that synthesizes resource usage data from the OS
and task-execution flows from the Hadoop logs to diagnose problems,
(ii) an automated, online diagnosis framework that transparently
extracts different time-varying data sources and implements our
diagnosis algorithms as plug-in modules, and (iii) visualization tools
for Hadoop that provide programmers insight into the execution
patterns of their jobs. Our visualization tools have been checked into
the Hadoop repository under the Chukwa project.

Syndicate content