You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by shahab mehmandoust <sh...@gmail.com> on 2008/11/04 18:51:31 UTC
Recovery from Failed Jobs
Hello,
I want to parse lines of an access logs, line by line with map/reduce. I
want to know, once my access log is in the HDFS, am I guaranteed that every
line will be processed and results will be in the output dir? In other
words, if a job fails, does hadoop know where it failed? and can hadoop
recover from that point so no data is lost?
Thanks,
Shahab
Re: Recovery from Failed Jobs
Posted by Alex Loddengaard <al...@cloudera.com>.
With regard to checkpointing, not yet. This JIRA is a prerequisite:
<http://issues.apache.org/jira/browse/HADOOP-3245>
I'm a little confused about what you're trying to do with log parsing. You
should consider Scribe or Chukwa, though Chukwa isn't ready to be used yet.
Learn more here:
Chukwa:
<http://wiki.apache.org/hadoop/Chukwa>
<http://issues.apache.org/jira/browse/HADOOP-3719>
Scribe:
<
http://www.cloudera.com/blog/2008/10/28/installing-scribe-for-log-collection/
>
<
http://www.cloudera.com/blog/2008/11/02/configuring-and-using-scribe-for-hadoop-log-collection/
>
Alex
On Tue, Nov 4, 2008 at 11:51 AM, shahab mehmandoust <sh...@gmail.com>wrote:
> Hello,
>
> I want to parse lines of an access logs, line by line with map/reduce. I
> want to know, once my access log is in the HDFS, am I guaranteed that every
> line will be processed and results will be in the output dir? In other
> words, if a job fails, does hadoop know where it failed? and can hadoop
> recover from that point so no data is lost?
>
> Thanks,
> Shahab
>