You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@crunch.apache.org by Joseph Adler <jo...@gmail.com> on 2013/06/06 18:36:22 UTC

testing if a pipeline was successful

Hi guys,

I'm confused by how to tell if a Crunch Pipeline has succeeded. I suspect
that the current implementation is not correct, but want to verify that
before I file a Jira and submit a patch.

Here's what I'm trying to do: I'd like to wrap a crunch job inside another
Java program (so that I can execute it in a larger scheduling/workflow tool
like Azkaban or Oozie). So, I'd like to build and execute a Crunch
pipeline, then test if it succeeded.

I'd like to do something like:

    PipelineResult result = pipeline.run();
    if (!result.succeeded())
    {
        throw new Exception("Job failed: " + result.toString() + "\n");
    }
    return 0;


Unfortunately, this doesn't work. Looking at the implementation of
PipelineResult, here is how the succeeded method is implemented:

  public boolean succeeded() {
    return !stageResults.isEmpty();
  }

Looking at where the PipelineResult object is created (in MRExecutor), it
looks like the list of stage results will be non-empty regardless of
whether the job succeeded or not:

       ...
       result = new PipelineResult(stages);
        if (killSignal.getCount() != 0) {
          status.set(Status.KILLED);
        } else {
          status.set(result.succeeded() ? Status.SUCCEEDED : Status.FAILED);
        }
        ...

So, even if one of the stages fails, the result of executing the pipeline
will be "succeeded."

I can't find an alternative way to test if a failure occurred during a
pipeline execution. Is there something that I'm missing, or is the
implementation incorrect?

-- Joe

Re: testing if a pipeline was successful

Posted by Josh Wills <jo...@gmail.com>.
Yeah, I'm not sure how I ever thought that would work. If you have a patch,
let's get it in.

J


On Fri, Jun 7, 2013 at 9:42 PM, Joseph Adler <jo...@gmail.com> wrote:

> Thanks Josh.
>
> Let me know what you think. I have a patch for this (so pipelines don't
> succeed if one of the stages fails).
>
> -- Joe
>
>
> On Thu, Jun 6, 2013 at 1:49 PM, Josh Wills <jw...@cloudera.com> wrote:
>
>> Definitely sounds like a bug. Will take a look when I get on this plane.
>> On Jun 6, 2013 12:36 PM, "Joseph Adler" <jo...@gmail.com> wrote:
>>
>>> Hi guys,
>>>
>>> I'm confused by how to tell if a Crunch Pipeline has succeeded. I
>>> suspect that the current implementation is not correct, but want to verify
>>> that before I file a Jira and submit a patch.
>>>
>>> Here's what I'm trying to do: I'd like to wrap a crunch job inside
>>> another Java program (so that I can execute it in a larger
>>> scheduling/workflow tool like Azkaban or Oozie). So, I'd like to build and
>>> execute a Crunch pipeline, then test if it succeeded.
>>>
>>> I'd like to do something like:
>>>
>>>     PipelineResult result = pipeline.run();
>>>     if (!result.succeeded())
>>>     {
>>>         throw new Exception("Job failed: " + result.toString() + "\n");
>>>     }
>>>     return 0;
>>>
>>>
>>> Unfortunately, this doesn't work. Looking at the implementation of
>>> PipelineResult, here is how the succeeded method is implemented:
>>>
>>>   public boolean succeeded() {
>>>     return !stageResults.isEmpty();
>>>   }
>>>
>>> Looking at where the PipelineResult object is created (in MRExecutor),
>>> it looks like the list of stage results will be non-empty regardless of
>>> whether the job succeeded or not:
>>>
>>>        ...
>>>        result = new PipelineResult(stages);
>>>         if (killSignal.getCount() != 0) {
>>>           status.set(Status.KILLED);
>>>         } else {
>>>           status.set(result.succeeded() ? Status.SUCCEEDED :
>>> Status.FAILED);
>>>         }
>>>         ...
>>>
>>> So, even if one of the stages fails, the result of executing the
>>> pipeline will be "succeeded."
>>>
>>> I can't find an alternative way to test if a failure occurred during a
>>> pipeline execution. Is there something that I'm missing, or is the
>>> implementation incorrect?
>>>
>>> -- Joe
>>>
>>
>

Re: testing if a pipeline was successful

Posted by Joseph Adler <jo...@gmail.com>.
Thanks Josh.

Let me know what you think. I have a patch for this (so pipelines don't
succeed if one of the stages fails).

-- Joe


On Thu, Jun 6, 2013 at 1:49 PM, Josh Wills <jw...@cloudera.com> wrote:

> Definitely sounds like a bug. Will take a look when I get on this plane.
> On Jun 6, 2013 12:36 PM, "Joseph Adler" <jo...@gmail.com> wrote:
>
>> Hi guys,
>>
>> I'm confused by how to tell if a Crunch Pipeline has succeeded. I suspect
>> that the current implementation is not correct, but want to verify that
>> before I file a Jira and submit a patch.
>>
>> Here's what I'm trying to do: I'd like to wrap a crunch job inside
>> another Java program (so that I can execute it in a larger
>> scheduling/workflow tool like Azkaban or Oozie). So, I'd like to build and
>> execute a Crunch pipeline, then test if it succeeded.
>>
>> I'd like to do something like:
>>
>>     PipelineResult result = pipeline.run();
>>     if (!result.succeeded())
>>     {
>>         throw new Exception("Job failed: " + result.toString() + "\n");
>>     }
>>     return 0;
>>
>>
>> Unfortunately, this doesn't work. Looking at the implementation of
>> PipelineResult, here is how the succeeded method is implemented:
>>
>>   public boolean succeeded() {
>>     return !stageResults.isEmpty();
>>   }
>>
>> Looking at where the PipelineResult object is created (in MRExecutor), it
>> looks like the list of stage results will be non-empty regardless of
>> whether the job succeeded or not:
>>
>>        ...
>>        result = new PipelineResult(stages);
>>         if (killSignal.getCount() != 0) {
>>           status.set(Status.KILLED);
>>         } else {
>>           status.set(result.succeeded() ? Status.SUCCEEDED :
>> Status.FAILED);
>>         }
>>         ...
>>
>> So, even if one of the stages fails, the result of executing the pipeline
>> will be "succeeded."
>>
>> I can't find an alternative way to test if a failure occurred during a
>> pipeline execution. Is there something that I'm missing, or is the
>> implementation incorrect?
>>
>> -- Joe
>>
>

Re: testing if a pipeline was successful

Posted by Josh Wills <jw...@cloudera.com>.
Definitely sounds like a bug. Will take a look when I get on this plane.
On Jun 6, 2013 12:36 PM, "Joseph Adler" <jo...@gmail.com> wrote:

> Hi guys,
>
> I'm confused by how to tell if a Crunch Pipeline has succeeded. I suspect
> that the current implementation is not correct, but want to verify that
> before I file a Jira and submit a patch.
>
> Here's what I'm trying to do: I'd like to wrap a crunch job inside another
> Java program (so that I can execute it in a larger scheduling/workflow tool
> like Azkaban or Oozie). So, I'd like to build and execute a Crunch
> pipeline, then test if it succeeded.
>
> I'd like to do something like:
>
>     PipelineResult result = pipeline.run();
>     if (!result.succeeded())
>     {
>         throw new Exception("Job failed: " + result.toString() + "\n");
>     }
>     return 0;
>
>
> Unfortunately, this doesn't work. Looking at the implementation of
> PipelineResult, here is how the succeeded method is implemented:
>
>   public boolean succeeded() {
>     return !stageResults.isEmpty();
>   }
>
> Looking at where the PipelineResult object is created (in MRExecutor), it
> looks like the list of stage results will be non-empty regardless of
> whether the job succeeded or not:
>
>        ...
>        result = new PipelineResult(stages);
>         if (killSignal.getCount() != 0) {
>           status.set(Status.KILLED);
>         } else {
>           status.set(result.succeeded() ? Status.SUCCEEDED :
> Status.FAILED);
>         }
>         ...
>
> So, even if one of the stages fails, the result of executing the pipeline
> will be "succeeded."
>
> I can't find an alternative way to test if a failure occurred during a
> pipeline execution. Is there something that I'm missing, or is the
> implementation incorrect?
>
> -- Joe
>