You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gobblin.apache.org by Jay Sen <ja...@apache.org> on 2019/07/15 01:09:35 UTC

MR mode wikipedia example fails

Hi Dev Team,

The PullFromWikipedia example fails on Gobblin MR mode.

the issue I have found so far is that even after the MR job completes
successfully and sseen by gobblin, its explicitly marking it as FAILED due
to missing "workunit.working.state" in WorkUnitState ( at SafeDatasetCommit
# finalizeDatasetStateBeforeCommit method ).

This is how i believe states are structured, just for the reference here
JobState ->DatasetState-> TaskState -> WorkUnitState

since its missing in the WorkUniteState, It by default get "PENDING" state
(by taskState.getWorkingState()) and the function
(finalizeDatasetStateBeforeCommit)  sets it to FAIL in any other state
other than "SUCCESSFUL".

Now, I am not sure if this is a bug or i m missing any config
like JobCommitPolicy or to tell not to commit at job level at all.

can someone pls take a look and comment?

Thanks
Jay

Re: MR mode wikipedia example fails

Posted by Jay Sen <ja...@apache.org>.
missed to mention that i used Hadoop 2.7.7 for this, wonder what version
LinkedIn is using. THanks

On Sun, Jul 14, 2019 at 6:09 PM Jay Sen <ja...@apache.org> wrote:

> Hi Dev Team,
>
> The PullFromWikipedia example fails on Gobblin MR mode.
>
> the issue I have found so far is that even after the MR job completes
> successfully and sseen by gobblin, its explicitly marking it as FAILED due
> to missing "workunit.working.state" in WorkUnitState ( at SafeDatasetCommit
> # finalizeDatasetStateBeforeCommit method ).
>
> This is how i believe states are structured, just for the reference here
> JobState ->DatasetState-> TaskState -> WorkUnitState
>
> since its missing in the WorkUniteState, It by default get "PENDING" state
> (by taskState.getWorkingState()) and the function
> (finalizeDatasetStateBeforeCommit)  sets it to FAIL in any other state
> other than "SUCCESSFUL".
>
> Now, I am not sure if this is a bug or i m missing any config
> like JobCommitPolicy or to tell not to commit at job level at all.
>
> can someone pls take a look and comment?
>
> Thanks
> Jay
>
>
>