You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Koji Noguchi (JIRA)" <ji...@apache.org> on 2017/03/27 14:07:41 UTC

[jira] [Updated] (PIG-5198) streaming job stuck with script failure when combined with split

     [ https://issues.apache.org/jira/browse/PIG-5198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Koji Noguchi updated PIG-5198:
------------------------------
    Attachment: pig-5198-v01.patch

Hanging itself is same from PIG-4976, but unlike that jira, here POStream is properly returning {{POStatus.STATUS_ERR}}.  It's just that POSplit is converting that to RESULT_EMPTY.

{code:title=POSplit.java}
244         return (res.returnStatus == POStatus.STATUS_OK) ? res : RESULT_EMPTY;
{code}

Attaching a patch that would return STATUS_ERR in such cases.
I couldn't reproduce this on a small local unit test.  
Reason was when input is too small 

{code:title=POSplit.java}
198     public Result getNextTuple() throws ExecException {
199
200         if (this.parentPlan.endOfAllInput) {
201
202             return getStreamCloseResult();
203
204         }
205
{code}
This {{endOfAllInput}} becomes true and code flow chooses a different path and does not hit this bug.  For now, added e2e test that produces the hang.

> streaming job stuck with script failure when combined with split
> ----------------------------------------------------------------
>
>                 Key: PIG-5198
>                 URL: https://issues.apache.org/jira/browse/PIG-5198
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>            Reporter: Koji Noguchi
>            Assignee: Koji Noguchi
>            Priority: Minor
>         Attachments: pig-5198-v01.patch
>
>
> {code:title=test.pig} 
> DEFINE myawk `./test.awk` ship('./test.awk');
> DEFINE mypy `python my.py` ship ('./my.py');
> A = load 'input.txt';
> B =  stream A through myawk ;
> BB =  stream A through mypy ;
> store B into '$output/abc';
> store BB into '$output/bcd';
> {code} 
> This script would hang when my.py fails with syntax error.
> (input.txt has to large)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)