You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Rekha (JIRA)" <ji...@apache.org> on 2009/07/01 13:36:47 UTC

[jira] Created: (PIG-869) Pig scripts should be able to handle scenario where input datasets not present/or empty before running

Pig scripts should be able to handle scenario where input datasets not present/or empty before running
------------------------------------------------------------------------------------------------------

                 Key: PIG-869
                 URL: https://issues.apache.org/jira/browse/PIG-869
             Project: Pig
          Issue Type: Improvement
          Components: impl
    Affects Versions: 0.2.0
         Environment: grid environment testing of pig 2.2
            Reporter: Rekha
            Priority: Minor
             Fix For: 0.2.0


Pig 2.2 does not handle situatiosn where dataset is not present, as in file missing, or empty file.

It would be great if Pig would within scripts enforce some data checks.
It can be any simple command like below that can be easily wrapped around all input sources--

if ( datapath_valid && data_present && file_not_empty)  {
           run the rest of the script 
} 
else {
            throw an exception/error code  
          --this should be easily trappable valuecode in logs
}

This improvement can be beneficial for our DQ check.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-869) Pig scripts should be able to handle scenario where input datasets not present/or empty before running

Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12726288#action_12726288 ] 

Olga Natkovich commented on PIG-869:
------------------------------------

Support for conditionals in the script is a nice features but something that is not easy to implement. Also, before we start adding constructs like this to the language, we need to have a bigger picture of where we are taking the language.

It is not in the scope of the next few months but we might be able to revisit this sometimes next year.

> Pig scripts should be able to handle scenario where input datasets not present/or empty before running
> ------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-869
>                 URL: https://issues.apache.org/jira/browse/PIG-869
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>    Affects Versions: 0.2.0
>         Environment: grid environment testing of pig 2.2
>            Reporter: Rekha
>            Priority: Minor
>             Fix For: 0.2.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Pig 2.2 does not handle situatiosn where dataset is not present, as in file missing, or empty file.
> It would be great if Pig would within scripts enforce some data checks.
> It can be any simple command like below that can be easily wrapped around all input sources--
> if ( datapath_valid && data_present && file_not_empty)  {
>            run the rest of the script 
> } 
> else {
>             throw an exception/error code  
>           --this should be easily trappable valuecode in logs
> }
> This improvement can be beneficial for our DQ check.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.