You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Richard Ding (JIRA)" <ji...@apache.org> on 2009/11/04 02:11:32 UTC

[jira] Created: (PIG-1071) Support comma separated file/directory names in load statements

Support comma separated file/directory names in load statements
---------------------------------------------------------------

                 Key: PIG-1071
                 URL: https://issues.apache.org/jira/browse/PIG-1071
             Project: Pig
          Issue Type: New Feature
            Reporter: Richard Ding


Currently Pig Latin support following LOAD syntax:

{code}
LOAD 'data' [USING loader function] [AS schema];      
{code}

where data is the name of the file or directory, including files specified with Hadoop-supported globing syntax. This name is passed to the loader function.

This feature is to support loaders that can load multiple files from different directories and allows users to pass in the file names in a comma separated string.

For example, these will be valid load statements:

{code}
LOAD '/usr/pig/test1/a,/usr/pig/test2/b' USING someloader()';
{code}

and 

{code}
LOAD '/usr/pig/test1/{a,c},/usr/pig/test2/b' USING someloader();
{code}

This comma separated string is passed to the loader.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1071) Support comma separated file/directory names in load statements

Posted by "Richard Ding (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Richard Ding updated PIG-1071:
------------------------------

    Status: Patch Available  (was: Open)

> Support comma separated file/directory names in load statements
> ---------------------------------------------------------------
>
>                 Key: PIG-1071
>                 URL: https://issues.apache.org/jira/browse/PIG-1071
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Richard Ding
>            Assignee: Richard Ding
>         Attachments: PIG-1071.patch
>
>
> Currently Pig Latin support following LOAD syntax:
> {code}
> LOAD 'data' [USING loader function] [AS schema];      
> {code}
> where data is the name of the file or directory, including files specified with Hadoop-supported globing syntax. This name is passed to the loader function.
> This feature is to support loaders that can load multiple files from different directories and allows users to pass in the file names in a comma separated string.
> For example, these will be valid load statements:
> {code}
> LOAD '/usr/pig/test1/a,/usr/pig/test2/b' USING someloader()';
> {code}
> and 
> {code}
> LOAD '/usr/pig/test1/{a,c},/usr/pig/test2/b' USING someloader();
> {code}
> This comma separated string is passed to the loader.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1071) Support comma separated file/directory names in load statements

Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pradeep Kamath updated PIG-1071:
--------------------------------

       Resolution: Fixed
    Fix Version/s: 0.6.0
     Hadoop Flags: [Reviewed]
           Status: Resolved  (was: Patch Available)

+1, patch committed, Thanks Richard!

> Support comma separated file/directory names in load statements
> ---------------------------------------------------------------
>
>                 Key: PIG-1071
>                 URL: https://issues.apache.org/jira/browse/PIG-1071
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Richard Ding
>            Assignee: Richard Ding
>             Fix For: 0.6.0
>
>         Attachments: PIG-1071.patch, PIG-1071.patch
>
>
> Currently Pig Latin support following LOAD syntax:
> {code}
> LOAD 'data' [USING loader function] [AS schema];      
> {code}
> where data is the name of the file or directory, including files specified with Hadoop-supported globing syntax. This name is passed to the loader function.
> This feature is to support loaders that can load multiple files from different directories and allows users to pass in the file names in a comma separated string.
> For example, these will be valid load statements:
> {code}
> LOAD '/usr/pig/test1/a,/usr/pig/test2/b' USING someloader()';
> {code}
> and 
> {code}
> LOAD '/usr/pig/test1/{a,c},/usr/pig/test2/b' USING someloader();
> {code}
> This comma separated string is passed to the loader.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1071) Support comma separated file/directory names in load statements

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12774061#action_12774061 ] 

Hadoop QA commented on PIG-1071:
--------------------------------

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12424056/PIG-1071.patch
  against trunk revision 832804.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 5 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/140/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/140/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/140/console

This message is automatically generated.

> Support comma separated file/directory names in load statements
> ---------------------------------------------------------------
>
>                 Key: PIG-1071
>                 URL: https://issues.apache.org/jira/browse/PIG-1071
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Richard Ding
>            Assignee: Richard Ding
>         Attachments: PIG-1071.patch
>
>
> Currently Pig Latin support following LOAD syntax:
> {code}
> LOAD 'data' [USING loader function] [AS schema];      
> {code}
> where data is the name of the file or directory, including files specified with Hadoop-supported globing syntax. This name is passed to the loader function.
> This feature is to support loaders that can load multiple files from different directories and allows users to pass in the file names in a comma separated string.
> For example, these will be valid load statements:
> {code}
> LOAD '/usr/pig/test1/a,/usr/pig/test2/b' USING someloader()';
> {code}
> and 
> {code}
> LOAD '/usr/pig/test1/{a,c},/usr/pig/test2/b' USING someloader();
> {code}
> This comma separated string is passed to the loader.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1071) Support comma separated file/directory names in load statements

Posted by "Richard Ding (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Richard Ding updated PIG-1071:
------------------------------

    Status: Patch Available  (was: Open)

> Support comma separated file/directory names in load statements
> ---------------------------------------------------------------
>
>                 Key: PIG-1071
>                 URL: https://issues.apache.org/jira/browse/PIG-1071
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Richard Ding
>            Assignee: Richard Ding
>         Attachments: PIG-1071.patch
>
>
> Currently Pig Latin support following LOAD syntax:
> {code}
> LOAD 'data' [USING loader function] [AS schema];      
> {code}
> where data is the name of the file or directory, including files specified with Hadoop-supported globing syntax. This name is passed to the loader function.
> This feature is to support loaders that can load multiple files from different directories and allows users to pass in the file names in a comma separated string.
> For example, these will be valid load statements:
> {code}
> LOAD '/usr/pig/test1/a,/usr/pig/test2/b' USING someloader()';
> {code}
> and 
> {code}
> LOAD '/usr/pig/test1/{a,c},/usr/pig/test2/b' USING someloader();
> {code}
> This comma separated string is passed to the loader.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1071) Support comma separated file/directory names in load statements

Posted by "Richard Ding (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Richard Ding updated PIG-1071:
------------------------------

    Status: Open  (was: Patch Available)

> Support comma separated file/directory names in load statements
> ---------------------------------------------------------------
>
>                 Key: PIG-1071
>                 URL: https://issues.apache.org/jira/browse/PIG-1071
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Richard Ding
>            Assignee: Richard Ding
>         Attachments: PIG-1071.patch
>
>
> Currently Pig Latin support following LOAD syntax:
> {code}
> LOAD 'data' [USING loader function] [AS schema];      
> {code}
> where data is the name of the file or directory, including files specified with Hadoop-supported globing syntax. This name is passed to the loader function.
> This feature is to support loaders that can load multiple files from different directories and allows users to pass in the file names in a comma separated string.
> For example, these will be valid load statements:
> {code}
> LOAD '/usr/pig/test1/a,/usr/pig/test2/b' USING someloader()';
> {code}
> and 
> {code}
> LOAD '/usr/pig/test1/{a,c},/usr/pig/test2/b' USING someloader();
> {code}
> This comma separated string is passed to the loader.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1071) Support comma separated file/directory names in load statements

Posted by "Richard Ding (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Richard Ding updated PIG-1071:
------------------------------

    Attachment: PIG-1071.patch

This patch implements the feature.

> Support comma separated file/directory names in load statements
> ---------------------------------------------------------------
>
>                 Key: PIG-1071
>                 URL: https://issues.apache.org/jira/browse/PIG-1071
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Richard Ding
>            Assignee: Richard Ding
>         Attachments: PIG-1071.patch
>
>
> Currently Pig Latin support following LOAD syntax:
> {code}
> LOAD 'data' [USING loader function] [AS schema];      
> {code}
> where data is the name of the file or directory, including files specified with Hadoop-supported globing syntax. This name is passed to the loader function.
> This feature is to support loaders that can load multiple files from different directories and allows users to pass in the file names in a comma separated string.
> For example, these will be valid load statements:
> {code}
> LOAD '/usr/pig/test1/a,/usr/pig/test2/b' USING someloader()';
> {code}
> and 
> {code}
> LOAD '/usr/pig/test1/{a,c},/usr/pig/test2/b' USING someloader();
> {code}
> This comma separated string is passed to the loader.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1071) Support comma separated file/directory names in load statements

Posted by "Richard Ding (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Richard Ding updated PIG-1071:
------------------------------

    Attachment: PIG-1071.patch

Added two more test cases.

> Support comma separated file/directory names in load statements
> ---------------------------------------------------------------
>
>                 Key: PIG-1071
>                 URL: https://issues.apache.org/jira/browse/PIG-1071
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Richard Ding
>            Assignee: Richard Ding
>         Attachments: PIG-1071.patch, PIG-1071.patch
>
>
> Currently Pig Latin support following LOAD syntax:
> {code}
> LOAD 'data' [USING loader function] [AS schema];      
> {code}
> where data is the name of the file or directory, including files specified with Hadoop-supported globing syntax. This name is passed to the loader function.
> This feature is to support loaders that can load multiple files from different directories and allows users to pass in the file names in a comma separated string.
> For example, these will be valid load statements:
> {code}
> LOAD '/usr/pig/test1/a,/usr/pig/test2/b' USING someloader()';
> {code}
> and 
> {code}
> LOAD '/usr/pig/test1/{a,c},/usr/pig/test2/b' USING someloader();
> {code}
> This comma separated string is passed to the loader.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1071) Support comma separated file/directory names in load statements

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12773750#action_12773750 ] 

Hadoop QA commented on PIG-1071:
--------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12424056/PIG-1071.patch
  against trunk revision 832804.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 5 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    -1 core tests.  The patch failed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/139/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/139/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/139/console

This message is automatically generated.

> Support comma separated file/directory names in load statements
> ---------------------------------------------------------------
>
>                 Key: PIG-1071
>                 URL: https://issues.apache.org/jira/browse/PIG-1071
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Richard Ding
>            Assignee: Richard Ding
>         Attachments: PIG-1071.patch
>
>
> Currently Pig Latin support following LOAD syntax:
> {code}
> LOAD 'data' [USING loader function] [AS schema];      
> {code}
> where data is the name of the file or directory, including files specified with Hadoop-supported globing syntax. This name is passed to the loader function.
> This feature is to support loaders that can load multiple files from different directories and allows users to pass in the file names in a comma separated string.
> For example, these will be valid load statements:
> {code}
> LOAD '/usr/pig/test1/a,/usr/pig/test2/b' USING someloader()';
> {code}
> and 
> {code}
> LOAD '/usr/pig/test1/{a,c},/usr/pig/test2/b' USING someloader();
> {code}
> This comma separated string is passed to the loader.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (PIG-1071) Support comma separated file/directory names in load statements

Posted by "Richard Ding (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Richard Ding reassigned PIG-1071:
---------------------------------

    Assignee: Richard Ding

> Support comma separated file/directory names in load statements
> ---------------------------------------------------------------
>
>                 Key: PIG-1071
>                 URL: https://issues.apache.org/jira/browse/PIG-1071
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Richard Ding
>            Assignee: Richard Ding
>
> Currently Pig Latin support following LOAD syntax:
> {code}
> LOAD 'data' [USING loader function] [AS schema];      
> {code}
> where data is the name of the file or directory, including files specified with Hadoop-supported globing syntax. This name is passed to the loader function.
> This feature is to support loaders that can load multiple files from different directories and allows users to pass in the file names in a comma separated string.
> For example, these will be valid load statements:
> {code}
> LOAD '/usr/pig/test1/a,/usr/pig/test2/b' USING someloader()';
> {code}
> and 
> {code}
> LOAD '/usr/pig/test1/{a,c},/usr/pig/test2/b' USING someloader();
> {code}
> This comma separated string is passed to the loader.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.