You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Tom White (JIRA)" <ji...@apache.org> on 2008/06/05 16:21:45 UTC

[jira] Commented: (PIG-252) Allow multiple paths in the load statement

    [ https://issues.apache.org/jira/browse/PIG-252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12602656#action_12602656 ] 

Tom White commented on PIG-252:
-------------------------------

By making globs more powerful (HADOOP-3498), we would be able to say:

{code}
x = LOAD '{2008/05/{26,27,28,29,30,31},2008/06/{1,2}}'
{code}

> Allow multiple paths in the load statement
> ------------------------------------------
>
>                 Key: PIG-252
>                 URL: https://issues.apache.org/jira/browse/PIG-252
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Olga Natkovich
>
> From Tom White:
> I;m having a problem loading data from multiple paths in Pig. What I'm trying to do is to load data from a range of dates, so I would like to specify an input of two globbed paths:
> x = LOAD '2008/05/{26,27,28,29,30,31},2008/06/{1,2}'
> Pig doesn't seem to like this though as it's trying to interpret it as a single path. The best I can do it to use UNION:
> x1 = LOAD '2008/05/{26,27,28,29,30,31}'
> x2 = LOAD '2008/06/{1,2}'
> x = UNION x1, x2
> The downside to this is that I want to parameterize my paths, and having separate script for each number of paths in the input is cumbersome.
> Is there a better way of doing this? Are there any plans to support multiple paths, and/or PathFilters?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.