You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by prasenjit mukherjee <pr...@gmail.com> on 2009/09/24 19:52:56 UTC

using s3 and local hdfs in same pig script

I want to read from s3 and write to local hdfs. But when I am setting
-Dfs... it is setting for all load/store pig latin statements. Is
there a way I can instruct pig to read from a s3 filesystem ( in LOAD
) and write to a local hdfs filesystem ( in STORE ) ?

-Thanks,
Prasen

Re: using s3 and local hdfs in same pig script

Posted by Benjamin Reed <br...@yahoo-inc.com>.
if you want to use the default name node in your hdfs URI just use one 
slash:

hdfs:/mnt/output-dir-name/output-filename

prasenjit mukherjee wrote:
> Any help ?
>
> On Fri, Sep 25, 2009 at 9:03 AM, prasenjit mukherjee
> <pr...@gmail.com> wrote:
>   
>> Thanks for the pointer, sure would like to try it out.
>>
>> I am running it on ec2 and would like to use a namenode-independent
>> hdfs:// naming scheme in the pig script. Can I just use
>> hdfs://mnt/output-dir-name/output-filename without using the
>> namenodehost ?
>>
>> -Prasen
>>
>> On Fri, Sep 25, 2009 at 12:00 AM, Benjamin Reed <br...@yahoo-inc.com> wrote:
>>     
>>> i believe the hadoop APIs we are using support URIs, so you should be able
>>> to specify the URI in the load and store statements. something like
>>> hdfs://nn1:port/path. s3 has a URI as well, but i'm not familiar with it.
>>>
>>> unfortunately, i've never had an opportunity to try it. do you mind giving
>>> it a try and posting the results for the rest of us?
>>>
>>> thanx
>>> ben
>>>
>>> prasenjit mukherjee wrote:
>>>       
>>>> I want to read from s3 and write to local hdfs. But when I am setting
>>>> -Dfs... it is setting for all load/store pig latin statements. Is
>>>> there a way I can instruct pig to read from a s3 filesystem ( in LOAD
>>>> ) and write to a local hdfs filesystem ( in STORE ) ?
>>>>
>>>> -Thanks,
>>>> Prasen
>>>>
>>>>         
>>>       


Re: using s3 and local hdfs in same pig script

Posted by prasenjit mukherjee <pr...@gmail.com>.
Any help ?

On Fri, Sep 25, 2009 at 9:03 AM, prasenjit mukherjee
<pr...@gmail.com> wrote:
> Thanks for the pointer, sure would like to try it out.
>
> I am running it on ec2 and would like to use a namenode-independent
> hdfs:// naming scheme in the pig script. Can I just use
> hdfs://mnt/output-dir-name/output-filename without using the
> namenodehost ?
>
> -Prasen
>
> On Fri, Sep 25, 2009 at 12:00 AM, Benjamin Reed <br...@yahoo-inc.com> wrote:
>> i believe the hadoop APIs we are using support URIs, so you should be able
>> to specify the URI in the load and store statements. something like
>> hdfs://nn1:port/path. s3 has a URI as well, but i'm not familiar with it.
>>
>> unfortunately, i've never had an opportunity to try it. do you mind giving
>> it a try and posting the results for the rest of us?
>>
>> thanx
>> ben
>>
>> prasenjit mukherjee wrote:
>>>
>>> I want to read from s3 and write to local hdfs. But when I am setting
>>> -Dfs... it is setting for all load/store pig latin statements. Is
>>> there a way I can instruct pig to read from a s3 filesystem ( in LOAD
>>> ) and write to a local hdfs filesystem ( in STORE ) ?
>>>
>>> -Thanks,
>>> Prasen
>>>
>>
>>
>

Re: using s3 and local hdfs in same pig script

Posted by prasenjit mukherjee <pr...@gmail.com>.
Thanks for the pointer, sure would like to try it out.

I am running it on ec2 and would like to use a namenode-independent
hdfs:// naming scheme in the pig script. Can I just use
hdfs://mnt/output-dir-name/output-filename without using the
namenodehost ?

-Prasen

On Fri, Sep 25, 2009 at 12:00 AM, Benjamin Reed <br...@yahoo-inc.com> wrote:
> i believe the hadoop APIs we are using support URIs, so you should be able
> to specify the URI in the load and store statements. something like
> hdfs://nn1:port/path. s3 has a URI as well, but i'm not familiar with it.
>
> unfortunately, i've never had an opportunity to try it. do you mind giving
> it a try and posting the results for the rest of us?
>
> thanx
> ben
>
> prasenjit mukherjee wrote:
>>
>> I want to read from s3 and write to local hdfs. But when I am setting
>> -Dfs... it is setting for all load/store pig latin statements. Is
>> there a way I can instruct pig to read from a s3 filesystem ( in LOAD
>> ) and write to a local hdfs filesystem ( in STORE ) ?
>>
>> -Thanks,
>> Prasen
>>
>
>

Re: using s3 and local hdfs in same pig script

Posted by Benjamin Reed <br...@yahoo-inc.com>.
i believe the hadoop APIs we are using support URIs, so you should be 
able to specify the URI in the load and store statements. something like 
hdfs://nn1:port/path. s3 has a URI as well, but i'm not familiar with it.

unfortunately, i've never had an opportunity to try it. do you mind 
giving it a try and posting the results for the rest of us?

thanx
ben

prasenjit mukherjee wrote:
> I want to read from s3 and write to local hdfs. But when I am setting
> -Dfs... it is setting for all load/store pig latin statements. Is
> there a way I can instruct pig to read from a s3 filesystem ( in LOAD
> ) and write to a local hdfs filesystem ( in STORE ) ?
>
> -Thanks,
> Prasen
>