You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Tanton Gibbs <ta...@gmail.com> on 2008/05/21 23:48:53 UTC
Reading file from HDFS
How do I get pig to process a file that is already loaded on the
hadoop file system.
Right now, from GRUNT, I can do an ls, but it shows the local file
system. I've also, tried
A = load 'myfile' using PigStorage()
A = load 'file:/myfile' using PigStorage()
A = load 'file://myfile' using PigStorage()
A = load 'file://user/tgibbs/myfile' using PigStorage()
A = load 'hdfs:/myfile' using PigStorage()
All of the above fail in various ways.
Also, when pig loads it displays
1 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine
- Connecting to hadoop file system at: file:///
I'm using hadoop v. 0.16.3 and the latest pig from svn.
Anybody have any ideas?
Thanks!
Tanton
Re: Reading file from HDFS
Posted by pi song <pi...@gmail.com>.
We have changed the folder and configuration structure last month.
Now you can run Pig by just doing (from {pig-home}) ./bin/pig
And you can set up all the configuration including the location of the HDFS
in ./conf/pig.properties
Pi
On 5/22/08, Tanton Gibbs <ta...@gmail.com> wrote:
>
> On a whim, I just tried running java -cp pig:$HADOOPSITECONFIG
> org.apache.pig.Main
>
> That worked correctly and found my hadoop cluster.
>
> thanks for the tip!
>
> Tanton
>
> On Wed, May 21, 2008 at 5:06 PM, Tanton Gibbs <ta...@gmail.com>
> wrote:
> > I ran the pig script in the bin directory.
> >
> > I looked for pig.pl (mentioned in the wiki) but couldn't find it.
> >
> > I set HADOOPSITECONFIG and HADOOP_HOME, but apparently that isn't enough
> :)
> >
> > On Wed, May 21, 2008 at 4:55 PM, Olga Natkovich <ol...@yahoo-inc.com>
> wrote:
> >> This means that pig is not connected to your hadoop cluster. What
> >> command did you use to start pig?
> >>
> >> Olga
> >>
> >>> -----Original Message-----
> >>> From: Tanton Gibbs [mailto:tanton.gibbs@gmail.com]
> >>> Sent: Wednesday, May 21, 2008 2:49 PM
> >>> To: pig-user@incubator.apache.org
> >>> Subject: Reading file from HDFS
> >>>
> >>> How do I get pig to process a file that is already loaded on
> >>> the hadoop file system.
> >>>
> >>> Right now, from GRUNT, I can do an ls, but it shows the local
> >>> file system. I've also, tried
> >>>
> >>> A = load 'myfile' using PigStorage()
> >>> A = load 'file:/myfile' using PigStorage() A = load
> >>> 'file://myfile' using PigStorage() A = load
> >>> 'file://user/tgibbs/myfile' using PigStorage() A = load
> >>> 'hdfs:/myfile' using PigStorage()
> >>>
> >>> All of the above fail in various ways.
> >>>
> >>> Also, when pig loads it displays
> >>> 1 [main] INFO
> >>> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine
> >>> - Connecting to hadoop file system at: file:///
> >>>
> >>> I'm using hadoop v. 0.16.3 and the latest pig from svn.
> >>>
> >>> Anybody have any ideas?
> >>>
> >>> Thanks!
> >>> Tanton
> >>>
> >>
> >
>
Re: Reading file from HDFS
Posted by Tanton Gibbs <ta...@gmail.com>.
On a whim, I just tried running java -cp pig:$HADOOPSITECONFIG
org.apache.pig.Main
That worked correctly and found my hadoop cluster.
thanks for the tip!
Tanton
On Wed, May 21, 2008 at 5:06 PM, Tanton Gibbs <ta...@gmail.com> wrote:
> I ran the pig script in the bin directory.
>
> I looked for pig.pl (mentioned in the wiki) but couldn't find it.
>
> I set HADOOPSITECONFIG and HADOOP_HOME, but apparently that isn't enough :)
>
> On Wed, May 21, 2008 at 4:55 PM, Olga Natkovich <ol...@yahoo-inc.com> wrote:
>> This means that pig is not connected to your hadoop cluster. What
>> command did you use to start pig?
>>
>> Olga
>>
>>> -----Original Message-----
>>> From: Tanton Gibbs [mailto:tanton.gibbs@gmail.com]
>>> Sent: Wednesday, May 21, 2008 2:49 PM
>>> To: pig-user@incubator.apache.org
>>> Subject: Reading file from HDFS
>>>
>>> How do I get pig to process a file that is already loaded on
>>> the hadoop file system.
>>>
>>> Right now, from GRUNT, I can do an ls, but it shows the local
>>> file system. I've also, tried
>>>
>>> A = load 'myfile' using PigStorage()
>>> A = load 'file:/myfile' using PigStorage() A = load
>>> 'file://myfile' using PigStorage() A = load
>>> 'file://user/tgibbs/myfile' using PigStorage() A = load
>>> 'hdfs:/myfile' using PigStorage()
>>>
>>> All of the above fail in various ways.
>>>
>>> Also, when pig loads it displays
>>> 1 [main] INFO
>>> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine
>>> - Connecting to hadoop file system at: file:///
>>>
>>> I'm using hadoop v. 0.16.3 and the latest pig from svn.
>>>
>>> Anybody have any ideas?
>>>
>>> Thanks!
>>> Tanton
>>>
>>
>
Re: Reading file from HDFS
Posted by Tanton Gibbs <ta...@gmail.com>.
I ran the pig script in the bin directory.
I looked for pig.pl (mentioned in the wiki) but couldn't find it.
I set HADOOPSITECONFIG and HADOOP_HOME, but apparently that isn't enough :)
On Wed, May 21, 2008 at 4:55 PM, Olga Natkovich <ol...@yahoo-inc.com> wrote:
> This means that pig is not connected to your hadoop cluster. What
> command did you use to start pig?
>
> Olga
>
>> -----Original Message-----
>> From: Tanton Gibbs [mailto:tanton.gibbs@gmail.com]
>> Sent: Wednesday, May 21, 2008 2:49 PM
>> To: pig-user@incubator.apache.org
>> Subject: Reading file from HDFS
>>
>> How do I get pig to process a file that is already loaded on
>> the hadoop file system.
>>
>> Right now, from GRUNT, I can do an ls, but it shows the local
>> file system. I've also, tried
>>
>> A = load 'myfile' using PigStorage()
>> A = load 'file:/myfile' using PigStorage() A = load
>> 'file://myfile' using PigStorage() A = load
>> 'file://user/tgibbs/myfile' using PigStorage() A = load
>> 'hdfs:/myfile' using PigStorage()
>>
>> All of the above fail in various ways.
>>
>> Also, when pig loads it displays
>> 1 [main] INFO
>> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine
>> - Connecting to hadoop file system at: file:///
>>
>> I'm using hadoop v. 0.16.3 and the latest pig from svn.
>>
>> Anybody have any ideas?
>>
>> Thanks!
>> Tanton
>>
>
RE: Reading file from HDFS
Posted by Olga Natkovich <ol...@yahoo-inc.com>.
This means that pig is not connected to your hadoop cluster. What
command did you use to start pig?
Olga
> -----Original Message-----
> From: Tanton Gibbs [mailto:tanton.gibbs@gmail.com]
> Sent: Wednesday, May 21, 2008 2:49 PM
> To: pig-user@incubator.apache.org
> Subject: Reading file from HDFS
>
> How do I get pig to process a file that is already loaded on
> the hadoop file system.
>
> Right now, from GRUNT, I can do an ls, but it shows the local
> file system. I've also, tried
>
> A = load 'myfile' using PigStorage()
> A = load 'file:/myfile' using PigStorage() A = load
> 'file://myfile' using PigStorage() A = load
> 'file://user/tgibbs/myfile' using PigStorage() A = load
> 'hdfs:/myfile' using PigStorage()
>
> All of the above fail in various ways.
>
> Also, when pig loads it displays
> 1 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine
> - Connecting to hadoop file system at: file:///
>
> I'm using hadoop v. 0.16.3 and the latest pig from svn.
>
> Anybody have any ideas?
>
> Thanks!
> Tanton
>