You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@zookeeper.apache.org by Christophe Bisciglia <ch...@cloudera.com> on 2009/06/25 02:02:45 UTC

Next Bay Area Hadoop User Group - Focus on Hadoop 0.20 and Core Project Split

Bay Area Hadoop Fans,

We're excited to hold our first Hadoop User Group at Cloudera's office
in Burlingame (just south of SFO). We pushed the start time back 30
minutes to allow a little extra time to drive further north, and we
hope the mid-way location brings more users from San Francisco.

Since meetup.com seems to be the norm for HUGs around the country, we
created a meetup group for the bay area
(http://www.meetup.com/Bay-Area-Hadoop-User-Group-HUG). Join this
group to stay up to date with additional meetings and locations -
we're hoping to move the location around potentially alternating
between north bay and south bay.

We've scheduled the next meetup for July 15th at 6:30 PM. Our office
isn't huge, but we do have room for 40 friendly people:
http://www.meetup.com/Bay-Area-Hadoop-User-Group-HUG/calendar/10728923/

We'll focus this meeting on Hadoop 0.20 and the split of "core" into
mapreduce, hdfs and common projects. Specifically, we'll go over new
features, API changes, upgrade experiences and more. If you'd like to
present about your experience, please let me know. If you'd like to
present about something else all together, also let me know, and we'll
see what we can do at this, or a later meetup.

We'll provide beer, drinks and snacks, and if there are any board game
fans in the house, we won't kick you our afterwards :-) On a more
serious note, after the meetup is a great opportunity to meet
Cloudera's engineering team and get advice about any headaches you
might be having.

We'll post the agenda to the meetup group as soon as we hear from
potential presenters and nail things down.

Christophe

-- 
get hadoop: cloudera.com/hadoop
online training: cloudera.com/hadoop-training
blog: cloudera.com/blog
twitter: twitter.com/cloudera

Re: Error while running pig in hadoop mode

Posted by zhangjiayin <zh...@360quan.com>.

you must put the directory and files  to  HDFS system

use
hadoop fs   -ls   mnt/fileprocessor/hadoop/pig-0.2.0/nmstest/nms.csv 

to see if the file in hdfs system.

baburaj.S wrote:
> I was successful in setting up pig in hadoop mode. I didn't do anything
> in special. I installed and configured Hadoop and pig again and it
> started working. But now I have this new problem. When I am running a
> script in the local mode its working fine. But when I am trying to run
> the same in hadoop mode its giving error which tells that it cannot find
> the file which I am loading in the pig script. The following is my pig
> script
>
>
> nms = LOAD '/mnt/fileprocessor/hadoop/pig-0.2.0/nmstest/nms.csv'USING
> PigStorage(',');
> grpd = GROUP nms BY $4;
> cntd = FOREACH grpd GENERATE group,SUM(nms.$9),COUNT(nms);
> DUMP cntd;
>
>
> And this is the error that I am getting in the log file
>
>
> ERROR 2998: Unhandled internal error.
> org.apache.pig.backend.executionengine.ExecException: ERROR 2100:
> /mnt/fileprocessor/hadoop/pig-0.2.0/nmstest/nms.csv does not exist.
>         at
> org.apache.pig.backend.executionengine.PigSlicer.validate(PigSlicer.java
> :120)
>         at
> org.apache.pig.impl.io.ValidatingInputFileSpec.validate(ValidatingInputF
> ileSpec.java:59)
>         at
> org.apache.pig.impl.io.ValidatingInputFileSpec.<init>(ValidatingInputFil
> eSpec.java:44)
>         at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFor
> mat.getSplits(PigInputFormat.java:214)
>         at
> org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:742)
>         at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:370)
>         at
> org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl
> .java:247)
>         at
> org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
>         at java.lang.Thread.run(Thread.java:619)
>
> java.lang.Exception:
> org.apache.pig.backend.executionengine.ExecException: ERROR 2100:
> /mnt/fileprocessor/hadoop/pig-0.2.0/nmstest/nms.csv does not exist.
>         at
> org.apache.pig.backend.executionengine.PigSlicer.validate(PigSlicer.java
> :120)
>         at
> org.apache.pig.impl.io.ValidatingInputFileSpec.validate(ValidatingInputF
> ileSpec.java:59)
>         at
> org.apache.pig.impl.io.ValidatingInputFileSpec.<init>(ValidatingInputFil
> eSpec.java:44)
>         at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFor
> mat.getSplits(PigInputFormat.java:214)
>         at
> org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:742)
>         at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:370)
>         at
> org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl
> .java:247)
>         at
> org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
>         at java.lang.Thread.run(Thread.java:619)
>
>         at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.ge
> tStats(Launcher.java:131)
>         at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa
> uncher.launchPig(MapReduceLauncher.java:133)
>         at
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(H
> ExecutionEngine.java:261)
>         at
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:695)
>         at org.apache.pig.PigServer.store(PigServer.java:498)
>         at org.apache.pig.PigServer.store(PigServer.java:465)
>         at org.apache.pig.PigServer.openIterator(PigServer.java:426)
>         at
> org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:359)
>         at
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptPar
> ser.java:193)
>         at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java
> :99)
>         at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:82)
>         at org.apache.pig.Main.main(Main.java:354)
> ERROR 2100: /mnt/fileprocessor/hadoop/pig-0.2.0/nmstest/nms.csv does not
> exist.
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable
> to open iterator for alias cntd
>         at org.apache.pig.PigServer.openIterator(PigServer.java:438)
>         at
> org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:359)
>         at
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptPar
> ser.java:193)
>         at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java
> :99)
>         at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:82)
>         at org.apache.pig.Main.main(Main.java:354)
>
>
> Please help me in solving this.
>
> Regards
> Babu
>
>
>   
> ------------------------------------------------------------------------
>
> ==============================
> DISCLAIMER: The information in this message is confidential and may be legally privileged. It is intended solely for the addressee. Access to this message by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, or distribution of the message, or any action or omission taken by you in reliance on it, is prohibited and may be unlawful. Please immediately contact the sender if you have received this message in error. Further, this e-mail may contain viruses and all reasonable precaution to minimize the risk arising there from is taken by OnMobile. OnMobile is not liable for any damage sustained by you as a result of any virus in this e-mail. All applicable virus checks should be carried out by you before opening this e-mail or any attachment thereto. 
> Thank you - OnMobile Global Limited.
> ==============================
>

Error while running pig in hadoop mode

Posted by "baburaj.S" <ba...@onmobile.com>.

I was successful in setting up pig in hadoop mode. I didn't do anything
in special. I installed and configured Hadoop and pig again and it
started working. But now I have this new problem. When I am running a
script in the local mode its working fine. But when I am trying to run
the same in hadoop mode its giving error which tells that it cannot find
the file which I am loading in the pig script. The following is my pig
script


nms = LOAD '/mnt/fileprocessor/hadoop/pig-0.2.0/nmstest/nms.csv'USING
PigStorage(',');
grpd = GROUP nms BY $4;
cntd = FOREACH grpd GENERATE group,SUM(nms.$9),COUNT(nms);
DUMP cntd;


And this is the error that I am getting in the log file


ERROR 2998: Unhandled internal error.
org.apache.pig.backend.executionengine.ExecException: ERROR 2100:
/mnt/fileprocessor/hadoop/pig-0.2.0/nmstest/nms.csv does not exist.
        at
org.apache.pig.backend.executionengine.PigSlicer.validate(PigSlicer.java
:120)
        at
org.apache.pig.impl.io.ValidatingInputFileSpec.validate(ValidatingInputF
ileSpec.java:59)
        at
org.apache.pig.impl.io.ValidatingInputFileSpec.<init>(ValidatingInputFil
eSpec.java:44)
        at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFor
mat.getSplits(PigInputFormat.java:214)
        at
org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:742)
        at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:370)
        at
org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl
.java:247)
        at
org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
        at java.lang.Thread.run(Thread.java:619)

java.lang.Exception:
org.apache.pig.backend.executionengine.ExecException: ERROR 2100:
/mnt/fileprocessor/hadoop/pig-0.2.0/nmstest/nms.csv does not exist.
        at
org.apache.pig.backend.executionengine.PigSlicer.validate(PigSlicer.java
:120)
        at
org.apache.pig.impl.io.ValidatingInputFileSpec.validate(ValidatingInputF
ileSpec.java:59)
        at
org.apache.pig.impl.io.ValidatingInputFileSpec.<init>(ValidatingInputFil
eSpec.java:44)
        at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFor
mat.getSplits(PigInputFormat.java:214)
        at
org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:742)
        at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:370)
        at
org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl
.java:247)
        at
org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
        at java.lang.Thread.run(Thread.java:619)

        at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.ge
tStats(Launcher.java:131)
        at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa
uncher.launchPig(MapReduceLauncher.java:133)
        at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(H
ExecutionEngine.java:261)
        at
org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:695)
        at org.apache.pig.PigServer.store(PigServer.java:498)
        at org.apache.pig.PigServer.store(PigServer.java:465)
        at org.apache.pig.PigServer.openIterator(PigServer.java:426)
        at
org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:359)
        at
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptPar
ser.java:193)
        at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java
:99)
        at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:82)
        at org.apache.pig.Main.main(Main.java:354)
ERROR 2100: /mnt/fileprocessor/hadoop/pig-0.2.0/nmstest/nms.csv does not
exist.
org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable
to open iterator for alias cntd
        at org.apache.pig.PigServer.openIterator(PigServer.java:438)
        at
org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:359)
        at
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptPar
ser.java:193)
        at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java
:99)
        at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:82)
        at org.apache.pig.Main.main(Main.java:354)


Please help me in solving this.

Regards
Babu