You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Matteo Nasi <ma...@gmail.com> on 2010/01/04 22:42:44 UTC
Re: Problem in running Pig in hadoop mode

Hi Anastasia,

I'm new to this technology too, but I had problems like you had, and now I
can connect to the cluster in MapReduce mode.
reading your post seems you didn't setup pig correclty, there a couple of
ways of doing it (I guess they're both equivalent):

- pig.properties
- hadoop cluster xml files (core, mapred, dfs)

they must be all placed in conf subdirectories of $PIG_HOME (the place where
you unpacked your pig distribution).
I suggest, doing this you can't miss it, to use xml files directly from your
hadoop conf dir. Anyway pig.properties should look something like  this
(corresponding what you have in xml conf files in hadoop):

fs.default.name=hdfs://master/
mapred.job.tracker=master:54311

Aftert that just run $PIG_HOME/bin/pig without options to run in MapReduce
mode (it's the default mode), and try to interactivly navigate some command
(ls, etc.) to see if you can reach HDFS.

I'm using hadoop 0.20.1 and pig 0.5.0, hope it helped, and maybe some expert
guy can be more accurate ;-)

ciao Matteo

2009/12/30 Anastasia Theodouli <sa...@hotmail.com>

>
> Dear users,
>
> I'm new to hadoop and pig and I really feel I need some help...
>
> I managed to set up a hadoop cluster on two Ubuntu boxes. All hadoop
> deamons begin without any problems.
> I can also successfully copy files to the HDFS from the local filesystem.
>
> The problem is that I can't run pig in mapreduce mode.I can do it in local
> mode though...
>
> Every time I try to run an example script (from the Pig wiki examples), I
> get this:
> 2009-12-30 20:50:08,872 [main] INFO
>  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting
> to hadoop file system at: file:///
> 2009-12-30 20:50:09,037 [main] INFO
>  org.apache.hadoop.metrics.jvm.JvmMetrics - Initializing JVM Metrics with
> processName=JobTracker, sessionId=
>
> Furthermore, from grunt shell I seem to be connected to the local
> filesystem
> grunt> ls shows me all the local files and not the HDFS ones.
>
> I don't know how to make the settings in the file
> <PIG_HOME>/conf/pig.properties ,please note also that I have created this
> file manually. Which environments variables --PIG_CLASSPATH,HADOOPDIR ,
> other ?-- should I set there? Should it be this way:
> <property><name>....</name><value>....</value></property>
> or this way: export <variable_name>=value ?
> Any example concerning this file would be highly appreciated, as I didn't
> find any so far.
>
> I also tried to change pig execution mode using this command 'pig -x
> mapreduce' but I took this message in bash, pig: invalid option -- 'x'
>
> You can find below the full error stack I took when I tried to access a
> file from the hdfs through grunt shell.
>
> My commands in the grunt shell
>
> grunt> A= load 'hdfs://master:54310/id.out';
> grunt> dump A;
>
> (I took the same error below by using this command that describes the full
> path to the file in the hdfs.
>
> grunt> A= load 'hdfs://master:54310/user/hadoop/id.out';
> grunt> dump A; )
>
> The error stack
>
>  2009-12-30 21:11:15,266 [main] INFO
>  org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics
> with processName=JobTracker, sessionId= - already initialized
> 2009-12-30 21:11:15,268 [Thread-21] WARN
>  org.apache.hadoop.mapred.JobClient - Use GenericOptionsParser for parsing
> the arguments. Applications should implement Tool for the same.
> 2009-12-30 21:11:20,267 [main] INFO
>  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - 0% complete
> 2009-12-30 21:11:20,267 [main] ERROR
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - Map reduce job failed
> 2009-12-30 21:11:20,267 [main] ERROR
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - java.io.IOException: Call failed on local exception
>    at org.apache.hadoop.ipc.Client.call(Client.java:718)
>    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>    at org.apache.hadoop.dfs.$Proxy0.getProtocolVersion(Unknown Source)
>    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:319)
>    at org.apache.hadoop.dfs.DFSClient.createRPCNamenode(DFSClient.java:103)
>    at org.apache.hadoop.dfs.DFSClient.<init>(DFSClient.java:173)
>    at
> org.apache.hadoop.dfs.DistributedFileSystem.initialize(DistributedFileSystem.java:67)
>    at
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1339)
>    at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:56)
>    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1351)
>    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:213)
>    at org.apache.hadoop.fs.Path.getFileSystem(Path.java:175)
>    at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:189)
>    at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:742)
>    at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:370)
>    at
> org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
>    at
> org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
>    at java.lang.Thread.run(Thread.java:636)
> Caused by: java.io.EOFException
>    at java.io.DataInputStream.readInt(DataInputStream.java:392)
>    at
> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:499)
>    at org.apache.hadoop.ipc.Client$Connection.run(Client.java:441)
>
> 2009-12-30 21:11:20,268 [main] ERROR org.apache.pig.tools.grunt.GruntParser
> - java.io.IOException: Unable to open iterator for alias: A [Job terminated
> with anomalous status FAILED]
>    at org.apache.pig.PigServer.openIterator(PigServer.java:410)
>    at
> org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:269)
>    at
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:178)
>    at
> org.apache.pig.tools.grunt.GruntParser.parseContOnError(GruntParser.java:94)
>    at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:58)
>    at org.apache.pig.Main.main(Main.java:282)
> Caused by: java.io.IOException: Job terminated with anomalous status FAILED
>    ... 6 more
>
> 2009-12-30 21:11:20,268 [main] ERROR org.apache.pig.tools.grunt.GruntParser
> - Unable to open iterator for alias: A [Job terminated with anomalous status
> FAILED]
> 2009-12-30 21:11:20,268 [main] ERROR org.apache.pig.tools.grunt.GruntParser
> - java.io.IOException: Unable to open iterator for alias: A [Job terminated
> with anomalous status FAILED]
>
> If any more experienced user can figure out what the problem(s)is, I would
> be grateful!
>
> Regards,
>
> Anastasia Th.
>
>
> _________________________________________________________________
> Windows Live: Keep your friends up to date with what you do online.
>
> http://www.microsoft.com/middleeast/windows/windowslive/see-it-in-action/social-network-basics.aspx?ocid=PID23461::T:WLMTAGL:ON:WL:en-xm:SI_SB_1:092010