You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@oozie.apache.org by Miguel Angel Martin junquera <mi...@gmail.com> on 2013/12/10 16:00:26 UTC
setting PIG_INPUT_INITIAL_ADDRESS environment . variable in Oozie for cassandra ...¿?
Hi,
I have an error with pig action in oozie 4.0.0 using cassandraStorage.
(cassandra 1.2.10)
I can run pig scripts right with cassandra. but whe I try to use
cassandraStorage to load data I have this error:
*Run pig script using PigRunner.run() for Pig version 0.8+*
*Apache Pig version 0.10.0 (r1328203) *
*compiled Apr 20 2012, 00:33:25*
*Run pig script using PigRunner.run() for Pig version 0.8+*
*2013-12-10 12:24:39,084 [main] INFO org.apache.pig.Main - Apache
Pig version 0.10.0 (r1328203) compiled Apr 20 2012, 00:33:25*
*2013-12-10 12:24:39,084 [main] INFO org.apache.pig.Main - Apache
Pig version 0.10.0 (r1328203) compiled Apr 20 2012, 00:33:25*
*2013-12-10 12:24:39,095 [main] INFO org.apache.pig.Main - Logging
error messages to:
/tmp/hadoop-ec2-user/mapred/local/taskTracker/ec2-user/jobcache/job_201312100858_0007/attempt_201312100858_0007_m_000000_0/work/pig-job_201312100858_0007.log*
*2013-12-10 12:24:39,095 [main] INFO org.apache.pig.Main - Logging
error messages to:
/tmp/hadoop-ec2-user/mapred/local/taskTracker/ec2-user/jobcache/job_201312100858_0007/attempt_201312100858_0007_m_000000_0/work/pig-job_201312100858_0007.log*
*2013-12-10 12:24:39,501 [main] INFO
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
Connecting to hadoop file system at: hdfs://10.228.243.18:9000
<http://10.228.243.18:9000>*
*2013-12-10 12:24:39,501 [main] INFO
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
Connecting to hadoop file system at: hdfs://10.228.243.18:9000
<http://10.228.243.18:9000>*
*2013-12-10 12:24:39,510 [main] INFO
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
Connecting to map-reduce job tracker at: 10.228.243.18:9001
<http://10.228.243.18:9001>*
*2013-12-10 12:24:39,510 [main] INFO
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
Connecting to map-reduce job tracker at: 10.228.243.18:9001
<http://10.228.243.18:9001>*
*2013-12-10 12:24:40,505 [main] ERROR org.apache.pig.tools.grunt.Grunt
- ERROR 2245: *
*<file testCassandra.pig, line 7, column 7> Cannot get schema from
loadFunc org.apache.cassandra.hadoop.pig.CassandraStorage*
*2013-12-10 12:24:40,505 [main] ERROR org.apache.pig.tools.grunt.Grunt
- ERROR 2245: *
*<file testCassandra.pig, line 7, column 7> Cannot get schema from
loadFunc org.apache.cassandra.hadoop.pig.CassandraStorage*
*2013-12-10 12:24:40,505 [main] ERROR org.apache.pig.tools.grunt.Grunt
- org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2245: *
*<file testCassandra.pig, line 7, column 7> Cannot get schema from
loadFunc org.apache.cassandra.hadoop.pig.CassandraStorage*
* at org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:155)*
* at org.apache.pig.newplan.logical.relational.LOLoad.getSchema(LOLoad.java:110)*
* at org.apache.pig.newplan.logical.relational.LOStore.getSchema(LOStore.java:68)*
* at org.apache.pig.newplan.logical.visitor.SchemaAliasVisitor.validate(SchemaAliasVisitor.java:60)*
* at org.apache.pig.newplan.logical.visitor.SchemaAliasVisitor.visit(SchemaAliasVisitor.java:84)*
* at org.apache.pig.newplan.logical.relational.LOStore.accept(LOStore.java:77)*
* at org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)*
* at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)*
* at org.apache.pig.PigServer$Graph.compile(PigServer.java:1617)*
* at org.apache.pig.PigServer$Graph.compile(PigServer.java:1611)*
* at org.apache.pig.PigServer$Graph.access$200(PigServer.java:1334)*
* at org.apache.pig.PigServer.execute(PigServer.java:1239)*
* at org.apache.pig.PigServer.executeBatch(PigServer.java:362)*
* at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:132)*
* at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:193)*
* at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)*
* at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)*
* at org.apache.pig.Main.run(Main.java:430)*
* at org.apache.pig.PigRunner.run(PigRunner.java:49)*
* at org.apache.oozie.action.hadoop.PigMain.runPigJob(PigMain.java:283)*
* at org.apache.oozie.action.hadoop.PigMain.run(PigMain.java:223)*
* at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:37)*
* at org.apache.oozie.action.hadoop.PigMain.main(PigMain.java:76)*
* at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)*
* at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)*
* at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)*
* at java.lang.reflect.Method.invoke(Method.java:601)*
* at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:226)*
* at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)*
* at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)*
* at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)*
* at org.apache.hadoop.mapred.Child$4.run(Child.java:255)*
* at java.security.AccessController.doPrivileged(Native Method)*
* at javax.security.auth.Subject.doAs(Subject.java:415)*
* at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)*
* at org.apache.hadoop.mapred.Child.main(Child.java:249)*
*Caused by: java.io.IOException: PIG_INPUT_INITIAL_ADDRESS or
PIG_INITIAL_ADDRESS environment variable not set*
* at org.apache.cassandra.hadoop.pig.CassandraStorage.setLocation(CassandraStorage.java:314)*
* at org.apache.cassandra.hadoop.pig.CassandraStorage.getSchema(CassandraStorage.java:358)*
* at org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:151)*
* ... 35 more*
*2013-12-10 12:24:40,505 [main] ERROR org.apache.pig.tools.grunt.Grunt
- org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2245: *
*<file testCassandra.pig, line 7, column 7> Cannot get schema from
loadFunc org.apache.cassandra.hadoop.pig.CassandraStorage*
* at org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:155)*
* at org.apache.pig.newplan.logical.relational.LOLoad.getSchema(LOLoad.java:110)*
* at org.apache.pig.newplan.logical.relational.LOStore.getSchema(LOStore.java:68)*
* at org.apache.pig.newplan.logical.visitor.SchemaAliasVisitor.validate(SchemaAliasVisitor.java:60)*
* at org.apache.pig.newplan.logical.visitor.SchemaAliasVisitor.visit(SchemaAliasVisitor.java:84)*
* at org.apache.pig.newplan.logical.relational.LOStore.accept(LOStore.java:77)*
* at org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)*
* at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)*
* at org.apache.pig.PigServer$Graph.compile(PigServer.java:1617)*
* at org.apache.pig.PigServer$Graph.compile(PigServer.java:1611)*
* at org.apache.pig.PigServer$Graph.access$200(PigServer.java:1334)*
* at org.apache.pig.PigServer.execute(PigServer.java:1239)*
* at org.apache.pig.PigServer.executeBatch(PigServer.java:362)*
* at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:132)*
* at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:193)*
* at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)*
* at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)*
* at org.apache.pig.Main.run(Main.java:430)*
* at org.apache.pig.PigRunner.run(PigRunner.java:49)*
* at org.apache.oozie.action.hadoop.PigMain.runPigJob(PigMain.java:283)*
* at org.apache.oozie.action.hadoop.PigMain.run(PigMain.java:223)*
* at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:37)*
* at org.apache.oozie.action.hadoop.PigMain.main(PigMain.java:76)*
* at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)*
* at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)*
* at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)*
* at java.lang.reflect.Method.invoke(Method.java:601)*
* at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:226)*
* at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)*
* at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)*
* at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)*
* at org.apache.hadoop.mapred.Child$4.run(Child.java:255)*
* at java.security.AccessController.doPrivileged(Native Method)*
* at javax.security.auth.Subject.doAs(Subject.java:415)*
* at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)*
* at org.apache.hadoop.mapred.Child.main(Child.java:249)*
*Caused by: java.io.IOException: PIG_INPUT_INITIAL_ADDRESS or
PIG_INITIAL_ADDRESS environment variable not set*
* at org.apache.cassandra.hadoop.pig.CassandraStorage.setLocation(CassandraStorage.java:314)*
* at org.apache.cassandra.hadoop.pig.CassandraStorage.getSchema(CassandraStorage.java:358)*
* at org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:151)*
* ... 35 more*
*<<< Invocation of Main class completed <<<*
*Failing Oozie Launcher, Main class
[org.apache.oozie.action.hadoop.PigMain], exit code [2]*
*Oozie Launcher failed, finishing Hadoop job gracefully*
*-----------------------------------------------------------------------*
I, m using Ec2 instances and i have a hadoop cluster with cassandra in
all nodes and I can run right pig_cassandra scriptand I conigured
oozie in other instance with namenode..
I set the var in .bash_profile file like:
*..*
*export PIG_INITIAL_ADDRESS=${seed}*
*...*
How can I set this env variable in Oozie or in Pig with Oozie
Another cuestion is:
2.- I configured oozie installation to use pig 0.11.1 in the pom.xml
like the Oozie instructions installation but I see in the logs that
Oozie is using
Pig 0.10.
*...*
*Run pig script using PigRunner.run() for Pig version 0.8+*
*Apache Pig version 0.10.0 (r1328203) *
*compiled Apr 20 2012, 00:33:25*
*...*
*I upload to hdfs share lib and check pig library is 0.11.1 version*
How can i change or configure this issue?
thanks in advance
And any helps wil be appreciated.
Regards
Re: setting PIG_INPUT_INITIAL_ADDRESS environment . variable in Oozie for cassandra ...¿?
Posted by Miguel Angel Martin junquera <mi...@gmail.com>.
¡Eureka!
At last !!!
the trick??
As well as putting the jar libraries dependecies in the sharelib folder in
hdfs ...
I had define this and the other environment variables in the bash_profile
and works fine if I launch pig scripts from command line shell.
I have to define these
variables: PIG_INITIAL_ADDRESS, PIG_CONF_DIR, PIG_RPC_PORT, PIG_PARTITIONER
.....
in the hadoop_env.sh and works fine.
The others exceptions were like null exceptions in cassandra's class, etc
Now, It is working with pig 0.10
...
Run pig script using PigRunner.run() for Pig version 0.8+
Apache Pig version 0.10.0 (r1328203)
...
Although I configue to use pig 0.11 and put this pig jar version in the
oozie shrelib, pom , classpath, etc
and , I can not run pig script by shell actions in oozie, but this another
song!!!
Regards
2013/12/12 Aaron Morton <aa...@thelastpickle.com>
> > Caused by: java.io.IOException: PIG_INPUT_INITIAL_ADDRESS or
> PIG_INITIAL_ADDRESS environment variable not set
> > at
> org.apache.cassandra.hadoop.pig.CassandraStorage.setLocation(CassandraStorage.java:314)
> > at
> org.apache.cassandra.hadoop.pig.CassandraStorage.getSchema(CassandraStorage.java:358)
> > at
> org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:151)
> > ... 35 more
>
> Have you checked these are set ?
>
> Cheers
>
> -----------------
> Aaron Morton
> New Zealand
> @aaronmorton
>
> Co-Founder & Principal Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com
>
> On 11/12/2013, at 4:00 am, Miguel Angel Martin junquera <
> mianmarjun.mailinglist@gmail.com> wrote:
>
> > Hi,
> >
> > I have an error with pig action in oozie 4.0.0 using cassandraStorage.
> (cassandra 1.2.10)
> >
> > I can run pig scripts right with cassandra. but whe I try to use
> cassandraStorage to load data I have this error:
> >
> >
> > Run pig script using PigRunner.run() for Pig version 0.8+
> > Apache Pig version 0.10.0 (r1328203)
> > compiled Apr 20 2012, 00:33:25
> >
> > Run pig script using PigRunner.run() for Pig version 0.8+
> > 2013-12-10 12:24:39,084 [main] INFO org.apache.pig.Main - Apache Pig
> version 0.10.0 (r1328203) compiled Apr 20 2012, 00:33:25
> > 2013-12-10 12:24:39,084 [main] INFO org.apache.pig.Main - Apache Pig
> version 0.10.0 (r1328203) compiled Apr 20 2012, 00:33:25
> > 2013-12-10 12:24:39,095 [main] INFO org.apache.pig.Main - Logging
> error messages to:
> /tmp/hadoop-ec2-user/mapred/local/taskTracker/ec2-user/jobcache/job_201312100858_0007/attempt_201312100858_0007_m_000000_0/work/pig-job_201312100858_0007.log
> > 2013-12-10 12:24:39,095 [main] INFO org.apache.pig.Main - Logging
> error messages to:
> /tmp/hadoop-ec2-user/mapred/local/taskTracker/ec2-user/jobcache/job_201312100858_0007/attempt_201312100858_0007_m_000000_0/work/pig-job_201312100858_0007.log
> > 2013-12-10 12:24:39,501 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
> Connecting to hadoop file system at: hdfs://10.228.243.18:9000
> > 2013-12-10 12:24:39,501 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
> Connecting to hadoop file system at: hdfs://10.228.243.18:9000
> > 2013-12-10 12:24:39,510 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
> Connecting to map-reduce job tracker at: 10.228.243.18:9001
> > 2013-12-10 12:24:39,510 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
> Connecting to map-reduce job tracker at: 10.228.243.18:9001
> > 2013-12-10 12:24:40,505 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> ERROR 2245:
> > <file testCassandra.pig, line 7, column 7> Cannot get schema from
> loadFunc org.apache.cassandra.hadoop.pig.CassandraStorage
> > 2013-12-10 12:24:40,505 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> ERROR 2245:
> > <file testCassandra.pig, line 7, column 7> Cannot get schema from
> loadFunc org.apache.cassandra.hadoop.pig.CassandraStorage
> > 2013-12-10 12:24:40,505 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2245:
> > <file testCassandra.pig, line 7, column 7> Cannot get schema from
> loadFunc org.apache.cassandra.hadoop.pig.CassandraStorage
> > at
> org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:155)
> > at
> org.apache.pig.newplan.logical.relational.LOLoad.getSchema(LOLoad.java:110)
> > at
> org.apache.pig.newplan.logical.relational.LOStore.getSchema(LOStore.java:68)
> > at
> org.apache.pig.newplan.logical.visitor.SchemaAliasVisitor.validate(SchemaAliasVisitor.java:60)
> > at
> org.apache.pig.newplan.logical.visitor.SchemaAliasVisitor.visit(SchemaAliasVisitor.java:84)
> > at
> org.apache.pig.newplan.logical.relational.LOStore.accept(LOStore.java:77)
> > at
> org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
> > at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
> > at org.apache.pig.PigServer$Graph.compile(PigServer.java:1617)
> > at org.apache.pig.PigServer$Graph.compile(PigServer.java:1611)
> > at org.apache.pig.PigServer$Graph.access$200(PigServer.java:1334)
> > at org.apache.pig.PigServer.execute(PigServer.java:1239)
> > at org.apache.pig.PigServer.executeBatch(PigServer.java:362)
> > at
> org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:132)
> > at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:193)
> > at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
> > at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
> > at org.apache.pig.Main.run(Main.java:430)
> > at org.apache.pig.PigRunner.run(PigRunner.java:49)
> > at
> org.apache.oozie.action.hadoop.PigMain.runPigJob(PigMain.java:283)
> > at org.apache.oozie.action.hadoop.PigMain.run(PigMain.java:223)
> > at
> org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:37)
> > at org.apache.oozie.action.hadoop.PigMain.main(PigMain.java:76)
> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> > at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > at java.lang.reflect.Method.invoke(Method.java:601)
> > at
> org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:226)
> > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
> > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
> > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
> > at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> > at java.security.AccessController.doPrivileged(Native Method)
> > at javax.security.auth.Subject.doAs(Subject.java:415)
> > at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)
> > at org.apache.hadoop.mapred.Child.main(Child.java:249)
> > Caused by: java.io.IOException: PIG_INPUT_INITIAL_ADDRESS or
> PIG_INITIAL_ADDRESS environment variable not set
> > at
> org.apache.cassandra.hadoop.pig.CassandraStorage.setLocation(CassandraStorage.java:314)
> > at
> org.apache.cassandra.hadoop.pig.CassandraStorage.getSchema(CassandraStorage.java:358)
> > at
> org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:151)
> > ... 35 more
> >
> > 2013-12-10 12:24:40,505 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2245:
> > <file testCassandra.pig, line 7, column 7> Cannot get schema from
> loadFunc org.apache.cassandra.hadoop.pig.CassandraStorage
> > at
> org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:155)
> > at
> org.apache.pig.newplan.logical.relational.LOLoad.getSchema(LOLoad.java:110)
> > at
> org.apache.pig.newplan.logical.relational.LOStore.getSchema(LOStore.java:68)
> > at
> org.apache.pig.newplan.logical.visitor.SchemaAliasVisitor.validate(SchemaAliasVisitor.java:60)
> > at
> org.apache.pig.newplan.logical.visitor.SchemaAliasVisitor.visit(SchemaAliasVisitor.java:84)
> > at
> org.apache.pig.newplan.logical.relational.LOStore.accept(LOStore.java:77)
> > at
> org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
> > at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
> > at org.apache.pig.PigServer$Graph.compile(PigServer.java:1617)
> > at org.apache.pig.PigServer$Graph.compile(PigServer.java:1611)
> > at org.apache.pig.PigServer$Graph.access$200(PigServer.java:1334)
> > at org.apache.pig.PigServer.execute(PigServer.java:1239)
> > at org.apache.pig.PigServer.executeBatch(PigServer.java:362)
> > at
> org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:132)
> > at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:193)
> > at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
> > at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
> > at org.apache.pig.Main.run(Main.java:430)
> > at org.apache.pig.PigRunner.run(PigRunner.java:49)
> > at
> org.apache.oozie.action.hadoop.PigMain.runPigJob(PigMain.java:283)
> > at org.apache.oozie.action.hadoop.PigMain.run(PigMain.java:223)
> > at
> org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:37)
> > at org.apache.oozie.action.hadoop.PigMain.main(PigMain.java:76)
> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> > at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > at java.lang.reflect.Method.invoke(Method.java:601)
> > at
> org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:226)
> > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
> > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
> > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
> > at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> > at java.security.AccessController.doPrivileged(Native Method)
> > at javax.security.auth.Subject.doAs(Subject.java:415)
> > at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)
> > at org.apache.hadoop.mapred.Child.main(Child.java:249)
> > Caused by: java.io.IOException: PIG_INPUT_INITIAL_ADDRESS or
> PIG_INITIAL_ADDRESS environment variable not set
> > at
> org.apache.cassandra.hadoop.pig.CassandraStorage.setLocation(CassandraStorage.java:314)
> > at
> org.apache.cassandra.hadoop.pig.CassandraStorage.getSchema(CassandraStorage.java:358)
> > at
> org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:151)
> > ... 35 more
> >
> > <<< Invocation of Main class completed <<<
> >
> > Failing Oozie Launcher, Main class
> [org.apache.oozie.action.hadoop.PigMain], exit code [2]
> >
> > Oozie Launcher failed, finishing Hadoop job gracefully
> > -----------------------------------------------------------------------
> >
> >
> > I, m using Ec2 instances and i have a hadoop cluster with cassandra in
> all nodes and I can run right pig_cassandra scriptand I conigured oozie in
> other instance with namenode..
> >
> >
> > I set the var in .bash_profile file like:
> > ..
> > export PIG_INITIAL_ADDRESS=${seed}
> > ...
> >
> >
> > How can I set this env variable in Oozie or in Pig with Oozie
> >
> > Another cuestion is:
> >
> > 2.- I configured oozie installation to use pig 0.11.1 in the pom.xml
> like the Oozie instructions installation but I see in the logs that Oozie
> is using
> > Pig 0.10.
> >
> > ...
> > Run pig script using PigRunner.run() for Pig version 0.8+
> > Apache Pig version 0.10.0 (r1328203)
> > compiled Apr 20 2012, 00:33:25
> > ...
> > I upload to hdfs share lib and check pig library is 0.11.1 version
> > How can i change or configure this issue?
> >
> >
> >
> >
> > thanks in advance
> > And any helps wil be appreciated.
> >
> > Regards
>
>
Re: setting PIG_INPUT_INITIAL_ADDRESS environment . variable in Oozie for cassandra ...¿?
Posted by Aaron Morton <aa...@thelastpickle.com>.
> Caused by: java.io.IOException: PIG_INPUT_INITIAL_ADDRESS or PIG_INITIAL_ADDRESS environment variable not set
> at org.apache.cassandra.hadoop.pig.CassandraStorage.setLocation(CassandraStorage.java:314)
> at org.apache.cassandra.hadoop.pig.CassandraStorage.getSchema(CassandraStorage.java:358)
> at org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:151)
> ... 35 more
Have you checked these are set ?
Cheers
-----------------
Aaron Morton
New Zealand
@aaronmorton
Co-Founder & Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com
On 11/12/2013, at 4:00 am, Miguel Angel Martin junquera <mi...@gmail.com> wrote:
> Hi,
>
> I have an error with pig action in oozie 4.0.0 using cassandraStorage. (cassandra 1.2.10)
>
> I can run pig scripts right with cassandra. but whe I try to use cassandraStorage to load data I have this error:
>
>
> Run pig script using PigRunner.run() for Pig version 0.8+
> Apache Pig version 0.10.0 (r1328203)
> compiled Apr 20 2012, 00:33:25
>
> Run pig script using PigRunner.run() for Pig version 0.8+
> 2013-12-10 12:24:39,084 [main] INFO org.apache.pig.Main - Apache Pig version 0.10.0 (r1328203) compiled Apr 20 2012, 00:33:25
> 2013-12-10 12:24:39,084 [main] INFO org.apache.pig.Main - Apache Pig version 0.10.0 (r1328203) compiled Apr 20 2012, 00:33:25
> 2013-12-10 12:24:39,095 [main] INFO org.apache.pig.Main - Logging error messages to: /tmp/hadoop-ec2-user/mapred/local/taskTracker/ec2-user/jobcache/job_201312100858_0007/attempt_201312100858_0007_m_000000_0/work/pig-job_201312100858_0007.log
> 2013-12-10 12:24:39,095 [main] INFO org.apache.pig.Main - Logging error messages to: /tmp/hadoop-ec2-user/mapred/local/taskTracker/ec2-user/jobcache/job_201312100858_0007/attempt_201312100858_0007_m_000000_0/work/pig-job_201312100858_0007.log
> 2013-12-10 12:24:39,501 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://10.228.243.18:9000
> 2013-12-10 12:24:39,501 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://10.228.243.18:9000
> 2013-12-10 12:24:39,510 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: 10.228.243.18:9001
> 2013-12-10 12:24:39,510 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: 10.228.243.18:9001
> 2013-12-10 12:24:40,505 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2245:
> <file testCassandra.pig, line 7, column 7> Cannot get schema from loadFunc org.apache.cassandra.hadoop.pig.CassandraStorage
> 2013-12-10 12:24:40,505 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2245:
> <file testCassandra.pig, line 7, column 7> Cannot get schema from loadFunc org.apache.cassandra.hadoop.pig.CassandraStorage
> 2013-12-10 12:24:40,505 [main] ERROR org.apache.pig.tools.grunt.Grunt - org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2245:
> <file testCassandra.pig, line 7, column 7> Cannot get schema from loadFunc org.apache.cassandra.hadoop.pig.CassandraStorage
> at org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:155)
> at org.apache.pig.newplan.logical.relational.LOLoad.getSchema(LOLoad.java:110)
> at org.apache.pig.newplan.logical.relational.LOStore.getSchema(LOStore.java:68)
> at org.apache.pig.newplan.logical.visitor.SchemaAliasVisitor.validate(SchemaAliasVisitor.java:60)
> at org.apache.pig.newplan.logical.visitor.SchemaAliasVisitor.visit(SchemaAliasVisitor.java:84)
> at org.apache.pig.newplan.logical.relational.LOStore.accept(LOStore.java:77)
> at org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
> at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
> at org.apache.pig.PigServer$Graph.compile(PigServer.java:1617)
> at org.apache.pig.PigServer$Graph.compile(PigServer.java:1611)
> at org.apache.pig.PigServer$Graph.access$200(PigServer.java:1334)
> at org.apache.pig.PigServer.execute(PigServer.java:1239)
> at org.apache.pig.PigServer.executeBatch(PigServer.java:362)
> at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:132)
> at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:193)
> at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
> at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
> at org.apache.pig.Main.run(Main.java:430)
> at org.apache.pig.PigRunner.run(PigRunner.java:49)
> at org.apache.oozie.action.hadoop.PigMain.runPigJob(PigMain.java:283)
> at org.apache.oozie.action.hadoop.PigMain.run(PigMain.java:223)
> at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:37)
> at org.apache.oozie.action.hadoop.PigMain.main(PigMain.java:76)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:601)
> at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:226)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)
> at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: java.io.IOException: PIG_INPUT_INITIAL_ADDRESS or PIG_INITIAL_ADDRESS environment variable not set
> at org.apache.cassandra.hadoop.pig.CassandraStorage.setLocation(CassandraStorage.java:314)
> at org.apache.cassandra.hadoop.pig.CassandraStorage.getSchema(CassandraStorage.java:358)
> at org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:151)
> ... 35 more
>
> 2013-12-10 12:24:40,505 [main] ERROR org.apache.pig.tools.grunt.Grunt - org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2245:
> <file testCassandra.pig, line 7, column 7> Cannot get schema from loadFunc org.apache.cassandra.hadoop.pig.CassandraStorage
> at org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:155)
> at org.apache.pig.newplan.logical.relational.LOLoad.getSchema(LOLoad.java:110)
> at org.apache.pig.newplan.logical.relational.LOStore.getSchema(LOStore.java:68)
> at org.apache.pig.newplan.logical.visitor.SchemaAliasVisitor.validate(SchemaAliasVisitor.java:60)
> at org.apache.pig.newplan.logical.visitor.SchemaAliasVisitor.visit(SchemaAliasVisitor.java:84)
> at org.apache.pig.newplan.logical.relational.LOStore.accept(LOStore.java:77)
> at org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
> at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
> at org.apache.pig.PigServer$Graph.compile(PigServer.java:1617)
> at org.apache.pig.PigServer$Graph.compile(PigServer.java:1611)
> at org.apache.pig.PigServer$Graph.access$200(PigServer.java:1334)
> at org.apache.pig.PigServer.execute(PigServer.java:1239)
> at org.apache.pig.PigServer.executeBatch(PigServer.java:362)
> at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:132)
> at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:193)
> at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
> at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
> at org.apache.pig.Main.run(Main.java:430)
> at org.apache.pig.PigRunner.run(PigRunner.java:49)
> at org.apache.oozie.action.hadoop.PigMain.runPigJob(PigMain.java:283)
> at org.apache.oozie.action.hadoop.PigMain.run(PigMain.java:223)
> at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:37)
> at org.apache.oozie.action.hadoop.PigMain.main(PigMain.java:76)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:601)
> at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:226)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)
> at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: java.io.IOException: PIG_INPUT_INITIAL_ADDRESS or PIG_INITIAL_ADDRESS environment variable not set
> at org.apache.cassandra.hadoop.pig.CassandraStorage.setLocation(CassandraStorage.java:314)
> at org.apache.cassandra.hadoop.pig.CassandraStorage.getSchema(CassandraStorage.java:358)
> at org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:151)
> ... 35 more
>
> <<< Invocation of Main class completed <<<
>
> Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.PigMain], exit code [2]
>
> Oozie Launcher failed, finishing Hadoop job gracefully
> -----------------------------------------------------------------------
>
>
> I, m using Ec2 instances and i have a hadoop cluster with cassandra in all nodes and I can run right pig_cassandra scriptand I conigured oozie in other instance with namenode..
>
>
> I set the var in .bash_profile file like:
> ..
> export PIG_INITIAL_ADDRESS=${seed}
> ...
>
>
> How can I set this env variable in Oozie or in Pig with Oozie
>
> Another cuestion is:
>
> 2.- I configured oozie installation to use pig 0.11.1 in the pom.xml like the Oozie instructions installation but I see in the logs that Oozie is using
> Pig 0.10.
>
> ...
> Run pig script using PigRunner.run() for Pig version 0.8+
> Apache Pig version 0.10.0 (r1328203)
> compiled Apr 20 2012, 00:33:25
> ...
> I upload to hdfs share lib and check pig library is 0.11.1 version
> How can i change or configure this issue?
>
>
>
>
> thanks in advance
> And any helps wil be appreciated.
>
> Regards