You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by pob <pe...@gmail.com> on 2011/04/20 01:28:22 UTC

pig + hadoop

Hello,

I did cluster configuration by
http://wiki.apache.org/cassandra/HadoopSupport. When I run
pig example-script.pig
-x local, everything is fine and i get correct results.

Problem is occurring with -x mapreduce

Im getting those errors :>


2011-04-20 01:24:21,791 [main] ERROR org.apache.pig.tools.pigstats.PigStats
- ERROR: java.lang.NumberFormatException: null
2011-04-20 01:24:21,792 [main] ERROR
org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!
2011-04-20 01:24:21,793 [main] INFO  org.apache.pig.tools.pigstats.PigStats
- Script Statistics:

Input(s):
Failed to read data from "cassandra://Keyspace1/Standard1"

Output(s):
Failed to produce result in
"hdfs://ip:54310/tmp/temp-1383865669/tmp-1895601791"

Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_201104200056_0005   ->      null,
null    ->      null,
null


2011-04-20 01:24:21,793 [main] INFO
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- Failed!
2011-04-20 01:24:21,803 [main] ERROR org.apache.pig.tools.grunt.Grunt -
ERROR 1066: Unable to open iterator for alias topnames. Backend error :
java.lang.NumberFormatException: null



====
thats from jobtasks web management - error  from task directly:

java.lang.RuntimeException: java.lang.NumberFormatException: null
at
org.apache.cassandra.hadoop.ColumnFamilyRecordReader.initialize(ColumnFamilyRecordReader.java:123)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.initialize(PigRecordReader.java:176)
at
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:418)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:620)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: java.lang.NumberFormatException: null
at java.lang.Integer.parseInt(Integer.java:417)
at java.lang.Integer.parseInt(Integer.java:499)
at
org.apache.cassandra.hadoop.ConfigHelper.getRpcPort(ConfigHelper.java:233)
at
org.apache.cassandra.hadoop.ColumnFamilyRecordReader.initialize(ColumnFamilyRecordReader.java:105)
... 5 more



Any suggestions where should be problem?

Thanks,

Re: pig + hadoop

Posted by pob <pe...@gmail.com>.

my false,

ignore last post.


2011/4/20 pob <pe...@gmail.com>

> Hi,
>
> everything works fine with cassandra 0.7.5, but when I tried with 0.7.3
> another errors showed up, but task finished with success whats strange.....
>
>
> 2011-04-20 11:45:40,674 INFO org.apache.hadoop.mapred.TaskInProgress: Error
> from attempt_201104201139_0004_m_000000_3: Error: java.lang.ClassNotF
> oundException: org.apache.thrift.TException
>         at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
>         at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
>         at java.lang.Class.forName0(Native Method)
>         at java.lang.Class.forName(Class.java:247)
>         at
> org.apache.pig.impl.PigContext.resolveClassName(PigContext.java:426)
>         at
> org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:456)
>         at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getLoadFunc(PigInputFormat.java:153)
>         at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.createRecordReader(PigInputFormat.java:105)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:588)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>         at org.apache.hadoop.mapred.Child.main(Child.java:170)
>
>
>
> 2011-04-20 11:45:43,629 INFO org.apache.hadoop.mapred.TaskInProgress: Error
> from attempt_201104201139_0004_m_000001_3: org.apache.pig.backend.exe
> cutionengine.ExecException: ERROR 2044: The type null cannot be collected
> as a Key type
>         at
> org.apache.pig.backend.hadoop.HDataType.getWritableComparableTypes(HDataType.java:143)
>         at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:105)
>         at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:238)
>         at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:231)
>         at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>         at org.apache.hadoop.mapred.Child.main(Child.java:170)
>
>
> 2011-04-20 11:42:49,498 INFO org.apache.hadoop.mapred.TaskInProgress: Error
> from attempt_201104201139_0001_m_000000_1: Error: java.lang.ClassNotF
> oundException: org.apache.commons.lang.ArrayUtils
>         at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
>         at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
>         at
> org.apache.cassandra.utils.ByteBufferUtil.<clinit>(ByteBufferUtil.java:75)
>         at
> org.apache.cassandra.hadoop.pig.CassandraStorage.<clinit>(Unknown Source)
>         at java.lang.Class.forName0(Native Method)
>         at java.lang.Class.forName(Class.java:247)
>         at
> org.apache.pig.impl.PigContext.resolveClassName(PigContext.java:426)
>         at
> org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:456)
>         at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getLoadFunc(PigInputFormat.java:153)
>         at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.createRecordReader(PigInputFormat.java:105)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:588)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>         at org.apache.hadoop.mapred.Child.main(Child.java:170)
>
>
>
>
>
> 2011/4/20 Jeremy Hanna <je...@gmail.com>
>
>> Just as an example:
>>
>>  <property>
>>    <name>cassandra.thrift.address</name>
>>    <value>10.12.34.56</value>
>>  </property>
>>  <property>
>>    <name>cassandra.thrift.port</name>
>>    <value>9160</value>
>>  </property>
>>  <property>
>>    <name>cassandra.partitioner.class</name>
>>    <value>org.apache.cassandra.dht.RandomPartitioner</value>
>>  </property>
>>
>>
>> On Apr 19, 2011, at 10:28 PM, Jeremy Hanna wrote:
>>
>> > oh yeah - that's what's going on.  what I do is on the machine that I
>> run the pig script from, I set the PIG_CONF variable to my HADOOP_HOME/conf
>> directory and in my mapred-site.xml file found there, I set the three
>> variables.
>> >
>> > I don't use environment variables when I run against a cluster.
>> >
>> > On Apr 19, 2011, at 9:54 PM, Jeffrey Wang wrote:
>> >
>> >> Did you set PIG_RPC_PORT in your hadoop-env.sh? I was seeing this error
>> for a while before I added that.
>> >>
>> >> -Jeffrey
>> >>
>> >> From: pob [mailto:peterob333@gmail.com]
>> >> Sent: Tuesday, April 19, 2011 6:42 PM
>> >> To: user@cassandra.apache.org
>> >> Subject: Re: pig + hadoop
>> >>
>> >> Hey Aaron,
>> >>
>> >> I read it, and all of 3 env variables was exported. The results are
>> same.
>> >>
>> >> Best,
>> >> P
>> >>
>> >> 2011/4/20 aaron morton <aa...@thelastpickle.com>
>> >> Am guessing but here goes. Looks like the cassandra RPC port is not
>> set, did you follow these steps in contrib/pig/README.txt
>> >>
>> >> Finally, set the following as environment variables (uppercase,
>> >> underscored), or as Hadoop configuration variables (lowercase, dotted):
>> >> * PIG_RPC_PORT or cassandra.thrift.port : the port thrift is listening
>> on
>> >> * PIG_INITIAL_ADDRESS or cassandra.thrift.address : initial address to
>> connect to
>> >> * PIG_PARTITIONER or cassandra.partitioner.class : cluster partitioner
>> >>
>> >> Hope that helps.
>> >> Aaron
>> >>
>> >>
>> >> On 20 Apr 2011, at 11:28, pob wrote:
>> >>
>> >>
>> >> Hello,
>> >>
>> >> I did cluster configuration by
>> http://wiki.apache.org/cassandra/HadoopSupport. When I run pig
>> example-script.pig
>> >> -x local, everything is fine and i get correct results.
>> >>
>> >> Problem is occurring with -x mapreduce
>> >>
>> >> Im getting those errors :>
>> >>
>> >>
>> >> 2011-04-20 01:24:21,791 [main] ERROR
>> org.apache.pig.tools.pigstats.PigStats - ERROR:
>> java.lang.NumberFormatException: null
>> >> 2011-04-20 01:24:21,792 [main] ERROR
>> org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!
>> >> 2011-04-20 01:24:21,793 [main] INFO
>>  org.apache.pig.tools.pigstats.PigStats - Script Statistics:
>> >>
>> >> Input(s):
>> >> Failed to read data from "cassandra://Keyspace1/Standard1"
>> >>
>> >> Output(s):
>> >> Failed to produce result in
>> "hdfs://ip:54310/tmp/temp-1383865669/tmp-1895601791"
>> >>
>> >> Counters:
>> >> Total records written : 0
>> >> Total bytes written : 0
>> >> Spillable Memory Manager spill count : 0
>> >> Total bags proactively spilled: 0
>> >> Total records proactively spilled: 0
>> >>
>> >> Job DAG:
>> >> job_201104200056_0005   ->      null,
>> >> null    ->      null,
>> >> null
>> >>
>> >>
>> >> 2011-04-20 01:24:21,793 [main] INFO
>>  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
>> - Failed!
>> >> 2011-04-20 01:24:21,803 [main] ERROR org.apache.pig.tools.grunt.Grunt -
>> ERROR 1066: Unable to open iterator for alias topnames. Backend error :
>> java.lang.NumberFormatException: null
>> >>
>> >>
>> >>
>> >> ====
>> >> thats from jobtasks web management - error  from task directly:
>> >>
>> >> java.lang.RuntimeException: java.lang.NumberFormatException: null
>> >> at
>> org.apache.cassandra.hadoop.ColumnFamilyRecordReader.initialize(ColumnFamilyRecordReader.java:123)
>> >> at
>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.initialize(PigRecordReader.java:176)
>> >> at
>> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:418)
>> >> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:620)
>> >> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>> >> at org.apache.hadoop.mapred.Child.main(Child.java:170)
>> >> Caused by: java.lang.NumberFormatException: null
>> >> at java.lang.Integer.parseInt(Integer.java:417)
>> >> at java.lang.Integer.parseInt(Integer.java:499)
>> >> at
>> org.apache.cassandra.hadoop.ConfigHelper.getRpcPort(ConfigHelper.java:233)
>> >> at
>> org.apache.cassandra.hadoop.ColumnFamilyRecordReader.initialize(ColumnFamilyRecordReader.java:105)
>> >> ... 5 more
>> >>
>> >>
>> >>
>> >> Any suggestions where should be problem?
>> >>
>> >> Thanks,
>> >>
>> >>
>> >>
>> >
>>
>>
>

Re: pig + hadoop

Posted by pob <pe...@gmail.com>.

Hi,

everything works fine with cassandra 0.7.5, but when I tried with 0.7.3
another errors showed up, but task finished with success whats strange.....


2011-04-20 11:45:40,674 INFO org.apache.hadoop.mapred.TaskInProgress: Error
from attempt_201104201139_0004_m_000000_3: Error: java.lang.ClassNotF
oundException: org.apache.thrift.TException
        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:247)
        at
org.apache.pig.impl.PigContext.resolveClassName(PigContext.java:426)
        at
org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:456)
        at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getLoadFunc(PigInputFormat.java:153)
        at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.createRecordReader(PigInputFormat.java:105)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:588)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
        at org.apache.hadoop.mapred.Child.main(Child.java:170)



2011-04-20 11:45:43,629 INFO org.apache.hadoop.mapred.TaskInProgress: Error
from attempt_201104201139_0004_m_000001_3: org.apache.pig.backend.exe
cutionengine.ExecException: ERROR 2044: The type null cannot be collected as
a Key type
        at
org.apache.pig.backend.hadoop.HDataType.getWritableComparableTypes(HDataType.java:143)
        at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:105)
        at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:238)
        at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:231)
        at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
        at org.apache.hadoop.mapred.Child.main(Child.java:170)


2011-04-20 11:42:49,498 INFO org.apache.hadoop.mapred.TaskInProgress: Error
from attempt_201104201139_0001_m_000000_1: Error: java.lang.ClassNotF
oundException: org.apache.commons.lang.ArrayUtils
        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
        at
org.apache.cassandra.utils.ByteBufferUtil.<clinit>(ByteBufferUtil.java:75)
        at org.apache.cassandra.hadoop.pig.CassandraStorage.<clinit>(Unknown
Source)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:247)
        at
org.apache.pig.impl.PigContext.resolveClassName(PigContext.java:426)
        at
org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:456)
        at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getLoadFunc(PigInputFormat.java:153)
        at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.createRecordReader(PigInputFormat.java:105)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:588)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
        at org.apache.hadoop.mapred.Child.main(Child.java:170)





2011/4/20 Jeremy Hanna <je...@gmail.com>

> Just as an example:
>
>  <property>
>    <name>cassandra.thrift.address</name>
>    <value>10.12.34.56</value>
>  </property>
>  <property>
>    <name>cassandra.thrift.port</name>
>    <value>9160</value>
>  </property>
>  <property>
>    <name>cassandra.partitioner.class</name>
>    <value>org.apache.cassandra.dht.RandomPartitioner</value>
>  </property>
>
>
> On Apr 19, 2011, at 10:28 PM, Jeremy Hanna wrote:
>
> > oh yeah - that's what's going on.  what I do is on the machine that I run
> the pig script from, I set the PIG_CONF variable to my HADOOP_HOME/conf
> directory and in my mapred-site.xml file found there, I set the three
> variables.
> >
> > I don't use environment variables when I run against a cluster.
> >
> > On Apr 19, 2011, at 9:54 PM, Jeffrey Wang wrote:
> >
> >> Did you set PIG_RPC_PORT in your hadoop-env.sh? I was seeing this error
> for a while before I added that.
> >>
> >> -Jeffrey
> >>
> >> From: pob [mailto:peterob333@gmail.com]
> >> Sent: Tuesday, April 19, 2011 6:42 PM
> >> To: user@cassandra.apache.org
> >> Subject: Re: pig + hadoop
> >>
> >> Hey Aaron,
> >>
> >> I read it, and all of 3 env variables was exported. The results are
> same.
> >>
> >> Best,
> >> P
> >>
> >> 2011/4/20 aaron morton <aa...@thelastpickle.com>
> >> Am guessing but here goes. Looks like the cassandra RPC port is not set,
> did you follow these steps in contrib/pig/README.txt
> >>
> >> Finally, set the following as environment variables (uppercase,
> >> underscored), or as Hadoop configuration variables (lowercase, dotted):
> >> * PIG_RPC_PORT or cassandra.thrift.port : the port thrift is listening
> on
> >> * PIG_INITIAL_ADDRESS or cassandra.thrift.address : initial address to
> connect to
> >> * PIG_PARTITIONER or cassandra.partitioner.class : cluster partitioner
> >>
> >> Hope that helps.
> >> Aaron
> >>
> >>
> >> On 20 Apr 2011, at 11:28, pob wrote:
> >>
> >>
> >> Hello,
> >>
> >> I did cluster configuration by
> http://wiki.apache.org/cassandra/HadoopSupport. When I run pig
> example-script.pig
> >> -x local, everything is fine and i get correct results.
> >>
> >> Problem is occurring with -x mapreduce
> >>
> >> Im getting those errors :>
> >>
> >>
> >> 2011-04-20 01:24:21,791 [main] ERROR
> org.apache.pig.tools.pigstats.PigStats - ERROR:
> java.lang.NumberFormatException: null
> >> 2011-04-20 01:24:21,792 [main] ERROR
> org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!
> >> 2011-04-20 01:24:21,793 [main] INFO
>  org.apache.pig.tools.pigstats.PigStats - Script Statistics:
> >>
> >> Input(s):
> >> Failed to read data from "cassandra://Keyspace1/Standard1"
> >>
> >> Output(s):
> >> Failed to produce result in
> "hdfs://ip:54310/tmp/temp-1383865669/tmp-1895601791"
> >>
> >> Counters:
> >> Total records written : 0
> >> Total bytes written : 0
> >> Spillable Memory Manager spill count : 0
> >> Total bags proactively spilled: 0
> >> Total records proactively spilled: 0
> >>
> >> Job DAG:
> >> job_201104200056_0005   ->      null,
> >> null    ->      null,
> >> null
> >>
> >>
> >> 2011-04-20 01:24:21,793 [main] INFO
>  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - Failed!
> >> 2011-04-20 01:24:21,803 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> ERROR 1066: Unable to open iterator for alias topnames. Backend error :
> java.lang.NumberFormatException: null
> >>
> >>
> >>
> >> ====
> >> thats from jobtasks web management - error  from task directly:
> >>
> >> java.lang.RuntimeException: java.lang.NumberFormatException: null
> >> at
> org.apache.cassandra.hadoop.ColumnFamilyRecordReader.initialize(ColumnFamilyRecordReader.java:123)
> >> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.initialize(PigRecordReader.java:176)
> >> at
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:418)
> >> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:620)
> >> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
> >> at org.apache.hadoop.mapred.Child.main(Child.java:170)
> >> Caused by: java.lang.NumberFormatException: null
> >> at java.lang.Integer.parseInt(Integer.java:417)
> >> at java.lang.Integer.parseInt(Integer.java:499)
> >> at
> org.apache.cassandra.hadoop.ConfigHelper.getRpcPort(ConfigHelper.java:233)
> >> at
> org.apache.cassandra.hadoop.ColumnFamilyRecordReader.initialize(ColumnFamilyRecordReader.java:105)
> >> ... 5 more
> >>
> >>
> >>
> >> Any suggestions where should be problem?
> >>
> >> Thanks,
> >>
> >>
> >>
> >
>
>

Re: pig + hadoop

Posted by pob <pe...@gmail.com>.

Hi,

that was the problem! Thanks, you should pick that stuff into your
documentation.


Thanks for help!


Best,
P

2011/4/20 Jeremy Hanna <je...@gmail.com>

> Just as an example:
>
>  <property>
>    <name>cassandra.thrift.address</name>
>    <value>10.12.34.56</value>
>  </property>
>  <property>
>    <name>cassandra.thrift.port</name>
>    <value>9160</value>
>  </property>
>  <property>
>    <name>cassandra.partitioner.class</name>
>    <value>org.apache.cassandra.dht.RandomPartitioner</value>
>  </property>
>
>
> On Apr 19, 2011, at 10:28 PM, Jeremy Hanna wrote:
>
> > oh yeah - that's what's going on.  what I do is on the machine that I run
> the pig script from, I set the PIG_CONF variable to my HADOOP_HOME/conf
> directory and in my mapred-site.xml file found there, I set the three
> variables.
> >
> > I don't use environment variables when I run against a cluster.
> >
> > On Apr 19, 2011, at 9:54 PM, Jeffrey Wang wrote:
> >
> >> Did you set PIG_RPC_PORT in your hadoop-env.sh? I was seeing this error
> for a while before I added that.
> >>
> >> -Jeffrey
> >>
> >> From: pob [mailto:peterob333@gmail.com]
> >> Sent: Tuesday, April 19, 2011 6:42 PM
> >> To: user@cassandra.apache.org
> >> Subject: Re: pig + hadoop
> >>
> >> Hey Aaron,
> >>
> >> I read it, and all of 3 env variables was exported. The results are
> same.
> >>
> >> Best,
> >> P
> >>
> >> 2011/4/20 aaron morton <aa...@thelastpickle.com>
> >> Am guessing but here goes. Looks like the cassandra RPC port is not set,
> did you follow these steps in contrib/pig/README.txt
> >>
> >> Finally, set the following as environment variables (uppercase,
> >> underscored), or as Hadoop configuration variables (lowercase, dotted):
> >> * PIG_RPC_PORT or cassandra.thrift.port : the port thrift is listening
> on
> >> * PIG_INITIAL_ADDRESS or cassandra.thrift.address : initial address to
> connect to
> >> * PIG_PARTITIONER or cassandra.partitioner.class : cluster partitioner
> >>
> >> Hope that helps.
> >> Aaron
> >>
> >>
> >> On 20 Apr 2011, at 11:28, pob wrote:
> >>
> >>
> >> Hello,
> >>
> >> I did cluster configuration by
> http://wiki.apache.org/cassandra/HadoopSupport. When I run pig
> example-script.pig
> >> -x local, everything is fine and i get correct results.
> >>
> >> Problem is occurring with -x mapreduce
> >>
> >> Im getting those errors :>
> >>
> >>
> >> 2011-04-20 01:24:21,791 [main] ERROR
> org.apache.pig.tools.pigstats.PigStats - ERROR:
> java.lang.NumberFormatException: null
> >> 2011-04-20 01:24:21,792 [main] ERROR
> org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!
> >> 2011-04-20 01:24:21,793 [main] INFO
>  org.apache.pig.tools.pigstats.PigStats - Script Statistics:
> >>
> >> Input(s):
> >> Failed to read data from "cassandra://Keyspace1/Standard1"
> >>
> >> Output(s):
> >> Failed to produce result in
> "hdfs://ip:54310/tmp/temp-1383865669/tmp-1895601791"
> >>
> >> Counters:
> >> Total records written : 0
> >> Total bytes written : 0
> >> Spillable Memory Manager spill count : 0
> >> Total bags proactively spilled: 0
> >> Total records proactively spilled: 0
> >>
> >> Job DAG:
> >> job_201104200056_0005   ->      null,
> >> null    ->      null,
> >> null
> >>
> >>
> >> 2011-04-20 01:24:21,793 [main] INFO
>  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - Failed!
> >> 2011-04-20 01:24:21,803 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> ERROR 1066: Unable to open iterator for alias topnames. Backend error :
> java.lang.NumberFormatException: null
> >>
> >>
> >>
> >> ====
> >> thats from jobtasks web management - error  from task directly:
> >>
> >> java.lang.RuntimeException: java.lang.NumberFormatException: null
> >> at
> org.apache.cassandra.hadoop.ColumnFamilyRecordReader.initialize(ColumnFamilyRecordReader.java:123)
> >> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.initialize(PigRecordReader.java:176)
> >> at
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:418)
> >> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:620)
> >> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
> >> at org.apache.hadoop.mapred.Child.main(Child.java:170)
> >> Caused by: java.lang.NumberFormatException: null
> >> at java.lang.Integer.parseInt(Integer.java:417)
> >> at java.lang.Integer.parseInt(Integer.java:499)
> >> at
> org.apache.cassandra.hadoop.ConfigHelper.getRpcPort(ConfigHelper.java:233)
> >> at
> org.apache.cassandra.hadoop.ColumnFamilyRecordReader.initialize(ColumnFamilyRecordReader.java:105)
> >> ... 5 more
> >>
> >>
> >>
> >> Any suggestions where should be problem?
> >>
> >> Thanks,
> >>
> >>
> >>
> >
>
>

Re: pig + hadoop

Posted by Jeremy Hanna <je...@gmail.com>.

Just as an example:

  <property>
    <name>cassandra.thrift.address</name>
    <value>10.12.34.56</value>
  </property>
  <property>
    <name>cassandra.thrift.port</name>
    <value>9160</value>
  </property>
  <property>
    <name>cassandra.partitioner.class</name>
    <value>org.apache.cassandra.dht.RandomPartitioner</value>
  </property>


On Apr 19, 2011, at 10:28 PM, Jeremy Hanna wrote:

> oh yeah - that's what's going on.  what I do is on the machine that I run the pig script from, I set the PIG_CONF variable to my HADOOP_HOME/conf directory and in my mapred-site.xml file found there, I set the three variables.
> 
> I don't use environment variables when I run against a cluster.
> 
> On Apr 19, 2011, at 9:54 PM, Jeffrey Wang wrote:
> 
>> Did you set PIG_RPC_PORT in your hadoop-env.sh? I was seeing this error for a while before I added that.
>> 
>> -Jeffrey
>> 
>> From: pob [mailto:peterob333@gmail.com] 
>> Sent: Tuesday, April 19, 2011 6:42 PM
>> To: user@cassandra.apache.org
>> Subject: Re: pig + hadoop
>> 
>> Hey Aaron,
>> 
>> I read it, and all of 3 env variables was exported. The results are same.
>> 
>> Best,
>> P
>> 
>> 2011/4/20 aaron morton <aa...@thelastpickle.com>
>> Am guessing but here goes. Looks like the cassandra RPC port is not set, did you follow these steps in contrib/pig/README.txt
>> 
>> Finally, set the following as environment variables (uppercase,
>> underscored), or as Hadoop configuration variables (lowercase, dotted):
>> * PIG_RPC_PORT or cassandra.thrift.port : the port thrift is listening on 
>> * PIG_INITIAL_ADDRESS or cassandra.thrift.address : initial address to connect to
>> * PIG_PARTITIONER or cassandra.partitioner.class : cluster partitioner
>> 
>> Hope that helps. 
>> Aaron
>> 
>> 
>> On 20 Apr 2011, at 11:28, pob wrote:
>> 
>> 
>> Hello, 
>> 
>> I did cluster configuration by http://wiki.apache.org/cassandra/HadoopSupport. When I run pig example-script.pig 
>> -x local, everything is fine and i get correct results.
>> 
>> Problem is occurring with -x mapreduce 
>> 
>> Im getting those errors :>
>> 
>> 
>> 2011-04-20 01:24:21,791 [main] ERROR org.apache.pig.tools.pigstats.PigStats - ERROR: java.lang.NumberFormatException: null
>> 2011-04-20 01:24:21,792 [main] ERROR org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!
>> 2011-04-20 01:24:21,793 [main] INFO  org.apache.pig.tools.pigstats.PigStats - Script Statistics: 
>> 
>> Input(s):
>> Failed to read data from "cassandra://Keyspace1/Standard1"
>> 
>> Output(s):
>> Failed to produce result in "hdfs://ip:54310/tmp/temp-1383865669/tmp-1895601791"
>> 
>> Counters:
>> Total records written : 0
>> Total bytes written : 0
>> Spillable Memory Manager spill count : 0
>> Total bags proactively spilled: 0
>> Total records proactively spilled: 0
>> 
>> Job DAG:
>> job_201104200056_0005   ->      null,
>> null    ->      null,
>> null
>> 
>> 
>> 2011-04-20 01:24:21,793 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!
>> 2011-04-20 01:24:21,803 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias topnames. Backend error : java.lang.NumberFormatException: null
>> 
>> 
>> 
>> ====
>> thats from jobtasks web management - error  from task directly:
>> 
>> java.lang.RuntimeException: java.lang.NumberFormatException: null
>> at org.apache.cassandra.hadoop.ColumnFamilyRecordReader.initialize(ColumnFamilyRecordReader.java:123)
>> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.initialize(PigRecordReader.java:176)
>> at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:418)
>> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:620)
>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>> at org.apache.hadoop.mapred.Child.main(Child.java:170)
>> Caused by: java.lang.NumberFormatException: null
>> at java.lang.Integer.parseInt(Integer.java:417)
>> at java.lang.Integer.parseInt(Integer.java:499)
>> at org.apache.cassandra.hadoop.ConfigHelper.getRpcPort(ConfigHelper.java:233)
>> at org.apache.cassandra.hadoop.ColumnFamilyRecordReader.initialize(ColumnFamilyRecordReader.java:105)
>> ... 5 more
>> 
>> 
>> 
>> Any suggestions where should be problem?
>> 
>> Thanks,
>> 
>> 
>> 
>

Re: pig + hadoop

Posted by Jeremy Hanna <je...@gmail.com>.

oh yeah - that's what's going on.  what I do is on the machine that I run the pig script from, I set the PIG_CONF variable to my HADOOP_HOME/conf directory and in my mapred-site.xml file found there, I set the three variables.

I don't use environment variables when I run against a cluster.

On Apr 19, 2011, at 9:54 PM, Jeffrey Wang wrote:

> Did you set PIG_RPC_PORT in your hadoop-env.sh? I was seeing this error for a while before I added that.
>  
> -Jeffrey
>  
> From: pob [mailto:peterob333@gmail.com] 
> Sent: Tuesday, April 19, 2011 6:42 PM
> To: user@cassandra.apache.org
> Subject: Re: pig + hadoop
>  
> Hey Aaron,
>  
> I read it, and all of 3 env variables was exported. The results are same.
>  
> Best,
> P
> 
> 2011/4/20 aaron morton <aa...@thelastpickle.com>
> Am guessing but here goes. Looks like the cassandra RPC port is not set, did you follow these steps in contrib/pig/README.txt
>  
> Finally, set the following as environment variables (uppercase,
> underscored), or as Hadoop configuration variables (lowercase, dotted):
> * PIG_RPC_PORT or cassandra.thrift.port : the port thrift is listening on 
> * PIG_INITIAL_ADDRESS or cassandra.thrift.address : initial address to connect to
> * PIG_PARTITIONER or cassandra.partitioner.class : cluster partitioner
>  
> Hope that helps. 
> Aaron
>  
>  
> On 20 Apr 2011, at 11:28, pob wrote:
> 
> 
> Hello, 
>  
> I did cluster configuration by http://wiki.apache.org/cassandra/HadoopSupport. When I run pig example-script.pig 
> -x local, everything is fine and i get correct results.
>  
> Problem is occurring with -x mapreduce 
>  
> Im getting those errors :>
>  
>  
> 2011-04-20 01:24:21,791 [main] ERROR org.apache.pig.tools.pigstats.PigStats - ERROR: java.lang.NumberFormatException: null
> 2011-04-20 01:24:21,792 [main] ERROR org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!
> 2011-04-20 01:24:21,793 [main] INFO  org.apache.pig.tools.pigstats.PigStats - Script Statistics: 
>  
> Input(s):
> Failed to read data from "cassandra://Keyspace1/Standard1"
>  
> Output(s):
> Failed to produce result in "hdfs://ip:54310/tmp/temp-1383865669/tmp-1895601791"
>  
> Counters:
> Total records written : 0
> Total bytes written : 0
> Spillable Memory Manager spill count : 0
> Total bags proactively spilled: 0
> Total records proactively spilled: 0
>  
> Job DAG:
> job_201104200056_0005   ->      null,
> null    ->      null,
> null
>  
>  
> 2011-04-20 01:24:21,793 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!
> 2011-04-20 01:24:21,803 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias topnames. Backend error : java.lang.NumberFormatException: null
>  
>  
>  
> ====
> thats from jobtasks web management - error  from task directly:
>  
> java.lang.RuntimeException: java.lang.NumberFormatException: null
> at org.apache.cassandra.hadoop.ColumnFamilyRecordReader.initialize(ColumnFamilyRecordReader.java:123)
> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.initialize(PigRecordReader.java:176)
> at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:418)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:620)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
> at org.apache.hadoop.mapred.Child.main(Child.java:170)
> Caused by: java.lang.NumberFormatException: null
> at java.lang.Integer.parseInt(Integer.java:417)
> at java.lang.Integer.parseInt(Integer.java:499)
> at org.apache.cassandra.hadoop.ConfigHelper.getRpcPort(ConfigHelper.java:233)
> at org.apache.cassandra.hadoop.ColumnFamilyRecordReader.initialize(ColumnFamilyRecordReader.java:105)
> ... 5 more
>  
>  
>  
> Any suggestions where should be problem?
>  
> Thanks,
>  
>  
>

RE: pig + hadoop

Posted by Jeffrey Wang <jw...@palantir.com>.

Did you set PIG_RPC_PORT in your hadoop-env.sh? I was seeing this error for a while before I added that.

-Jeffrey

From: pob [mailto:peterob333@gmail.com]
Sent: Tuesday, April 19, 2011 6:42 PM
To: user@cassandra.apache.org
Subject: Re: pig + hadoop

Hey Aaron,

I read it, and all of 3 env variables was exported. The results are same.

Best,
P
2011/4/20 aaron morton <aa...@thelastpickle.com>>
Am guessing but here goes. Looks like the cassandra RPC port is not set, did you follow these steps in contrib/pig/README.txt

Finally, set the following as environment variables (uppercase,
underscored), or as Hadoop configuration variables (lowercase, dotted):
* PIG_RPC_PORT or cassandra.thrift.port : the port thrift is listening on
* PIG_INITIAL_ADDRESS or cassandra.thrift.address : initial address to connect to
* PIG_PARTITIONER or cassandra.partitioner.class : cluster partitioner

Hope that helps.
Aaron

On 20 Apr 2011, at 11:28, pob wrote:

Hello,

I did cluster configuration by http://wiki.apache.org/cassandra/HadoopSupport. When I run pig example-script.pig
-x local, everything is fine and i get correct results.

Problem is occurring with -x mapreduce

Im getting those errors :>

2011-04-20 01:24:21,791 [main] ERROR org.apache.pig.tools.pigstats.PigStats - ERROR: java.lang.NumberFormatException: null
2011-04-20 01:24:21,792 [main] ERROR org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!
2011-04-20 01:24:21,793 [main] INFO  org.apache.pig.tools.pigstats.PigStats - Script Statistics:

Input(s):
Failed to read data from "cassandra://Keyspace1/Standard1"

Output(s):
Failed to produce result in "hdfs://ip:54310/tmp/temp-1383865669/tmp-1895601791"

Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_201104200056_0005   ->      null,
null    ->      null,
null

2011-04-20 01:24:21,793 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!
2011-04-20 01:24:21,803 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias topnames. Backend error : java.lang.NumberFormatException: null

====
thats from jobtasks web management - error  from task directly:

java.lang.RuntimeException: java.lang.NumberFormatException: null
at org.apache.cassandra.hadoop.ColumnFamilyRecordReader.initialize(ColumnFamilyRecordReader.java:123)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.initialize(PigRecordReader.java:176)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:418)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:620)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: java.lang.NumberFormatException: null
at java.lang.Integer.parseInt(Integer.java:417)
at java.lang.Integer.parseInt(Integer.java:499)
at org.apache.cassandra.hadoop.ConfigHelper.getRpcPort(ConfigHelper.java:233)
at org.apache.cassandra.hadoop.ColumnFamilyRecordReader.initialize(ColumnFamilyRecordReader.java:105)
... 5 more

Any suggestions where should be problem?

Thanks,

Re: pig + hadoop

Posted by pob <pe...@gmail.com>.

Hey Aaron,

I read it, and all of 3 env variables was exported. The results are same.

Best,
P

2011/4/20 aaron morton <aa...@thelastpickle.com>

> Am guessing but here goes. Looks like the cassandra RPC port is not set,
> did you follow these steps in contrib/pig/README.txt
>
> Finally, set the following as environment variables (uppercase,
> underscored), or as Hadoop configuration variables (lowercase, dotted):
> * PIG_RPC_PORT or cassandra.thrift.port : the port thrift is listening on
> * PIG_INITIAL_ADDRESS or cassandra.thrift.address : initial address to
> connect to
> * PIG_PARTITIONER or cassandra.partitioner.class : cluster partitioner
>
> Hope that helps.
> Aaron
>
>
> On 20 Apr 2011, at 11:28, pob wrote:
>
> Hello,
>
> I did cluster configuration by
> http://wiki.apache.org/cassandra/HadoopSupport. When I run
> pig example-script.pig
> -x local, everything is fine and i get correct results.
>
> Problem is occurring with -x mapreduce
>
> Im getting those errors :>
>
>
> 2011-04-20 01:24:21,791 [main] ERROR org.apache.pig.tools.pigstats.PigStats
> - ERROR: java.lang.NumberFormatException: null
> 2011-04-20 01:24:21,792 [main] ERROR
> org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!
> 2011-04-20 01:24:21,793 [main] INFO  org.apache.pig.tools.pigstats.PigStats
> - Script Statistics:
>
> Input(s):
> Failed to read data from "cassandra://Keyspace1/Standard1"
>
> Output(s):
> Failed to produce result in "
> hdfs://ip:54310/tmp/temp-1383865669/tmp-1895601791"
>
> Counters:
> Total records written : 0
> Total bytes written : 0
> Spillable Memory Manager spill count : 0
> Total bags proactively spilled: 0
> Total records proactively spilled: 0
>
> Job DAG:
> job_201104200056_0005   ->      null,
> null    ->      null,
> null
>
>
> 2011-04-20 01:24:21,793 [main] INFO
>  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - Failed!
> 2011-04-20 01:24:21,803 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> ERROR 1066: Unable to open iterator for alias topnames. Backend error :
> java.lang.NumberFormatException: null
>
>
>
> ====
> thats from jobtasks web management - error  from task directly:
>
> java.lang.RuntimeException: java.lang.NumberFormatException: null
> at
> org.apache.cassandra.hadoop.ColumnFamilyRecordReader.initialize(ColumnFamilyRecordReader.java:123)
>  at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.initialize(PigRecordReader.java:176)
> at
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:418)
>  at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:620)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>  at org.apache.hadoop.mapred.Child.main(Child.java:170)
> Caused by: java.lang.NumberFormatException: null
> at java.lang.Integer.parseInt(Integer.java:417)
>  at java.lang.Integer.parseInt(Integer.java:499)
> at
> org.apache.cassandra.hadoop.ConfigHelper.getRpcPort(ConfigHelper.java:233)
>  at
> org.apache.cassandra.hadoop.ColumnFamilyRecordReader.initialize(ColumnFamilyRecordReader.java:105)
> ... 5 more
>
>
>
> Any suggestions where should be problem?
>
> Thanks,
>
>
>

Re: pig + hadoop

Posted by pob <pe...@gmail.com>.

and one more thing...

2011-04-20 04:09:23,412 INFO org.apache.hadoop.mapred.TaskTracker:
org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find
taskTracker/jobcache/job_201104200406_0001/attempt_201104200406_0001_m_000002_0/output/file.out
in any of the configured local directories


2011/4/20 pob <pe...@gmail.com>

> Thats from jobtracker:
>
>
> 2011-04-20 03:36:39,519 INFO org.apache.hadoop.mapred.JobInProgress:
> Choosing rack-local task task_201104200331_0002_m_000000
> 2011-04-20 03:36:42,521 INFO org.apache.hadoop.mapred.TaskInProgress: Error
> from attempt_201104200331_0002_m_000000_3: java.lang.NumberFormatException:
> null
>         at java.lang.Integer.parseInt(Integer.java:417)
>         at java.lang.Integer.parseInt(Integer.java:499)
>         at
> org.apache.cassandra.hadoop.ConfigHelper.getRpcPort(ConfigHelper.java:250)
>         at
> org.apache.cassandra.hadoop.pig.CassandraStorage.setConnectionInformation(Unknown
> Source)
>         at
> org.apache.cassandra.hadoop.pig.CassandraStorage.setLocation(Unknown Source)
>         at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.mergeSplitSpecificConf(PigInputFormat.java:133)
>         at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.createRecordReader(PigInputFormat.java:111)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:588)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>         at org.apache.hadoop.mapred.Child.main(Child.java:170)
>
>
> and tasktracker
>
> 2011-04-20 03:33:10,942 INFO org.apache.hadoop.mapred.TaskTracker:  Using
> MemoryCalculatorPlugin :
> org.apache.hadoop.util.LinuxMemoryCalculatorPlugin@3c1fc1a6
> 2011-04-20 03:33:10,945 WARN org.apache.hadoop.mapred.TaskTracker:
> TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is
> disabled.
> 2011-04-20 03:33:10,946 INFO org.apache.hadoop.mapred.IndexCache:
> IndexCache created with max memory = 10485760
> 2011-04-20 03:33:11,069 INFO org.apache.hadoop.mapred.TaskTracker:
> LaunchTaskAction (registerTask): attempt_201104200331_0001_m_000000_1 task's
> state:UNASSIGNED
> 2011-04-20 03:33:11,072 INFO org.apache.hadoop.mapred.TaskTracker: Trying
> to launch : attempt_201104200331_0001_m_000000_1
> 2011-04-20 03:33:11,072 INFO org.apache.hadoop.mapred.TaskTracker: In
> TaskLauncher, current free slots : 2 and trying to launch
> attempt_201104200331_0001_m_000000_1
> 2011-04-20 03:33:11,986 INFO org.apache.hadoop.mapred.JvmManager: In
> JvmRunner constructed JVM ID: jvm_201104200331_0001_m_-926908110
> 2011-04-20 03:33:11,986 INFO org.apache.hadoop.mapred.JvmManager: JVM
> Runner jvm_201104200331_0001_m_-926908110 spawned.
> 2011-04-20 03:33:12,400 INFO org.apache.hadoop.mapred.TaskTracker: JVM with
> ID: jvm_201104200331_0001_m_-926908110 given task:
> attempt_201104200331_0001_m_000000_1
> 2011-04-20 03:33:12,895 INFO org.apache.hadoop.mapred.TaskTracker:
> attempt_201104200331_0001_m_000000_1 0.0%
> 2011-04-20 03:33:12,918 INFO org.apache.hadoop.mapred.JvmManager: JVM :
> jvm_201104200331_0001_m_-926908110 exited. Number of tasks it ran: 0
> 2011-04-20 03:33:15,919 INFO org.apache.hadoop.mapred.TaskRunner:
> attempt_201104200331_0001_m_000000_1 done; removing files.
> 2011-04-20 03:33:15,920 INFO org.apache.hadoop.mapred.TaskTracker:
> addFreeSlot : current free slots : 2
> 2011-04-20 03:33:38,090 INFO org.apache.hadoop.mapred.TaskTracker: Received
> 'KillJobAction' for job: job_201104200331_0001
> 2011-04-20 03:36:32,199 INFO org.apache.hadoop.mapred.TaskTracker:
> LaunchTaskAction (registerTask): attempt_201104200331_0002_m_000000_2 task's
> state:UNASSIGNED
> 2011-04-20 03:36:32,199 INFO org.apache.hadoop.mapred.TaskTracker: Trying
> to launch : attempt_201104200331_0002_m_000000_2
> 2011-04-20 03:36:32,199 INFO org.apache.hadoop.mapred.TaskTracker: In
> TaskLauncher, current free slots : 2 and trying to launch
> attempt_201104200331_0002_m_000000_2
> 2011-04-20 03:36:32,813 INFO org.apache.hadoop.mapred.JvmManager: In
> JvmRunner constructed JVM ID: jvm_201104200331_0002_m_-134007035
> 2011-04-20 03:36:32,814 INFO org.apache.hadoop.mapred.JvmManager: JVM
> Runner jvm_201104200331_0002_m_-134007035 spawned.
> 2011-04-20 03:36:33,214 INFO org.apache.hadoop.mapred.TaskTracker: JVM with
> ID: jvm_201104200331_0002_m_-134007035 given task:
> attempt_201104200331_0002_m_000000_2
> 2011-04-20 03:36:33,711 INFO org.apache.hadoop.mapred.TaskTracker:
> attempt_201104200331_0002_m_000000_2 0.0%
> 2011-04-20 03:36:33,731 INFO org.apache.hadoop.mapred.JvmManager: JVM :
> jvm_201104200331_0002_m_-134007035 exited. Number of tasks it ran: 0
> 2011-04-20 03:36:36,732 INFO org.apache.hadoop.mapred.TaskRunner:
> attempt_201104200331_0002_m_000000_2 done; removing files.
> 2011-04-20 03:36:36,733 INFO org.apache.hadoop.mapred.TaskTracker:
> addFreeSlot : current free slots : 2
> 2011-04-20 03:36:50,210 INFO org.apache.hadoop.mapred.TaskTracker: Received
> 'KillJobAction' for job: job_201104200331_0002
>
>
>
>
> 2011/4/20 pob <pe...@gmail.com>
>
>> ad2. it works with -x local , so there cant be issue with
>> pig->DB(Cassandra).
>>
>> im using pig-0.8 from official site + hadoop-0.20.2 from offic. site.
>>
>>
>> thx
>>
>>
>> 2011/4/20 aaron morton <aa...@thelastpickle.com>
>>
>>> Am guessing but here goes. Looks like the cassandra RPC port is not set,
>>> did you follow these steps in contrib/pig/README.txt
>>>
>>> Finally, set the following as environment variables (uppercase,
>>> underscored), or as Hadoop configuration variables (lowercase, dotted):
>>> * PIG_RPC_PORT or cassandra.thrift.port : the port thrift is listening
>>> on
>>> * PIG_INITIAL_ADDRESS or cassandra.thrift.address : initial address to
>>> connect to
>>> * PIG_PARTITIONER or cassandra.partitioner.class : cluster partitioner
>>>
>>> Hope that helps.
>>> Aaron
>>>
>>>
>>> On 20 Apr 2011, at 11:28, pob wrote:
>>>
>>> Hello,
>>>
>>> I did cluster configuration by
>>> http://wiki.apache.org/cassandra/HadoopSupport. When I run
>>> pig example-script.pig
>>> -x local, everything is fine and i get correct results.
>>>
>>> Problem is occurring with -x mapreduce
>>>
>>> Im getting those errors :>
>>>
>>>
>>> 2011-04-20 01:24:21,791 [main] ERROR
>>> org.apache.pig.tools.pigstats.PigStats - ERROR:
>>> java.lang.NumberFormatException: null
>>> 2011-04-20 01:24:21,792 [main] ERROR
>>> org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!
>>> 2011-04-20 01:24:21,793 [main] INFO
>>>  org.apache.pig.tools.pigstats.PigStats - Script Statistics:
>>>
>>> Input(s):
>>> Failed to read data from "cassandra://Keyspace1/Standard1"
>>>
>>> Output(s):
>>> Failed to produce result in "
>>> hdfs://ip:54310/tmp/temp-1383865669/tmp-1895601791"
>>>
>>> Counters:
>>> Total records written : 0
>>> Total bytes written : 0
>>> Spillable Memory Manager spill count : 0
>>> Total bags proactively spilled: 0
>>> Total records proactively spilled: 0
>>>
>>> Job DAG:
>>> job_201104200056_0005   ->      null,
>>> null    ->      null,
>>> null
>>>
>>>
>>> 2011-04-20 01:24:21,793 [main] INFO
>>>  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
>>> - Failed!
>>> 2011-04-20 01:24:21,803 [main] ERROR org.apache.pig.tools.grunt.Grunt -
>>> ERROR 1066: Unable to open iterator for alias topnames. Backend error :
>>> java.lang.NumberFormatException: null
>>>
>>>
>>>
>>> ====
>>> thats from jobtasks web management - error  from task directly:
>>>
>>> java.lang.RuntimeException: java.lang.NumberFormatException: null
>>> at
>>> org.apache.cassandra.hadoop.ColumnFamilyRecordReader.initialize(ColumnFamilyRecordReader.java:123)
>>>  at
>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.initialize(PigRecordReader.java:176)
>>> at
>>> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:418)
>>>  at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:620)
>>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>>>  at org.apache.hadoop.mapred.Child.main(Child.java:170)
>>> Caused by: java.lang.NumberFormatException: null
>>> at java.lang.Integer.parseInt(Integer.java:417)
>>>  at java.lang.Integer.parseInt(Integer.java:499)
>>> at
>>> org.apache.cassandra.hadoop.ConfigHelper.getRpcPort(ConfigHelper.java:233)
>>>  at
>>> org.apache.cassandra.hadoop.ColumnFamilyRecordReader.initialize(ColumnFamilyRecordReader.java:105)
>>> ... 5 more
>>>
>>>
>>>
>>> Any suggestions where should be problem?
>>>
>>> Thanks,
>>>
>>>
>>>
>>
>

Re: pig + hadoop

Posted by pob <pe...@gmail.com>.

Thats from jobtracker:


2011-04-20 03:36:39,519 INFO org.apache.hadoop.mapred.JobInProgress:
Choosing rack-local task task_201104200331_0002_m_000000
2011-04-20 03:36:42,521 INFO org.apache.hadoop.mapred.TaskInProgress: Error
from attempt_201104200331_0002_m_000000_3: java.lang.NumberFormatException:
null
        at java.lang.Integer.parseInt(Integer.java:417)
        at java.lang.Integer.parseInt(Integer.java:499)
        at
org.apache.cassandra.hadoop.ConfigHelper.getRpcPort(ConfigHelper.java:250)
        at
org.apache.cassandra.hadoop.pig.CassandraStorage.setConnectionInformation(Unknown
Source)
        at
org.apache.cassandra.hadoop.pig.CassandraStorage.setLocation(Unknown Source)
        at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.mergeSplitSpecificConf(PigInputFormat.java:133)
        at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.createRecordReader(PigInputFormat.java:111)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:588)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
        at org.apache.hadoop.mapred.Child.main(Child.java:170)


and tasktracker

2011-04-20 03:33:10,942 INFO org.apache.hadoop.mapred.TaskTracker:  Using
MemoryCalculatorPlugin :
org.apache.hadoop.util.LinuxMemoryCalculatorPlugin@3c1fc1a6
2011-04-20 03:33:10,945 WARN org.apache.hadoop.mapred.TaskTracker:
TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is
disabled.
2011-04-20 03:33:10,946 INFO org.apache.hadoop.mapred.IndexCache: IndexCache
created with max memory = 10485760
2011-04-20 03:33:11,069 INFO org.apache.hadoop.mapred.TaskTracker:
LaunchTaskAction (registerTask): attempt_201104200331_0001_m_000000_1 task's
state:UNASSIGNED
2011-04-20 03:33:11,072 INFO org.apache.hadoop.mapred.TaskTracker: Trying to
launch : attempt_201104200331_0001_m_000000_1
2011-04-20 03:33:11,072 INFO org.apache.hadoop.mapred.TaskTracker: In
TaskLauncher, current free slots : 2 and trying to launch
attempt_201104200331_0001_m_000000_1
2011-04-20 03:33:11,986 INFO org.apache.hadoop.mapred.JvmManager: In
JvmRunner constructed JVM ID: jvm_201104200331_0001_m_-926908110
2011-04-20 03:33:11,986 INFO org.apache.hadoop.mapred.JvmManager: JVM Runner
jvm_201104200331_0001_m_-926908110 spawned.
2011-04-20 03:33:12,400 INFO org.apache.hadoop.mapred.TaskTracker: JVM with
ID: jvm_201104200331_0001_m_-926908110 given task:
attempt_201104200331_0001_m_000000_1
2011-04-20 03:33:12,895 INFO org.apache.hadoop.mapred.TaskTracker:
attempt_201104200331_0001_m_000000_1 0.0%
2011-04-20 03:33:12,918 INFO org.apache.hadoop.mapred.JvmManager: JVM :
jvm_201104200331_0001_m_-926908110 exited. Number of tasks it ran: 0
2011-04-20 03:33:15,919 INFO org.apache.hadoop.mapred.TaskRunner:
attempt_201104200331_0001_m_000000_1 done; removing files.
2011-04-20 03:33:15,920 INFO org.apache.hadoop.mapred.TaskTracker:
addFreeSlot : current free slots : 2
2011-04-20 03:33:38,090 INFO org.apache.hadoop.mapred.TaskTracker: Received
'KillJobAction' for job: job_201104200331_0001
2011-04-20 03:36:32,199 INFO org.apache.hadoop.mapred.TaskTracker:
LaunchTaskAction (registerTask): attempt_201104200331_0002_m_000000_2 task's
state:UNASSIGNED
2011-04-20 03:36:32,199 INFO org.apache.hadoop.mapred.TaskTracker: Trying to
launch : attempt_201104200331_0002_m_000000_2
2011-04-20 03:36:32,199 INFO org.apache.hadoop.mapred.TaskTracker: In
TaskLauncher, current free slots : 2 and trying to launch
attempt_201104200331_0002_m_000000_2
2011-04-20 03:36:32,813 INFO org.apache.hadoop.mapred.JvmManager: In
JvmRunner constructed JVM ID: jvm_201104200331_0002_m_-134007035
2011-04-20 03:36:32,814 INFO org.apache.hadoop.mapred.JvmManager: JVM Runner
jvm_201104200331_0002_m_-134007035 spawned.
2011-04-20 03:36:33,214 INFO org.apache.hadoop.mapred.TaskTracker: JVM with
ID: jvm_201104200331_0002_m_-134007035 given task:
attempt_201104200331_0002_m_000000_2
2011-04-20 03:36:33,711 INFO org.apache.hadoop.mapred.TaskTracker:
attempt_201104200331_0002_m_000000_2 0.0%
2011-04-20 03:36:33,731 INFO org.apache.hadoop.mapred.JvmManager: JVM :
jvm_201104200331_0002_m_-134007035 exited. Number of tasks it ran: 0
2011-04-20 03:36:36,732 INFO org.apache.hadoop.mapred.TaskRunner:
attempt_201104200331_0002_m_000000_2 done; removing files.
2011-04-20 03:36:36,733 INFO org.apache.hadoop.mapred.TaskTracker:
addFreeSlot : current free slots : 2
2011-04-20 03:36:50,210 INFO org.apache.hadoop.mapred.TaskTracker: Received
'KillJobAction' for job: job_201104200331_0002




2011/4/20 pob <pe...@gmail.com>

> ad2. it works with -x local , so there cant be issue with
> pig->DB(Cassandra).
>
> im using pig-0.8 from official site + hadoop-0.20.2 from offic. site.
>
>
> thx
>
>
> 2011/4/20 aaron morton <aa...@thelastpickle.com>
>
>> Am guessing but here goes. Looks like the cassandra RPC port is not set,
>> did you follow these steps in contrib/pig/README.txt
>>
>> Finally, set the following as environment variables (uppercase,
>> underscored), or as Hadoop configuration variables (lowercase, dotted):
>> * PIG_RPC_PORT or cassandra.thrift.port : the port thrift is listening on
>> * PIG_INITIAL_ADDRESS or cassandra.thrift.address : initial address to
>> connect to
>> * PIG_PARTITIONER or cassandra.partitioner.class : cluster partitioner
>>
>> Hope that helps.
>> Aaron
>>
>>
>> On 20 Apr 2011, at 11:28, pob wrote:
>>
>> Hello,
>>
>> I did cluster configuration by
>> http://wiki.apache.org/cassandra/HadoopSupport. When I run
>> pig example-script.pig
>> -x local, everything is fine and i get correct results.
>>
>> Problem is occurring with -x mapreduce
>>
>> Im getting those errors :>
>>
>>
>> 2011-04-20 01:24:21,791 [main] ERROR
>> org.apache.pig.tools.pigstats.PigStats - ERROR:
>> java.lang.NumberFormatException: null
>> 2011-04-20 01:24:21,792 [main] ERROR
>> org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!
>> 2011-04-20 01:24:21,793 [main] INFO
>>  org.apache.pig.tools.pigstats.PigStats - Script Statistics:
>>
>> Input(s):
>> Failed to read data from "cassandra://Keyspace1/Standard1"
>>
>> Output(s):
>> Failed to produce result in "
>> hdfs://ip:54310/tmp/temp-1383865669/tmp-1895601791"
>>
>> Counters:
>> Total records written : 0
>> Total bytes written : 0
>> Spillable Memory Manager spill count : 0
>> Total bags proactively spilled: 0
>> Total records proactively spilled: 0
>>
>> Job DAG:
>> job_201104200056_0005   ->      null,
>> null    ->      null,
>> null
>>
>>
>> 2011-04-20 01:24:21,793 [main] INFO
>>  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
>> - Failed!
>> 2011-04-20 01:24:21,803 [main] ERROR org.apache.pig.tools.grunt.Grunt -
>> ERROR 1066: Unable to open iterator for alias topnames. Backend error :
>> java.lang.NumberFormatException: null
>>
>>
>>
>> ====
>> thats from jobtasks web management - error  from task directly:
>>
>> java.lang.RuntimeException: java.lang.NumberFormatException: null
>> at
>> org.apache.cassandra.hadoop.ColumnFamilyRecordReader.initialize(ColumnFamilyRecordReader.java:123)
>>  at
>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.initialize(PigRecordReader.java:176)
>> at
>> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:418)
>>  at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:620)
>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>>  at org.apache.hadoop.mapred.Child.main(Child.java:170)
>> Caused by: java.lang.NumberFormatException: null
>> at java.lang.Integer.parseInt(Integer.java:417)
>>  at java.lang.Integer.parseInt(Integer.java:499)
>> at
>> org.apache.cassandra.hadoop.ConfigHelper.getRpcPort(ConfigHelper.java:233)
>>  at
>> org.apache.cassandra.hadoop.ColumnFamilyRecordReader.initialize(ColumnFamilyRecordReader.java:105)
>> ... 5 more
>>
>>
>>
>> Any suggestions where should be problem?
>>
>> Thanks,
>>
>>
>>
>

Re: pig + hadoop

Posted by pob <pe...@gmail.com>.

ad2. it works with -x local , so there cant be issue with
pig->DB(Cassandra).

im using pig-0.8 from official site + hadoop-0.20.2 from offic. site.


thx


2011/4/20 aaron morton <aa...@thelastpickle.com>

> Am guessing but here goes. Looks like the cassandra RPC port is not set,
> did you follow these steps in contrib/pig/README.txt
>
> Finally, set the following as environment variables (uppercase,
> underscored), or as Hadoop configuration variables (lowercase, dotted):
> * PIG_RPC_PORT or cassandra.thrift.port : the port thrift is listening on
> * PIG_INITIAL_ADDRESS or cassandra.thrift.address : initial address to
> connect to
> * PIG_PARTITIONER or cassandra.partitioner.class : cluster partitioner
>
> Hope that helps.
> Aaron
>
>
> On 20 Apr 2011, at 11:28, pob wrote:
>
> Hello,
>
> I did cluster configuration by
> http://wiki.apache.org/cassandra/HadoopSupport. When I run
> pig example-script.pig
> -x local, everything is fine and i get correct results.
>
> Problem is occurring with -x mapreduce
>
> Im getting those errors :>
>
>
> 2011-04-20 01:24:21,791 [main] ERROR org.apache.pig.tools.pigstats.PigStats
> - ERROR: java.lang.NumberFormatException: null
> 2011-04-20 01:24:21,792 [main] ERROR
> org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!
> 2011-04-20 01:24:21,793 [main] INFO  org.apache.pig.tools.pigstats.PigStats
> - Script Statistics:
>
> Input(s):
> Failed to read data from "cassandra://Keyspace1/Standard1"
>
> Output(s):
> Failed to produce result in "
> hdfs://ip:54310/tmp/temp-1383865669/tmp-1895601791"
>
> Counters:
> Total records written : 0
> Total bytes written : 0
> Spillable Memory Manager spill count : 0
> Total bags proactively spilled: 0
> Total records proactively spilled: 0
>
> Job DAG:
> job_201104200056_0005   ->      null,
> null    ->      null,
> null
>
>
> 2011-04-20 01:24:21,793 [main] INFO
>  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - Failed!
> 2011-04-20 01:24:21,803 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> ERROR 1066: Unable to open iterator for alias topnames. Backend error :
> java.lang.NumberFormatException: null
>
>
>
> ====
> thats from jobtasks web management - error  from task directly:
>
> java.lang.RuntimeException: java.lang.NumberFormatException: null
> at
> org.apache.cassandra.hadoop.ColumnFamilyRecordReader.initialize(ColumnFamilyRecordReader.java:123)
>  at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.initialize(PigRecordReader.java:176)
> at
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:418)
>  at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:620)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>  at org.apache.hadoop.mapred.Child.main(Child.java:170)
> Caused by: java.lang.NumberFormatException: null
> at java.lang.Integer.parseInt(Integer.java:417)
>  at java.lang.Integer.parseInt(Integer.java:499)
> at
> org.apache.cassandra.hadoop.ConfigHelper.getRpcPort(ConfigHelper.java:233)
>  at
> org.apache.cassandra.hadoop.ColumnFamilyRecordReader.initialize(ColumnFamilyRecordReader.java:105)
> ... 5 more
>
>
>
> Any suggestions where should be problem?
>
> Thanks,
>
>
>

Re: pig + hadoop

Posted by aaron morton <aa...@thelastpickle.com>.

Am guessing but here goes. Looks like the cassandra RPC port is not set, did you follow these steps in contrib/pig/README.txt

Finally, set the following as environment variables (uppercase,
underscored), or as Hadoop configuration variables (lowercase, dotted):
* PIG_RPC_PORT or cassandra.thrift.port : the port thrift is listening on 
* PIG_INITIAL_ADDRESS or cassandra.thrift.address : initial address to connect to
* PIG_PARTITIONER or cassandra.partitioner.class : cluster partitioner

Hope that helps. 
Aaron


On 20 Apr 2011, at 11:28, pob wrote:

> Hello, 
> 
> I did cluster configuration by http://wiki.apache.org/cassandra/HadoopSupport. When I run pig example-script.pig 
> -x local, everything is fine and i get correct results.
> 
> Problem is occurring with -x mapreduce 
> 
> Im getting those errors :>
> 
> 
> 2011-04-20 01:24:21,791 [main] ERROR org.apache.pig.tools.pigstats.PigStats - ERROR: java.lang.NumberFormatException: null
> 2011-04-20 01:24:21,792 [main] ERROR org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!
> 2011-04-20 01:24:21,793 [main] INFO  org.apache.pig.tools.pigstats.PigStats - Script Statistics: 
> 
> Input(s):
> Failed to read data from "cassandra://Keyspace1/Standard1"
> 
> Output(s):
> Failed to produce result in "hdfs://ip:54310/tmp/temp-1383865669/tmp-1895601791"
> 
> Counters:
> Total records written : 0
> Total bytes written : 0
> Spillable Memory Manager spill count : 0
> Total bags proactively spilled: 0
> Total records proactively spilled: 0
> 
> Job DAG:
> job_201104200056_0005   ->      null,
> null    ->      null,
> null
> 
> 
> 2011-04-20 01:24:21,793 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!
> 2011-04-20 01:24:21,803 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias topnames. Backend error : java.lang.NumberFormatException: null
> 
> 
> 
> ====
> thats from jobtasks web management - error  from task directly:
> 
> java.lang.RuntimeException: java.lang.NumberFormatException: null
> 	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader.initialize(ColumnFamilyRecordReader.java:123)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.initialize(PigRecordReader.java:176)
> 	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:418)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:620)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
> 	at org.apache.hadoop.mapred.Child.main(Child.java:170)
> Caused by: java.lang.NumberFormatException: null
> 	at java.lang.Integer.parseInt(Integer.java:417)
> 	at java.lang.Integer.parseInt(Integer.java:499)
> 	at org.apache.cassandra.hadoop.ConfigHelper.getRpcPort(ConfigHelper.java:233)
> 	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader.initialize(ColumnFamilyRecordReader.java:105)
> 	... 5 more
> 
> 
> 
> Any suggestions where should be problem?
> 
> Thanks,
>