You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Thamizh <tc...@yahoo.co.in> on 2011/08/23 11:51:40 UTC
multi-node cassandra config doubt
Hi All,
This is regarding multi-node cluster configuration doubt.
I have configured 3 nodes of cluster using Cassandra-0.8.4 and getting error when I ran Map/Reduce job which uploads records from HDFS to Cassandra.
Here are my 3 nodes cluster config file (cassandra.yaml) for Cassandra:
node01:
seeds: "node01,node02,node03"
auto_bootstrap: false
listen_address: 192.168.0.1
rpc_address: 192.168.0.1
node02:
seeds: "node01,node02,node03"
auto_bootstrap: true
listen_address: 192.168.0.2
rpc_address: 192.168.0.2
node03:
seeds: "node01,node02,node03"
auto_bootstrap: true
listen_address: 192.168.0.3
rpc_address: 192.168.0.3
When I ran M/R program, I am getting below error
11/08/23 04:37:00 INFO mapred.JobClient: map 100% reduce 11%
11/08/23 04:37:06 INFO mapred.JobClient: map 100% reduce 22%
11/08/23 04:37:09 INFO mapred.JobClient: map 100% reduce 33%
11/08/23 04:37:14 INFO mapred.JobClient: Task Id : attempt_201104211044_0719_r_000000_0, Status : FAILED
java.lang.NullPointerException
at org.apache.cassandra.client.RingCache.getRange(RingCache.java:130)
at org.apache.cassandra.hadoop.ColumnFamilyRecordWriter.write(ColumnFamilyRecordWriter.java:125)
at org.apache.cassandra.hadoop.ColumnFamilyRecordWriter.write(ColumnFamilyRecordWriter.java:60)
at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
at CassTblUploader$TblUploadReducer.reduce(CassTblUploader.java:90)
at CassTblUploader$TblUploadReducer.reduce(CassTblUploader.java:1)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:174)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:563)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
Is anything wrong on my cassandra.yaml file?
I followed http://wiki.apache.org/cassandra/MultinodeCluster for cluster configuration.
Regards,
Thamizhannal
Re: multi-node cassandra config doubt
Posted by Thamizh <tc...@yahoo.co.in>.
Hi All,
It looks it is know issue with Cassandra-0.8.4. So either I have to wait till 0.8.5 to be released or have to switch to 0.7.8 if this has been resolved in that.
Ref: https://issues.apache.org/jira/browse/CASSANDRA-3044
Regards,
Thamizhannal P
--- On Thu, 25/8/11, Thamizh <tc...@yahoo.co.in> wrote:
From: Thamizh <tc...@yahoo.co.in>
Subject: Re: multi-node cassandra config doubt
To: user@cassandra.apache.org
Date: Thursday, 25 August, 2011, 9:01 PM
Hi Aaron,
Thanks a lot for your suggestions. I have got exhausted with below error. It would great if you point me what went wrong with my approach.
I wanted to install cassandra-0.8.4 on 3 nodes and to run Map/Reduce job that uploads data from HDFS to Cassandra.
I have installed Cassnadra on 3 nodes lab02(199.168.0.2),lab03(199.168.0.3) & lab04(199.168.0.4) respectively and can create a keyspace & column family and they got distributed across the cluster.
When I run my map/reduce program it ended up with "UnknownHostException". the same map/reduce program works well on single node cluster.
Here are the steps which I have followed.
1. cassandra.yaml details
lab02(199.168.0.2): (seed node)
auto_bootstrap: false
seeds: "199.168.0.2"
listen_address: 199.168.0.2
rpc_address:
199.168.0.2
lab03(199.168.0.3):
auto_bootstrap: true
seeds: "199.168.0.2"
listen_address: 199.168.0.3
rpc_address: 199.168.0.3
lab04(199.168.0.4):
auto_bootstrap: true
seeds: "199.168.0.2"
listen_address: 199.168.0.4
rpc_address: 199.168.0.4
2.
O/P of bin/cassandra :
------
------
INFO 11:59:40,602 Node /199.168.0.2 is now part of the cluster
INFO 11:59:40,604 InetAddress /199.168.0.2 is now UP
INFO 11:59:55,667 Node /199.168.0.4 is now part of the cluster
INFO 11:59:55,669 InetAddress /199.168.0.4 is now UP
INFO 12:01:08,389 Joining: getting bootstrap token
INFO 12:01:08,410 New token will be 43083119672609054510947312506340649252 to assume load from /199.168.0.2
INFO 12:01:08,412 Enqueuing flush of Memtable-LocationInfo@6824966(123/153 serialized/live bytes, 4 ops)
INFO 12:01:08,413
Writing Memtable-LocationInfo@6824966(123/153 serialized/live bytes, 4 ops)
INFO 12:01:08,461 Completed flushing /var/lib/cassandra/data/system/LocationInfo-g-2-Data.db (287 bytes)
INFO 12:01:08,477 Node /199.168.0.3 state jump to normal
INFO 12:01:08,480 Enqueuing flush of Memtable-LocationInfo@10141941(53/66 serialized/live bytes, 2 ops)
INFO 12:01:08,482 Writing Memtable-LocationInfo@10141941(53/66 serialized/live bytes, 2 ops)
INFO 12:01:08,514 Completed flushing /var/lib/cassandra/data/system/LocationInfo-g-3-Data.db (163 bytes)
INFO 12:01:08,527 Node /199.168.0.3 state jump to normal
INFO 12:01:08,652 mx4j successfuly loaded
HttpAdaptor version 3.0.1 started on port 8081
3.
When I run my map/reduce program it ended up with "UnknownHostException"
Error: java.net.UnknownHostException: /199.168.0.2
at
java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:849)
at java.net.InetAddress.getAddressFromNameService(InetAddress.java:1200)
at java.net.InetAddress.getAllByName0(InetAddress.java:1153)
at java.net.InetAddress.getAllByName(InetAddress.java:1083)
at java.net.InetAddress.getAllByName(InetAddress.java:1019)
at java.net.InetAddress.getByName(InetAddress.java:969)
at org.apache.cassandra.client.RingCache.refreshEndpointMap(RingCache.java:93)
at org.apache.cassandra.client.RingCache.<init>(RingCache.java:67)
at org.apache.cassandra.hadoop.ColumnFamilyRecordWriter.<init>(ColumnFamilyRecordWriter.java:98)
at
org.apache.cassandra.hadoop.ColumnFamilyRecordWriter.<init>(ColumnFamilyRecordWriter.java:92)
at org.apache.cassandra.hadoop.ColumnFamilyOutputFormat.getRecordWriter(ColumnFamilyOutputFormat.java:132)
at org.apache.cassandra.hadoop.ColumnFamilyOutputFormat.getRecordWriter(ColumnFamilyOutputFormat.java:62)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:553)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
Here are the config line for map/reduce.
job4.setReducerClass(TblUploadReducer.class );
job4.setOutputKeyClass(ByteBuffer.class);
job4.setOutputValueClass(List.class);
job4.setOutputFormatClass(ColumnFamilyOutputFormat.class);
ConfigHelper.setOutputColumnFamily(job4.getConfiguration(), args[1],args[3] );
ConfigHelper.setRpcPort(job4.getConfiguration(), args[7]); // 9160
ConfigHelper.setInitialAddress(job4.getConfiguration(), args[9]); // 199.168.0.2
ConfigHelper.setPartitioner(job4.getConfiguration(), "org.apache.cassandra.dht.RandomPartitioner");
Steps which I have verified,
1. There is a passwordless ssh has been configured b/w lab02,lab03 &lab04. All the nodes can ping each other with out any issues.
2. When I ran "InetAddress.getLocalHost()" from java program on lab02 it prints "lab02/199.168.0.2".
3. When I over looked "o/p" of bin/cassandra it prints couple of messages and under InetAddress field "/199.168.0.3" etc.
Here
it does not print "hostname/IP". Is that problem?
Kindly help me.
Regards,
Thamizhannal
--- On Thu, 25/8/11, aaron morton <aa...@thelastpickle.com> wrote:
From: aaron morton <aa...@thelastpickle.com>
Subject: Re: multi-node cassandra config doubt
To: user@cassandra.apache.org
Date: Thursday, 25 August, 2011, 3:45 AM
Jump on the machine that raised the error and see if you can ssh to node01.
or try using ip address to see if they work.
Cheers
-----------------Aaron MortonFreelance Cassandra Developer@aaronmortonhttp://www.thelastpickle.com
On 24/08/2011, at 11:34 PM, Thamizh wrote:
Hi Aaron,
This is yet to be resolved.
I have set-up Cassandra multi node clustering and facing issues in pushing HDFS data to Cassandra. When I ran "MapReduce" progrma I am getting UnknownHostException.
In hadoop(0.20.1), I have configured node01-as master and node01, node02 & node03 as slaves.
In Cassandra(0.8.4), the installation & configurations has been done. when I issue nodetool ring command I could see the ring and also the KEYSPACES & COLUMNFAMILYS have got distributed.
o/p: nodetool
$bin/nodetool -h node02 ring
Address DC
Rack Status State Load Owns
Token
161930152162677484001961360738128229499
198.168.0.1 datacenter1 rack1 Up Normal 132.28 MB 12.48%
13027320554261208311902766005835168982
198.168.0.2 datacenter1 rack1 Up Normal 99.34 MB 75.07% 140745249930211229277235689500208693608
198.168.0.3 datacenter1 rack1 Up Normal 66.21 KB 12.45% 161930152162677484001961360738128229499
nutch@lab02:/code/apache-cassandra-0.8.4$
Here are the hadoop config.
job4.setOutputFormatClass(ColumnFamilyOutputFormat.class);
ConfigHelper.setOutputColumnFamily(job4.getConfiguration(), KEYSPACE,COLUMN_FAMILY );
ConfigHelper.setRpcPort(job4.getConfiguration(), ""9160);
ConfigHelper.setInitialAddress(job4.getConfiguration(), "node01");
ConfigHelper.setPartitioner(job4.getConfiguration(), "org.apache.cassandra.dht.RandomPartitioner");
Bleow is an exception message:
Error: java.net.UnknownHostException: /198.168.0.3
at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:849)
at java.net.InetAddress.getAddressFromNameService(InetAddress.java:1200)
at java.net.InetAddress.getAllByName0(InetAddress.java:1153)
at java.net.InetAddress.getAllByName(InetAddress.java:1083)
at java.net.InetAddress.getAllByName(InetAddress.java:1019)
at
java.net.InetAddress.getByName(InetAddress.java:969)
at org.apache.cassandra.client.RingCache.refreshEndpointMap(RingCache.java:93)
at org.apache.cassandra.client.RingCache.<init>(RingCache.java:67)
at org.apache.cassandra.hadoop.ColumnFamilyRecordWriter.<init>(ColumnFamilyRecordWriter.java:98)
at org.apache.cassandra.hadoop.ColumnFamilyRecordWriter.<init>(ColumnFamilyRecordWriter.java:92)
at org.apache.cassandra.hadoop.ColumnFamilyOutputFormat.getRecordWriter(ColumnFamilyOutputFormat.java:132)
at org.apache.cassandra.hadoop.ColumnFamilyOutputFormat.getRecordWriter(ColumnFamilyOutputFormat.java:62)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:553)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
at
org.apache.hadoop.mapred.Child.main(Child.java:170)
note: Same /etc/hosts file has been used across all the nodes.
Kindly help me to resolve this issue?
Regards,
Thamizhannal P
--- On Wed, 24/8/11, aaron morton <aa...@thelastpickle.com> wrote:
From: aaron morton <aa...@thelastpickle.com>
Subject: Re: multi-node cassandra config doubt
To: user@cassandra.apache.org
Date: Wednesday, 24 August, 2011, 2:40 PM
Did you get this sorted ?
At a guess I would say there are no nodes listed in the Hadoop
JobConf.
Cheers
-----------------Aaron MortonFreelance Cassandra Developer@aaronmortonhttp://www.thelastpickle.com
On 23/08/2011, at 9:51 PM, Thamizh wrote:
Hi All,
This is regarding multi-node cluster configuration doubt.
I have configured 3 nodes of cluster using Cassandra-0.8.4 and getting error when I ran Map/Reduce job which uploads records from HDFS to Cassandra.
Here are my 3 nodes cluster config file (cassandra.yaml) for Cassandra:
node01:
seeds: "node01,node02,node03"
auto_bootstrap: false
listen_address: 192.168.0.1
rpc_address: 192.168.0.1
node02:
seeds: "node01,node02,node03"
auto_bootstrap: true
listen_address: 192.168.0.2
rpc_address: 192.168.0.2
node03:
seeds: "node01,node02,node03"
auto_bootstrap:
true
listen_address: 192.168.0.3
rpc_address: 192.168.0.3
When I ran M/R program, I am getting below error
11/08/23 04:37:00 INFO
mapred.JobClient: map 100% reduce 11%
11/08/23 04:37:06 INFO mapred.JobClient: map 100% reduce 22%
11/08/23 04:37:09 INFO mapred.JobClient: map 100% reduce 33%
11/08/23 04:37:14 INFO mapred.JobClient: Task Id : attempt_201104211044_0719_r_000000_0, Status : FAILED
java.lang.NullPointerException
at org.apache.cassandra.client.RingCache.getRange(RingCache.java:130)
at org.apache.cassandra.hadoop.ColumnFamilyRecordWriter.write(ColumnFamilyRecordWriter.java:125)
at org.apache.cassandra.hadoop.ColumnFamilyRecordWriter.write(ColumnFamilyRecordWriter.java:60)
at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
at CassTblUploader$TblUploadReducer.reduce(CassTblUploader.java:90)
at CassTblUploader$TblUploadReducer.reduce(CassTblUploader.java:1)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:174)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:563)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
Is anything wrong on my cassandra.yaml file?
I followed http://wiki.apache.org/cassandra/MultinodeCluster for cluster configuration.
Regards,
Thamizhannal
Re: multi-node cassandra config doubt
Posted by Thamizh <tc...@yahoo.co.in>.
Hi Aaron,
Thanks a lot for your suggestions. I have got exhausted with below error. It would great if you point me what went wrong with my approach.
I wanted to install cassandra-0.8.4 on 3 nodes and to run Map/Reduce job that uploads data from HDFS to Cassandra.
I have installed Cassnadra on 3 nodes lab02(199.168.0.2),lab03(199.168.0.3) & lab04(199.168.0.4) respectively and can create a keyspace & column family and they got distributed across the cluster.
When I run my map/reduce program it ended up with "UnknownHostException". the same map/reduce program works well on single node cluster.
Here are the steps which I have followed.
1. cassandra.yaml details
lab02(199.168.0.2): (seed node)
auto_bootstrap: false
seeds: "199.168.0.2"
listen_address: 199.168.0.2
rpc_address: 199.168.0.2
lab03(199.168.0.3):
auto_bootstrap: true
seeds: "199.168.0.2"
listen_address: 199.168.0.3
rpc_address: 199.168.0.3
lab04(199.168.0.4):
auto_bootstrap: true
seeds: "199.168.0.2"
listen_address: 199.168.0.4
rpc_address: 199.168.0.4
2.
O/P of bin/cassandra :
------
------
INFO 11:59:40,602 Node /199.168.0.2 is now part of the cluster
INFO 11:59:40,604 InetAddress /199.168.0.2 is now UP
INFO 11:59:55,667 Node /199.168.0.4 is now part of the cluster
INFO 11:59:55,669 InetAddress /199.168.0.4 is now UP
INFO 12:01:08,389 Joining: getting bootstrap token
INFO 12:01:08,410 New token will be 43083119672609054510947312506340649252 to assume load from /199.168.0.2
INFO 12:01:08,412 Enqueuing flush of Memtable-LocationInfo@6824966(123/153 serialized/live bytes, 4 ops)
INFO 12:01:08,413 Writing Memtable-LocationInfo@6824966(123/153 serialized/live bytes, 4 ops)
INFO 12:01:08,461 Completed flushing /var/lib/cassandra/data/system/LocationInfo-g-2-Data.db (287 bytes)
INFO 12:01:08,477 Node /199.168.0.3 state jump to normal
INFO 12:01:08,480 Enqueuing flush of Memtable-LocationInfo@10141941(53/66 serialized/live bytes, 2 ops)
INFO 12:01:08,482 Writing Memtable-LocationInfo@10141941(53/66 serialized/live bytes, 2 ops)
INFO 12:01:08,514 Completed flushing /var/lib/cassandra/data/system/LocationInfo-g-3-Data.db (163 bytes)
INFO 12:01:08,527 Node /199.168.0.3 state jump to normal
INFO 12:01:08,652 mx4j successfuly loaded
HttpAdaptor version 3.0.1 started on port 8081
3.
When I run my map/reduce program it ended up with "UnknownHostException"
Error: java.net.UnknownHostException: /199.168.0.2
at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:849)
at java.net.InetAddress.getAddressFromNameService(InetAddress.java:1200)
at java.net.InetAddress.getAllByName0(InetAddress.java:1153)
at java.net.InetAddress.getAllByName(InetAddress.java:1083)
at java.net.InetAddress.getAllByName(InetAddress.java:1019)
at java.net.InetAddress.getByName(InetAddress.java:969)
at org.apache.cassandra.client.RingCache.refreshEndpointMap(RingCache.java:93)
at org.apache.cassandra.client.RingCache.<init>(RingCache.java:67)
at org.apache.cassandra.hadoop.ColumnFamilyRecordWriter.<init>(ColumnFamilyRecordWriter.java:98)
at org.apache.cassandra.hadoop.ColumnFamilyRecordWriter.<init>(ColumnFamilyRecordWriter.java:92)
at org.apache.cassandra.hadoop.ColumnFamilyOutputFormat.getRecordWriter(ColumnFamilyOutputFormat.java:132)
at org.apache.cassandra.hadoop.ColumnFamilyOutputFormat.getRecordWriter(ColumnFamilyOutputFormat.java:62)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:553)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
Here are the config line for map/reduce.
job4.setReducerClass(TblUploadReducer.class );
job4.setOutputKeyClass(ByteBuffer.class);
job4.setOutputValueClass(List.class);
job4.setOutputFormatClass(ColumnFamilyOutputFormat.class);
ConfigHelper.setOutputColumnFamily(job4.getConfiguration(), args[1],args[3] );
ConfigHelper.setRpcPort(job4.getConfiguration(), args[7]); // 9160
ConfigHelper.setInitialAddress(job4.getConfiguration(), args[9]); // 199.168.0.2
ConfigHelper.setPartitioner(job4.getConfiguration(), "org.apache.cassandra.dht.RandomPartitioner");
Steps which I have verified,
1. There is a passwordless ssh has been configured b/w lab02,lab03 &lab04. All the nodes can ping each other with out any issues.
2. When I ran "InetAddress.getLocalHost()" from java program on lab02 it prints "lab02/199.168.0.2".
3. When I over looked "o/p" of bin/cassandra it prints couple of messages and under InetAddress field "/199.168.0.3" etc.
Here it does not print "hostname/IP". Is that problem?
Kindly help me.
Regards,
Thamizhannal
--- On Thu, 25/8/11, aaron morton <aa...@thelastpickle.com> wrote:
From: aaron morton <aa...@thelastpickle.com>
Subject: Re: multi-node cassandra config doubt
To: user@cassandra.apache.org
Date: Thursday, 25 August, 2011, 3:45 AM
Jump on the machine that raised the error and see if you can ssh to node01.
or try using ip address to see if they work.
Cheers
-----------------Aaron MortonFreelance Cassandra Developer@aaronmortonhttp://www.thelastpickle.com
On 24/08/2011, at 11:34 PM, Thamizh wrote:
Hi Aaron,
This is yet to be resolved.
I have set-up Cassandra multi node clustering and facing issues in pushing HDFS data to Cassandra. When I ran "MapReduce" progrma I am getting UnknownHostException.
In hadoop(0.20.1), I have configured node01-as master and node01, node02 & node03 as slaves.
In Cassandra(0.8.4), the installation & configurations has been done. when I issue nodetool ring command I could see the ring and also the KEYSPACES & COLUMNFAMILYS have got distributed.
o/p: nodetool
$bin/nodetool -h node02 ring
Address DC Rack Status State Load Owns
Token
161930152162677484001961360738128229499
198.168.0.1 datacenter1 rack1 Up Normal 132.28 MB 12.48%
13027320554261208311902766005835168982
198.168.0.2 datacenter1 rack1 Up Normal 99.34 MB 75.07% 140745249930211229277235689500208693608
198.168.0.3 datacenter1 rack1 Up Normal 66.21 KB 12.45% 161930152162677484001961360738128229499
nutch@lab02:/code/apache-cassandra-0.8.4$
Here are the hadoop config.
job4.setOutputFormatClass(ColumnFamilyOutputFormat.class);
ConfigHelper.setOutputColumnFamily(job4.getConfiguration(), KEYSPACE,COLUMN_FAMILY );
ConfigHelper.setRpcPort(job4.getConfiguration(), ""9160);
ConfigHelper.setInitialAddress(job4.getConfiguration(), "node01");
ConfigHelper.setPartitioner(job4.getConfiguration(), "org.apache.cassandra.dht.RandomPartitioner");
Bleow is an exception message:
Error: java.net.UnknownHostException: /198.168.0.3
at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:849)
at java.net.InetAddress.getAddressFromNameService(InetAddress.java:1200)
at java.net.InetAddress.getAllByName0(InetAddress.java:1153)
at java.net.InetAddress.getAllByName(InetAddress.java:1083)
at java.net.InetAddress.getAllByName(InetAddress.java:1019)
at
java.net.InetAddress.getByName(InetAddress.java:969)
at org.apache.cassandra.client.RingCache.refreshEndpointMap(RingCache.java:93)
at org.apache.cassandra.client.RingCache.<init>(RingCache.java:67)
at org.apache.cassandra.hadoop.ColumnFamilyRecordWriter.<init>(ColumnFamilyRecordWriter.java:98)
at org.apache.cassandra.hadoop.ColumnFamilyRecordWriter.<init>(ColumnFamilyRecordWriter.java:92)
at org.apache.cassandra.hadoop.ColumnFamilyOutputFormat.getRecordWriter(ColumnFamilyOutputFormat.java:132)
at org.apache.cassandra.hadoop.ColumnFamilyOutputFormat.getRecordWriter(ColumnFamilyOutputFormat.java:62)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:553)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
at
org.apache.hadoop.mapred.Child.main(Child.java:170)
note: Same /etc/hosts file has been used across all the nodes.
Kindly help me to resolve this issue?
Regards,
Thamizhannal P
--- On Wed, 24/8/11, aaron morton <aa...@thelastpickle.com> wrote:
From: aaron morton <aa...@thelastpickle.com>
Subject: Re: multi-node cassandra config doubt
To: user@cassandra.apache.org
Date: Wednesday, 24 August, 2011, 2:40 PM
Did you get this sorted ?
At a guess I would say there are no nodes listed in the Hadoop JobConf.
Cheers
-----------------Aaron MortonFreelance Cassandra Developer@aaronmortonhttp://www.thelastpickle.com
On 23/08/2011, at 9:51 PM, Thamizh wrote:
Hi All,
This is regarding multi-node cluster configuration doubt.
I have configured 3 nodes of cluster using Cassandra-0.8.4 and getting error when I ran Map/Reduce job which uploads records from HDFS to Cassandra.
Here are my 3 nodes cluster config file (cassandra.yaml) for Cassandra:
node01:
seeds: "node01,node02,node03"
auto_bootstrap: false
listen_address: 192.168.0.1
rpc_address: 192.168.0.1
node02:
seeds: "node01,node02,node03"
auto_bootstrap: true
listen_address: 192.168.0.2
rpc_address: 192.168.0.2
node03:
seeds: "node01,node02,node03"
auto_bootstrap:
true
listen_address: 192.168.0.3
rpc_address: 192.168.0.3
When I ran M/R program, I am getting below error
11/08/23 04:37:00 INFO
mapred.JobClient: map 100% reduce 11%
11/08/23 04:37:06 INFO mapred.JobClient: map 100% reduce 22%
11/08/23 04:37:09 INFO mapred.JobClient: map 100% reduce 33%
11/08/23 04:37:14 INFO mapred.JobClient: Task Id : attempt_201104211044_0719_r_000000_0, Status : FAILED
java.lang.NullPointerException
at org.apache.cassandra.client.RingCache.getRange(RingCache.java:130)
at org.apache.cassandra.hadoop.ColumnFamilyRecordWriter.write(ColumnFamilyRecordWriter.java:125)
at org.apache.cassandra.hadoop.ColumnFamilyRecordWriter.write(ColumnFamilyRecordWriter.java:60)
at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
at CassTblUploader$TblUploadReducer.reduce(CassTblUploader.java:90)
at CassTblUploader$TblUploadReducer.reduce(CassTblUploader.java:1)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:174)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:563)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
Is anything wrong on my cassandra.yaml file?
I followed http://wiki.apache.org/cassandra/MultinodeCluster for cluster configuration.
Regards,
Thamizhannal
Re: multi-node cassandra config doubt
Posted by aaron morton <aa...@thelastpickle.com>.
Jump on the machine that raised the error and see if you can ssh to node01.
or try using ip address to see if they work.
Cheers
-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com
On 24/08/2011, at 11:34 PM, Thamizh wrote:
> Hi Aaron,
>
> This is yet to be resolved.
>
> I have set-up Cassandra multi node clustering and facing issues in pushing HDFS data to Cassandra. When I ran "MapReduce" progrma I am getting UnknownHostException.
>
> In hadoop(0.20.1), I have configured node01-as master and node01, node02 & node03 as slaves.
>
> In Cassandra(0.8.4), the installation & configurations has been done. when I issue nodetool ring command I could see the ring and also the KEYSPACES & COLUMNFAMILYS have got distributed.
>
> o/p: nodetool
> $bin/nodetool -h node02 ring
> Address DC Rack Status State Load Owns Token
> 161930152162677484001961360738128229499
> 198.168.0.1 datacenter1 rack1 Up Normal 132.28 MB 12.48% 13027320554261208311902766005835168982
> 198.168.0.2 datacenter1 rack1 Up Normal 99.34 MB 75.07% 140745249930211229277235689500208693608
> 198.168.0.3 datacenter1 rack1 Up Normal 66.21 KB 12.45% 161930152162677484001961360738128229499
> nutch@lab02:/code/apache-cassandra-0.8.4$
>
>
> Here are the hadoop config.
>
> job4.setOutputFormatClass(ColumnFamilyOutputFormat.class);
> ConfigHelper.setOutputColumnFamily(job4.getConfiguration(), KEYSPACE,COLUMN_FAMILY );
> ConfigHelper.setRpcPort(job4.getConfiguration(), ""9160);
> ConfigHelper.setInitialAddress(job4.getConfiguration(), "node01");
> ConfigHelper.setPartitioner(job4.getConfiguration(), "org.apache.cassandra.dht.RandomPartitioner");
>
> Bleow is an exception message:
>
> Error: java.net.UnknownHostException: /198.168.0.3
> at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
> at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:849)
> at java.net.InetAddress.getAddressFromNameService(InetAddress.java:1200)
> at java.net.InetAddress.getAllByName0(InetAddress.java:1153)
> at java.net.InetAddress.getAllByName(InetAddress.java:1083)
> at java.net.InetAddress.getAllByName(InetAddress.java:1019)
> at java.net.InetAddress.getByName(InetAddress.java:969)
> at org.apache.cassandra.client.RingCache.refreshEndpointMap(RingCache.java:93)
> at org.apache.cassandra.client.RingCache.<init>(RingCache.java:67)
> at org.apache.cassandra.hadoop.ColumnFamilyRecordWriter.<init>(ColumnFamilyRecordWriter.java:98)
> at org.apache.cassandra.hadoop.ColumnFamilyRecordWriter.<init>(ColumnFamilyRecordWriter.java:92)
> at org.apache.cassandra.hadoop.ColumnFamilyOutputFormat.getRecordWriter(ColumnFamilyOutputFormat.java:132)
> at org.apache.cassandra.hadoop.ColumnFamilyOutputFormat.getRecordWriter(ColumnFamilyOutputFormat.java:62)
> at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:553)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
> at org.apache.hadoop.mapred.Child.main(Child.java:170)
>
> note: Same /etc/hosts file has been used across all the nodes.
>
> Kindly help me to resolve this issue?
>
>
> Regards,
> Thamizhannal P
>
> --- On Wed, 24/8/11, aaron morton <aa...@thelastpickle.com> wrote:
>
> From: aaron morton <aa...@thelastpickle.com>
> Subject: Re: multi-node cassandra config doubt
> To: user@cassandra.apache.org
> Date: Wednesday, 24 August, 2011, 2:40 PM
>
> Did you get this sorted ?
>
> At a guess I would say there are no nodes listed in the Hadoop JobConf.
>
> Cheers
>
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 23/08/2011, at 9:51 PM, Thamizh wrote:
>
>> Hi All,
>>
>> This is regarding multi-node cluster configuration doubt.
>>
>> I have configured 3 nodes of cluster using Cassandra-0.8.4 and getting error when I ran Map/Reduce job which uploads records from HDFS to Cassandra.
>>
>> Here are my 3 nodes cluster config file (cassandra.yaml) for Cassandra:
>>
>> node01:
>> seeds: "node01,node02,node03"
>> auto_bootstrap: false
>> listen_address: 192.168.0.1
>> rpc_address: 192.168.0.1
>>
>>
>> node02:
>>
>> seeds: "node01,node02,node03"
>> auto_bootstrap: true
>> listen_address: 192.168.0.2
>> rpc_address: 192.168.0.2
>>
>>
>> node03:
>> seeds: "node01,node02,node03"
>> auto_bootstrap: true
>> listen_address: 192.168.0.3
>> rpc_address: 192.168.0.3
>>
>> When I ran M/R program, I am getting below error
>> 11/08/23 04:37:00 INFO mapred.JobClient: map 100% reduce 11%
>> 11/08/23 04:37:06 INFO mapred.JobClient: map 100% reduce 22%
>> 11/08/23 04:37:09 INFO mapred.JobClient: map 100% reduce 33%
>> 11/08/23 04:37:14 INFO mapred.JobClient: Task Id : attempt_201104211044_0719_r_000000_0, Status : FAILED
>> java.lang.NullPointerException
>> at org.apache.cassandra.client.RingCache.getRange(RingCache.java:130)
>> at org.apache.cassandra.hadoop.ColumnFamilyRecordWriter.write(ColumnFamilyRecordWriter.java:125)
>> at org.apache.cassandra.hadoop.ColumnFamilyRecordWriter.write(ColumnFamilyRecordWriter.java:60)
>> at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
>> at CassTblUploader$TblUploadReducer.reduce(CassTblUploader.java:90)
>> at CassTblUploader$TblUploadReducer.reduce(CassTblUploader.java:1)
>> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:174)
>> at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:563)
>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
>> at org.apache.hadoop.mapred.Child.main(Child.java:170)
>>
>>
>> Is anything wrong on my cassandra.yaml file?
>>
>> I followed http://wiki.apache.org/cassandra/MultinodeCluster for cluster configuration.
>>
>> Regards,
>> Thamizhannal
>
Re: multi-node cassandra config doubt
Posted by Thamizh <tc...@yahoo.co.in>.
Hi Aaron,
This is yet to be resolved.
I have set-up Cassandra multi node clustering and facing issues in pushing HDFS data to Cassandra. When I ran "MapReduce" progrma I am getting UnknownHostException.
In hadoop(0.20.1), I have configured node01-as master and node01, node02 & node03 as slaves.
In Cassandra(0.8.4), the installation & configurations has been done. when I issue nodetool ring command I could see the ring and also the KEYSPACES & COLUMNFAMILYS have got distributed.
o/p: nodetool
$bin/nodetool -h node02 ring
Address DC Rack Status State Load Owns Token
161930152162677484001961360738128229499
198.168.0.1 datacenter1 rack1 Up Normal 132.28 MB 12.48% 13027320554261208311902766005835168982
198.168.0.2 datacenter1 rack1 Up Normal 99.34 MB 75.07% 140745249930211229277235689500208693608
198.168.0.3 datacenter1 rack1 Up Normal 66.21 KB 12.45% 161930152162677484001961360738128229499
nutch@lab02:/code/apache-cassandra-0.8.4$
Here are the hadoop config.
job4.setOutputFormatClass(ColumnFamilyOutputFormat.class);
ConfigHelper.setOutputColumnFamily(job4.getConfiguration(), KEYSPACE,COLUMN_FAMILY );
ConfigHelper.setRpcPort(job4.getConfiguration(), ""9160);
ConfigHelper.setInitialAddress(job4.getConfiguration(), "node01");
ConfigHelper.setPartitioner(job4.getConfiguration(), "org.apache.cassandra.dht.RandomPartitioner");
Bleow is an exception message:
Error: java.net.UnknownHostException: /198.168.0.3
at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:849)
at java.net.InetAddress.getAddressFromNameService(InetAddress.java:1200)
at java.net.InetAddress.getAllByName0(InetAddress.java:1153)
at java.net.InetAddress.getAllByName(InetAddress.java:1083)
at java.net.InetAddress.getAllByName(InetAddress.java:1019)
at java.net.InetAddress.getByName(InetAddress.java:969)
at org.apache.cassandra.client.RingCache.refreshEndpointMap(RingCache.java:93)
at org.apache.cassandra.client.RingCache.<init>(RingCache.java:67)
at org.apache.cassandra.hadoop.ColumnFamilyRecordWriter.<init>(ColumnFamilyRecordWriter.java:98)
at org.apache.cassandra.hadoop.ColumnFamilyRecordWriter.<init>(ColumnFamilyRecordWriter.java:92)
at org.apache.cassandra.hadoop.ColumnFamilyOutputFormat.getRecordWriter(ColumnFamilyOutputFormat.java:132)
at org.apache.cassandra.hadoop.ColumnFamilyOutputFormat.getRecordWriter(ColumnFamilyOutputFormat.java:62)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:553)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
note: Same /etc/hosts file has been used across all the nodes.
Kindly help me to resolve this issue?
Regards,
Thamizhannal P
--- On Wed, 24/8/11, aaron morton <aa...@thelastpickle.com> wrote:
From: aaron morton <aa...@thelastpickle.com>
Subject: Re: multi-node cassandra config doubt
To: user@cassandra.apache.org
Date: Wednesday, 24 August, 2011, 2:40 PM
Did you get this sorted ?
At a guess I would say there are no nodes listed in the Hadoop JobConf.
Cheers
-----------------Aaron MortonFreelance Cassandra Developer@aaronmortonhttp://www.thelastpickle.com
On 23/08/2011, at 9:51 PM, Thamizh wrote:
Hi All,
This is regarding multi-node cluster configuration doubt.
I have configured 3 nodes of cluster using Cassandra-0.8.4 and getting error when I ran Map/Reduce job which uploads records from HDFS to Cassandra.
Here are my 3 nodes cluster config file (cassandra.yaml) for Cassandra:
node01:
seeds: "node01,node02,node03"
auto_bootstrap: false
listen_address: 192.168.0.1
rpc_address: 192.168.0.1
node02:
seeds: "node01,node02,node03"
auto_bootstrap: true
listen_address: 192.168.0.2
rpc_address: 192.168.0.2
node03:
seeds: "node01,node02,node03"
auto_bootstrap: true
listen_address: 192.168.0.3
rpc_address: 192.168.0.3
When I ran M/R program, I am getting below error
11/08/23 04:37:00 INFO
mapred.JobClient: map 100% reduce 11%
11/08/23 04:37:06 INFO mapred.JobClient: map 100% reduce 22%
11/08/23 04:37:09 INFO mapred.JobClient: map 100% reduce 33%
11/08/23 04:37:14 INFO mapred.JobClient: Task Id : attempt_201104211044_0719_r_000000_0, Status : FAILED
java.lang.NullPointerException
at org.apache.cassandra.client.RingCache.getRange(RingCache.java:130)
at org.apache.cassandra.hadoop.ColumnFamilyRecordWriter.write(ColumnFamilyRecordWriter.java:125)
at org.apache.cassandra.hadoop.ColumnFamilyRecordWriter.write(ColumnFamilyRecordWriter.java:60)
at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
at CassTblUploader$TblUploadReducer.reduce(CassTblUploader.java:90)
at CassTblUploader$TblUploadReducer.reduce(CassTblUploader.java:1)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:174)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:563)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
Is anything wrong on my cassandra.yaml file?
I followed http://wiki.apache.org/cassandra/MultinodeCluster for cluster configuration.
Regards,
Thamizhannal
Re: multi-node cassandra config doubt
Posted by aaron morton <aa...@thelastpickle.com>.
Did you get this sorted ?
At a guess I would say there are no nodes listed in the Hadoop JobConf.
Cheers
-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com
On 23/08/2011, at 9:51 PM, Thamizh wrote:
> Hi All,
>
> This is regarding multi-node cluster configuration doubt.
>
> I have configured 3 nodes of cluster using Cassandra-0.8.4 and getting error when I ran Map/Reduce job which uploads records from HDFS to Cassandra.
>
> Here are my 3 nodes cluster config file (cassandra.yaml) for Cassandra:
>
> node01:
> seeds: "node01,node02,node03"
> auto_bootstrap: false
> listen_address: 192.168.0.1
> rpc_address: 192.168.0.1
>
>
> node02:
>
> seeds: "node01,node02,node03"
> auto_bootstrap: true
> listen_address: 192.168.0.2
> rpc_address: 192.168.0.2
>
>
> node03:
> seeds: "node01,node02,node03"
> auto_bootstrap: true
> listen_address: 192.168.0.3
> rpc_address: 192.168.0.3
>
> When I ran M/R program, I am getting below error
> 11/08/23 04:37:00 INFO mapred.JobClient: map 100% reduce 11%
> 11/08/23 04:37:06 INFO mapred.JobClient: map 100% reduce 22%
> 11/08/23 04:37:09 INFO mapred.JobClient: map 100% reduce 33%
> 11/08/23 04:37:14 INFO mapred.JobClient: Task Id : attempt_201104211044_0719_r_000000_0, Status : FAILED
> java.lang.NullPointerException
> at org.apache.cassandra.client.RingCache.getRange(RingCache.java:130)
> at org.apache.cassandra.hadoop.ColumnFamilyRecordWriter.write(ColumnFamilyRecordWriter.java:125)
> at org.apache.cassandra.hadoop.ColumnFamilyRecordWriter.write(ColumnFamilyRecordWriter.java:60)
> at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
> at CassTblUploader$TblUploadReducer.reduce(CassTblUploader.java:90)
> at CassTblUploader$TblUploadReducer.reduce(CassTblUploader.java:1)
> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:174)
> at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:563)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
> at org.apache.hadoop.mapred.Child.main(Child.java:170)
>
>
> Is anything wrong on my cassandra.yaml file?
>
> I followed http://wiki.apache.org/cassandra/MultinodeCluster for cluster configuration.
>
> Regards,
> Thamizhannal