You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Patrik Modesto (Created) (JIRA)" <ji...@apache.org> on 2012/01/30 08:31:10 UTC

[jira] [Created] (CASSANDRA-3811) Empty rpc_address prevents running MapReduce job outside a cluster

Empty rpc_address prevents running MapReduce job outside a cluster
------------------------------------------------------------------

                 Key: CASSANDRA-3811
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3811
             Project: Cassandra
          Issue Type: Bug
          Components: Hadoop
    Affects Versions: 0.8.9
         Environment: Debian Stable,
Cassandra 0.8.9,
Java(TM) SE Runtime Environment (build 1.6.0_26-b03),
Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode)
            Reporter: Patrik Modesto


Setting rpc_address to empty to make Cassandra listen on all network intefaceces breaks running mapredude job from outside the cluster. The jobs wont even start, showing these messages:

{noformat}
12/01/26 11:15:21 DEBUG  hadoop.ColumnFamilyInputFormat: failed
connect to endpoint 0.0.0.0
java.io.IOException: unable to connect to server
       at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:389)
       at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:224)
       at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:73)
       at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:193)
       at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:178)
       at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
       at java.util.concurrent.FutureTask.run(FutureTask.java:138)
       at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
       at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.thrift.transport.TTransportException:
java.net.ConnectException: Connection refused
       at org.apache.thrift.transport.TSocket.open(TSocket.java:183)
       at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
       at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:385)
       ... 9 more
Caused by: java.net.ConnectException: Connection refused
       at java.net.PlainSocketImpl.socketConnect(Native Method)
       at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
       at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:211)
       at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
       at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
       at java.net.Socket.connect(Socket.java:529)
       at org.apache.thrift.transport.TSocket.open(TSocket.java:178)
       ... 11 more

...

Caused by: java.util.concurrent.ExecutionException:
java.io.IOException: failed connecting to all endpoints
10.0.18.129,10.0.18.99,10.0.18.98
       at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
       at java.util.concurrent.FutureTask.get(FutureTask.java:83)
       at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSplits(ColumnFamilyInputFormat.java:156)
       ... 19 more
Caused by: java.io.IOException: failed connecting to all endpoints
10.0.18.129,10.0.18.99,10.0.18.98
       at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:241)
       at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:73)
       at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:193)
       at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:178)
       at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
       at java.util.concurrent.FutureTask.run(FutureTask.java:138)
       at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
       at java.lang.Thread.run(Thread.java:662)
{noformat}

Describe ring retunrs:

{noformat}
describe_ring returns:
endpoints: 10.0.18.129,10.0.18.99,10.0.18.98
rpc_endpoints: 0.0.0.0,0.0.0.0,0.0.0.0
{noformat}

[Michael Frisch|http://www.mail-archive.com/user@cassandra.apache.org/msg20180.html] found possible bug in the Cassandra source:

{quote}
If the code in the 0.8 branch is reflective of what is actually included in 
Cassandra 0.8.9 (here: 
http://svn.apache.org/repos/asf/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/hadoop/ColumnFamilyInputFormat.java)
 then the problem is that line 202 is doing an == comparison on strings.  The 
correct way to compare would be endpoint_address.equals("0.0.0.0") instead.

- Mike
{quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-3811) Empty rpc_address prevents running MapReduce job outside a cluster

Posted by "Patrik Modesto (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13230511#comment-13230511 ] 

Patrik Modesto commented on CASSANDRA-3811:
-------------------------------------------

I can see there is a misunderstanding about our Hadoop cluster setup. We use quite common setup that's why I made this ticket "critical".

Our setup looks like this:
||node||Components||
|node1|Datanode, Tasktracker, Cassandra|
|node2|Datanode, Tasktracker, Cassandra|
|node3|Datanode, Tasktracker, Cassandra|
|node4|Datanode, Tasktracker, Cassandra|
|node5|Namenode, Jobtracker, our mapreduce jobs|

All with rpc_endpoints: 0.0.0.0

I'd say, this is quite reasonable clean setup.

The problem is, that with this setup, we can't use Cassandra 0.8.8 and above because of the problem I've described earlier today. With this exact setup our jobs just fail to start because there is no Cassandra on node5. Moving our jobs to, for example, node1 allows them to run but CFIF gets wrong split sizes (it asks just 0.0.0.0 for all of the key ranges) and there are tasks that shows thousands of percent of progress. Please carefuly read my earlier post about describe_splits().

We use rpc_endpoint: 0.0.0.0 because we have other non-Hadoop components that connect to Cassandra for data and they are on different intefaces.

I hope I've explained the setup so you can understand that from my point of view it is critical. With Cassandra 0.8.8 and above our Hadoop jobs eihter fail to start or fail to complete the work.

We have quite wide rows (even tens of thousands columns) and write heavy cluster so we use batch.size=512 and split.size=8196 for our Hadoop jobs. That may or may not be connected to the wrong key ranges.

Regards,
Patrik

                
> Empty rpc_address prevents running MapReduce job outside a cluster
> ------------------------------------------------------------------
>
>                 Key: CASSANDRA-3811
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3811
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 0.8.9, 0.8.10
>         Environment: Debian Stable,
> Cassandra 0.8.9,
> Java(TM) SE Runtime Environment (build 1.6.0_26-b03),
> Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode)
>            Reporter: Patrik Modesto
>            Priority: Minor
>
> Setting rpc_address to empty to make Cassandra listen on all network intefaceces breaks running mapredude job from outside the cluster. The jobs wont even start, showing these messages:
> {noformat}
> 12/01/26 11:15:21 DEBUG  hadoop.ColumnFamilyInputFormat: failed
> connect to endpoint 0.0.0.0
> java.io.IOException: unable to connect to server
>        at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:389)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:224)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:73)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:193)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:178)
>        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>        at java.lang.Thread.run(Thread.java:662)
> Caused by: org.apache.thrift.transport.TTransportException:
> java.net.ConnectException: Connection refused
>        at org.apache.thrift.transport.TSocket.open(TSocket.java:183)
>        at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
>        at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:385)
>        ... 9 more
> Caused by: java.net.ConnectException: Connection refused
>        at java.net.PlainSocketImpl.socketConnect(Native Method)
>        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
>        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:211)
>        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
>        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
>        at java.net.Socket.connect(Socket.java:529)
>        at org.apache.thrift.transport.TSocket.open(TSocket.java:178)
>        ... 11 more
> ...
> Caused by: java.util.concurrent.ExecutionException:
> java.io.IOException: failed connecting to all endpoints
> 10.0.18.129,10.0.18.99,10.0.18.98
>        at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
>        at java.util.concurrent.FutureTask.get(FutureTask.java:83)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSplits(ColumnFamilyInputFormat.java:156)
>        ... 19 more
> Caused by: java.io.IOException: failed connecting to all endpoints
> 10.0.18.129,10.0.18.99,10.0.18.98
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:241)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:73)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:193)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:178)
>        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>        at java.lang.Thread.run(Thread.java:662)
> {noformat}
> Describe ring retunrs:
> {noformat}
> describe_ring returns:
> endpoints: 10.0.18.129,10.0.18.99,10.0.18.98
> rpc_endpoints: 0.0.0.0,0.0.0.0,0.0.0.0
> {noformat}
> [Michael Frisch|http://www.mail-archive.com/user@cassandra.apache.org/msg20180.html] found possible bug in the Cassandra source:
> {quote}
> If the code in the 0.8 branch is reflective of what is actually included in 
> Cassandra 0.8.9 (here: 
> http://svn.apache.org/repos/asf/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/hadoop/ColumnFamilyInputFormat.java)
>  then the problem is that line 202 is doing an == comparison on strings.  The 
> correct way to compare would be endpoint_address.equals("0.0.0.0") instead.
> - Mike
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-3811) Empty rpc_address prevents running MapReduce job outside a cluster

Posted by "Patrik Modesto (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13195980#comment-13195980 ] 

Patrik Modesto commented on CASSANDRA-3811:
-------------------------------------------

Complete output of the job with the string comparsion fixed. Still doesn't work.

{noformat}
12/01/30 09:05:28 DEBUG  hadoop.ColumnFamilyInputFormat: failed connect to endpoint 0.0.0.0
java.io.IOException: unable to connect to server
	at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:389)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:224)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:73)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:193)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:178)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused
	at org.apache.thrift.transport.TSocket.open(TSocket.java:183)
	at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
	at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:385)
	... 9 more
Caused by: java.net.ConnectException: Connection refused
	at java.net.PlainSocketImpl.socketConnect(Native Method)
	at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
	at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:211)
	at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
	at java.net.Socket.connect(Socket.java:529)
	at org.apache.thrift.transport.TSocket.open(TSocket.java:178)
	... 11 more
12/01/30 09:05:28 DEBUG  hadoop.ColumnFamilyInputFormat: failed connect to endpoint 0.0.0.0
java.io.IOException: unable to connect to server
	at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:389)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:224)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:73)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:193)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:178)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused
	at org.apache.thrift.transport.TSocket.open(TSocket.java:183)
	at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
	at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:385)
	... 9 more
Caused by: java.net.ConnectException: Connection refused
	at java.net.PlainSocketImpl.socketConnect(Native Method)
	at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
	at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:211)
	at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
	at java.net.Socket.connect(Socket.java:529)
	at org.apache.thrift.transport.TSocket.open(TSocket.java:178)
	... 11 more
12/01/30 09:05:28 DEBUG  hadoop.ColumnFamilyInputFormat: failed connect to endpoint 0.0.0.0
java.io.IOException: unable to connect to server
	at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:389)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:224)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:73)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:193)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:178)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused
	at org.apache.thrift.transport.TSocket.open(TSocket.java:183)
	at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
	at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:385)
	... 9 more
Caused by: java.net.ConnectException: Connection refused
	at java.net.PlainSocketImpl.socketConnect(Native Method)
	at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
	at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:211)
	at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
	at java.net.Socket.connect(Socket.java:529)
	at org.apache.thrift.transport.TSocket.open(TSocket.java:178)
	... 11 more
12/01/30 09:05:28 DEBUG  hadoop.ColumnFamilyInputFormat: failed connect to endpoint 0.0.0.0
java.io.IOException: unable to connect to server
	at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:389)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:224)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:73)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:193)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:178)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused
	at org.apache.thrift.transport.TSocket.open(TSocket.java:183)
	at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
	at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:385)
	... 9 more
Caused by: java.net.ConnectException: Connection refused
	at java.net.PlainSocketImpl.socketConnect(Native Method)
	at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
	at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:211)
	at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
	at java.net.Socket.connect(Socket.java:529)
	at org.apache.thrift.transport.TSocket.open(TSocket.java:178)
	... 11 more
12/01/30 09:05:28 DEBUG  hadoop.ColumnFamilyInputFormat: failed connect to endpoint 0.0.0.0
java.io.IOException: unable to connect to server
	at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:389)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:224)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:73)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:193)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:178)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused
	at org.apache.thrift.transport.TSocket.open(TSocket.java:183)
	at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
	at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:385)
	... 9 more
Caused by: java.net.ConnectException: Connection refused
	at java.net.PlainSocketImpl.socketConnect(Native Method)
	at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
	at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:211)
	at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
	at java.net.Socket.connect(Socket.java:529)
	at org.apache.thrift.transport.TSocket.open(TSocket.java:178)
	... 11 more
12/01/30 09:05:28 DEBUG  hadoop.ColumnFamilyInputFormat: failed connect to endpoint 0.0.0.0
java.io.IOException: unable to connect to server
	at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:389)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:224)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:73)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:193)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:178)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused
	at org.apache.thrift.transport.TSocket.open(TSocket.java:183)
	at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
	at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:385)
	... 9 more
Caused by: java.net.ConnectException: Connection refused
	at java.net.PlainSocketImpl.socketConnect(Native Method)
	at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
	at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:211)
	at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
	at java.net.Socket.connect(Socket.java:529)
	at org.apache.thrift.transport.TSocket.open(TSocket.java:178)
	... 11 more
12/01/30 09:05:28 DEBUG  hadoop.ColumnFamilyInputFormat: failed connect to endpoint 0.0.0.0
java.io.IOException: unable to connect to server
	at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:389)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:224)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:73)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:193)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:178)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused
	at org.apache.thrift.transport.TSocket.open(TSocket.java:183)
	at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
	at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:385)
	... 9 more
Caused by: java.net.ConnectException: Connection refused
	at java.net.PlainSocketImpl.socketConnect(Native Method)
	at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
	at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:211)
	at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
	at java.net.Socket.connect(Socket.java:529)
	at org.apache.thrift.transport.TSocket.open(TSocket.java:178)
	... 11 more
12/01/30 09:05:28 DEBUG  hadoop.ColumnFamilyInputFormat: failed connect to endpoint 0.0.0.0
java.io.IOException: unable to connect to server
	at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:389)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:224)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:73)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:193)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:178)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused
	at org.apache.thrift.transport.TSocket.open(TSocket.java:183)
	at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
	at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:385)
	... 9 more
Caused by: java.net.ConnectException: Connection refused
	at java.net.PlainSocketImpl.socketConnect(Native Method)
	at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
	at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:211)
	at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
	at java.net.Socket.connect(Socket.java:529)
	at org.apache.thrift.transport.TSocket.open(TSocket.java:178)
	... 11 more
12/01/30 09:05:28 DEBUG  hadoop.ColumnFamilyInputFormat: failed connect to endpoint 0.0.0.0
java.io.IOException: unable to connect to server
	at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:389)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:224)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:73)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:193)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:178)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused
	at org.apache.thrift.transport.TSocket.open(TSocket.java:183)
	at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
	at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:385)
	... 9 more
Caused by: java.net.ConnectException: Connection refused
	at java.net.PlainSocketImpl.socketConnect(Native Method)
	at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
	at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:211)
	at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
	at java.net.Socket.connect(Socket.java:529)
	at org.apache.thrift.transport.TSocket.open(TSocket.java:178)
	... 11 more
12/01/30 09:05:28 DEBUG  hadoop.ColumnFamilyInputFormat: failed connect to endpoint 0.0.0.0
java.io.IOException: unable to connect to server
	at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:389)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:224)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:73)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:193)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:178)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused
	at org.apache.thrift.transport.TSocket.open(TSocket.java:183)
	at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
	at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:385)
	... 9 more
Caused by: java.net.ConnectException: Connection refused
	at java.net.PlainSocketImpl.socketConnect(Native Method)
	at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
	at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:211)
	at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
	at java.net.Socket.connect(Socket.java:529)
	at org.apache.thrift.transport.TSocket.open(TSocket.java:178)
	... 11 more
12/01/30 09:05:28 DEBUG  hadoop.ColumnFamilyInputFormat: failed connect to endpoint 0.0.0.0
java.io.IOException: unable to connect to server
	at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:389)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:224)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:73)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:193)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:178)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused
	at org.apache.thrift.transport.TSocket.open(TSocket.java:183)
	at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
	at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:385)
	... 9 more
Caused by: java.net.ConnectException: Connection refused
	at java.net.PlainSocketImpl.socketConnect(Native Method)
	at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
	at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:211)
	at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
	at java.net.Socket.connect(Socket.java:529)
	at org.apache.thrift.transport.TSocket.open(TSocket.java:178)
	... 11 more
12/01/30 09:05:28 DEBUG  hadoop.ColumnFamilyInputFormat: failed connect to endpoint 0.0.0.0
java.io.IOException: unable to connect to server
	at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:389)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:224)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:73)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:193)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:178)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused
	at org.apache.thrift.transport.TSocket.open(TSocket.java:183)
	at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
	at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:385)
	... 9 more
Caused by: java.net.ConnectException: Connection refused
	at java.net.PlainSocketImpl.socketConnect(Native Method)
	at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
	at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:211)
	at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
	at java.net.Socket.connect(Socket.java:529)
	at org.apache.thrift.transport.TSocket.open(TSocket.java:178)
	... 11 more
12/01/30 09:05:28 INFO  mapred.JobClient: Cleaning up the staging area hdfs://product-hadoop-namenode:3900/www/product/hadoop/tmp/mapred/staging/mapred/.staging/job_201201161511_9053
Exception in thread "main" java.io.IOException: Could not get input splits
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSplits(ColumnFamilyInputFormat.java:160)
	at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:944)
	at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:961)
	at org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:170)
	at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:880)
	at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
	at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:833)
	at org.apache.hadoop.mapreduce.Job.submit(Job.java:476)
	at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:506)
	at cz.company.product.context.contextindexer.ContextIndexer.run(ContextIndexer.java:1124)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
	at cz.company.product.context.contextindexer.ContextIndexer.main(ContextIndexer.java:124)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
Caused by: java.util.concurrent.ExecutionException: java.io.IOException: failed connecting to all endpoints 0.0.0.0,0.0.0.0,0.0.0.0
	at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
	at java.util.concurrent.FutureTask.get(FutureTask.java:83)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSplits(ColumnFamilyInputFormat.java:156)
	... 19 more
Caused by: java.io.IOException: failed connecting to all endpoints 0.0.0.0,0.0.0.0,0.0.0.0
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:241)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:73)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:193)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:178)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:662)
{noformat}
                
> Empty rpc_address prevents running MapReduce job outside a cluster
> ------------------------------------------------------------------
>
>                 Key: CASSANDRA-3811
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3811
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 0.8.9
>         Environment: Debian Stable,
> Cassandra 0.8.9,
> Java(TM) SE Runtime Environment (build 1.6.0_26-b03),
> Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode)
>            Reporter: Patrik Modesto
>
> Setting rpc_address to empty to make Cassandra listen on all network intefaceces breaks running mapredude job from outside the cluster. The jobs wont even start, showing these messages:
> {noformat}
> 12/01/26 11:15:21 DEBUG  hadoop.ColumnFamilyInputFormat: failed
> connect to endpoint 0.0.0.0
> java.io.IOException: unable to connect to server
>        at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:389)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:224)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:73)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:193)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:178)
>        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>        at java.lang.Thread.run(Thread.java:662)
> Caused by: org.apache.thrift.transport.TTransportException:
> java.net.ConnectException: Connection refused
>        at org.apache.thrift.transport.TSocket.open(TSocket.java:183)
>        at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
>        at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:385)
>        ... 9 more
> Caused by: java.net.ConnectException: Connection refused
>        at java.net.PlainSocketImpl.socketConnect(Native Method)
>        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
>        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:211)
>        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
>        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
>        at java.net.Socket.connect(Socket.java:529)
>        at org.apache.thrift.transport.TSocket.open(TSocket.java:178)
>        ... 11 more
> ...
> Caused by: java.util.concurrent.ExecutionException:
> java.io.IOException: failed connecting to all endpoints
> 10.0.18.129,10.0.18.99,10.0.18.98
>        at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
>        at java.util.concurrent.FutureTask.get(FutureTask.java:83)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSplits(ColumnFamilyInputFormat.java:156)
>        ... 19 more
> Caused by: java.io.IOException: failed connecting to all endpoints
> 10.0.18.129,10.0.18.99,10.0.18.98
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:241)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:73)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:193)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:178)
>        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>        at java.lang.Thread.run(Thread.java:662)
> {noformat}
> Describe ring retunrs:
> {noformat}
> describe_ring returns:
> endpoints: 10.0.18.129,10.0.18.99,10.0.18.98
> rpc_endpoints: 0.0.0.0,0.0.0.0,0.0.0.0
> {noformat}
> [Michael Frisch|http://www.mail-archive.com/user@cassandra.apache.org/msg20180.html] found possible bug in the Cassandra source:
> {quote}
> If the code in the 0.8 branch is reflective of what is actually included in 
> Cassandra 0.8.9 (here: 
> http://svn.apache.org/repos/asf/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/hadoop/ColumnFamilyInputFormat.java)
>  then the problem is that line 202 is doing an == comparison on strings.  The 
> correct way to compare would be endpoint_address.equals("0.0.0.0") instead.
> - Mike
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-3811) Empty rpc_address prevents running MapReduce job outside a cluster

Posted by "Brandon Williams (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13230424#comment-13230424 ] 

Brandon Williams commented on CASSANDRA-3811:
---------------------------------------------

It's an edge case because most people run hadoop colocated with cassandra.  Why?  Because hadoop is about moving computation to data, not the other way around, and without colocation this is exactly what you're doing.

That said, we understand this is a problem that needs to be addressed, but it is hardly critical.
                
> Empty rpc_address prevents running MapReduce job outside a cluster
> ------------------------------------------------------------------
>
>                 Key: CASSANDRA-3811
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3811
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 0.8.9, 0.8.10
>         Environment: Debian Stable,
> Cassandra 0.8.9,
> Java(TM) SE Runtime Environment (build 1.6.0_26-b03),
> Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode)
>            Reporter: Patrik Modesto
>            Priority: Minor
>
> Setting rpc_address to empty to make Cassandra listen on all network intefaceces breaks running mapredude job from outside the cluster. The jobs wont even start, showing these messages:
> {noformat}
> 12/01/26 11:15:21 DEBUG  hadoop.ColumnFamilyInputFormat: failed
> connect to endpoint 0.0.0.0
> java.io.IOException: unable to connect to server
>        at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:389)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:224)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:73)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:193)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:178)
>        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>        at java.lang.Thread.run(Thread.java:662)
> Caused by: org.apache.thrift.transport.TTransportException:
> java.net.ConnectException: Connection refused
>        at org.apache.thrift.transport.TSocket.open(TSocket.java:183)
>        at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
>        at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:385)
>        ... 9 more
> Caused by: java.net.ConnectException: Connection refused
>        at java.net.PlainSocketImpl.socketConnect(Native Method)
>        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
>        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:211)
>        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
>        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
>        at java.net.Socket.connect(Socket.java:529)
>        at org.apache.thrift.transport.TSocket.open(TSocket.java:178)
>        ... 11 more
> ...
> Caused by: java.util.concurrent.ExecutionException:
> java.io.IOException: failed connecting to all endpoints
> 10.0.18.129,10.0.18.99,10.0.18.98
>        at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
>        at java.util.concurrent.FutureTask.get(FutureTask.java:83)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSplits(ColumnFamilyInputFormat.java:156)
>        ... 19 more
> Caused by: java.io.IOException: failed connecting to all endpoints
> 10.0.18.129,10.0.18.99,10.0.18.98
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:241)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:73)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:193)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:178)
>        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>        at java.lang.Thread.run(Thread.java:662)
> {noformat}
> Describe ring retunrs:
> {noformat}
> describe_ring returns:
> endpoints: 10.0.18.129,10.0.18.99,10.0.18.98
> rpc_endpoints: 0.0.0.0,0.0.0.0,0.0.0.0
> {noformat}
> [Michael Frisch|http://www.mail-archive.com/user@cassandra.apache.org/msg20180.html] found possible bug in the Cassandra source:
> {quote}
> If the code in the 0.8 branch is reflective of what is actually included in 
> Cassandra 0.8.9 (here: 
> http://svn.apache.org/repos/asf/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/hadoop/ColumnFamilyInputFormat.java)
>  then the problem is that line 202 is doing an == comparison on strings.  The 
> correct way to compare would be endpoint_address.equals("0.0.0.0") instead.
> - Mike
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-3811) Empty rpc_address prevents running MapReduce job outside a cluster

Posted by "Patrik Modesto (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13230251#comment-13230251 ] 

Patrik Modesto commented on CASSANDRA-3811:
-------------------------------------------

I don't agree that it's an edge case. Why do you force people to install a job to a cluster node? It doesn't need to be there and there is no reason to force people to do that. To run a Hadoop job, you just need to *point* it to a Namenode and Jobtracker, thats all, no special placement required. It's the same as with Hadoop Namenode+Datanodes or Jobtracker+Tasktrackers, does Hadoop force you to have Jobtracker is on the same node as Tasktracker? No.

So instead of fixing the bug you declare that an edge case? Great!

You force us to stay with quite old version 0.8.7 that was the last working version and start looking for alternatives. That's sad.

Patrik
                
> Empty rpc_address prevents running MapReduce job outside a cluster
> ------------------------------------------------------------------
>
>                 Key: CASSANDRA-3811
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3811
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 0.8.9, 0.8.10
>         Environment: Debian Stable,
> Cassandra 0.8.9,
> Java(TM) SE Runtime Environment (build 1.6.0_26-b03),
> Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode)
>            Reporter: Patrik Modesto
>            Priority: Minor
>
> Setting rpc_address to empty to make Cassandra listen on all network intefaceces breaks running mapredude job from outside the cluster. The jobs wont even start, showing these messages:
> {noformat}
> 12/01/26 11:15:21 DEBUG  hadoop.ColumnFamilyInputFormat: failed
> connect to endpoint 0.0.0.0
> java.io.IOException: unable to connect to server
>        at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:389)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:224)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:73)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:193)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:178)
>        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>        at java.lang.Thread.run(Thread.java:662)
> Caused by: org.apache.thrift.transport.TTransportException:
> java.net.ConnectException: Connection refused
>        at org.apache.thrift.transport.TSocket.open(TSocket.java:183)
>        at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
>        at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:385)
>        ... 9 more
> Caused by: java.net.ConnectException: Connection refused
>        at java.net.PlainSocketImpl.socketConnect(Native Method)
>        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
>        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:211)
>        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
>        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
>        at java.net.Socket.connect(Socket.java:529)
>        at org.apache.thrift.transport.TSocket.open(TSocket.java:178)
>        ... 11 more
> ...
> Caused by: java.util.concurrent.ExecutionException:
> java.io.IOException: failed connecting to all endpoints
> 10.0.18.129,10.0.18.99,10.0.18.98
>        at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
>        at java.util.concurrent.FutureTask.get(FutureTask.java:83)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSplits(ColumnFamilyInputFormat.java:156)
>        ... 19 more
> Caused by: java.io.IOException: failed connecting to all endpoints
> 10.0.18.129,10.0.18.99,10.0.18.98
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:241)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:73)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:193)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:178)
>        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>        at java.lang.Thread.run(Thread.java:662)
> {noformat}
> Describe ring retunrs:
> {noformat}
> describe_ring returns:
> endpoints: 10.0.18.129,10.0.18.99,10.0.18.98
> rpc_endpoints: 0.0.0.0,0.0.0.0,0.0.0.0
> {noformat}
> [Michael Frisch|http://www.mail-archive.com/user@cassandra.apache.org/msg20180.html] found possible bug in the Cassandra source:
> {quote}
> If the code in the 0.8 branch is reflective of what is actually included in 
> Cassandra 0.8.9 (here: 
> http://svn.apache.org/repos/asf/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/hadoop/ColumnFamilyInputFormat.java)
>  then the problem is that line 202 is doing an == comparison on strings.  The 
> correct way to compare would be endpoint_address.equals("0.0.0.0") instead.
> - Mike
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-3811) Empty rpc_address prevents running MapReduce job outside a cluster

Posted by "Jonathan Ellis (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-3811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-3811:
--------------------------------------

    Priority: Minor  (was: Critical)

Changing to Minor.

I don't like to argue about priorities, but "Critical" means "things are badly broken;" either it doesn't work AT ALL in the common case, or in edge cases it can fail catastrophically (data loss or cascading failure).

This is not the case here; we have a problem with an edge case that we barely support (jobs from outside the cluster) that does not affect more normal setups.  That's minor for the project as a whole.

                
> Empty rpc_address prevents running MapReduce job outside a cluster
> ------------------------------------------------------------------
>
>                 Key: CASSANDRA-3811
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3811
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 0.8.9, 0.8.10
>         Environment: Debian Stable,
> Cassandra 0.8.9,
> Java(TM) SE Runtime Environment (build 1.6.0_26-b03),
> Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode)
>            Reporter: Patrik Modesto
>            Priority: Minor
>
> Setting rpc_address to empty to make Cassandra listen on all network intefaceces breaks running mapredude job from outside the cluster. The jobs wont even start, showing these messages:
> {noformat}
> 12/01/26 11:15:21 DEBUG  hadoop.ColumnFamilyInputFormat: failed
> connect to endpoint 0.0.0.0
> java.io.IOException: unable to connect to server
>        at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:389)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:224)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:73)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:193)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:178)
>        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>        at java.lang.Thread.run(Thread.java:662)
> Caused by: org.apache.thrift.transport.TTransportException:
> java.net.ConnectException: Connection refused
>        at org.apache.thrift.transport.TSocket.open(TSocket.java:183)
>        at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
>        at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:385)
>        ... 9 more
> Caused by: java.net.ConnectException: Connection refused
>        at java.net.PlainSocketImpl.socketConnect(Native Method)
>        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
>        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:211)
>        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
>        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
>        at java.net.Socket.connect(Socket.java:529)
>        at org.apache.thrift.transport.TSocket.open(TSocket.java:178)
>        ... 11 more
> ...
> Caused by: java.util.concurrent.ExecutionException:
> java.io.IOException: failed connecting to all endpoints
> 10.0.18.129,10.0.18.99,10.0.18.98
>        at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
>        at java.util.concurrent.FutureTask.get(FutureTask.java:83)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSplits(ColumnFamilyInputFormat.java:156)
>        ... 19 more
> Caused by: java.io.IOException: failed connecting to all endpoints
> 10.0.18.129,10.0.18.99,10.0.18.98
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:241)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:73)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:193)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:178)
>        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>        at java.lang.Thread.run(Thread.java:662)
> {noformat}
> Describe ring retunrs:
> {noformat}
> describe_ring returns:
> endpoints: 10.0.18.129,10.0.18.99,10.0.18.98
> rpc_endpoints: 0.0.0.0,0.0.0.0,0.0.0.0
> {noformat}
> [Michael Frisch|http://www.mail-archive.com/user@cassandra.apache.org/msg20180.html] found possible bug in the Cassandra source:
> {quote}
> If the code in the 0.8 branch is reflective of what is actually included in 
> Cassandra 0.8.9 (here: 
> http://svn.apache.org/repos/asf/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/hadoop/ColumnFamilyInputFormat.java)
>  then the problem is that line 202 is doing an == comparison on strings.  The 
> correct way to compare would be endpoint_address.equals("0.0.0.0") instead.
> - Mike
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-3811) Empty rpc_address prevents running MapReduce job outside a cluster

Posted by "Patrik Modesto (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-3811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Patrik Modesto updated CASSANDRA-3811:
--------------------------------------

             Priority: Critical  (was: Major)
    Affects Version/s: 0.8.10

Changed to critical, because Cassandra-Hadoop doesn't work.

The problem really is the rpc_endpoint: 0.0.0.0, the CFIF can't handle that.

What happens if you setup a Cassandra cluster with rpc_endpoint: 0.0.0.0:

A) you run mapreduce job from outside the cluster, where there is no Cassandra server on localhost; job fails, can't connect to localhost to get splits.

B) you run mapreduce job from inside the cluster, where there is Cassandra server on localhost:
   1) CFIF calls describe_ring that returns something like this:
{noformat}
148873535527910577765226390751398592512 - 21267647932558653966460912964485513216 [10.0.18.129,10.0.18.99,10.0.18.98] [0.0.0.0,0.0.0.0,0.0.0.0]
106338239662793269832304564822427566080 - 148873535527910577765226390751398592512 [10.0.18.87,10.0.18.129,10.0.18.99] [0.0.0.0,0.0.0.0,0.0.0.0]
63802943797675961899382738893456539648 - 106338239662793269832304564822427566080 [10.0.18.98,10.0.18.87,10.0.18.129] [0.0.0.0,0.0.0.0,0.0.0.0]
21267647932558653966460912964485513216 - 63802943797675961899382738893456539648 [10.0.18.99,10.0.18.98,10.0.18.87] [0.0.0.0,0.0.0.0,0.0.0.0]
{noformat}
      Note the 0.0.0.0 IPs returned as rpc_endpoints.
  2) CFIF.getSplits then asks each node for the respective key range, which is the 0.0.0.0 ie. localhost instead of a node that really owns the key range
  3) localhost has of course just its key range and for it, correctly returns the splits, for other key ranges it returns start_key:end_key which is wrong
  4) hadoop then uses these wrong splits to calculate work for tasks etc. and such tasks never finish and get killed eventualy

Here is output of my simple test utility:
{noformat}
$ ./describe.py rfTest2
describe_ring
148873535527910577765226390751398592512 - 21267647932558653966460912964485513216 [10.0.18.129,10.0.18.99] [0.0.0.0,0.0.0.0]
106338239662793269832304564822427566080 - 148873535527910577765226390751398592512 [10.0.18.87,10.0.18.129] [0.0.0.0,0.0.0.0]
63802943797675961899382738893456539648 - 106338239662793269832304564822427566080 [10.0.18.98,10.0.18.87] [0.0.0.0,0.0.0.0]
21267647932558653966460912964485513216 - 63802943797675961899382738893456539648 [10.0.18.99,10.0.18.98] [0.0.0.0,0.0.0.0]
10.0.18.98:  ['148873535527910577765226390751398592512', '21267647932558653966460912964485513216']
10.0.18.98:  ['106338239662793269832304564822427566080', '148873535527910577765226390751398592512']
10.0.18.98:  ['63802943797675961899382738893456539648', '68793533432627989494832763003260446472', '74819769657966890059528779911565558455', '80567991868944382942831588469855825734', '87891603877459256288845990379651315512', '93924679813695495884062398757642798961', '100192950219560445380847254251687782801', '106338239662793269832304564822427566080']
10.0.18.98:  ['21267647932558653966460912964485513216', '26244106837171755875962953279096666742', '32201975146808227304585609407713826911', '38824800339023975211549544003547061559', '45039424797795217820051587252107982434', '50205785598336646901229997590646295071', '57012896007316411899806797335411421637', '63802943797675961899382738893456539648']
{noformat}

To explain the output. I have 4 node test cluster, keyspace rfTest2 has RF=2. It calls describe_ring to get node list. Then it calls describe_splits for each key range but asks always the same node, the same way CFIF does. You can see that nodes which don't have the key range return just start_key:end_key.

Solutions:
A) never return 0.0.0.0 from describe_ring
B) fix CFIF to use endpoint if rpc_endpoint is localhost

Regars,
Patrik
                
> Empty rpc_address prevents running MapReduce job outside a cluster
> ------------------------------------------------------------------
>
>                 Key: CASSANDRA-3811
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3811
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 0.8.9, 0.8.10
>         Environment: Debian Stable,
> Cassandra 0.8.9,
> Java(TM) SE Runtime Environment (build 1.6.0_26-b03),
> Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode)
>            Reporter: Patrik Modesto
>            Priority: Critical
>
> Setting rpc_address to empty to make Cassandra listen on all network intefaceces breaks running mapredude job from outside the cluster. The jobs wont even start, showing these messages:
> {noformat}
> 12/01/26 11:15:21 DEBUG  hadoop.ColumnFamilyInputFormat: failed
> connect to endpoint 0.0.0.0
> java.io.IOException: unable to connect to server
>        at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:389)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:224)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:73)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:193)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:178)
>        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>        at java.lang.Thread.run(Thread.java:662)
> Caused by: org.apache.thrift.transport.TTransportException:
> java.net.ConnectException: Connection refused
>        at org.apache.thrift.transport.TSocket.open(TSocket.java:183)
>        at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
>        at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:385)
>        ... 9 more
> Caused by: java.net.ConnectException: Connection refused
>        at java.net.PlainSocketImpl.socketConnect(Native Method)
>        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
>        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:211)
>        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
>        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
>        at java.net.Socket.connect(Socket.java:529)
>        at org.apache.thrift.transport.TSocket.open(TSocket.java:178)
>        ... 11 more
> ...
> Caused by: java.util.concurrent.ExecutionException:
> java.io.IOException: failed connecting to all endpoints
> 10.0.18.129,10.0.18.99,10.0.18.98
>        at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
>        at java.util.concurrent.FutureTask.get(FutureTask.java:83)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSplits(ColumnFamilyInputFormat.java:156)
>        ... 19 more
> Caused by: java.io.IOException: failed connecting to all endpoints
> 10.0.18.129,10.0.18.99,10.0.18.98
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:241)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:73)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:193)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:178)
>        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>        at java.lang.Thread.run(Thread.java:662)
> {noformat}
> Describe ring retunrs:
> {noformat}
> describe_ring returns:
> endpoints: 10.0.18.129,10.0.18.99,10.0.18.98
> rpc_endpoints: 0.0.0.0,0.0.0.0,0.0.0.0
> {noformat}
> [Michael Frisch|http://www.mail-archive.com/user@cassandra.apache.org/msg20180.html] found possible bug in the Cassandra source:
> {quote}
> If the code in the 0.8 branch is reflective of what is actually included in 
> Cassandra 0.8.9 (here: 
> http://svn.apache.org/repos/asf/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/hadoop/ColumnFamilyInputFormat.java)
>  then the problem is that line 202 is doing an == comparison on strings.  The 
> correct way to compare would be endpoint_address.equals("0.0.0.0") instead.
> - Mike
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-3811) Empty rpc_address prevents running MapReduce job outside a cluster

Posted by "Patrik Modesto (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13197908#comment-13197908 ] 

Patrik Modesto commented on CASSANDRA-3811:
-------------------------------------------

I didn't rebuild the nodes. I've just recompiled maven artifacts, then recompiled the hadoop job and run it. IMHO that should do the test.
                
> Empty rpc_address prevents running MapReduce job outside a cluster
> ------------------------------------------------------------------
>
>                 Key: CASSANDRA-3811
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3811
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 0.8.9
>         Environment: Debian Stable,
> Cassandra 0.8.9,
> Java(TM) SE Runtime Environment (build 1.6.0_26-b03),
> Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode)
>            Reporter: Patrik Modesto
>
> Setting rpc_address to empty to make Cassandra listen on all network intefaceces breaks running mapredude job from outside the cluster. The jobs wont even start, showing these messages:
> {noformat}
> 12/01/26 11:15:21 DEBUG  hadoop.ColumnFamilyInputFormat: failed
> connect to endpoint 0.0.0.0
> java.io.IOException: unable to connect to server
>        at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:389)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:224)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:73)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:193)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:178)
>        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>        at java.lang.Thread.run(Thread.java:662)
> Caused by: org.apache.thrift.transport.TTransportException:
> java.net.ConnectException: Connection refused
>        at org.apache.thrift.transport.TSocket.open(TSocket.java:183)
>        at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
>        at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:385)
>        ... 9 more
> Caused by: java.net.ConnectException: Connection refused
>        at java.net.PlainSocketImpl.socketConnect(Native Method)
>        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
>        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:211)
>        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
>        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
>        at java.net.Socket.connect(Socket.java:529)
>        at org.apache.thrift.transport.TSocket.open(TSocket.java:178)
>        ... 11 more
> ...
> Caused by: java.util.concurrent.ExecutionException:
> java.io.IOException: failed connecting to all endpoints
> 10.0.18.129,10.0.18.99,10.0.18.98
>        at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
>        at java.util.concurrent.FutureTask.get(FutureTask.java:83)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSplits(ColumnFamilyInputFormat.java:156)
>        ... 19 more
> Caused by: java.io.IOException: failed connecting to all endpoints
> 10.0.18.129,10.0.18.99,10.0.18.98
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:241)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:73)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:193)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:178)
>        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>        at java.lang.Thread.run(Thread.java:662)
> {noformat}
> Describe ring retunrs:
> {noformat}
> describe_ring returns:
> endpoints: 10.0.18.129,10.0.18.99,10.0.18.98
> rpc_endpoints: 0.0.0.0,0.0.0.0,0.0.0.0
> {noformat}
> [Michael Frisch|http://www.mail-archive.com/user@cassandra.apache.org/msg20180.html] found possible bug in the Cassandra source:
> {quote}
> If the code in the 0.8 branch is reflective of what is actually included in 
> Cassandra 0.8.9 (here: 
> http://svn.apache.org/repos/asf/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/hadoop/ColumnFamilyInputFormat.java)
>  then the problem is that line 202 is doing an == comparison on strings.  The 
> correct way to compare would be endpoint_address.equals("0.0.0.0") instead.
> - Mike
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (CASSANDRA-3811) Empty rpc_address prevents running MapReduce job outside a cluster

Posted by "Janne Jalkanen (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204996#comment-13204996 ] 

Janne Jalkanen edited comment on CASSANDRA-3811 at 2/9/12 11:02 PM:
--------------------------------------------------------------------

Tried the following patch against cassandra-0.8, but it doesn't seem to work - same symptoms. Rebuilt cassandra_storage.jar too.

{noformat}
diff --git a/src/java/org/apache/cassandra/hadoop/ColumnFamilyInputFormat.java b/src/java/org/apache/cassandra/hadoop/ColumnFamilyInputFormat.java
index d89f285..af0765b 100644
--- a/src/java/org/apache/cassandra/hadoop/ColumnFamilyInputFormat.java
+++ b/src/java/org/apache/cassandra/hadoop/ColumnFamilyInputFormat.java
@@ -199,9 +199,9 @@ public class ColumnFamilyInputFormat extends InputFormat<ByteBuffer, SortedMap<B
             for (String endpoint: range.rpc_endpoints)
             {
                 String endpoint_address = endpoint;
-                       if(endpoint_address == null || endpoint_address == "0.0.0.0")
-                               endpoint_address = range.endpoints.get(endpointIndex);
-                       endpoints[endpointIndex++] = InetAddress.getByName(endpoint_address).getHostName();
+                if(endpoint_address == null || endpoint_address.equals( "0.0.0.0" ) )
+                    endpoint_address = range.endpoints.get(endpointIndex);
+                endpoints[endpointIndex++] = InetAddress.getByName(endpoint_address).getHostName();
             }
 
             for (int i = 1; i < tokens.size(); i++)
{noformat}

Though I think that this patch is useful anyway, since it removes one clearly wrong line (and corrects an errand indentation).
                
      was (Author: jalkanen):
    Tried the following patch against cassandra-0.8, but it doesn't seem to work - same symptoms. Rebuilt cassandra_storage.jar too.

{noformat}
diff --git a/src/java/org/apache/cassandra/hadoop/ColumnFamilyInputFormat.java b/src/java/org/apache/cassandra/hadoop/ColumnFamilyInputFormat.java
index d89f285..af0765b 100644
--- a/src/java/org/apache/cassandra/hadoop/ColumnFamilyInputFormat.java
+++ b/src/java/org/apache/cassandra/hadoop/ColumnFamilyInputFormat.java
@@ -199,9 +199,9 @@ public class ColumnFamilyInputFormat extends InputFormat<ByteBuffer, SortedMap<B
             for (String endpoint: range.rpc_endpoints)
             {
                 String endpoint_address = endpoint;
-                       if(endpoint_address == null || endpoint_address == "0.0.0.0")
-                               endpoint_address = range.endpoints.get(endpointIndex);
-                       endpoints[endpointIndex++] = InetAddress.getByName(endpoint_address).getHostName();
+                if(endpoint_address == null || endpoint_address.equals( "0.0.0.0" ) )
+                    endpoint_address = range.endpoints.get(endpointIndex);
+                endpoints[endpointIndex++] = InetAddress.getByName(endpoint_address).getHostName();
             }
 
             for (int i = 1; i < tokens.size(); i++)
{noformat}
                  
> Empty rpc_address prevents running MapReduce job outside a cluster
> ------------------------------------------------------------------
>
>                 Key: CASSANDRA-3811
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3811
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 0.8.9
>         Environment: Debian Stable,
> Cassandra 0.8.9,
> Java(TM) SE Runtime Environment (build 1.6.0_26-b03),
> Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode)
>            Reporter: Patrik Modesto
>
> Setting rpc_address to empty to make Cassandra listen on all network intefaceces breaks running mapredude job from outside the cluster. The jobs wont even start, showing these messages:
> {noformat}
> 12/01/26 11:15:21 DEBUG  hadoop.ColumnFamilyInputFormat: failed
> connect to endpoint 0.0.0.0
> java.io.IOException: unable to connect to server
>        at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:389)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:224)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:73)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:193)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:178)
>        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>        at java.lang.Thread.run(Thread.java:662)
> Caused by: org.apache.thrift.transport.TTransportException:
> java.net.ConnectException: Connection refused
>        at org.apache.thrift.transport.TSocket.open(TSocket.java:183)
>        at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
>        at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:385)
>        ... 9 more
> Caused by: java.net.ConnectException: Connection refused
>        at java.net.PlainSocketImpl.socketConnect(Native Method)
>        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
>        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:211)
>        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
>        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
>        at java.net.Socket.connect(Socket.java:529)
>        at org.apache.thrift.transport.TSocket.open(TSocket.java:178)
>        ... 11 more
> ...
> Caused by: java.util.concurrent.ExecutionException:
> java.io.IOException: failed connecting to all endpoints
> 10.0.18.129,10.0.18.99,10.0.18.98
>        at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
>        at java.util.concurrent.FutureTask.get(FutureTask.java:83)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSplits(ColumnFamilyInputFormat.java:156)
>        ... 19 more
> Caused by: java.io.IOException: failed connecting to all endpoints
> 10.0.18.129,10.0.18.99,10.0.18.98
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:241)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:73)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:193)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:178)
>        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>        at java.lang.Thread.run(Thread.java:662)
> {noformat}
> Describe ring retunrs:
> {noformat}
> describe_ring returns:
> endpoints: 10.0.18.129,10.0.18.99,10.0.18.98
> rpc_endpoints: 0.0.0.0,0.0.0.0,0.0.0.0
> {noformat}
> [Michael Frisch|http://www.mail-archive.com/user@cassandra.apache.org/msg20180.html] found possible bug in the Cassandra source:
> {quote}
> If the code in the 0.8 branch is reflective of what is actually included in 
> Cassandra 0.8.9 (here: 
> http://svn.apache.org/repos/asf/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/hadoop/ColumnFamilyInputFormat.java)
>  then the problem is that line 202 is doing an == comparison on strings.  The 
> correct way to compare would be endpoint_address.equals("0.0.0.0") instead.
> - Mike
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-3811) Empty rpc_address prevents running MapReduce job outside a cluster

Posted by "Janne Jalkanen (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13205808#comment-13205808 ] 

Janne Jalkanen commented on CASSANDRA-3811:
-------------------------------------------

BTW, I cannot replicate this on local OSX/10.6, but I *can* replicate this reliably on Ubuntu 10.04 LTS running on Amazon EC2.  So this might be specific to Linux and/or EC2.
                
> Empty rpc_address prevents running MapReduce job outside a cluster
> ------------------------------------------------------------------
>
>                 Key: CASSANDRA-3811
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3811
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 0.8.9
>         Environment: Debian Stable,
> Cassandra 0.8.9,
> Java(TM) SE Runtime Environment (build 1.6.0_26-b03),
> Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode)
>            Reporter: Patrik Modesto
>
> Setting rpc_address to empty to make Cassandra listen on all network intefaceces breaks running mapredude job from outside the cluster. The jobs wont even start, showing these messages:
> {noformat}
> 12/01/26 11:15:21 DEBUG  hadoop.ColumnFamilyInputFormat: failed
> connect to endpoint 0.0.0.0
> java.io.IOException: unable to connect to server
>        at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:389)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:224)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:73)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:193)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:178)
>        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>        at java.lang.Thread.run(Thread.java:662)
> Caused by: org.apache.thrift.transport.TTransportException:
> java.net.ConnectException: Connection refused
>        at org.apache.thrift.transport.TSocket.open(TSocket.java:183)
>        at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
>        at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:385)
>        ... 9 more
> Caused by: java.net.ConnectException: Connection refused
>        at java.net.PlainSocketImpl.socketConnect(Native Method)
>        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
>        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:211)
>        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
>        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
>        at java.net.Socket.connect(Socket.java:529)
>        at org.apache.thrift.transport.TSocket.open(TSocket.java:178)
>        ... 11 more
> ...
> Caused by: java.util.concurrent.ExecutionException:
> java.io.IOException: failed connecting to all endpoints
> 10.0.18.129,10.0.18.99,10.0.18.98
>        at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
>        at java.util.concurrent.FutureTask.get(FutureTask.java:83)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSplits(ColumnFamilyInputFormat.java:156)
>        ... 19 more
> Caused by: java.io.IOException: failed connecting to all endpoints
> 10.0.18.129,10.0.18.99,10.0.18.98
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:241)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:73)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:193)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:178)
>        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>        at java.lang.Thread.run(Thread.java:662)
> {noformat}
> Describe ring retunrs:
> {noformat}
> describe_ring returns:
> endpoints: 10.0.18.129,10.0.18.99,10.0.18.98
> rpc_endpoints: 0.0.0.0,0.0.0.0,0.0.0.0
> {noformat}
> [Michael Frisch|http://www.mail-archive.com/user@cassandra.apache.org/msg20180.html] found possible bug in the Cassandra source:
> {quote}
> If the code in the 0.8 branch is reflective of what is actually included in 
> Cassandra 0.8.9 (here: 
> http://svn.apache.org/repos/asf/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/hadoop/ColumnFamilyInputFormat.java)
>  then the problem is that line 202 is doing an == comparison on strings.  The 
> correct way to compare would be endpoint_address.equals("0.0.0.0") instead.
> - Mike
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-3811) Empty rpc_address prevents running MapReduce job outside a cluster

Posted by "Janne Jalkanen (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204996#comment-13204996 ] 

Janne Jalkanen commented on CASSANDRA-3811:
-------------------------------------------

Tried the following patch against cassandra-0.8, but it doesn't seem to work - same symptoms. Rebuilt cassandra_storage.jar too.

{noformat}
diff --git a/src/java/org/apache/cassandra/hadoop/ColumnFamilyInputFormat.java b/src/java/org/apache/cassandra/hadoop/ColumnFamilyInputFormat.java
index d89f285..af0765b 100644
--- a/src/java/org/apache/cassandra/hadoop/ColumnFamilyInputFormat.java
+++ b/src/java/org/apache/cassandra/hadoop/ColumnFamilyInputFormat.java
@@ -199,9 +199,9 @@ public class ColumnFamilyInputFormat extends InputFormat<ByteBuffer, SortedMap<B
             for (String endpoint: range.rpc_endpoints)
             {
                 String endpoint_address = endpoint;
-                       if(endpoint_address == null || endpoint_address == "0.0.0.0")
-                               endpoint_address = range.endpoints.get(endpointIndex);
-                       endpoints[endpointIndex++] = InetAddress.getByName(endpoint_address).getHostName();
+                if(endpoint_address == null || endpoint_address.equals( "0.0.0.0" ) )
+                    endpoint_address = range.endpoints.get(endpointIndex);
+                endpoints[endpointIndex++] = InetAddress.getByName(endpoint_address).getHostName();
             }
 
             for (int i = 1; i < tokens.size(); i++)
{noformat}
                
> Empty rpc_address prevents running MapReduce job outside a cluster
> ------------------------------------------------------------------
>
>                 Key: CASSANDRA-3811
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3811
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 0.8.9
>         Environment: Debian Stable,
> Cassandra 0.8.9,
> Java(TM) SE Runtime Environment (build 1.6.0_26-b03),
> Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode)
>            Reporter: Patrik Modesto
>
> Setting rpc_address to empty to make Cassandra listen on all network intefaceces breaks running mapredude job from outside the cluster. The jobs wont even start, showing these messages:
> {noformat}
> 12/01/26 11:15:21 DEBUG  hadoop.ColumnFamilyInputFormat: failed
> connect to endpoint 0.0.0.0
> java.io.IOException: unable to connect to server
>        at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:389)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:224)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:73)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:193)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:178)
>        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>        at java.lang.Thread.run(Thread.java:662)
> Caused by: org.apache.thrift.transport.TTransportException:
> java.net.ConnectException: Connection refused
>        at org.apache.thrift.transport.TSocket.open(TSocket.java:183)
>        at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
>        at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:385)
>        ... 9 more
> Caused by: java.net.ConnectException: Connection refused
>        at java.net.PlainSocketImpl.socketConnect(Native Method)
>        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
>        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:211)
>        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
>        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
>        at java.net.Socket.connect(Socket.java:529)
>        at org.apache.thrift.transport.TSocket.open(TSocket.java:178)
>        ... 11 more
> ...
> Caused by: java.util.concurrent.ExecutionException:
> java.io.IOException: failed connecting to all endpoints
> 10.0.18.129,10.0.18.99,10.0.18.98
>        at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
>        at java.util.concurrent.FutureTask.get(FutureTask.java:83)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSplits(ColumnFamilyInputFormat.java:156)
>        ... 19 more
> Caused by: java.io.IOException: failed connecting to all endpoints
> 10.0.18.129,10.0.18.99,10.0.18.98
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:241)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:73)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:193)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:178)
>        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>        at java.lang.Thread.run(Thread.java:662)
> {noformat}
> Describe ring retunrs:
> {noformat}
> describe_ring returns:
> endpoints: 10.0.18.129,10.0.18.99,10.0.18.98
> rpc_endpoints: 0.0.0.0,0.0.0.0,0.0.0.0
> {noformat}
> [Michael Frisch|http://www.mail-archive.com/user@cassandra.apache.org/msg20180.html] found possible bug in the Cassandra source:
> {quote}
> If the code in the 0.8 branch is reflective of what is actually included in 
> Cassandra 0.8.9 (here: 
> http://svn.apache.org/repos/asf/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/hadoop/ColumnFamilyInputFormat.java)
>  then the problem is that line 202 is doing an == comparison on strings.  The 
> correct way to compare would be endpoint_address.equals("0.0.0.0") instead.
> - Mike
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-3811) Empty rpc_address prevents running MapReduce job outside a cluster

Posted by "Michael Frisch (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13197149#comment-13197149 ] 

Michael Frisch commented on CASSANDRA-3811:
-------------------------------------------

Assuming you built this into only your Cassandra nodes, you need to rebuild the cassandra_storage.jar as well and have your Hadoop job use that.
                
> Empty rpc_address prevents running MapReduce job outside a cluster
> ------------------------------------------------------------------
>
>                 Key: CASSANDRA-3811
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3811
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 0.8.9
>         Environment: Debian Stable,
> Cassandra 0.8.9,
> Java(TM) SE Runtime Environment (build 1.6.0_26-b03),
> Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode)
>            Reporter: Patrik Modesto
>
> Setting rpc_address to empty to make Cassandra listen on all network intefaceces breaks running mapredude job from outside the cluster. The jobs wont even start, showing these messages:
> {noformat}
> 12/01/26 11:15:21 DEBUG  hadoop.ColumnFamilyInputFormat: failed
> connect to endpoint 0.0.0.0
> java.io.IOException: unable to connect to server
>        at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:389)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:224)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:73)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:193)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:178)
>        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>        at java.lang.Thread.run(Thread.java:662)
> Caused by: org.apache.thrift.transport.TTransportException:
> java.net.ConnectException: Connection refused
>        at org.apache.thrift.transport.TSocket.open(TSocket.java:183)
>        at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
>        at org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:385)
>        ... 9 more
> Caused by: java.net.ConnectException: Connection refused
>        at java.net.PlainSocketImpl.socketConnect(Native Method)
>        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
>        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:211)
>        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
>        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
>        at java.net.Socket.connect(Socket.java:529)
>        at org.apache.thrift.transport.TSocket.open(TSocket.java:178)
>        ... 11 more
> ...
> Caused by: java.util.concurrent.ExecutionException:
> java.io.IOException: failed connecting to all endpoints
> 10.0.18.129,10.0.18.99,10.0.18.98
>        at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
>        at java.util.concurrent.FutureTask.get(FutureTask.java:83)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSplits(ColumnFamilyInputFormat.java:156)
>        ... 19 more
> Caused by: java.io.IOException: failed connecting to all endpoints
> 10.0.18.129,10.0.18.99,10.0.18.98
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:241)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:73)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:193)
>        at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:178)
>        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>        at java.lang.Thread.run(Thread.java:662)
> {noformat}
> Describe ring retunrs:
> {noformat}
> describe_ring returns:
> endpoints: 10.0.18.129,10.0.18.99,10.0.18.98
> rpc_endpoints: 0.0.0.0,0.0.0.0,0.0.0.0
> {noformat}
> [Michael Frisch|http://www.mail-archive.com/user@cassandra.apache.org/msg20180.html] found possible bug in the Cassandra source:
> {quote}
> If the code in the 0.8 branch is reflective of what is actually included in 
> Cassandra 0.8.9 (here: 
> http://svn.apache.org/repos/asf/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/hadoop/ColumnFamilyInputFormat.java)
>  then the problem is that line 202 is doing an == comparison on strings.  The 
> correct way to compare would be endpoint_address.equals("0.0.0.0") instead.
> - Mike
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira