You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@cassandra.apache.org by "Utku Can Topcu (JIRA)" <ji...@apache.org> on 2011/01/03 02:15:46 UTC

[jira] Created: (CASSANDRA-1927) Hadoop Integration doesn't work when one node is down

Hadoop Integration doesn't work when one node is down
-----------------------------------------------------

                 Key: CASSANDRA-1927
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1927
             Project: Cassandra
          Issue Type: Bug
          Components: Hadoop
    Affects Versions: 0.7.0 rc 2
            Reporter: Utku Can Topcu


using the same directives in the sample code:

When I start the CFInputFormat to read a CF in a keyspace of RF=3 on a 4-node cluster:
- If all the nodes are all up, everything works fine and I don't have any problems walking through the all data in the CF, however
- If there's a node down, the hadoop job does not even start, just dies without any errors or exceptions.

So I'm really sorry for not being able to post any errors or exceptions.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CASSANDRA-1927) Hadoop Integration doesn't work when one node is down

Posted by "Mck SembWever (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mck SembWever updated CASSANDRA-1927:
-------------------------------------

    Reviewer: stuhood

Putting Stu as reviewer since he was for CASSANDRA-342 (which the TODO comment in question was added under).

> Hadoop Integration doesn't work when one node is down
> -----------------------------------------------------
>
>                 Key: CASSANDRA-1927
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1927
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 0.7.0 rc 2
>            Reporter: Utku Can Topcu
>            Assignee: Mck SembWever
>             Fix For: 0.7.1
>
>         Attachments: CASSANDRA-1927.patch
>
>
> using the same directives in the sample code:
> When I start the CFInputFormat to read a CF in a keyspace of RF=3 on a 4-node cluster:
> - If all the nodes are all up, everything works fine and I don't have any problems walking through the all data in the CF, however
> - If there's a node down, the hadoop job does not even start, just dies without any errors or exceptions.
> So I'm really sorry for not being able to post any errors or exceptions, though it's really easy to reproduce. Just startup a cluster and take one node down and you're there :)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-1927) Hadoop Integration doesn't work when one node is down

Posted by "Mck SembWever (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12976644#action_12976644 ] 

Mck SembWever commented on CASSANDRA-1927:
------------------------------------------

There's a todo comment in ColumnFamilyInputFormat
 // TODO handle failure of range replicas & retry

line 198 only tries the first endpoint. a loop on the TException trying the next endpoint is needed.

> Hadoop Integration doesn't work when one node is down
> -----------------------------------------------------
>
>                 Key: CASSANDRA-1927
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1927
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 0.7.0 rc 2
>            Reporter: Utku Can Topcu
>             Fix For: 0.7.1
>
>
> using the same directives in the sample code:
> When I start the CFInputFormat to read a CF in a keyspace of RF=3 on a 4-node cluster:
> - If all the nodes are all up, everything works fine and I don't have any problems walking through the all data in the CF, however
> - If there's a node down, the hadoop job does not even start, just dies without any errors or exceptions.
> So I'm really sorry for not being able to post any errors or exceptions, though it's really easy to reproduce. Just startup a cluster and take one node down and you're there :)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-1927) Hadoop Integration doesn't work when one node is down

Posted by "Mck SembWever (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12976698#action_12976698 ] 

Mck SembWever commented on CASSANDRA-1927:
------------------------------------------

Sent DM. If it doesn't work you should at minimum see the job's IOException stacktrace change from "unable to connect to server" to "failed connecting to all endpoints".

> Hadoop Integration doesn't work when one node is down
> -----------------------------------------------------
>
>                 Key: CASSANDRA-1927
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1927
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 0.7.0 rc 2
>            Reporter: Utku Can Topcu
>            Assignee: Mck SembWever
>             Fix For: 0.7.1
>
>         Attachments: CASSANDRA-1927.patch
>
>
> using the same directives in the sample code:
> When I start the CFInputFormat to read a CF in a keyspace of RF=3 on a 4-node cluster:
> - If all the nodes are all up, everything works fine and I don't have any problems walking through the all data in the CF, however
> - If there's a node down, the hadoop job does not even start, just dies without any errors or exceptions.
> So I'm really sorry for not being able to post any errors or exceptions, though it's really easy to reproduce. Just startup a cluster and take one node down and you're there :)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-1927) Hadoop Integration doesn't work when one node is down

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12976832#action_12976832 ] 

Jonathan Ellis commented on CASSANDRA-1927:
-------------------------------------------

It looks like this patch includes the code from CASSANDRA-1921, which is causing conflicts b/c it's already applied on 0.7 and trunk.  Can you create a patch for 1927 only?

> Hadoop Integration doesn't work when one node is down
> -----------------------------------------------------
>
>                 Key: CASSANDRA-1927
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1927
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 0.7.0 rc 2
>            Reporter: Utku Can Topcu
>            Assignee: Mck SembWever
>             Fix For: 0.7.1
>
>         Attachments: CASSANDRA-1927.patch
>
>
> using the same directives in the sample code:
> When I start the CFInputFormat to read a CF in a keyspace of RF=3 on a 4-node cluster:
> - If all the nodes are all up, everything works fine and I don't have any problems walking through the all data in the CF, however
> - If there's a node down, the hadoop job does not even start, just dies without any errors or exceptions.
> So I'm really sorry for not being able to post any errors or exceptions, though it's really easy to reproduce. Just startup a cluster and take one node down and you're there :)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-1927) Hadoop Integration doesn't work when one node is down

Posted by "Mck SembWever (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12976839#action_12976839 ] 

Mck SembWever commented on CASSANDRA-1927:
------------------------------------------

Yeah, the patch had a lot of crap in it. sorry. will re-apply.

> Hadoop Integration doesn't work when one node is down
> -----------------------------------------------------
>
>                 Key: CASSANDRA-1927
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1927
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 0.7.0 rc 2
>            Reporter: Utku Can Topcu
>            Assignee: Mck SembWever
>             Fix For: 0.7.1
>
>         Attachments: CASSANDRA-1927.patch
>
>
> using the same directives in the sample code:
> When I start the CFInputFormat to read a CF in a keyspace of RF=3 on a 4-node cluster:
> - If all the nodes are all up, everything works fine and I don't have any problems walking through the all data in the CF, however
> - If there's a node down, the hadoop job does not even start, just dies without any errors or exceptions.
> So I'm really sorry for not being able to post any errors or exceptions, though it's really easy to reproduce. Just startup a cluster and take one node down and you're there :)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-1927) Hadoop Integration doesn't work when one node is down

Posted by "Utku Can Topcu (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12976695#action_12976695 ] 

Utku Can Topcu commented on CASSANDRA-1927:
-------------------------------------------

Mck: Right now I can't access to our compilation server. However I can replace the running binaries and test them if I have the patched rc4. Can you somehow provide me the compiled package?

> Hadoop Integration doesn't work when one node is down
> -----------------------------------------------------
>
>                 Key: CASSANDRA-1927
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1927
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 0.7.0 rc 2
>            Reporter: Utku Can Topcu
>            Assignee: Mck SembWever
>             Fix For: 0.7.1
>
>         Attachments: CASSANDRA-1927.patch
>
>
> using the same directives in the sample code:
> When I start the CFInputFormat to read a CF in a keyspace of RF=3 on a 4-node cluster:
> - If all the nodes are all up, everything works fine and I don't have any problems walking through the all data in the CF, however
> - If there's a node down, the hadoop job does not even start, just dies without any errors or exceptions.
> So I'm really sorry for not being able to post any errors or exceptions, though it's really easy to reproduce. Just startup a cluster and take one node down and you're there :)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (CASSANDRA-1927) Hadoop Integration doesn't work when one node is down

Posted by "Mck SembWever (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mck SembWever reassigned CASSANDRA-1927:
----------------------------------------

    Assignee: Mck SembWever

> Hadoop Integration doesn't work when one node is down
> -----------------------------------------------------
>
>                 Key: CASSANDRA-1927
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1927
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 0.7.0 rc 2
>            Reporter: Utku Can Topcu
>            Assignee: Mck SembWever
>             Fix For: 0.7.1
>
>
> using the same directives in the sample code:
> When I start the CFInputFormat to read a CF in a keyspace of RF=3 on a 4-node cluster:
> - If all the nodes are all up, everything works fine and I don't have any problems walking through the all data in the CF, however
> - If there's a node down, the hadoop job does not even start, just dies without any errors or exceptions.
> So I'm really sorry for not being able to post any errors or exceptions, though it's really easy to reproduce. Just startup a cluster and take one node down and you're there :)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (CASSANDRA-1927) Hadoop Integration doesn't work when one node is down

Posted by "Mck SembWever (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12976683#action_12976683 ] 

Mck SembWever edited comment on CASSANDRA-1927 at 1/3/11 5:52 AM:
------------------------------------------------------------------

Utku: are you able to test this patch?

It does not work for me because i'm using ByteOrderedPartitioner which doesn't return multiple endpoints for each TokenRange returned by: 

client.describe_ring(..) <-- storageService.getRangeToEndpointMap(..) <-- getRangeToAddressMap(..) <-- getRangeToAddressMap(..) <-- constructRangeToEndpointMap(..) <-- replicationStrategy.getNaturalEndpoints(..)

(Or maybe endpoints are not suppose to reference available replicas. stu?)

      was (Author: michaelsembwever):
    Utku: are you able to test this patch?

It does not work for me because i'm using ByteOrderedPartitioner which doesn't return multiple endpoints for each TokenRange returned by client.describe_ring(..)
(Or maybe endpoints are not suppose to reference available replicas. stu?)
  
> Hadoop Integration doesn't work when one node is down
> -----------------------------------------------------
>
>                 Key: CASSANDRA-1927
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1927
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 0.7.0 rc 2
>            Reporter: Utku Can Topcu
>            Assignee: Mck SembWever
>             Fix For: 0.7.1
>
>         Attachments: CASSANDRA-1927.patch
>
>
> using the same directives in the sample code:
> When I start the CFInputFormat to read a CF in a keyspace of RF=3 on a 4-node cluster:
> - If all the nodes are all up, everything works fine and I don't have any problems walking through the all data in the CF, however
> - If there's a node down, the hadoop job does not even start, just dies without any errors or exceptions.
> So I'm really sorry for not being able to post any errors or exceptions, though it's really easy to reproduce. Just startup a cluster and take one node down and you're there :)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CASSANDRA-1927) Hadoop Integration doesn't work when one node is down

Posted by "Mck SembWever (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mck SembWever updated CASSANDRA-1927:
-------------------------------------

    Attachment: CASSANDRA-1927.patch

third time lucky. removed unnecessary import.

> Hadoop Integration doesn't work when one node is down
> -----------------------------------------------------
>
>                 Key: CASSANDRA-1927
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1927
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 0.7.0 rc 2
>            Reporter: Utku Can Topcu
>            Assignee: Mck SembWever
>             Fix For: 0.7.1
>
>         Attachments: CASSANDRA-1927.patch
>
>
> using the same directives in the sample code:
> When I start the CFInputFormat to read a CF in a keyspace of RF=3 on a 4-node cluster:
> - If all the nodes are all up, everything works fine and I don't have any problems walking through the all data in the CF, however
> - If there's a node down, the hadoop job does not even start, just dies without any errors or exceptions.
> So I'm really sorry for not being able to post any errors or exceptions, though it's really easy to reproduce. Just startup a cluster and take one node down and you're there :)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CASSANDRA-1927) Hadoop Integration doesn't work when one node is down

Posted by "Mck SembWever (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mck SembWever updated CASSANDRA-1927:
-------------------------------------

    Attachment: CASSANDRA-1927.patch

correct patch & license grant

> Hadoop Integration doesn't work when one node is down
> -----------------------------------------------------
>
>                 Key: CASSANDRA-1927
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1927
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 0.7.0 rc 2
>            Reporter: Utku Can Topcu
>            Assignee: Mck SembWever
>             Fix For: 0.7.1
>
>         Attachments: CASSANDRA-1927.patch
>
>
> using the same directives in the sample code:
> When I start the CFInputFormat to read a CF in a keyspace of RF=3 on a 4-node cluster:
> - If all the nodes are all up, everything works fine and I don't have any problems walking through the all data in the CF, however
> - If there's a node down, the hadoop job does not even start, just dies without any errors or exceptions.
> So I'm really sorry for not being able to post any errors or exceptions, though it's really easy to reproduce. Just startup a cluster and take one node down and you're there :)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CASSANDRA-1927) Hadoop Integration doesn't work when one node is down

Posted by "Mck SembWever (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mck SembWever updated CASSANDRA-1927:
-------------------------------------

    Attachment: CASSANDRA-1927.patch

correct patch

> Hadoop Integration doesn't work when one node is down
> -----------------------------------------------------
>
>                 Key: CASSANDRA-1927
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1927
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 0.7.0 rc 2
>            Reporter: Utku Can Topcu
>            Assignee: Mck SembWever
>             Fix For: 0.7.1
>
>         Attachments: CASSANDRA-1927.patch, CASSANDRA-1927.patch
>
>
> using the same directives in the sample code:
> When I start the CFInputFormat to read a CF in a keyspace of RF=3 on a 4-node cluster:
> - If all the nodes are all up, everything works fine and I don't have any problems walking through the all data in the CF, however
> - If there's a node down, the hadoop job does not even start, just dies without any errors or exceptions.
> So I'm really sorry for not being able to post any errors or exceptions, though it's really easy to reproduce. Just startup a cluster and take one node down and you're there :)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CASSANDRA-1927) Hadoop Integration doesn't work when one node is down

Posted by "Mck SembWever (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mck SembWever updated CASSANDRA-1927:
-------------------------------------

    Attachment:     (was: CASSANDRA-1927.patch)

> Hadoop Integration doesn't work when one node is down
> -----------------------------------------------------
>
>                 Key: CASSANDRA-1927
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1927
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 0.7.0 rc 2
>            Reporter: Utku Can Topcu
>            Assignee: Mck SembWever
>             Fix For: 0.7.1
>
>         Attachments: CASSANDRA-1927.patch
>
>
> using the same directives in the sample code:
> When I start the CFInputFormat to read a CF in a keyspace of RF=3 on a 4-node cluster:
> - If all the nodes are all up, everything works fine and I don't have any problems walking through the all data in the CF, however
> - If there's a node down, the hadoop job does not even start, just dies without any errors or exceptions.
> So I'm really sorry for not being able to post any errors or exceptions, though it's really easy to reproduce. Just startup a cluster and take one node down and you're there :)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-1927) Hadoop Integration doesn't work when one node is down

Posted by "Utku Can Topcu (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12976741#action_12976741 ] 

Utku Can Topcu commented on CASSANDRA-1927:
-------------------------------------------

I'll be testing it in a few hours. I'll write down the results. something urgent came up.

> Hadoop Integration doesn't work when one node is down
> -----------------------------------------------------
>
>                 Key: CASSANDRA-1927
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1927
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 0.7.0 rc 2
>            Reporter: Utku Can Topcu
>            Assignee: Mck SembWever
>             Fix For: 0.7.1
>
>         Attachments: CASSANDRA-1927.patch
>
>
> using the same directives in the sample code:
> When I start the CFInputFormat to read a CF in a keyspace of RF=3 on a 4-node cluster:
> - If all the nodes are all up, everything works fine and I don't have any problems walking through the all data in the CF, however
> - If there's a node down, the hadoop job does not even start, just dies without any errors or exceptions.
> So I'm really sorry for not being able to post any errors or exceptions, though it's really easy to reproduce. Just startup a cluster and take one node down and you're there :)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-1927) Hadoop Integration doesn't work when one node is down

Posted by "Utku Can Topcu (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12976847#action_12976847 ] 

Utku Can Topcu commented on CASSANDRA-1927:
-------------------------------------------

I've tested against the rc4+patch and it works.

> Hadoop Integration doesn't work when one node is down
> -----------------------------------------------------
>
>                 Key: CASSANDRA-1927
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1927
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 0.7.0 rc 2
>            Reporter: Utku Can Topcu
>            Assignee: Mck SembWever
>             Fix For: 0.7.1
>
>         Attachments: CASSANDRA-1927.patch
>
>
> using the same directives in the sample code:
> When I start the CFInputFormat to read a CF in a keyspace of RF=3 on a 4-node cluster:
> - If all the nodes are all up, everything works fine and I don't have any problems walking through the all data in the CF, however
> - If there's a node down, the hadoop job does not even start, just dies without any errors or exceptions.
> So I'm really sorry for not being able to post any errors or exceptions, though it's really easy to reproduce. Just startup a cluster and take one node down and you're there :)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-1927) Hadoop Integration doesn't work when one node is down

Posted by "Mck SembWever (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12976643#action_12976643 ] 

Mck SembWever commented on CASSANDRA-1927:
------------------------------------------

Client side (hadoop job):

java.io.IOException: Could not get input splits
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSplits(ColumnFamilyInputFormat.java:127)
	at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885)
	at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
	at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
	at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
	at no.finntech.countstats.reduce.FakeAdCounterTableReduce.run(FakeAdCounterTableReduce.java:421)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
	at no.finntech.countstats.reduce.FakeAdCounterTableReduce.main(FakeAdCounterTableReduce.java:75)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Caused by: java.util.concurrent.ExecutionException: java.io.IOException: unable to connect to server
	at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
	at java.util.concurrent.FutureTask.get(FutureTask.java:83)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSplits(ColumnFamilyInputFormat.java:123)
	... 13 more
Caused by: java.io.IOException: unable to connect to server
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.createConnection(ColumnFamilyInputFormat.java:212)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:187)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:74)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:160)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:145)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:619)
Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused
	at org.apache.thrift.transport.TSocket.open(TSocket.java:185)
	at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
	at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.createConnection(ColumnFamilyInputFormat.java:208)
	... 9 more
Caused by: java.net.ConnectException: Connection refused
	at java.net.PlainSocketImpl.socketConnect(Native Method)
	at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
	at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
	at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
	at java.net.Socket.connect(Socket.java:525)
	at org.apache.thrift.transport.TSocket.open(TSocket.java:180)
	... 11 more



> Hadoop Integration doesn't work when one node is down
> -----------------------------------------------------
>
>                 Key: CASSANDRA-1927
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1927
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 0.7.0 rc 2
>            Reporter: Utku Can Topcu
>             Fix For: 0.7.1
>
>
> using the same directives in the sample code:
> When I start the CFInputFormat to read a CF in a keyspace of RF=3 on a 4-node cluster:
> - If all the nodes are all up, everything works fine and I don't have any problems walking through the all data in the CF, however
> - If there's a node down, the hadoop job does not even start, just dies without any errors or exceptions.
> So I'm really sorry for not being able to post any errors or exceptions, though it's really easy to reproduce. Just startup a cluster and take one node down and you're there :)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (CASSANDRA-1927) Hadoop Integration doesn't work when one node is down

Posted by "Mck SembWever (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12976683#action_12976683 ] 

Mck SembWever edited comment on CASSANDRA-1927 at 1/3/11 9:18 AM:
------------------------------------------------------------------

Utku: are you able to test this patch?

( It didn't work for me because RF was never really set to 3. using cassandra-cli "describe keyspace xxx" reported "Replication Factor: 1" )  :-$



      was (Author: michaelsembwever):
    Utku: are you able to test this patch?

It does not work for me because i'm using ByteOrderedPartitioner which doesn't return multiple endpoints for each TokenRange returned by: 

client.describe_ring(..) <-- storageService.getRangeToEndpointMap(..) <-- getRangeToAddressMap(..) <-- getRangeToAddressMap(..) <-- constructRangeToEndpointMap(..) <-- replicationStrategy.getNaturalEndpoints(..)

(Or maybe endpoints are not suppose to reference available replicas. stu?)
  
> Hadoop Integration doesn't work when one node is down
> -----------------------------------------------------
>
>                 Key: CASSANDRA-1927
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1927
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 0.7.0 rc 2
>            Reporter: Utku Can Topcu
>            Assignee: Mck SembWever
>             Fix For: 0.7.1
>
>         Attachments: CASSANDRA-1927.patch
>
>
> using the same directives in the sample code:
> When I start the CFInputFormat to read a CF in a keyspace of RF=3 on a 4-node cluster:
> - If all the nodes are all up, everything works fine and I don't have any problems walking through the all data in the CF, however
> - If there's a node down, the hadoop job does not even start, just dies without any errors or exceptions.
> So I'm really sorry for not being able to post any errors or exceptions, though it's really easy to reproduce. Just startup a cluster and take one node down and you're there :)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CASSANDRA-1927) Hadoop Integration doesn't work when one node is down

Posted by "Mck SembWever (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mck SembWever updated CASSANDRA-1927:
-------------------------------------

    Attachment:     (was: CASSANDRA-1927.patch)

> Hadoop Integration doesn't work when one node is down
> -----------------------------------------------------
>
>                 Key: CASSANDRA-1927
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1927
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 0.7.0 rc 2
>            Reporter: Utku Can Topcu
>            Assignee: Mck SembWever
>             Fix For: 0.7.1
>
>         Attachments: CASSANDRA-1927.patch
>
>
> using the same directives in the sample code:
> When I start the CFInputFormat to read a CF in a keyspace of RF=3 on a 4-node cluster:
> - If all the nodes are all up, everything works fine and I don't have any problems walking through the all data in the CF, however
> - If there's a node down, the hadoop job does not even start, just dies without any errors or exceptions.
> So I'm really sorry for not being able to post any errors or exceptions, though it's really easy to reproduce. Just startup a cluster and take one node down and you're there :)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CASSANDRA-1927) Hadoop Integration doesn't work when one node is down

Posted by "Mck SembWever (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mck SembWever updated CASSANDRA-1927:
-------------------------------------

    Attachment:     (was: CASSANDRA-1927.patch)

> Hadoop Integration doesn't work when one node is down
> -----------------------------------------------------
>
>                 Key: CASSANDRA-1927
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1927
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 0.7.0 rc 2
>            Reporter: Utku Can Topcu
>            Assignee: Mck SembWever
>             Fix For: 0.7.1
>
>         Attachments: CASSANDRA-1927.patch
>
>
> using the same directives in the sample code:
> When I start the CFInputFormat to read a CF in a keyspace of RF=3 on a 4-node cluster:
> - If all the nodes are all up, everything works fine and I don't have any problems walking through the all data in the CF, however
> - If there's a node down, the hadoop job does not even start, just dies without any errors or exceptions.
> So I'm really sorry for not being able to post any errors or exceptions, though it's really easy to reproduce. Just startup a cluster and take one node down and you're there :)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CASSANDRA-1927) Hadoop Integration doesn't work when one node is down

Posted by "Mck SembWever (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mck SembWever updated CASSANDRA-1927:
-------------------------------------

    Comment: was deleted

(was: correct patch)

> Hadoop Integration doesn't work when one node is down
> -----------------------------------------------------
>
>                 Key: CASSANDRA-1927
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1927
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 0.7.0 rc 2
>            Reporter: Utku Can Topcu
>            Assignee: Mck SembWever
>             Fix For: 0.7.1
>
>         Attachments: CASSANDRA-1927.patch
>
>
> using the same directives in the sample code:
> When I start the CFInputFormat to read a CF in a keyspace of RF=3 on a 4-node cluster:
> - If all the nodes are all up, everything works fine and I don't have any problems walking through the all data in the CF, however
> - If there's a node down, the hadoop job does not even start, just dies without any errors or exceptions.
> So I'm really sorry for not being able to post any errors or exceptions, though it's really easy to reproduce. Just startup a cluster and take one node down and you're there :)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CASSANDRA-1927) Hadoop Integration doesn't work when one node is down

Posted by "Utku Can Topcu (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Utku Can Topcu updated CASSANDRA-1927:
--------------------------------------

    Description: 
using the same directives in the sample code:

When I start the CFInputFormat to read a CF in a keyspace of RF=3 on a 4-node cluster:
- If all the nodes are all up, everything works fine and I don't have any problems walking through the all data in the CF, however
- If there's a node down, the hadoop job does not even start, just dies without any errors or exceptions.

So I'm really sorry for not being able to post any errors or exceptions, though it's really easy to reproduce. Just startup a cluster and take one node down and you're there :)

  was:
using the same directives in the sample code:

When I start the CFInputFormat to read a CF in a keyspace of RF=3 on a 4-node cluster:
- If all the nodes are all up, everything works fine and I don't have any problems walking through the all data in the CF, however
- If there's a node down, the hadoop job does not even start, just dies without any errors or exceptions.

So I'm really sorry for not being able to post any errors or exceptions.


> Hadoop Integration doesn't work when one node is down
> -----------------------------------------------------
>
>                 Key: CASSANDRA-1927
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1927
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 0.7.0 rc 2
>            Reporter: Utku Can Topcu
>
> using the same directives in the sample code:
> When I start the CFInputFormat to read a CF in a keyspace of RF=3 on a 4-node cluster:
> - If all the nodes are all up, everything works fine and I don't have any problems walking through the all data in the CF, however
> - If there's a node down, the hadoop job does not even start, just dies without any errors or exceptions.
> So I'm really sorry for not being able to post any errors or exceptions, though it's really easy to reproduce. Just startup a cluster and take one node down and you're there :)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-1927) Hadoop Integration doesn't work when one node is down

Posted by "Hudson (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12976936#action_12976936 ] 

Hudson commented on CASSANDRA-1927:
-----------------------------------

Integrated in Cassandra-0.7 #142 (See [https://hudson.apache.org/hudson/job/Cassandra-0.7/142/])
    retry hadoop split requests on connection failure
patch by mck; reviewed by jbellis for CASSANDRA-1927


> Hadoop Integration doesn't work when one node is down
> -----------------------------------------------------
>
>                 Key: CASSANDRA-1927
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1927
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 0.7.0 rc 2
>            Reporter: Utku Can Topcu
>            Assignee: Mck SembWever
>             Fix For: 0.7.1
>
>         Attachments: CASSANDRA-1927.patch
>
>
> using the same directives in the sample code:
> When I start the CFInputFormat to read a CF in a keyspace of RF=3 on a 4-node cluster:
> - If all the nodes are all up, everything works fine and I don't have any problems walking through the all data in the CF, however
> - If there's a node down, the hadoop job does not even start, just dies without any errors or exceptions.
> So I'm really sorry for not being able to post any errors or exceptions, though it's really easy to reproduce. Just startup a cluster and take one node down and you're there :)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CASSANDRA-1927) Hadoop Integration doesn't work when one node is down

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-1927:
--------------------------------------

    Fix Version/s: 0.7.1

> Hadoop Integration doesn't work when one node is down
> -----------------------------------------------------
>
>                 Key: CASSANDRA-1927
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1927
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 0.7.0 rc 2
>            Reporter: Utku Can Topcu
>             Fix For: 0.7.1
>
>
> using the same directives in the sample code:
> When I start the CFInputFormat to read a CF in a keyspace of RF=3 on a 4-node cluster:
> - If all the nodes are all up, everything works fine and I don't have any problems walking through the all data in the CF, however
> - If there's a node down, the hadoop job does not even start, just dies without any errors or exceptions.
> So I'm really sorry for not being able to post any errors or exceptions, though it's really easy to reproduce. Just startup a cluster and take one node down and you're there :)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CASSANDRA-1927) Hadoop Integration doesn't work when one node is down

Posted by "Mck SembWever (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mck SembWever updated CASSANDRA-1927:
-------------------------------------

    Attachment: CASSANDRA-1927.patch

Utku: are you able to test this patch?

It does not work for me because i'm using ByteOrderedPartitioner which doesn't return multiple endpoints for each TokenRange returned by client.describe_ring(..)
(Or maybe endpoints are not suppose to reference available replicas. stu?)

> Hadoop Integration doesn't work when one node is down
> -----------------------------------------------------
>
>                 Key: CASSANDRA-1927
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1927
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 0.7.0 rc 2
>            Reporter: Utku Can Topcu
>            Assignee: Mck SembWever
>             Fix For: 0.7.1
>
>         Attachments: CASSANDRA-1927.patch
>
>
> using the same directives in the sample code:
> When I start the CFInputFormat to read a CF in a keyspace of RF=3 on a 4-node cluster:
> - If all the nodes are all up, everything works fine and I don't have any problems walking through the all data in the CF, however
> - If there's a node down, the hadoop job does not even start, just dies without any errors or exceptions.
> So I'm really sorry for not being able to post any errors or exceptions, though it's really easy to reproduce. Just startup a cluster and take one node down and you're there :)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-1927) Hadoop Integration doesn't work when one node is down

Posted by "Mck SembWever (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12976787#action_12976787 ] 

Mck SembWever commented on CASSANDRA-1927:
------------------------------------------

After fixing my local RF problem this patch works for me.

> Hadoop Integration doesn't work when one node is down
> -----------------------------------------------------
>
>                 Key: CASSANDRA-1927
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1927
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 0.7.0 rc 2
>            Reporter: Utku Can Topcu
>            Assignee: Mck SembWever
>             Fix For: 0.7.1
>
>         Attachments: CASSANDRA-1927.patch
>
>
> using the same directives in the sample code:
> When I start the CFInputFormat to read a CF in a keyspace of RF=3 on a 4-node cluster:
> - If all the nodes are all up, everything works fine and I don't have any problems walking through the all data in the CF, however
> - If there's a node down, the hadoop job does not even start, just dies without any errors or exceptions.
> So I'm really sorry for not being able to post any errors or exceptions, though it's really easy to reproduce. Just startup a cluster and take one node down and you're there :)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.