You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by AN...@homedepot.com on 2012/12/13 21:57:19 UTC

BulkOutputFormat error - org.apache.thrift.transport.TTransportException

Hi

I am a newbie to Cassandra. Was trying out a sample (word count) code on BulkOutputFormat and got stuck with an error.

What I am trying to do is - migrate all Hive tables (from Hadoop cluster) to Cassandra column families.
My MR program is configured to run on Hadoop cluster v 0.20.2 (cdh3u3) by pointing job config params 'fs.default.name' and 'mapred.job.tracker' appropriately.
The output is pointed to my local Cassandra v1.1.7.
Have set the following params for writing to Cassandra:
conf.set("cassandra.output.keyspace", "Customer");
       conf.set("cassandra.output.columnfamily", "words");
       conf.set("cassandra.output.partitioner.class", "org.apache.cassandra.dht.RandomPartitioner");
       conf.set("cassandra.output.thrift.port","9160");    // default
       conf.set("cassandra.output.thrift.address", "localhost");
       conf.set("mapreduce.output.bulkoutputformat.streamthrottlembits", "10");

But, programs fails with the below error:
12/12/13 15:32:55 INFO security.UserGroupInformation: JAAS Configuration already set up for Hadoop, not re-installing.
Cassandra thrift address   :      localhost
Cassandra thrift port      :      9160
12/12/13 15:32:56 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
12/12/13 15:34:21 INFO input.FileInputFormat: Total input paths to process : 1
12/12/13 15:34:21 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
12/12/13 15:34:21 WARN snappy.LoadSnappy: Snappy native library not loaded
12/12/13 15:34:22 INFO mapred.JobClient: Running job: job_201212111101_4622
12/12/13 15:34:23 INFO mapred.JobClient:  map 0% reduce 0%
12/12/13 15:34:28 INFO mapred.JobClient:  map 100% reduce 0%
12/12/13 15:34:37 INFO mapred.JobClient:  map 100% reduce 33%
12/12/13 15:34:39 INFO mapred.JobClient: Task Id : attempt_201212111101_4622_r_000000_0, Status : FAILED
java.lang.RuntimeException: Could not retrieve endpoint ranges:
       at org.apache.cassandra.hadoop.BulkRecordWriter$ExternalClient.init(BulkRecordWriter.java:328)
       at org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:116)
       at org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:111)
       at org.apache.cassandra.hadoop.BulkRecordWriter.close(BulkRecordWriter.java:223)
       at org.apache.cassandra.hadoop.BulkRecordWriter.close(BulkRecordWriter.java:208)
       at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:573)
       at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:414)
       at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
       at java.security.AccessController.doPrivileged(Native Method)
       at javax.security.auth.Subject.doAs(Subject.java:396)
       at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157)
       at org.apache.hadoop.mapred.Child.main(Child.java:264)
Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectE

Please help me out understand the problem.

Regards
Anand B

________________________________

The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the terms and conditions expressed in any applicable governing The Home Depot terms of business or client engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy and content of this attachment and for any damages or losses arising from any inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature, which may be contained in this attachment and shall not be liable for direct, indirect, consequential or special damages in connection with this e-mail message or its attachment.

RE: BulkOutputFormat error - org.apache.thrift.transport.TTransportException

Posted by AN...@homedepot.com.
The problem was with the compatibility. I was using a lower version of Cassandra jar files. Now, BulkOutputFormat works fine.

-----Original Message-----
From: ANAND_BALARAMAN@homedepot.com [mailto:ANAND_BALARAMAN@homedepot.com]
Sent: Friday, December 14, 2012 12:37 AM
To: user@cassandra.apache.org
Subject: RE: BulkOutputFormat error - org.apache.thrift.transport.TTransportException

Aaron
Both the rpc_address in caasandra.yaml file and job configuration are same (localhost).
I will try connecting to a different Cassandra cluster and test it again.

-----Original Message-----
From: aaron morton [mailto:aaron@thelastpickle.com]
Sent: Thursday, December 13, 2012 9:03 PM
To: user@cassandra.apache.org
Subject: Re: BulkOutputFormat error - org.apache.thrift.transport.TTransportException

Looks like it cannot connect to the server

>        conf.set("cassandra.output.thrift.address", "localhost");
Is this the same address as the rpc_address in the cassandra config ?

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 14/12/2012, at 9:57 AM, ANAND_BALARAMAN@homedepot.com wrote:

> Hi
>
> I am a newbie to Cassandra. Was trying out a sample (word count) code on BulkOutputFormat and got stuck with an error.
>
> What I am trying to do is - migrate all Hive tables (from Hadoop cluster) to Cassandra column families.
> My MR program is configured to run on Hadoop cluster v 0.20.2 (cdh3u3) by pointing job config params 'fs.default.name' and 'mapred.job.tracker' appropriately.
> The output is pointed to my local Cassandra v1.1.7.
> Have set the following params for writing to Cassandra:
> conf.set("cassandra.output.keyspace", "Customer");
>        conf.set("cassandra.output.columnfamily", "words");
>        conf.set("cassandra.output.partitioner.class", "org.apache.cassandra.dht.RandomPartitioner");
>        conf.set("cassandra.output.thrift.port","9160");    // default
>        conf.set("cassandra.output.thrift.address", "localhost");
>        conf.set("mapreduce.output.bulkoutputformat.streamthrottlembits", "10");
>
> But, programs fails with the below error:
> 12/12/13 15:32:55 INFO security.UserGroupInformation: JAAS Configuration already set up for Hadoop, not re-installing.
> Cassandra thrift address   :      localhost
> Cassandra thrift port      :      9160
> 12/12/13 15:32:56 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
> 12/12/13 15:34:21 INFO input.FileInputFormat: Total input paths to process : 1
> 12/12/13 15:34:21 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
> 12/12/13 15:34:21 WARN snappy.LoadSnappy: Snappy native library not loaded
> 12/12/13 15:34:22 INFO mapred.JobClient: Running job: job_201212111101_4622
> 12/12/13 15:34:23 INFO mapred.JobClient:  map 0% reduce 0%
> 12/12/13 15:34:28 INFO mapred.JobClient:  map 100% reduce 0%
> 12/12/13 15:34:37 INFO mapred.JobClient:  map 100% reduce 33%
> 12/12/13 15:34:39 INFO mapred.JobClient: Task Id : attempt_201212111101_4622_r_000000_0, Status : FAILED
> java.lang.RuntimeException: Could not retrieve endpoint ranges:
>        at org.apache.cassandra.hadoop.BulkRecordWriter$ExternalClient.init(BulkRecordWriter.java:328)
>        at org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:116)
>        at org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:111)
>        at org.apache.cassandra.hadoop.BulkRecordWriter.close(BulkRecordWriter.java:223)
>        at org.apache.cassandra.hadoop.BulkRecordWriter.close(BulkRecordWriter.java:208)
>        at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:573)
>        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:414)
>        at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
>        at java.security.AccessController.doPrivileged(Native Method)
>        at javax.security.auth.Subject.doAs(Subject.java:396)
>        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157)
>        at org.apache.hadoop.mapred.Child.main(Child.java:264)
> Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectE
>
> Please help me out understand the problem.
>
> Regards
> Anand B
>
>
> The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the terms and conditions expressed in any applicable governing The Home Depot terms of business or client engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy and content of this attachment and for any damages or losses arising from any inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature, which may be contained in this attachment and shall not be liable for direct, indirect, consequential or special damages in connection with this e-mail message or its attachment.



________________________________

The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the terms and conditions expressed in any applicable governing The Home Depot terms of business or client engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy and content of this attachment and for any damages or losses arising from any inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature, which may be contained in this attachment and shall not be liable for direct, indirect, consequential or special damages in connection with this e-mail message or its attachment.


________________________________

The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the terms and conditions expressed in any applicable governing The Home Depot terms of business or client engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy and content of this attachment and for any damages or losses arising from any inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature, which may be contained in this attachment and shall not be liable for direct, indirect, consequential or special damages in connection with this e-mail message or its attachment.

RE: BulkOutputFormat error - org.apache.thrift.transport.TTransportException

Posted by AN...@homedepot.com.
Aaron
Both the rpc_address in caasandra.yaml file and job configuration are same (localhost).
I will try connecting to a different Cassandra cluster and test it again.

-----Original Message-----
From: aaron morton [mailto:aaron@thelastpickle.com]
Sent: Thursday, December 13, 2012 9:03 PM
To: user@cassandra.apache.org
Subject: Re: BulkOutputFormat error - org.apache.thrift.transport.TTransportException

Looks like it cannot connect to the server

>        conf.set("cassandra.output.thrift.address", "localhost");
Is this the same address as the rpc_address in the cassandra config ?

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 14/12/2012, at 9:57 AM, ANAND_BALARAMAN@homedepot.com wrote:

> Hi
>
> I am a newbie to Cassandra. Was trying out a sample (word count) code on BulkOutputFormat and got stuck with an error.
>
> What I am trying to do is - migrate all Hive tables (from Hadoop cluster) to Cassandra column families.
> My MR program is configured to run on Hadoop cluster v 0.20.2 (cdh3u3) by pointing job config params 'fs.default.name' and 'mapred.job.tracker' appropriately.
> The output is pointed to my local Cassandra v1.1.7.
> Have set the following params for writing to Cassandra:
> conf.set("cassandra.output.keyspace", "Customer");
>        conf.set("cassandra.output.columnfamily", "words");
>        conf.set("cassandra.output.partitioner.class", "org.apache.cassandra.dht.RandomPartitioner");
>        conf.set("cassandra.output.thrift.port","9160");    // default
>        conf.set("cassandra.output.thrift.address", "localhost");
>        conf.set("mapreduce.output.bulkoutputformat.streamthrottlembits", "10");
>
> But, programs fails with the below error:
> 12/12/13 15:32:55 INFO security.UserGroupInformation: JAAS Configuration already set up for Hadoop, not re-installing.
> Cassandra thrift address   :      localhost
> Cassandra thrift port      :      9160
> 12/12/13 15:32:56 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
> 12/12/13 15:34:21 INFO input.FileInputFormat: Total input paths to process : 1
> 12/12/13 15:34:21 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
> 12/12/13 15:34:21 WARN snappy.LoadSnappy: Snappy native library not loaded
> 12/12/13 15:34:22 INFO mapred.JobClient: Running job: job_201212111101_4622
> 12/12/13 15:34:23 INFO mapred.JobClient:  map 0% reduce 0%
> 12/12/13 15:34:28 INFO mapred.JobClient:  map 100% reduce 0%
> 12/12/13 15:34:37 INFO mapred.JobClient:  map 100% reduce 33%
> 12/12/13 15:34:39 INFO mapred.JobClient: Task Id : attempt_201212111101_4622_r_000000_0, Status : FAILED
> java.lang.RuntimeException: Could not retrieve endpoint ranges:
>        at org.apache.cassandra.hadoop.BulkRecordWriter$ExternalClient.init(BulkRecordWriter.java:328)
>        at org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:116)
>        at org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:111)
>        at org.apache.cassandra.hadoop.BulkRecordWriter.close(BulkRecordWriter.java:223)
>        at org.apache.cassandra.hadoop.BulkRecordWriter.close(BulkRecordWriter.java:208)
>        at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:573)
>        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:414)
>        at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
>        at java.security.AccessController.doPrivileged(Native Method)
>        at javax.security.auth.Subject.doAs(Subject.java:396)
>        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157)
>        at org.apache.hadoop.mapred.Child.main(Child.java:264)
> Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectE
>
> Please help me out understand the problem.
>
> Regards
> Anand B
>
>
> The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the terms and conditions expressed in any applicable governing The Home Depot terms of business or client engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy and content of this attachment and for any damages or losses arising from any inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature, which may be contained in this attachment and shall not be liable for direct, indirect, consequential or special damages in connection with this e-mail message or its attachment.



________________________________

The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the terms and conditions expressed in any applicable governing The Home Depot terms of business or client engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy and content of this attachment and for any damages or losses arising from any inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature, which may be contained in this attachment and shall not be liable for direct, indirect, consequential or special damages in connection with this e-mail message or its attachment.

Re: BulkOutputFormat error - org.apache.thrift.transport.TTransportException

Posted by aaron morton <aa...@thelastpickle.com>.
Looks like it cannot connect to the server

>        conf.set("cassandra.output.thrift.address", "localhost");
Is this the same address as the rpc_address in the cassandra config ? 

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 14/12/2012, at 9:57 AM, ANAND_BALARAMAN@homedepot.com wrote:

> Hi
>  
> I am a newbie to Cassandra. Was trying out a sample (word count) code on BulkOutputFormat and got stuck with an error.
>  
> What I am trying to do is – migrate all Hive tables (from Hadoop cluster) to Cassandra column families.
> My MR program is configured to run on Hadoop cluster v 0.20.2 (cdh3u3) by pointing job config params ‘fs.default.name’ and ‘mapred.job.tracker’ appropriately.
> The output is pointed to my local Cassandra v1.1.7.
> Have set the following params for writing to Cassandra:
> conf.set("cassandra.output.keyspace", "Customer");
>        conf.set("cassandra.output.columnfamily", "words");
>        conf.set("cassandra.output.partitioner.class", "org.apache.cassandra.dht.RandomPartitioner");
>        conf.set("cassandra.output.thrift.port","9160");    // default
>        conf.set("cassandra.output.thrift.address", "localhost");
>        conf.set("mapreduce.output.bulkoutputformat.streamthrottlembits", "10");
>  
> But, programs fails with the below error:
> 12/12/13 15:32:55 INFO security.UserGroupInformation: JAAS Configuration already set up for Hadoop, not re-installing.
> Cassandra thrift address   :      localhost
> Cassandra thrift port      :      9160
> 12/12/13 15:32:56 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
> 12/12/13 15:34:21 INFO input.FileInputFormat: Total input paths to process : 1
> 12/12/13 15:34:21 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
> 12/12/13 15:34:21 WARN snappy.LoadSnappy: Snappy native library not loaded
> 12/12/13 15:34:22 INFO mapred.JobClient: Running job: job_201212111101_4622
> 12/12/13 15:34:23 INFO mapred.JobClient:  map 0% reduce 0%
> 12/12/13 15:34:28 INFO mapred.JobClient:  map 100% reduce 0%
> 12/12/13 15:34:37 INFO mapred.JobClient:  map 100% reduce 33%
> 12/12/13 15:34:39 INFO mapred.JobClient: Task Id : attempt_201212111101_4622_r_000000_0, Status : FAILED
> java.lang.RuntimeException: Could not retrieve endpoint ranges:
>        at org.apache.cassandra.hadoop.BulkRecordWriter$ExternalClient.init(BulkRecordWriter.java:328)
>        at org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:116)
>        at org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:111)
>        at org.apache.cassandra.hadoop.BulkRecordWriter.close(BulkRecordWriter.java:223)
>        at org.apache.cassandra.hadoop.BulkRecordWriter.close(BulkRecordWriter.java:208)
>        at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:573)
>        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:414)
>        at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
>        at java.security.AccessController.doPrivileged(Native Method)
>        at javax.security.auth.Subject.doAs(Subject.java:396)
>        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157)
>        at org.apache.hadoop.mapred.Child.main(Child.java:264)
> Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectE
>  
> Please help me out understand the problem.
>  
> Regards
> Anand B
> 
> 
> The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the terms and conditions expressed in any applicable governing The Home Depot terms of business or client engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy and content of this attachment and for any damages or losses arising from any inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature, which may be contained in this attachment and shall not be liable for direct, indirect, consequential or special damages in connection with this e-mail message or its attachment.