You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Lanny Ripple <la...@spotright.com> on 2013/04/09 17:16:54 UTC

Thrift message length exceeded

Hello,

We have recently upgraded to Cass 1.2.3 from Cass 1.1.5.  We ran sstableupgrades and got the ring on its feet and we are now seeing a new issue.

When we run MapReduce jobs against practically any table we find the following errors:

2013-04-09 09:58:47,746 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
2013-04-09 09:58:47,899 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=MAP, sessionId=
2013-04-09 09:58:48,021 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0
2013-04-09 09:58:48,024 INFO org.apache.hadoop.mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@4a48edb5
2013-04-09 09:58:50,475 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
2013-04-09 09:58:50,477 WARN org.apache.hadoop.mapred.Child: Error running child
java.lang.RuntimeException: org.apache.thrift.TException: Message length exceeded: 106
	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.maybeInit(ColumnFamilyRecordReader.java:384)
	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.computeNext(ColumnFamilyRecordReader.java:390)
	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.computeNext(ColumnFamilyRecordReader.java:313)
	at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
	at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader.getProgress(ColumnFamilyRecordReader.java:103)
	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.getProgress(MapTask.java:444)
	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:460)
	at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1278)
	at org.apache.hadoop.mapred.Child.main(Child.java:260)
Caused by: org.apache.thrift.TException: Message length exceeded: 106
	at org.apache.thrift.protocol.TBinaryProtocol.checkReadLength(TBinaryProtocol.java:393)
	at org.apache.thrift.protocol.TBinaryProtocol.readBinary(TBinaryProtocol.java:363)
	at org.apache.cassandra.thrift.Column.read(Column.java:528)
	at org.apache.cassandra.thrift.ColumnOrSuperColumn.read(ColumnOrSuperColumn.java:507)
	at org.apache.cassandra.thrift.KeySlice.read(KeySlice.java:408)
	at org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassandra.java:12905)
	at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
	at org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:734)
	at org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:718)
	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.maybeInit(ColumnFamilyRecordReader.java:346)
	... 16 more
2013-04-09 09:58:50,481 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task

The message length listed on each failed job differs (not always 106).  Jobs that used to run fine now fail with code compiled against cass 1.2.3 (and work fine if compiled against 1.1.5 and run against the 1.2.3 servers in production).  I'm using the following setup to configure the job:

  def cassConfig(job: Job) {
    val conf = job.getConfiguration()

    ConfigHelper.setInputRpcPort(conf, "" + 9160)
    ConfigHelper.setInputInitialAddress(conf, Config.hostip)

    ConfigHelper.setInputPartitioner(conf, "org.apache.cassandra.dht.RandomPartitioner")
    ConfigHelper.setInputColumnFamily(conf, Config.keyspace, Config.cfname)

    val pred = {
      val range = new SliceRange()
        .setStart("".getBytes("UTF-8"))
        .setFinish("".getBytes("UTF-8"))
        .setReversed(false)
        .setCount(4096 * 1000)

      new SlicePredicate().setSlice_range(range)
    }

    ConfigHelper.setInputSlicePredicate(conf, pred)
  }

The job consists only of a mapper that increments counters for each row and associated columns so all I'm really doing is exercising ColumnFamilyRecordReader.

Has anyone else seen this?  Is there a workaround/fix to get our jobs running?

Thanks

Re: Thrift message length exceeded

Posted by Lanny Ripple <la...@spotright.com>.

Good catch since that bug also would have shut us down.

The original problem is that previous to Cass 1.1.10 it looks like cassandra.yaml values

  * thrift_framed_transport_size_in_mb
  * thrift_max_message_length_in_mb

were ignored (in favor of effectively no limits).  We went from 1.1.5 to 1.2.3 and these were suddenly turned on for us (and way too low for our data).

Also have confirmed your supplied patch2 works for us.

  -ljr

On Apr 22, 2013, at 6:57 AM, Oleksandr Petrov <ol...@gmail.com> wrote:

> I've submitted a patch that fixes the issue for 1.2.3: https://issues.apache.org/jira/browse/CASSANDRA-5504
> 
> Maybe guys know a better way to fix it, but that helped me in a meanwhile.
> 
> 
> On Mon, Apr 22, 2013 at 1:44 AM, Oleksandr Petrov <ol...@gmail.com> wrote:
> If you're using Cassandra 1.2.3, and new Hadoop interface, that would make a call to next(), you'll have an eternal loop reading same things all over again from your cassandra nodes (you may see it if you enable Debug output).
> 
> next() is clearing key() which is required for Wide Row iteration.
> 
> Setting key back fixed issue for me.
> 
> 
> On Sat, Apr 20, 2013 at 3:05 PM, Oleksandr Petrov <ol...@gmail.com> wrote:
> Tried to isolate the issue in testing environment,
> 
> What I currently have:
> 
> That's a setup for test:
> CREATE KEYSPACE cascading_cassandra WITH replication = {'class' : 'SimpleStrategy', 'replication_factor' : 1};
> USE cascading_cassandra;
> CREATE TABLE libraries (emitted_at timestamp, additional_info varchar, environment varchar, application varchar, type varchar, PRIMARY KEY (application, environment, type, emitted_at)) WITH COMPACT STORAGE;
> 
> Next, insert some test data:
> 
> (just for example) 
> [INSERT INTO libraries (application, environment, type, additional_info, emitted_at) VALUES (?, ?, ?, ?, ?); [app env type 0 #inst "2013-04-20T13:01:04.935-00:00"]]
> 
> If keys (e.q. "app" "env" "type") are all same across the dataset, it works correctly.
> As soon as I start varying keys, e.q. "app1", "app2", "app3" or others, I get the error with Message Length Exceeded.
> 
> Does anyone have some ideas?
> Thanks for help!
> 
> 
> On Sat, Apr 20, 2013 at 1:56 PM, Oleksandr Petrov <ol...@gmail.com> wrote:
> I can confirm running same problem. 
> 
> Tried ConfigHelper.setThriftMaxMessageLengthInMb();, and tuning server side, reducing/increasing batch size. 
> 
> Here's stacktrace from Hadoop/Cassandra, maybe it could give a hint:
> 
> Caused by: org.apache.thrift.protocol.TProtocolException: Message length exceeded: 8
> 	at org.apache.thrift.protocol.TBinaryProtocol.checkReadLength(TBinaryProtocol.java:393)
> 
> 	at org.apache.thrift.protocol.TBinaryProtocol.readBinary(TBinaryProtocol.java:363)
> 	at org.apache.cassandra.thrift.Column.read(Column.java:528)
> 	at org.apache.cassandra.thrift.ColumnOrSuperColumn.read(ColumnOrSuperColumn.java:507)
> 	at org.apache.cassandra.thrift.KeySlice.read(KeySlice.java:408)
> 	at org.apache.cassandra.thrift.Cassandra$get_paged_slice_result.read(Cassandra.java:14157)
> 	at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
> 	at org.apache.cassandra.thrift.Cassandra$Client.recv_get_paged_slice(Cassandra.java:769)
> 	at org.apache.cassandra.thrift.Cassandra$Client.get_paged_slice(Cassandra.java:753)
> 	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$WideRowIterator.maybeInit(ColumnFamilyRecordReader.java:438)
> 
> 
> On Thu, Apr 18, 2013 at 12:34 AM, Lanny Ripple <la...@spotright.com> wrote:
> It's slow going finding the time to do so but I'm working on that.
> 
> We do have another table that has one or sometimes two columns per row.  We can run jobs on it without issue.  I looked through org.apache.cassandra.hadoop code and don't see anything that's really changed since 1.1.5 (which was also using thrift-0.7) so something of a puzzler about what's going on.
> 
> 
> On Apr 17, 2013, at 2:47 PM, aaron morton <aa...@thelastpickle.com> wrote:
> 
> > Can you reproduce this in a simple way ?
> >
> > Cheers
> >
> > -----------------
> > Aaron Morton
> > Freelance Cassandra Consultant
> > New Zealand
> >
> > @aaronmorton
> > http://www.thelastpickle.com
> >
> > On 18/04/2013, at 5:50 AM, Lanny Ripple <la...@spotright.com> wrote:
> >
> >> That was our first thought.  Using maven's dependency tree info we verified that we're using the expected (cass 1.2.3) jars
> >>
> >> $ mvn dependency:tree | grep thrift
> >> [INFO] |  +- org.apache.thrift:libthrift:jar:0.7.0:compile
> >> [INFO] |  \- org.apache.cassandra:cassandra-thrift:jar:1.2.3:compile
> >>
> >> I've also dumped the final command run by the hadoop we use (CDH3u5) and verified it's not sneaking thrift in on us.
> >>
> >>
> >> On Tue, Apr 16, 2013 at 4:36 PM, aaron morton <aa...@thelastpickle.com> wrote:
> >> Can you confirm the you are using the same thrift version that ships 1.2.3 ?
> >>
> >> Cheers
> >>
> >> -----------------
> >> Aaron Morton
> >> Freelance Cassandra Consultant
> >> New Zealand
> >>
> >> @aaronmorton
> >> http://www.thelastpickle.com
> >>
> >> On 16/04/2013, at 10:17 AM, Lanny Ripple <la...@spotright.com> wrote:
> >>
> >>> A bump to say I found this
> >>>
> >>>  http://stackoverflow.com/questions/15487540/pig-cassandra-message-length-exceeded
> >>>
> >>> so others are seeing similar behavior.
> >>>
> >>> From what I can see of org.apache.cassandra.hadoop nothing has changed since 1.1.5 when we didn't see such things but sure looks like there's a bug that's slipped in (or been uncovered) somewhere.  I'll try to narrow down to a dataset and code that can reproduce.
> >>>
> >>> On Apr 10, 2013, at 6:29 PM, Lanny Ripple <la...@spotright.com> wrote:
> >>>
> >>>> We are using Astyanax in production but I cut back to just Hadoop and Cassandra to confirm it's a Cassandra (or our use of Cassandra) problem.
> >>>>
> >>>> We do have some extremely large rows but we went from everything working with 1.1.5 to almost everything carping with 1.2.3.  Something has changed.  Perhaps we were doing something wrong earlier that 1.2.3 exposed but surprises are never welcome in production.
> >>>>
> >>>> On Apr 10, 2013, at 8:10 AM, <mo...@barclays.com> wrote:
> >>>>
> >>>>> I also saw this when upgrading from C* 1.0 to 1.2.2, and from hector 0.6 to 0.8
> >>>>> Turns out the Thrift message really was too long.
> >>>>> The mystery to me: Why no complaints in previous versions? Were some checks added in Thrift or Hector?
> >>>>>
> >>>>> -----Original Message-----
> >>>>> From: Lanny Ripple [mailto:lanny@spotright.com]
> >>>>> Sent: Tuesday, April 09, 2013 6:17 PM
> >>>>> To: user@cassandra.apache.org
> >>>>> Subject: Thrift message length exceeded
> >>>>>
> >>>>> Hello,
> >>>>>
> >>>>> We have recently upgraded to Cass 1.2.3 from Cass 1.1.5.  We ran sstableupgrades and got the ring on its feet and we are now seeing a new issue.
> >>>>>
> >>>>> When we run MapReduce jobs against practically any table we find the following errors:
> >>>>>
> >>>>> 2013-04-09 09:58:47,746 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
> >>>>> 2013-04-09 09:58:47,899 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=MAP, sessionId=
> >>>>> 2013-04-09 09:58:48,021 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0
> >>>>> 2013-04-09 09:58:48,024 INFO org.apache.hadoop.mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@4a48edb5
> >>>>> 2013-04-09 09:58:50,475 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
> >>>>> 2013-04-09 09:58:50,477 WARN org.apache.hadoop.mapred.Child: Error running child
> >>>>> java.lang.RuntimeException: org.apache.thrift.TException: Message length exceeded: 106
> >>>>>   at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.maybeInit(ColumnFamilyRecordReader.java:384)
> >>>>>   at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.computeNext(ColumnFamilyRecordReader.java:390)
> >>>>>   at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.computeNext(ColumnFamilyRecordReader.java:313)
> >>>>>   at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
> >>>>>   at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
> >>>>>   at org.apache.cassandra.hadoop.ColumnFamilyRecordReader.getProgress(ColumnFamilyRecordReader.java:103)
> >>>>>   at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.getProgress(MapTask.java:444)
> >>>>>   at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:460)
> >>>>>   at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
> >>>>>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
> >>>>>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
> >>>>>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
> >>>>>   at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
> >>>>>   at java.security.AccessController.doPrivileged(Native Method)
> >>>>>   at javax.security.auth.Subject.doAs(Subject.java:396)
> >>>>>   at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1278)
> >>>>>   at org.apache.hadoop.mapred.Child.main(Child.java:260)
> >>>>> Caused by: org.apache.thrift.TException: Message length exceeded: 106
> >>>>>   at org.apache.thrift.protocol.TBinaryProtocol.checkReadLength(TBinaryProtocol.java:393)
> >>>>>   at org.apache.thrift.protocol.TBinaryProtocol.readBinary(TBinaryProtocol.java:363)
> >>>>>   at org.apache.cassandra.thrift.Column.read(Column.java:528)
> >>>>>   at org.apache.cassandra.thrift.ColumnOrSuperColumn.read(ColumnOrSuperColumn.java:507)
> >>>>>   at org.apache.cassandra.thrift.KeySlice.read(KeySlice.java:408)
> >>>>>   at org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassandra.java:12905)
> >>>>>   at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
> >>>>>   at org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:734)
> >>>>>   at org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:718)
> >>>>>   at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.maybeInit(ColumnFamilyRecordReader.java:346)
> >>>>>   ... 16 more
> >>>>> 2013-04-09 09:58:50,481 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task
> >>>>>
> >>>>> The message length listed on each failed job differs (not always 106).  Jobs that used to run fine now fail with code compiled against cass 1.2.3 (and work fine if compiled against 1.1.5 and run against the 1.2.3 servers in production).  I'm using the following setup to configure the job:
> >>>>>
> >>>>> def cassConfig(job: Job) {
> >>>>>  val conf = job.getConfiguration()
> >>>>>
> >>>>>  ConfigHelper.setInputRpcPort(conf, "" + 9160)
> >>>>>  ConfigHelper.setInputInitialAddress(conf, Config.hostip)
> >>>>>
> >>>>>  ConfigHelper.setInputPartitioner(conf, "org.apache.cassandra.dht.RandomPartitioner")
> >>>>>  ConfigHelper.setInputColumnFamily(conf, Config.keyspace, Config.cfname)
> >>>>>
> >>>>>  val pred = {
> >>>>>    val range = new SliceRange()
> >>>>>      .setStart("".getBytes("UTF-8"))
> >>>>>      .setFinish("".getBytes("UTF-8"))
> >>>>>      .setReversed(false)
> >>>>>      .setCount(4096 * 1000)
> >>>>>
> >>>>>    new SlicePredicate().setSlice_range(range)
> >>>>>  }
> >>>>>
> >>>>>  ConfigHelper.setInputSlicePredicate(conf, pred)
> >>>>> }
> >>>>>
> >>>>> The job consists only of a mapper that increments counters for each row and associated columns so all I'm really doing is exercising ColumnFamilyRecordReader.
> >>>>>
> >>>>> Has anyone else seen this?  Is there a workaround/fix to get our jobs running?
> >>>>>
> >>>>> Thanks
> >>>>> _______________________________________________
> >>>>>
> >>>>> This message may contain information that is confidential or privileged. If you are not an intended recipient of this message, please delete it and any attachments, and notify the sender that you have received it in error. Unless specifically stated in the message or otherwise indicated, you may not duplicate, redistribute or forward this message or any portion thereof, including any attachments, by any means to any other person, including any retail investor or customer. This message is not a recommendation, advice, offer or solicitation, to buy/sell any product or service, and is not an official confirmation of any transaction. Any opinions presented are solely those of the author and do not necessarily represent those of Barclays.
> >>>>>
> >>>>> This message is subject to terms available at: www.barclays.com/emaildisclaimer and, if received from Barclays' Sales or Trading desk, the terms available at: www.barclays.com/salesandtradingdisclaimer/. By messaging with Barclays you consent to the foregoing. Barclays Bank PLC is a company registered in England (number 1026167) with its registered office at 1 Churchill Place, London, E14 5HP. This email may relate to or be sent from other members of the Barclays group.
> >>>>>
> >>>>> _______________________________________________
> >>>>
> >>>
> >>
> >>
> >
> 
> 
> 
> 
> -- 
> alex p
> 
> 
> 
> -- 
> alex p
> 
> 
> 
> -- 
> alex p
> 
> 
> 
> -- 
> alex p

Re: Thrift message length exceeded

Posted by Oleksandr Petrov <ol...@gmail.com>.

I've submitted a patch that fixes the issue for 1.2.3:
https://issues.apache.org/jira/browse/CASSANDRA-5504

Maybe guys know a better way to fix it, but that helped me in a meanwhile.


On Mon, Apr 22, 2013 at 1:44 AM, Oleksandr Petrov <
oleksandr.petrov@gmail.com> wrote:

> If you're using Cassandra 1.2.3, and new Hadoop interface, that would make
> a call to next(), you'll have an eternal loop reading same things all over
> again from your cassandra nodes (you may see it if you enable Debug output).
>
> next() is clearing key() which is required for Wide Row iteration.
>
> Setting key back fixed issue for me.
>
>
> On Sat, Apr 20, 2013 at 3:05 PM, Oleksandr Petrov <
> oleksandr.petrov@gmail.com> wrote:
>
>> Tried to isolate the issue in testing environment,
>>
>> What I currently have:
>>
>> That's a setup for test:
>> CREATE KEYSPACE cascading_cassandra WITH replication = {'class' :
>> 'SimpleStrategy', 'replication_factor' : 1};
>> USE cascading_cassandra;
>> CREATE TABLE libraries (emitted_at timestamp, additional_info varchar,
>> environment varchar, application varchar, type varchar, PRIMARY KEY
>> (application, environment, type, emitted_at)) WITH COMPACT STORAGE;
>>
>> Next, insert some test data:
>>
>> (just for example)
>> [INSERT INTO libraries (application, environment, type, additional_info,
>> emitted_at) VALUES (?, ?, ?, ?, ?); [app env type 0 #inst "2013-04-20T13:01:
>> 04.935-00:00"]]
>>
>> If keys (e.q. "app" "env" "type") are all same across the dataset, it
>> works correctly.
>> As soon as I start varying keys, e.q. "app1", "app2", "app3" or others, I
>> get the error with Message Length Exceeded.
>>
>> Does anyone have some ideas?
>> Thanks for help!
>>
>>
>> On Sat, Apr 20, 2013 at 1:56 PM, Oleksandr Petrov <
>> oleksandr.petrov@gmail.com> wrote:
>>
>>> I can confirm running same problem.
>>>
>>> Tried ConfigHelper.setThriftMaxMessageLengthInMb();, and tuning server
>>> side, reducing/increasing batch size.
>>>
>>> Here's stacktrace from Hadoop/Cassandra, maybe it could give a hint:
>>>
>>> Caused by: org.apache.thrift.protocol.TProtocolException: Message length
>>> exceeded: 8
>>> at
>>> org.apache.thrift.protocol.TBinaryProtocol.checkReadLength(TBinaryProtocol.java:393)
>>>
>>> at
>>> org.apache.thrift.protocol.TBinaryProtocol.readBinary(TBinaryProtocol.java:363)
>>> at org.apache.cassandra.thrift.Column.read(Column.java:528)
>>>  at
>>> org.apache.cassandra.thrift.ColumnOrSuperColumn.read(ColumnOrSuperColumn.java:507)
>>> at org.apache.cassandra.thrift.KeySlice.read(KeySlice.java:408)
>>>  at
>>> org.apache.cassandra.thrift.Cassandra$get_paged_slice_result.read(Cassandra.java:14157)
>>> at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
>>>  at
>>> org.apache.cassandra.thrift.Cassandra$Client.recv_get_paged_slice(Cassandra.java:769)
>>> at
>>> org.apache.cassandra.thrift.Cassandra$Client.get_paged_slice(Cassandra.java:753)
>>>  at
>>> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$WideRowIterator.maybeInit(ColumnFamilyRecordReader.java:438)
>>>
>>>
>>> On Thu, Apr 18, 2013 at 12:34 AM, Lanny Ripple <la...@spotright.com>wrote:
>>>
>>>> It's slow going finding the time to do so but I'm working on that.
>>>>
>>>> We do have another table that has one or sometimes two columns per row.
>>>>  We can run jobs on it without issue.  I looked through
>>>> org.apache.cassandra.hadoop code and don't see anything that's really
>>>> changed since 1.1.5 (which was also using thrift-0.7) so something of a
>>>> puzzler about what's going on.
>>>>
>>>>
>>>> On Apr 17, 2013, at 2:47 PM, aaron morton <aa...@thelastpickle.com>
>>>> wrote:
>>>>
>>>> > Can you reproduce this in a simple way ?
>>>> >
>>>> > Cheers
>>>> >
>>>> > -----------------
>>>> > Aaron Morton
>>>> > Freelance Cassandra Consultant
>>>> > New Zealand
>>>> >
>>>> > @aaronmorton
>>>> > http://www.thelastpickle.com
>>>> >
>>>> > On 18/04/2013, at 5:50 AM, Lanny Ripple <la...@spotright.com> wrote:
>>>> >
>>>> >> That was our first thought.  Using maven's dependency tree info we
>>>> verified that we're using the expected (cass 1.2.3) jars
>>>> >>
>>>> >> $ mvn dependency:tree | grep thrift
>>>> >> [INFO] |  +- org.apache.thrift:libthrift:jar:0.7.0:compile
>>>> >> [INFO] |  \- org.apache.cassandra:cassandra-thrift:jar:1.2.3:compile
>>>> >>
>>>> >> I've also dumped the final command run by the hadoop we use (CDH3u5)
>>>> and verified it's not sneaking thrift in on us.
>>>> >>
>>>> >>
>>>> >> On Tue, Apr 16, 2013 at 4:36 PM, aaron morton <
>>>> aaron@thelastpickle.com> wrote:
>>>> >> Can you confirm the you are using the same thrift version that ships
>>>> 1.2.3 ?
>>>> >>
>>>> >> Cheers
>>>> >>
>>>> >> -----------------
>>>> >> Aaron Morton
>>>> >> Freelance Cassandra Consultant
>>>> >> New Zealand
>>>> >>
>>>> >> @aaronmorton
>>>> >> http://www.thelastpickle.com
>>>> >>
>>>> >> On 16/04/2013, at 10:17 AM, Lanny Ripple <la...@spotright.com>
>>>> wrote:
>>>> >>
>>>> >>> A bump to say I found this
>>>> >>>
>>>> >>>
>>>> http://stackoverflow.com/questions/15487540/pig-cassandra-message-length-exceeded
>>>> >>>
>>>> >>> so others are seeing similar behavior.
>>>> >>>
>>>> >>> From what I can see of org.apache.cassandra.hadoop nothing has
>>>> changed since 1.1.5 when we didn't see such things but sure looks like
>>>> there's a bug that's slipped in (or been uncovered) somewhere.  I'll try to
>>>> narrow down to a dataset and code that can reproduce.
>>>> >>>
>>>> >>> On Apr 10, 2013, at 6:29 PM, Lanny Ripple <la...@spotright.com>
>>>> wrote:
>>>> >>>
>>>> >>>> We are using Astyanax in production but I cut back to just Hadoop
>>>> and Cassandra to confirm it's a Cassandra (or our use of Cassandra) problem.
>>>> >>>>
>>>> >>>> We do have some extremely large rows but we went from everything
>>>> working with 1.1.5 to almost everything carping with 1.2.3.  Something has
>>>> changed.  Perhaps we were doing something wrong earlier that 1.2.3 exposed
>>>> but surprises are never welcome in production.
>>>> >>>>
>>>> >>>> On Apr 10, 2013, at 8:10 AM, <mo...@barclays.com> wrote:
>>>> >>>>
>>>> >>>>> I also saw this when upgrading from C* 1.0 to 1.2.2, and from
>>>> hector 0.6 to 0.8
>>>> >>>>> Turns out the Thrift message really was too long.
>>>> >>>>> The mystery to me: Why no complaints in previous versions? Were
>>>> some checks added in Thrift or Hector?
>>>> >>>>>
>>>> >>>>> -----Original Message-----
>>>> >>>>> From: Lanny Ripple [mailto:lanny@spotright.com]
>>>> >>>>> Sent: Tuesday, April 09, 2013 6:17 PM
>>>> >>>>> To: user@cassandra.apache.org
>>>> >>>>> Subject: Thrift message length exceeded
>>>> >>>>>
>>>> >>>>> Hello,
>>>> >>>>>
>>>> >>>>> We have recently upgraded to Cass 1.2.3 from Cass 1.1.5.  We ran
>>>> sstableupgrades and got the ring on its feet and we are now seeing a new
>>>> issue.
>>>> >>>>>
>>>> >>>>> When we run MapReduce jobs against practically any table we find
>>>> the following errors:
>>>> >>>>>
>>>> >>>>> 2013-04-09 09:58:47,746 INFO
>>>> org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
>>>> >>>>> 2013-04-09 09:58:47,899 INFO
>>>> org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with
>>>> processName=MAP, sessionId=
>>>> >>>>> 2013-04-09 09:58:48,021 INFO org.apache.hadoop.util.ProcessTree:
>>>> setsid exited with exit code 0
>>>> >>>>> 2013-04-09 09:58:48,024 INFO org.apache.hadoop.mapred.Task:
>>>>  Using ResourceCalculatorPlugin :
>>>> org.apache.hadoop.util.LinuxResourceCalculatorPlugin@4a48edb5
>>>> >>>>> 2013-04-09 09:58:50,475 INFO
>>>> org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater
>>>> with mapRetainSize=-1 and reduceRetainSize=-1
>>>> >>>>> 2013-04-09 09:58:50,477 WARN org.apache.hadoop.mapred.Child:
>>>> Error running child
>>>> >>>>> java.lang.RuntimeException: org.apache.thrift.TException: Message
>>>> length exceeded: 106
>>>> >>>>>   at
>>>> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.maybeInit(ColumnFamilyRecordReader.java:384)
>>>> >>>>>   at
>>>> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.computeNext(ColumnFamilyRecordReader.java:390)
>>>> >>>>>   at
>>>> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.computeNext(ColumnFamilyRecordReader.java:313)
>>>> >>>>>   at
>>>> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
>>>> >>>>>   at
>>>> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
>>>> >>>>>   at
>>>> org.apache.cassandra.hadoop.ColumnFamilyRecordReader.getProgress(ColumnFamilyRecordReader.java:103)
>>>> >>>>>   at
>>>> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.getProgress(MapTask.java:444)
>>>> >>>>>   at
>>>> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:460)
>>>> >>>>>   at
>>>> org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
>>>> >>>>>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
>>>> >>>>>   at
>>>> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
>>>> >>>>>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
>>>> >>>>>   at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
>>>> >>>>>   at java.security.AccessController.doPrivileged(Native Method)
>>>> >>>>>   at javax.security.auth.Subject.doAs(Subject.java:396)
>>>> >>>>>   at
>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1278)
>>>> >>>>>   at org.apache.hadoop.mapred.Child.main(Child.java:260)
>>>> >>>>> Caused by: org.apache.thrift.TException: Message length exceeded:
>>>> 106
>>>> >>>>>   at
>>>> org.apache.thrift.protocol.TBinaryProtocol.checkReadLength(TBinaryProtocol.java:393)
>>>> >>>>>   at
>>>> org.apache.thrift.protocol.TBinaryProtocol.readBinary(TBinaryProtocol.java:363)
>>>> >>>>>   at org.apache.cassandra.thrift.Column.read(Column.java:528)
>>>> >>>>>   at
>>>> org.apache.cassandra.thrift.ColumnOrSuperColumn.read(ColumnOrSuperColumn.java:507)
>>>> >>>>>   at org.apache.cassandra.thrift.KeySlice.read(KeySlice.java:408)
>>>> >>>>>   at
>>>> org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassandra.java:12905)
>>>> >>>>>   at
>>>> org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
>>>> >>>>>   at
>>>> org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:734)
>>>> >>>>>   at
>>>> org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:718)
>>>> >>>>>   at
>>>> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.maybeInit(ColumnFamilyRecordReader.java:346)
>>>> >>>>>   ... 16 more
>>>> >>>>> 2013-04-09 09:58:50,481 INFO org.apache.hadoop.mapred.Task:
>>>> Runnning cleanup for the task
>>>> >>>>>
>>>> >>>>> The message length listed on each failed job differs (not always
>>>> 106).  Jobs that used to run fine now fail with code compiled against cass
>>>> 1.2.3 (and work fine if compiled against 1.1.5 and run against the 1.2.3
>>>> servers in production).  I'm using the following setup to configure the job:
>>>> >>>>>
>>>> >>>>> def cassConfig(job: Job) {
>>>> >>>>>  val conf = job.getConfiguration()
>>>> >>>>>
>>>> >>>>>  ConfigHelper.setInputRpcPort(conf, "" + 9160)
>>>> >>>>>  ConfigHelper.setInputInitialAddress(conf, Config.hostip)
>>>> >>>>>
>>>> >>>>>  ConfigHelper.setInputPartitioner(conf,
>>>> "org.apache.cassandra.dht.RandomPartitioner")
>>>> >>>>>  ConfigHelper.setInputColumnFamily(conf, Config.keyspace,
>>>> Config.cfname)
>>>> >>>>>
>>>> >>>>>  val pred = {
>>>> >>>>>    val range = new SliceRange()
>>>> >>>>>      .setStart("".getBytes("UTF-8"))
>>>> >>>>>      .setFinish("".getBytes("UTF-8"))
>>>> >>>>>      .setReversed(false)
>>>> >>>>>      .setCount(4096 * 1000)
>>>> >>>>>
>>>> >>>>>    new SlicePredicate().setSlice_range(range)
>>>> >>>>>  }
>>>> >>>>>
>>>> >>>>>  ConfigHelper.setInputSlicePredicate(conf, pred)
>>>> >>>>> }
>>>> >>>>>
>>>> >>>>> The job consists only of a mapper that increments counters for
>>>> each row and associated columns so all I'm really doing is exercising
>>>> ColumnFamilyRecordReader.
>>>> >>>>>
>>>> >>>>> Has anyone else seen this?  Is there a workaround/fix to get our
>>>> jobs running?
>>>> >>>>>
>>>> >>>>> Thanks
>>>> >>>>> _______________________________________________
>>>> >>>>>
>>>> >>>>> This message may contain information that is confidential or
>>>> privileged. If you are not an intended recipient of this message, please
>>>> delete it and any attachments, and notify the sender that you have received
>>>> it in error. Unless specifically stated in the message or otherwise
>>>> indicated, you may not duplicate, redistribute or forward this message or
>>>> any portion thereof, including any attachments, by any means to any other
>>>> person, including any retail investor or customer. This message is not a
>>>> recommendation, advice, offer or solicitation, to buy/sell any product or
>>>> service, and is not an official confirmation of any transaction. Any
>>>> opinions presented are solely those of the author and do not necessarily
>>>> represent those of Barclays.
>>>> >>>>>
>>>> >>>>> This message is subject to terms available at:
>>>> www.barclays.com/emaildisclaimer and, if received from Barclays' Sales
>>>> or Trading desk, the terms available at:
>>>> www.barclays.com/salesandtradingdisclaimer/. By messaging with
>>>> Barclays you consent to the foregoing. Barclays Bank PLC is a company
>>>> registered in England (number 1026167) with its registered office at 1
>>>> Churchill Place, London, E14 5HP. This email may relate to or be sent from
>>>> other members of the Barclays group.
>>>> >>>>>
>>>> >>>>> _______________________________________________
>>>> >>>>
>>>> >>>
>>>> >>
>>>> >>
>>>> >
>>>>
>>>>
>>>
>>>
>>> --
>>> alex p
>>>
>>
>>
>>
>> --
>> alex p
>>
>
>
>
> --
> alex p
>



-- 
alex p

Re: Thrift message length exceeded

Posted by Oleksandr Petrov <ol...@gmail.com>.

If you're using Cassandra 1.2.3, and new Hadoop interface, that would make
a call to next(), you'll have an eternal loop reading same things all over
again from your cassandra nodes (you may see it if you enable Debug output).

next() is clearing key() which is required for Wide Row iteration.

Setting key back fixed issue for me.


On Sat, Apr 20, 2013 at 3:05 PM, Oleksandr Petrov <
oleksandr.petrov@gmail.com> wrote:

> Tried to isolate the issue in testing environment,
>
> What I currently have:
>
> That's a setup for test:
> CREATE KEYSPACE cascading_cassandra WITH replication = {'class' :
> 'SimpleStrategy', 'replication_factor' : 1};
> USE cascading_cassandra;
> CREATE TABLE libraries (emitted_at timestamp, additional_info varchar,
> environment varchar, application varchar, type varchar, PRIMARY KEY
> (application, environment, type, emitted_at)) WITH COMPACT STORAGE;
>
> Next, insert some test data:
>
> (just for example)
> [INSERT INTO libraries (application, environment, type, additional_info,
> emitted_at) VALUES (?, ?, ?, ?, ?); [app env type 0 #inst "2013-04-20T13:01:
> 04.935-00:00"]]
>
> If keys (e.q. "app" "env" "type") are all same across the dataset, it
> works correctly.
> As soon as I start varying keys, e.q. "app1", "app2", "app3" or others, I
> get the error with Message Length Exceeded.
>
> Does anyone have some ideas?
> Thanks for help!
>
>
> On Sat, Apr 20, 2013 at 1:56 PM, Oleksandr Petrov <
> oleksandr.petrov@gmail.com> wrote:
>
>> I can confirm running same problem.
>>
>> Tried ConfigHelper.setThriftMaxMessageLengthInMb();, and tuning server
>> side, reducing/increasing batch size.
>>
>> Here's stacktrace from Hadoop/Cassandra, maybe it could give a hint:
>>
>> Caused by: org.apache.thrift.protocol.TProtocolException: Message length
>> exceeded: 8
>> at
>> org.apache.thrift.protocol.TBinaryProtocol.checkReadLength(TBinaryProtocol.java:393)
>>
>> at
>> org.apache.thrift.protocol.TBinaryProtocol.readBinary(TBinaryProtocol.java:363)
>> at org.apache.cassandra.thrift.Column.read(Column.java:528)
>>  at
>> org.apache.cassandra.thrift.ColumnOrSuperColumn.read(ColumnOrSuperColumn.java:507)
>> at org.apache.cassandra.thrift.KeySlice.read(KeySlice.java:408)
>>  at
>> org.apache.cassandra.thrift.Cassandra$get_paged_slice_result.read(Cassandra.java:14157)
>> at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
>>  at
>> org.apache.cassandra.thrift.Cassandra$Client.recv_get_paged_slice(Cassandra.java:769)
>> at
>> org.apache.cassandra.thrift.Cassandra$Client.get_paged_slice(Cassandra.java:753)
>>  at
>> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$WideRowIterator.maybeInit(ColumnFamilyRecordReader.java:438)
>>
>>
>> On Thu, Apr 18, 2013 at 12:34 AM, Lanny Ripple <la...@spotright.com>wrote:
>>
>>> It's slow going finding the time to do so but I'm working on that.
>>>
>>> We do have another table that has one or sometimes two columns per row.
>>>  We can run jobs on it without issue.  I looked through
>>> org.apache.cassandra.hadoop code and don't see anything that's really
>>> changed since 1.1.5 (which was also using thrift-0.7) so something of a
>>> puzzler about what's going on.
>>>
>>>
>>> On Apr 17, 2013, at 2:47 PM, aaron morton <aa...@thelastpickle.com>
>>> wrote:
>>>
>>> > Can you reproduce this in a simple way ?
>>> >
>>> > Cheers
>>> >
>>> > -----------------
>>> > Aaron Morton
>>> > Freelance Cassandra Consultant
>>> > New Zealand
>>> >
>>> > @aaronmorton
>>> > http://www.thelastpickle.com
>>> >
>>> > On 18/04/2013, at 5:50 AM, Lanny Ripple <la...@spotright.com> wrote:
>>> >
>>> >> That was our first thought.  Using maven's dependency tree info we
>>> verified that we're using the expected (cass 1.2.3) jars
>>> >>
>>> >> $ mvn dependency:tree | grep thrift
>>> >> [INFO] |  +- org.apache.thrift:libthrift:jar:0.7.0:compile
>>> >> [INFO] |  \- org.apache.cassandra:cassandra-thrift:jar:1.2.3:compile
>>> >>
>>> >> I've also dumped the final command run by the hadoop we use (CDH3u5)
>>> and verified it's not sneaking thrift in on us.
>>> >>
>>> >>
>>> >> On Tue, Apr 16, 2013 at 4:36 PM, aaron morton <
>>> aaron@thelastpickle.com> wrote:
>>> >> Can you confirm the you are using the same thrift version that ships
>>> 1.2.3 ?
>>> >>
>>> >> Cheers
>>> >>
>>> >> -----------------
>>> >> Aaron Morton
>>> >> Freelance Cassandra Consultant
>>> >> New Zealand
>>> >>
>>> >> @aaronmorton
>>> >> http://www.thelastpickle.com
>>> >>
>>> >> On 16/04/2013, at 10:17 AM, Lanny Ripple <la...@spotright.com> wrote:
>>> >>
>>> >>> A bump to say I found this
>>> >>>
>>> >>>
>>> http://stackoverflow.com/questions/15487540/pig-cassandra-message-length-exceeded
>>> >>>
>>> >>> so others are seeing similar behavior.
>>> >>>
>>> >>> From what I can see of org.apache.cassandra.hadoop nothing has
>>> changed since 1.1.5 when we didn't see such things but sure looks like
>>> there's a bug that's slipped in (or been uncovered) somewhere.  I'll try to
>>> narrow down to a dataset and code that can reproduce.
>>> >>>
>>> >>> On Apr 10, 2013, at 6:29 PM, Lanny Ripple <la...@spotright.com>
>>> wrote:
>>> >>>
>>> >>>> We are using Astyanax in production but I cut back to just Hadoop
>>> and Cassandra to confirm it's a Cassandra (or our use of Cassandra) problem.
>>> >>>>
>>> >>>> We do have some extremely large rows but we went from everything
>>> working with 1.1.5 to almost everything carping with 1.2.3.  Something has
>>> changed.  Perhaps we were doing something wrong earlier that 1.2.3 exposed
>>> but surprises are never welcome in production.
>>> >>>>
>>> >>>> On Apr 10, 2013, at 8:10 AM, <mo...@barclays.com> wrote:
>>> >>>>
>>> >>>>> I also saw this when upgrading from C* 1.0 to 1.2.2, and from
>>> hector 0.6 to 0.8
>>> >>>>> Turns out the Thrift message really was too long.
>>> >>>>> The mystery to me: Why no complaints in previous versions? Were
>>> some checks added in Thrift or Hector?
>>> >>>>>
>>> >>>>> -----Original Message-----
>>> >>>>> From: Lanny Ripple [mailto:lanny@spotright.com]
>>> >>>>> Sent: Tuesday, April 09, 2013 6:17 PM
>>> >>>>> To: user@cassandra.apache.org
>>> >>>>> Subject: Thrift message length exceeded
>>> >>>>>
>>> >>>>> Hello,
>>> >>>>>
>>> >>>>> We have recently upgraded to Cass 1.2.3 from Cass 1.1.5.  We ran
>>> sstableupgrades and got the ring on its feet and we are now seeing a new
>>> issue.
>>> >>>>>
>>> >>>>> When we run MapReduce jobs against practically any table we find
>>> the following errors:
>>> >>>>>
>>> >>>>> 2013-04-09 09:58:47,746 INFO
>>> org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
>>> >>>>> 2013-04-09 09:58:47,899 INFO
>>> org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with
>>> processName=MAP, sessionId=
>>> >>>>> 2013-04-09 09:58:48,021 INFO org.apache.hadoop.util.ProcessTree:
>>> setsid exited with exit code 0
>>> >>>>> 2013-04-09 09:58:48,024 INFO org.apache.hadoop.mapred.Task:  Using
>>> ResourceCalculatorPlugin :
>>> org.apache.hadoop.util.LinuxResourceCalculatorPlugin@4a48edb5
>>> >>>>> 2013-04-09 09:58:50,475 INFO
>>> org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater
>>> with mapRetainSize=-1 and reduceRetainSize=-1
>>> >>>>> 2013-04-09 09:58:50,477 WARN org.apache.hadoop.mapred.Child: Error
>>> running child
>>> >>>>> java.lang.RuntimeException: org.apache.thrift.TException: Message
>>> length exceeded: 106
>>> >>>>>   at
>>> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.maybeInit(ColumnFamilyRecordReader.java:384)
>>> >>>>>   at
>>> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.computeNext(ColumnFamilyRecordReader.java:390)
>>> >>>>>   at
>>> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.computeNext(ColumnFamilyRecordReader.java:313)
>>> >>>>>   at
>>> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
>>> >>>>>   at
>>> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
>>> >>>>>   at
>>> org.apache.cassandra.hadoop.ColumnFamilyRecordReader.getProgress(ColumnFamilyRecordReader.java:103)
>>> >>>>>   at
>>> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.getProgress(MapTask.java:444)
>>> >>>>>   at
>>> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:460)
>>> >>>>>   at
>>> org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
>>> >>>>>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
>>> >>>>>   at
>>> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
>>> >>>>>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
>>> >>>>>   at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
>>> >>>>>   at java.security.AccessController.doPrivileged(Native Method)
>>> >>>>>   at javax.security.auth.Subject.doAs(Subject.java:396)
>>> >>>>>   at
>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1278)
>>> >>>>>   at org.apache.hadoop.mapred.Child.main(Child.java:260)
>>> >>>>> Caused by: org.apache.thrift.TException: Message length exceeded:
>>> 106
>>> >>>>>   at
>>> org.apache.thrift.protocol.TBinaryProtocol.checkReadLength(TBinaryProtocol.java:393)
>>> >>>>>   at
>>> org.apache.thrift.protocol.TBinaryProtocol.readBinary(TBinaryProtocol.java:363)
>>> >>>>>   at org.apache.cassandra.thrift.Column.read(Column.java:528)
>>> >>>>>   at
>>> org.apache.cassandra.thrift.ColumnOrSuperColumn.read(ColumnOrSuperColumn.java:507)
>>> >>>>>   at org.apache.cassandra.thrift.KeySlice.read(KeySlice.java:408)
>>> >>>>>   at
>>> org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassandra.java:12905)
>>> >>>>>   at
>>> org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
>>> >>>>>   at
>>> org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:734)
>>> >>>>>   at
>>> org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:718)
>>> >>>>>   at
>>> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.maybeInit(ColumnFamilyRecordReader.java:346)
>>> >>>>>   ... 16 more
>>> >>>>> 2013-04-09 09:58:50,481 INFO org.apache.hadoop.mapred.Task:
>>> Runnning cleanup for the task
>>> >>>>>
>>> >>>>> The message length listed on each failed job differs (not always
>>> 106).  Jobs that used to run fine now fail with code compiled against cass
>>> 1.2.3 (and work fine if compiled against 1.1.5 and run against the 1.2.3
>>> servers in production).  I'm using the following setup to configure the job:
>>> >>>>>
>>> >>>>> def cassConfig(job: Job) {
>>> >>>>>  val conf = job.getConfiguration()
>>> >>>>>
>>> >>>>>  ConfigHelper.setInputRpcPort(conf, "" + 9160)
>>> >>>>>  ConfigHelper.setInputInitialAddress(conf, Config.hostip)
>>> >>>>>
>>> >>>>>  ConfigHelper.setInputPartitioner(conf,
>>> "org.apache.cassandra.dht.RandomPartitioner")
>>> >>>>>  ConfigHelper.setInputColumnFamily(conf, Config.keyspace,
>>> Config.cfname)
>>> >>>>>
>>> >>>>>  val pred = {
>>> >>>>>    val range = new SliceRange()
>>> >>>>>      .setStart("".getBytes("UTF-8"))
>>> >>>>>      .setFinish("".getBytes("UTF-8"))
>>> >>>>>      .setReversed(false)
>>> >>>>>      .setCount(4096 * 1000)
>>> >>>>>
>>> >>>>>    new SlicePredicate().setSlice_range(range)
>>> >>>>>  }
>>> >>>>>
>>> >>>>>  ConfigHelper.setInputSlicePredicate(conf, pred)
>>> >>>>> }
>>> >>>>>
>>> >>>>> The job consists only of a mapper that increments counters for
>>> each row and associated columns so all I'm really doing is exercising
>>> ColumnFamilyRecordReader.
>>> >>>>>
>>> >>>>> Has anyone else seen this?  Is there a workaround/fix to get our
>>> jobs running?
>>> >>>>>
>>> >>>>> Thanks
>>> >>>>> _______________________________________________
>>> >>>>>
>>> >>>>> This message may contain information that is confidential or
>>> privileged. If you are not an intended recipient of this message, please
>>> delete it and any attachments, and notify the sender that you have received
>>> it in error. Unless specifically stated in the message or otherwise
>>> indicated, you may not duplicate, redistribute or forward this message or
>>> any portion thereof, including any attachments, by any means to any other
>>> person, including any retail investor or customer. This message is not a
>>> recommendation, advice, offer or solicitation, to buy/sell any product or
>>> service, and is not an official confirmation of any transaction. Any
>>> opinions presented are solely those of the author and do not necessarily
>>> represent those of Barclays.
>>> >>>>>
>>> >>>>> This message is subject to terms available at:
>>> www.barclays.com/emaildisclaimer and, if received from Barclays' Sales
>>> or Trading desk, the terms available at:
>>> www.barclays.com/salesandtradingdisclaimer/. By messaging with Barclays
>>> you consent to the foregoing. Barclays Bank PLC is a company registered in
>>> England (number 1026167) with its registered office at 1 Churchill Place,
>>> London, E14 5HP. This email may relate to or be sent from other members of
>>> the Barclays group.
>>> >>>>>
>>> >>>>> _______________________________________________
>>> >>>>
>>> >>>
>>> >>
>>> >>
>>> >
>>>
>>>
>>
>>
>> --
>> alex p
>>
>
>
>
> --
> alex p
>



-- 
alex p

Re: Thrift message length exceeded

Posted by Oleksandr Petrov <ol...@gmail.com>.

Tried to isolate the issue in testing environment,

What I currently have:

That's a setup for test:
CREATE KEYSPACE cascading_cassandra WITH replication = {'class' :
'SimpleStrategy', 'replication_factor' : 1};
USE cascading_cassandra;
CREATE TABLE libraries (emitted_at timestamp, additional_info varchar,
environment varchar, application varchar, type varchar, PRIMARY KEY
(application, environment, type, emitted_at)) WITH COMPACT STORAGE;

Next, insert some test data:

(just for example)
[INSERT INTO libraries (application, environment, type, additional_info,
emitted_at) VALUES (?, ?, ?, ?, ?); [app env type 0 #inst
"2013-04-20T13:01:04.935-00:00"]]

If keys (e.q. "app" "env" "type") are all same across the dataset, it works
correctly.
As soon as I start varying keys, e.q. "app1", "app2", "app3" or others, I
get the error with Message Length Exceeded.

Does anyone have some ideas?
Thanks for help!


On Sat, Apr 20, 2013 at 1:56 PM, Oleksandr Petrov <
oleksandr.petrov@gmail.com> wrote:

> I can confirm running same problem.
>
> Tried ConfigHelper.setThriftMaxMessageLengthInMb();, and tuning server
> side, reducing/increasing batch size.
>
> Here's stacktrace from Hadoop/Cassandra, maybe it could give a hint:
>
> Caused by: org.apache.thrift.protocol.TProtocolException: Message length
> exceeded: 8
> at
> org.apache.thrift.protocol.TBinaryProtocol.checkReadLength(TBinaryProtocol.java:393)
>
> at
> org.apache.thrift.protocol.TBinaryProtocol.readBinary(TBinaryProtocol.java:363)
> at org.apache.cassandra.thrift.Column.read(Column.java:528)
>  at
> org.apache.cassandra.thrift.ColumnOrSuperColumn.read(ColumnOrSuperColumn.java:507)
> at org.apache.cassandra.thrift.KeySlice.read(KeySlice.java:408)
>  at
> org.apache.cassandra.thrift.Cassandra$get_paged_slice_result.read(Cassandra.java:14157)
> at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
>  at
> org.apache.cassandra.thrift.Cassandra$Client.recv_get_paged_slice(Cassandra.java:769)
> at
> org.apache.cassandra.thrift.Cassandra$Client.get_paged_slice(Cassandra.java:753)
>  at
> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$WideRowIterator.maybeInit(ColumnFamilyRecordReader.java:438)
>
>
> On Thu, Apr 18, 2013 at 12:34 AM, Lanny Ripple <la...@spotright.com>wrote:
>
>> It's slow going finding the time to do so but I'm working on that.
>>
>> We do have another table that has one or sometimes two columns per row.
>>  We can run jobs on it without issue.  I looked through
>> org.apache.cassandra.hadoop code and don't see anything that's really
>> changed since 1.1.5 (which was also using thrift-0.7) so something of a
>> puzzler about what's going on.
>>
>>
>> On Apr 17, 2013, at 2:47 PM, aaron morton <aa...@thelastpickle.com>
>> wrote:
>>
>> > Can you reproduce this in a simple way ?
>> >
>> > Cheers
>> >
>> > -----------------
>> > Aaron Morton
>> > Freelance Cassandra Consultant
>> > New Zealand
>> >
>> > @aaronmorton
>> > http://www.thelastpickle.com
>> >
>> > On 18/04/2013, at 5:50 AM, Lanny Ripple <la...@spotright.com> wrote:
>> >
>> >> That was our first thought.  Using maven's dependency tree info we
>> verified that we're using the expected (cass 1.2.3) jars
>> >>
>> >> $ mvn dependency:tree | grep thrift
>> >> [INFO] |  +- org.apache.thrift:libthrift:jar:0.7.0:compile
>> >> [INFO] |  \- org.apache.cassandra:cassandra-thrift:jar:1.2.3:compile
>> >>
>> >> I've also dumped the final command run by the hadoop we use (CDH3u5)
>> and verified it's not sneaking thrift in on us.
>> >>
>> >>
>> >> On Tue, Apr 16, 2013 at 4:36 PM, aaron morton <aa...@thelastpickle.com>
>> wrote:
>> >> Can you confirm the you are using the same thrift version that ships
>> 1.2.3 ?
>> >>
>> >> Cheers
>> >>
>> >> -----------------
>> >> Aaron Morton
>> >> Freelance Cassandra Consultant
>> >> New Zealand
>> >>
>> >> @aaronmorton
>> >> http://www.thelastpickle.com
>> >>
>> >> On 16/04/2013, at 10:17 AM, Lanny Ripple <la...@spotright.com> wrote:
>> >>
>> >>> A bump to say I found this
>> >>>
>> >>>
>> http://stackoverflow.com/questions/15487540/pig-cassandra-message-length-exceeded
>> >>>
>> >>> so others are seeing similar behavior.
>> >>>
>> >>> From what I can see of org.apache.cassandra.hadoop nothing has
>> changed since 1.1.5 when we didn't see such things but sure looks like
>> there's a bug that's slipped in (or been uncovered) somewhere.  I'll try to
>> narrow down to a dataset and code that can reproduce.
>> >>>
>> >>> On Apr 10, 2013, at 6:29 PM, Lanny Ripple <la...@spotright.com>
>> wrote:
>> >>>
>> >>>> We are using Astyanax in production but I cut back to just Hadoop
>> and Cassandra to confirm it's a Cassandra (or our use of Cassandra) problem.
>> >>>>
>> >>>> We do have some extremely large rows but we went from everything
>> working with 1.1.5 to almost everything carping with 1.2.3.  Something has
>> changed.  Perhaps we were doing something wrong earlier that 1.2.3 exposed
>> but surprises are never welcome in production.
>> >>>>
>> >>>> On Apr 10, 2013, at 8:10 AM, <mo...@barclays.com> wrote:
>> >>>>
>> >>>>> I also saw this when upgrading from C* 1.0 to 1.2.2, and from
>> hector 0.6 to 0.8
>> >>>>> Turns out the Thrift message really was too long.
>> >>>>> The mystery to me: Why no complaints in previous versions? Were
>> some checks added in Thrift or Hector?
>> >>>>>
>> >>>>> -----Original Message-----
>> >>>>> From: Lanny Ripple [mailto:lanny@spotright.com]
>> >>>>> Sent: Tuesday, April 09, 2013 6:17 PM
>> >>>>> To: user@cassandra.apache.org
>> >>>>> Subject: Thrift message length exceeded
>> >>>>>
>> >>>>> Hello,
>> >>>>>
>> >>>>> We have recently upgraded to Cass 1.2.3 from Cass 1.1.5.  We ran
>> sstableupgrades and got the ring on its feet and we are now seeing a new
>> issue.
>> >>>>>
>> >>>>> When we run MapReduce jobs against practically any table we find
>> the following errors:
>> >>>>>
>> >>>>> 2013-04-09 09:58:47,746 INFO
>> org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
>> >>>>> 2013-04-09 09:58:47,899 INFO
>> org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with
>> processName=MAP, sessionId=
>> >>>>> 2013-04-09 09:58:48,021 INFO org.apache.hadoop.util.ProcessTree:
>> setsid exited with exit code 0
>> >>>>> 2013-04-09 09:58:48,024 INFO org.apache.hadoop.mapred.Task:  Using
>> ResourceCalculatorPlugin :
>> org.apache.hadoop.util.LinuxResourceCalculatorPlugin@4a48edb5
>> >>>>> 2013-04-09 09:58:50,475 INFO
>> org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater
>> with mapRetainSize=-1 and reduceRetainSize=-1
>> >>>>> 2013-04-09 09:58:50,477 WARN org.apache.hadoop.mapred.Child: Error
>> running child
>> >>>>> java.lang.RuntimeException: org.apache.thrift.TException: Message
>> length exceeded: 106
>> >>>>>   at
>> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.maybeInit(ColumnFamilyRecordReader.java:384)
>> >>>>>   at
>> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.computeNext(ColumnFamilyRecordReader.java:390)
>> >>>>>   at
>> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.computeNext(ColumnFamilyRecordReader.java:313)
>> >>>>>   at
>> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
>> >>>>>   at
>> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
>> >>>>>   at
>> org.apache.cassandra.hadoop.ColumnFamilyRecordReader.getProgress(ColumnFamilyRecordReader.java:103)
>> >>>>>   at
>> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.getProgress(MapTask.java:444)
>> >>>>>   at
>> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:460)
>> >>>>>   at
>> org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
>> >>>>>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
>> >>>>>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
>> >>>>>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
>> >>>>>   at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
>> >>>>>   at java.security.AccessController.doPrivileged(Native Method)
>> >>>>>   at javax.security.auth.Subject.doAs(Subject.java:396)
>> >>>>>   at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1278)
>> >>>>>   at org.apache.hadoop.mapred.Child.main(Child.java:260)
>> >>>>> Caused by: org.apache.thrift.TException: Message length exceeded:
>> 106
>> >>>>>   at
>> org.apache.thrift.protocol.TBinaryProtocol.checkReadLength(TBinaryProtocol.java:393)
>> >>>>>   at
>> org.apache.thrift.protocol.TBinaryProtocol.readBinary(TBinaryProtocol.java:363)
>> >>>>>   at org.apache.cassandra.thrift.Column.read(Column.java:528)
>> >>>>>   at
>> org.apache.cassandra.thrift.ColumnOrSuperColumn.read(ColumnOrSuperColumn.java:507)
>> >>>>>   at org.apache.cassandra.thrift.KeySlice.read(KeySlice.java:408)
>> >>>>>   at
>> org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassandra.java:12905)
>> >>>>>   at
>> org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
>> >>>>>   at
>> org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:734)
>> >>>>>   at
>> org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:718)
>> >>>>>   at
>> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.maybeInit(ColumnFamilyRecordReader.java:346)
>> >>>>>   ... 16 more
>> >>>>> 2013-04-09 09:58:50,481 INFO org.apache.hadoop.mapred.Task:
>> Runnning cleanup for the task
>> >>>>>
>> >>>>> The message length listed on each failed job differs (not always
>> 106).  Jobs that used to run fine now fail with code compiled against cass
>> 1.2.3 (and work fine if compiled against 1.1.5 and run against the 1.2.3
>> servers in production).  I'm using the following setup to configure the job:
>> >>>>>
>> >>>>> def cassConfig(job: Job) {
>> >>>>>  val conf = job.getConfiguration()
>> >>>>>
>> >>>>>  ConfigHelper.setInputRpcPort(conf, "" + 9160)
>> >>>>>  ConfigHelper.setInputInitialAddress(conf, Config.hostip)
>> >>>>>
>> >>>>>  ConfigHelper.setInputPartitioner(conf,
>> "org.apache.cassandra.dht.RandomPartitioner")
>> >>>>>  ConfigHelper.setInputColumnFamily(conf, Config.keyspace,
>> Config.cfname)
>> >>>>>
>> >>>>>  val pred = {
>> >>>>>    val range = new SliceRange()
>> >>>>>      .setStart("".getBytes("UTF-8"))
>> >>>>>      .setFinish("".getBytes("UTF-8"))
>> >>>>>      .setReversed(false)
>> >>>>>      .setCount(4096 * 1000)
>> >>>>>
>> >>>>>    new SlicePredicate().setSlice_range(range)
>> >>>>>  }
>> >>>>>
>> >>>>>  ConfigHelper.setInputSlicePredicate(conf, pred)
>> >>>>> }
>> >>>>>
>> >>>>> The job consists only of a mapper that increments counters for each
>> row and associated columns so all I'm really doing is exercising
>> ColumnFamilyRecordReader.
>> >>>>>
>> >>>>> Has anyone else seen this?  Is there a workaround/fix to get our
>> jobs running?
>> >>>>>
>> >>>>> Thanks
>> >>>>> _______________________________________________
>> >>>>>
>> >>>>> This message may contain information that is confidential or
>> privileged. If you are not an intended recipient of this message, please
>> delete it and any attachments, and notify the sender that you have received
>> it in error. Unless specifically stated in the message or otherwise
>> indicated, you may not duplicate, redistribute or forward this message or
>> any portion thereof, including any attachments, by any means to any other
>> person, including any retail investor or customer. This message is not a
>> recommendation, advice, offer or solicitation, to buy/sell any product or
>> service, and is not an official confirmation of any transaction. Any
>> opinions presented are solely those of the author and do not necessarily
>> represent those of Barclays.
>> >>>>>
>> >>>>> This message is subject to terms available at:
>> www.barclays.com/emaildisclaimer and, if received from Barclays' Sales
>> or Trading desk, the terms available at:
>> www.barclays.com/salesandtradingdisclaimer/. By messaging with Barclays
>> you consent to the foregoing. Barclays Bank PLC is a company registered in
>> England (number 1026167) with its registered office at 1 Churchill Place,
>> London, E14 5HP. This email may relate to or be sent from other members of
>> the Barclays group.
>> >>>>>
>> >>>>> _______________________________________________
>> >>>>
>> >>>
>> >>
>> >>
>> >
>>
>>
>
>
> --
> alex p
>



-- 
alex p

Re: Thrift message length exceeded

Posted by Oleksandr Petrov <ol...@gmail.com>.

I can confirm running same problem.

Tried ConfigHelper.setThriftMaxMessageLengthInMb();, and tuning server
side, reducing/increasing batch size.

Here's stacktrace from Hadoop/Cassandra, maybe it could give a hint:

Caused by: org.apache.thrift.protocol.TProtocolException: Message length
exceeded: 8
at
org.apache.thrift.protocol.TBinaryProtocol.checkReadLength(TBinaryProtocol.java:393)

at
org.apache.thrift.protocol.TBinaryProtocol.readBinary(TBinaryProtocol.java:363)
at org.apache.cassandra.thrift.Column.read(Column.java:528)
at
org.apache.cassandra.thrift.ColumnOrSuperColumn.read(ColumnOrSuperColumn.java:507)
at org.apache.cassandra.thrift.KeySlice.read(KeySlice.java:408)
at
org.apache.cassandra.thrift.Cassandra$get_paged_slice_result.read(Cassandra.java:14157)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
at
org.apache.cassandra.thrift.Cassandra$Client.recv_get_paged_slice(Cassandra.java:769)
at
org.apache.cassandra.thrift.Cassandra$Client.get_paged_slice(Cassandra.java:753)
at
org.apache.cassandra.hadoop.ColumnFamilyRecordReader$WideRowIterator.maybeInit(ColumnFamilyRecordReader.java:438)


On Thu, Apr 18, 2013 at 12:34 AM, Lanny Ripple <la...@spotright.com> wrote:

> It's slow going finding the time to do so but I'm working on that.
>
> We do have another table that has one or sometimes two columns per row.
>  We can run jobs on it without issue.  I looked through
> org.apache.cassandra.hadoop code and don't see anything that's really
> changed since 1.1.5 (which was also using thrift-0.7) so something of a
> puzzler about what's going on.
>
>
> On Apr 17, 2013, at 2:47 PM, aaron morton <aa...@thelastpickle.com> wrote:
>
> > Can you reproduce this in a simple way ?
> >
> > Cheers
> >
> > -----------------
> > Aaron Morton
> > Freelance Cassandra Consultant
> > New Zealand
> >
> > @aaronmorton
> > http://www.thelastpickle.com
> >
> > On 18/04/2013, at 5:50 AM, Lanny Ripple <la...@spotright.com> wrote:
> >
> >> That was our first thought.  Using maven's dependency tree info we
> verified that we're using the expected (cass 1.2.3) jars
> >>
> >> $ mvn dependency:tree | grep thrift
> >> [INFO] |  +- org.apache.thrift:libthrift:jar:0.7.0:compile
> >> [INFO] |  \- org.apache.cassandra:cassandra-thrift:jar:1.2.3:compile
> >>
> >> I've also dumped the final command run by the hadoop we use (CDH3u5)
> and verified it's not sneaking thrift in on us.
> >>
> >>
> >> On Tue, Apr 16, 2013 at 4:36 PM, aaron morton <aa...@thelastpickle.com>
> wrote:
> >> Can you confirm the you are using the same thrift version that ships
> 1.2.3 ?
> >>
> >> Cheers
> >>
> >> -----------------
> >> Aaron Morton
> >> Freelance Cassandra Consultant
> >> New Zealand
> >>
> >> @aaronmorton
> >> http://www.thelastpickle.com
> >>
> >> On 16/04/2013, at 10:17 AM, Lanny Ripple <la...@spotright.com> wrote:
> >>
> >>> A bump to say I found this
> >>>
> >>>
> http://stackoverflow.com/questions/15487540/pig-cassandra-message-length-exceeded
> >>>
> >>> so others are seeing similar behavior.
> >>>
> >>> From what I can see of org.apache.cassandra.hadoop nothing has changed
> since 1.1.5 when we didn't see such things but sure looks like there's a
> bug that's slipped in (or been uncovered) somewhere.  I'll try to narrow
> down to a dataset and code that can reproduce.
> >>>
> >>> On Apr 10, 2013, at 6:29 PM, Lanny Ripple <la...@spotright.com> wrote:
> >>>
> >>>> We are using Astyanax in production but I cut back to just Hadoop and
> Cassandra to confirm it's a Cassandra (or our use of Cassandra) problem.
> >>>>
> >>>> We do have some extremely large rows but we went from everything
> working with 1.1.5 to almost everything carping with 1.2.3.  Something has
> changed.  Perhaps we were doing something wrong earlier that 1.2.3 exposed
> but surprises are never welcome in production.
> >>>>
> >>>> On Apr 10, 2013, at 8:10 AM, <mo...@barclays.com> wrote:
> >>>>
> >>>>> I also saw this when upgrading from C* 1.0 to 1.2.2, and from hector
> 0.6 to 0.8
> >>>>> Turns out the Thrift message really was too long.
> >>>>> The mystery to me: Why no complaints in previous versions? Were some
> checks added in Thrift or Hector?
> >>>>>
> >>>>> -----Original Message-----
> >>>>> From: Lanny Ripple [mailto:lanny@spotright.com]
> >>>>> Sent: Tuesday, April 09, 2013 6:17 PM
> >>>>> To: user@cassandra.apache.org
> >>>>> Subject: Thrift message length exceeded
> >>>>>
> >>>>> Hello,
> >>>>>
> >>>>> We have recently upgraded to Cass 1.2.3 from Cass 1.1.5.  We ran
> sstableupgrades and got the ring on its feet and we are now seeing a new
> issue.
> >>>>>
> >>>>> When we run MapReduce jobs against practically any table we find the
> following errors:
> >>>>>
> >>>>> 2013-04-09 09:58:47,746 INFO
> org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
> >>>>> 2013-04-09 09:58:47,899 INFO
> org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with
> processName=MAP, sessionId=
> >>>>> 2013-04-09 09:58:48,021 INFO org.apache.hadoop.util.ProcessTree:
> setsid exited with exit code 0
> >>>>> 2013-04-09 09:58:48,024 INFO org.apache.hadoop.mapred.Task:  Using
> ResourceCalculatorPlugin :
> org.apache.hadoop.util.LinuxResourceCalculatorPlugin@4a48edb5
> >>>>> 2013-04-09 09:58:50,475 INFO
> org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater
> with mapRetainSize=-1 and reduceRetainSize=-1
> >>>>> 2013-04-09 09:58:50,477 WARN org.apache.hadoop.mapred.Child: Error
> running child
> >>>>> java.lang.RuntimeException: org.apache.thrift.TException: Message
> length exceeded: 106
> >>>>>   at
> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.maybeInit(ColumnFamilyRecordReader.java:384)
> >>>>>   at
> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.computeNext(ColumnFamilyRecordReader.java:390)
> >>>>>   at
> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.computeNext(ColumnFamilyRecordReader.java:313)
> >>>>>   at
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
> >>>>>   at
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
> >>>>>   at
> org.apache.cassandra.hadoop.ColumnFamilyRecordReader.getProgress(ColumnFamilyRecordReader.java:103)
> >>>>>   at
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.getProgress(MapTask.java:444)
> >>>>>   at
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:460)
> >>>>>   at
> org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
> >>>>>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
> >>>>>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
> >>>>>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
> >>>>>   at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
> >>>>>   at java.security.AccessController.doPrivileged(Native Method)
> >>>>>   at javax.security.auth.Subject.doAs(Subject.java:396)
> >>>>>   at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1278)
> >>>>>   at org.apache.hadoop.mapred.Child.main(Child.java:260)
> >>>>> Caused by: org.apache.thrift.TException: Message length exceeded: 106
> >>>>>   at
> org.apache.thrift.protocol.TBinaryProtocol.checkReadLength(TBinaryProtocol.java:393)
> >>>>>   at
> org.apache.thrift.protocol.TBinaryProtocol.readBinary(TBinaryProtocol.java:363)
> >>>>>   at org.apache.cassandra.thrift.Column.read(Column.java:528)
> >>>>>   at
> org.apache.cassandra.thrift.ColumnOrSuperColumn.read(ColumnOrSuperColumn.java:507)
> >>>>>   at org.apache.cassandra.thrift.KeySlice.read(KeySlice.java:408)
> >>>>>   at
> org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassandra.java:12905)
> >>>>>   at
> org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
> >>>>>   at
> org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:734)
> >>>>>   at
> org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:718)
> >>>>>   at
> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.maybeInit(ColumnFamilyRecordReader.java:346)
> >>>>>   ... 16 more
> >>>>> 2013-04-09 09:58:50,481 INFO org.apache.hadoop.mapred.Task: Runnning
> cleanup for the task
> >>>>>
> >>>>> The message length listed on each failed job differs (not always
> 106).  Jobs that used to run fine now fail with code compiled against cass
> 1.2.3 (and work fine if compiled against 1.1.5 and run against the 1.2.3
> servers in production).  I'm using the following setup to configure the job:
> >>>>>
> >>>>> def cassConfig(job: Job) {
> >>>>>  val conf = job.getConfiguration()
> >>>>>
> >>>>>  ConfigHelper.setInputRpcPort(conf, "" + 9160)
> >>>>>  ConfigHelper.setInputInitialAddress(conf, Config.hostip)
> >>>>>
> >>>>>  ConfigHelper.setInputPartitioner(conf,
> "org.apache.cassandra.dht.RandomPartitioner")
> >>>>>  ConfigHelper.setInputColumnFamily(conf, Config.keyspace,
> Config.cfname)
> >>>>>
> >>>>>  val pred = {
> >>>>>    val range = new SliceRange()
> >>>>>      .setStart("".getBytes("UTF-8"))
> >>>>>      .setFinish("".getBytes("UTF-8"))
> >>>>>      .setReversed(false)
> >>>>>      .setCount(4096 * 1000)
> >>>>>
> >>>>>    new SlicePredicate().setSlice_range(range)
> >>>>>  }
> >>>>>
> >>>>>  ConfigHelper.setInputSlicePredicate(conf, pred)
> >>>>> }
> >>>>>
> >>>>> The job consists only of a mapper that increments counters for each
> row and associated columns so all I'm really doing is exercising
> ColumnFamilyRecordReader.
> >>>>>
> >>>>> Has anyone else seen this?  Is there a workaround/fix to get our
> jobs running?
> >>>>>
> >>>>> Thanks
> >>>>> _______________________________________________
> >>>>>
> >>>>> This message may contain information that is confidential or
> privileged. If you are not an intended recipient of this message, please
> delete it and any attachments, and notify the sender that you have received
> it in error. Unless specifically stated in the message or otherwise
> indicated, you may not duplicate, redistribute or forward this message or
> any portion thereof, including any attachments, by any means to any other
> person, including any retail investor or customer. This message is not a
> recommendation, advice, offer or solicitation, to buy/sell any product or
> service, and is not an official confirmation of any transaction. Any
> opinions presented are solely those of the author and do not necessarily
> represent those of Barclays.
> >>>>>
> >>>>> This message is subject to terms available at:
> www.barclays.com/emaildisclaimer and, if received from Barclays' Sales or
> Trading desk, the terms available at:
> www.barclays.com/salesandtradingdisclaimer/. By messaging with Barclays
> you consent to the foregoing. Barclays Bank PLC is a company registered in
> England (number 1026167) with its registered office at 1 Churchill Place,
> London, E14 5HP. This email may relate to or be sent from other members of
> the Barclays group.
> >>>>>
> >>>>> _______________________________________________
> >>>>
> >>>
> >>
> >>
> >
>
>


-- 
alex p

Re: Thrift message length exceeded

Posted by Lanny Ripple <la...@spotright.com>.

It's slow going finding the time to do so but I'm working on that.

We do have another table that has one or sometimes two columns per row.  We can run jobs on it without issue.  I looked through org.apache.cassandra.hadoop code and don't see anything that's really changed since 1.1.5 (which was also using thrift-0.7) so something of a puzzler about what's going on.


On Apr 17, 2013, at 2:47 PM, aaron morton <aa...@thelastpickle.com> wrote:

> Can you reproduce this in a simple way ? 
> 
> Cheers
> 
> -----------------
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
> 
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 18/04/2013, at 5:50 AM, Lanny Ripple <la...@spotright.com> wrote:
> 
>> That was our first thought.  Using maven's dependency tree info we verified that we're using the expected (cass 1.2.3) jars
>> 
>> $ mvn dependency:tree | grep thrift
>> [INFO] |  +- org.apache.thrift:libthrift:jar:0.7.0:compile
>> [INFO] |  \- org.apache.cassandra:cassandra-thrift:jar:1.2.3:compile
>> 
>> I've also dumped the final command run by the hadoop we use (CDH3u5) and verified it's not sneaking thrift in on us.
>> 
>> 
>> On Tue, Apr 16, 2013 at 4:36 PM, aaron morton <aa...@thelastpickle.com> wrote:
>> Can you confirm the you are using the same thrift version that ships 1.2.3 ? 
>> 
>> Cheers
>> 
>> -----------------
>> Aaron Morton
>> Freelance Cassandra Consultant
>> New Zealand
>> 
>> @aaronmorton
>> http://www.thelastpickle.com
>> 
>> On 16/04/2013, at 10:17 AM, Lanny Ripple <la...@spotright.com> wrote:
>> 
>>> A bump to say I found this
>>> 
>>>  http://stackoverflow.com/questions/15487540/pig-cassandra-message-length-exceeded
>>> 
>>> so others are seeing similar behavior.
>>> 
>>> From what I can see of org.apache.cassandra.hadoop nothing has changed since 1.1.5 when we didn't see such things but sure looks like there's a bug that's slipped in (or been uncovered) somewhere.  I'll try to narrow down to a dataset and code that can reproduce.
>>> 
>>> On Apr 10, 2013, at 6:29 PM, Lanny Ripple <la...@spotright.com> wrote:
>>> 
>>>> We are using Astyanax in production but I cut back to just Hadoop and Cassandra to confirm it's a Cassandra (or our use of Cassandra) problem.
>>>> 
>>>> We do have some extremely large rows but we went from everything working with 1.1.5 to almost everything carping with 1.2.3.  Something has changed.  Perhaps we were doing something wrong earlier that 1.2.3 exposed but surprises are never welcome in production.
>>>> 
>>>> On Apr 10, 2013, at 8:10 AM, <mo...@barclays.com> wrote:
>>>> 
>>>>> I also saw this when upgrading from C* 1.0 to 1.2.2, and from hector 0.6 to 0.8
>>>>> Turns out the Thrift message really was too long.
>>>>> The mystery to me: Why no complaints in previous versions? Were some checks added in Thrift or Hector?
>>>>> 
>>>>> -----Original Message-----
>>>>> From: Lanny Ripple [mailto:lanny@spotright.com] 
>>>>> Sent: Tuesday, April 09, 2013 6:17 PM
>>>>> To: user@cassandra.apache.org
>>>>> Subject: Thrift message length exceeded
>>>>> 
>>>>> Hello,
>>>>> 
>>>>> We have recently upgraded to Cass 1.2.3 from Cass 1.1.5.  We ran sstableupgrades and got the ring on its feet and we are now seeing a new issue.
>>>>> 
>>>>> When we run MapReduce jobs against practically any table we find the following errors:
>>>>> 
>>>>> 2013-04-09 09:58:47,746 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
>>>>> 2013-04-09 09:58:47,899 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=MAP, sessionId=
>>>>> 2013-04-09 09:58:48,021 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0
>>>>> 2013-04-09 09:58:48,024 INFO org.apache.hadoop.mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@4a48edb5
>>>>> 2013-04-09 09:58:50,475 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
>>>>> 2013-04-09 09:58:50,477 WARN org.apache.hadoop.mapred.Child: Error running child
>>>>> java.lang.RuntimeException: org.apache.thrift.TException: Message length exceeded: 106
>>>>> 	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.maybeInit(ColumnFamilyRecordReader.java:384)
>>>>> 	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.computeNext(ColumnFamilyRecordReader.java:390)
>>>>> 	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.computeNext(ColumnFamilyRecordReader.java:313)
>>>>> 	at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
>>>>> 	at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
>>>>> 	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader.getProgress(ColumnFamilyRecordReader.java:103)
>>>>> 	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.getProgress(MapTask.java:444)
>>>>> 	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:460)
>>>>> 	at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
>>>>> 	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
>>>>> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
>>>>> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
>>>>> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
>>>>> 	at java.security.AccessController.doPrivileged(Native Method)
>>>>> 	at javax.security.auth.Subject.doAs(Subject.java:396)
>>>>> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1278)
>>>>> 	at org.apache.hadoop.mapred.Child.main(Child.java:260)
>>>>> Caused by: org.apache.thrift.TException: Message length exceeded: 106
>>>>> 	at org.apache.thrift.protocol.TBinaryProtocol.checkReadLength(TBinaryProtocol.java:393)
>>>>> 	at org.apache.thrift.protocol.TBinaryProtocol.readBinary(TBinaryProtocol.java:363)
>>>>> 	at org.apache.cassandra.thrift.Column.read(Column.java:528)
>>>>> 	at org.apache.cassandra.thrift.ColumnOrSuperColumn.read(ColumnOrSuperColumn.java:507)
>>>>> 	at org.apache.cassandra.thrift.KeySlice.read(KeySlice.java:408)
>>>>> 	at org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassandra.java:12905)
>>>>> 	at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
>>>>> 	at org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:734)
>>>>> 	at org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:718)
>>>>> 	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.maybeInit(ColumnFamilyRecordReader.java:346)
>>>>> 	... 16 more
>>>>> 2013-04-09 09:58:50,481 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task
>>>>> 
>>>>> The message length listed on each failed job differs (not always 106).  Jobs that used to run fine now fail with code compiled against cass 1.2.3 (and work fine if compiled against 1.1.5 and run against the 1.2.3 servers in production).  I'm using the following setup to configure the job:
>>>>> 
>>>>> def cassConfig(job: Job) {
>>>>>  val conf = job.getConfiguration()
>>>>> 
>>>>>  ConfigHelper.setInputRpcPort(conf, "" + 9160)
>>>>>  ConfigHelper.setInputInitialAddress(conf, Config.hostip)
>>>>> 
>>>>>  ConfigHelper.setInputPartitioner(conf, "org.apache.cassandra.dht.RandomPartitioner")
>>>>>  ConfigHelper.setInputColumnFamily(conf, Config.keyspace, Config.cfname)
>>>>> 
>>>>>  val pred = {
>>>>>    val range = new SliceRange()
>>>>>      .setStart("".getBytes("UTF-8"))
>>>>>      .setFinish("".getBytes("UTF-8"))
>>>>>      .setReversed(false)
>>>>>      .setCount(4096 * 1000)
>>>>> 
>>>>>    new SlicePredicate().setSlice_range(range)
>>>>>  }
>>>>> 
>>>>>  ConfigHelper.setInputSlicePredicate(conf, pred)
>>>>> }
>>>>> 
>>>>> The job consists only of a mapper that increments counters for each row and associated columns so all I'm really doing is exercising ColumnFamilyRecordReader.
>>>>> 
>>>>> Has anyone else seen this?  Is there a workaround/fix to get our jobs running?
>>>>> 
>>>>> Thanks
>>>>> _______________________________________________
>>>>> 
>>>>> This message may contain information that is confidential or privileged. If you are not an intended recipient of this message, please delete it and any attachments, and notify the sender that you have received it in error. Unless specifically stated in the message or otherwise indicated, you may not duplicate, redistribute or forward this message or any portion thereof, including any attachments, by any means to any other person, including any retail investor or customer. This message is not a recommendation, advice, offer or solicitation, to buy/sell any product or service, and is not an official confirmation of any transaction. Any opinions presented are solely those of the author and do not necessarily represent those of Barclays.
>>>>> 
>>>>> This message is subject to terms available at: www.barclays.com/emaildisclaimer and, if received from Barclays' Sales or Trading desk, the terms available at: www.barclays.com/salesandtradingdisclaimer/. By messaging with Barclays you consent to the foregoing. Barclays Bank PLC is a company registered in England (number 1026167) with its registered office at 1 Churchill Place, London, E14 5HP. This email may relate to or be sent from other members of the Barclays group.
>>>>> 
>>>>> _______________________________________________
>>>> 
>>> 
>> 
>> 
>

Re: Thrift message length exceeded

Posted by aaron morton <aa...@thelastpickle.com>.

Can you reproduce this in a simple way ? 

Cheers

-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 18/04/2013, at 5:50 AM, Lanny Ripple <la...@spotright.com> wrote:

> That was our first thought.  Using maven's dependency tree info we verified that we're using the expected (cass 1.2.3) jars
> 
> $ mvn dependency:tree | grep thrift
> [INFO] |  +- org.apache.thrift:libthrift:jar:0.7.0:compile
> [INFO] |  \- org.apache.cassandra:cassandra-thrift:jar:1.2.3:compile
> 
> I've also dumped the final command run by the hadoop we use (CDH3u5) and verified it's not sneaking thrift in on us.
> 
> 
> On Tue, Apr 16, 2013 at 4:36 PM, aaron morton <aa...@thelastpickle.com> wrote:
> Can you confirm the you are using the same thrift version that ships 1.2.3 ? 
> 
> Cheers
> 
> -----------------
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
> 
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 16/04/2013, at 10:17 AM, Lanny Ripple <la...@spotright.com> wrote:
> 
>> A bump to say I found this
>> 
>>  http://stackoverflow.com/questions/15487540/pig-cassandra-message-length-exceeded
>> 
>> so others are seeing similar behavior.
>> 
>> From what I can see of org.apache.cassandra.hadoop nothing has changed since 1.1.5 when we didn't see such things but sure looks like there's a bug that's slipped in (or been uncovered) somewhere.  I'll try to narrow down to a dataset and code that can reproduce.
>> 
>> On Apr 10, 2013, at 6:29 PM, Lanny Ripple <la...@spotright.com> wrote:
>> 
>>> We are using Astyanax in production but I cut back to just Hadoop and Cassandra to confirm it's a Cassandra (or our use of Cassandra) problem.
>>> 
>>> We do have some extremely large rows but we went from everything working with 1.1.5 to almost everything carping with 1.2.3.  Something has changed.  Perhaps we were doing something wrong earlier that 1.2.3 exposed but surprises are never welcome in production.
>>> 
>>> On Apr 10, 2013, at 8:10 AM, <mo...@barclays.com> wrote:
>>> 
>>>> I also saw this when upgrading from C* 1.0 to 1.2.2, and from hector 0.6 to 0.8
>>>> Turns out the Thrift message really was too long.
>>>> The mystery to me: Why no complaints in previous versions? Were some checks added in Thrift or Hector?
>>>> 
>>>> -----Original Message-----
>>>> From: Lanny Ripple [mailto:lanny@spotright.com] 
>>>> Sent: Tuesday, April 09, 2013 6:17 PM
>>>> To: user@cassandra.apache.org
>>>> Subject: Thrift message length exceeded
>>>> 
>>>> Hello,
>>>> 
>>>> We have recently upgraded to Cass 1.2.3 from Cass 1.1.5.  We ran sstableupgrades and got the ring on its feet and we are now seeing a new issue.
>>>> 
>>>> When we run MapReduce jobs against practically any table we find the following errors:
>>>> 
>>>> 2013-04-09 09:58:47,746 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
>>>> 2013-04-09 09:58:47,899 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=MAP, sessionId=
>>>> 2013-04-09 09:58:48,021 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0
>>>> 2013-04-09 09:58:48,024 INFO org.apache.hadoop.mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@4a48edb5
>>>> 2013-04-09 09:58:50,475 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
>>>> 2013-04-09 09:58:50,477 WARN org.apache.hadoop.mapred.Child: Error running child
>>>> java.lang.RuntimeException: org.apache.thrift.TException: Message length exceeded: 106
>>>> 	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.maybeInit(ColumnFamilyRecordReader.java:384)
>>>> 	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.computeNext(ColumnFamilyRecordReader.java:390)
>>>> 	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.computeNext(ColumnFamilyRecordReader.java:313)
>>>> 	at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
>>>> 	at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
>>>> 	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader.getProgress(ColumnFamilyRecordReader.java:103)
>>>> 	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.getProgress(MapTask.java:444)
>>>> 	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:460)
>>>> 	at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
>>>> 	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
>>>> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
>>>> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
>>>> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
>>>> 	at java.security.AccessController.doPrivileged(Native Method)
>>>> 	at javax.security.auth.Subject.doAs(Subject.java:396)
>>>> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1278)
>>>> 	at org.apache.hadoop.mapred.Child.main(Child.java:260)
>>>> Caused by: org.apache.thrift.TException: Message length exceeded: 106
>>>> 	at org.apache.thrift.protocol.TBinaryProtocol.checkReadLength(TBinaryProtocol.java:393)
>>>> 	at org.apache.thrift.protocol.TBinaryProtocol.readBinary(TBinaryProtocol.java:363)
>>>> 	at org.apache.cassandra.thrift.Column.read(Column.java:528)
>>>> 	at org.apache.cassandra.thrift.ColumnOrSuperColumn.read(ColumnOrSuperColumn.java:507)
>>>> 	at org.apache.cassandra.thrift.KeySlice.read(KeySlice.java:408)
>>>> 	at org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassandra.java:12905)
>>>> 	at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
>>>> 	at org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:734)
>>>> 	at org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:718)
>>>> 	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.maybeInit(ColumnFamilyRecordReader.java:346)
>>>> 	... 16 more
>>>> 2013-04-09 09:58:50,481 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task
>>>> 
>>>> The message length listed on each failed job differs (not always 106).  Jobs that used to run fine now fail with code compiled against cass 1.2.3 (and work fine if compiled against 1.1.5 and run against the 1.2.3 servers in production).  I'm using the following setup to configure the job:
>>>> 
>>>> def cassConfig(job: Job) {
>>>>  val conf = job.getConfiguration()
>>>> 
>>>>  ConfigHelper.setInputRpcPort(conf, "" + 9160)
>>>>  ConfigHelper.setInputInitialAddress(conf, Config.hostip)
>>>> 
>>>>  ConfigHelper.setInputPartitioner(conf, "org.apache.cassandra.dht.RandomPartitioner")
>>>>  ConfigHelper.setInputColumnFamily(conf, Config.keyspace, Config.cfname)
>>>> 
>>>>  val pred = {
>>>>    val range = new SliceRange()
>>>>      .setStart("".getBytes("UTF-8"))
>>>>      .setFinish("".getBytes("UTF-8"))
>>>>      .setReversed(false)
>>>>      .setCount(4096 * 1000)
>>>> 
>>>>    new SlicePredicate().setSlice_range(range)
>>>>  }
>>>> 
>>>>  ConfigHelper.setInputSlicePredicate(conf, pred)
>>>> }
>>>> 
>>>> The job consists only of a mapper that increments counters for each row and associated columns so all I'm really doing is exercising ColumnFamilyRecordReader.
>>>> 
>>>> Has anyone else seen this?  Is there a workaround/fix to get our jobs running?
>>>> 
>>>> Thanks
>>>> _______________________________________________
>>>> 
>>>> This message may contain information that is confidential or privileged. If you are not an intended recipient of this message, please delete it and any attachments, and notify the sender that you have received it in error. Unless specifically stated in the message or otherwise indicated, you may not duplicate, redistribute or forward this message or any portion thereof, including any attachments, by any means to any other person, including any retail investor or customer. This message is not a recommendation, advice, offer or solicitation, to buy/sell any product or service, and is not an official confirmation of any transaction. Any opinions presented are solely those of the author and do not necessarily represent those of Barclays.
>>>> 
>>>> This message is subject to terms available at: www.barclays.com/emaildisclaimer and, if received from Barclays' Sales or Trading desk, the terms available at: www.barclays.com/salesandtradingdisclaimer/. By messaging with Barclays you consent to the foregoing. Barclays Bank PLC is a company registered in England (number 1026167) with its registered office at 1 Churchill Place, London, E14 5HP. This email may relate to or be sent from other members of the Barclays group.
>>>> 
>>>> _______________________________________________
>>> 
>> 
> 
>

Re: Thrift message length exceeded

Posted by Lanny Ripple <la...@spotright.com>.

That was our first thought.  Using maven's dependency tree info we verified
that we're using the expected (cass 1.2.3) jars

$ mvn dependency:tree | grep thrift
[INFO] |  +- org.apache.thrift:libthrift:jar:0.7.0:compile
[INFO] |  \- org.apache.cassandra:cassandra-thrift:jar:1.2.3:compile

I've also dumped the final command run by the hadoop we use (CDH3u5) and
verified it's not sneaking thrift in on us.


On Tue, Apr 16, 2013 at 4:36 PM, aaron morton <aa...@thelastpickle.com>wrote:

> Can you confirm the you are using the same thrift version that ships 1.2.3
> ?
>
> Cheers
>
> -----------------
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 16/04/2013, at 10:17 AM, Lanny Ripple <la...@spotright.com> wrote:
>
> A bump to say I found this
>
>
> http://stackoverflow.com/questions/15487540/pig-cassandra-message-length-exceeded
>
> so others are seeing similar behavior.
>
> From what I can see of org.apache.cassandra.hadoop nothing has changed
> since 1.1.5 when we didn't see such things but sure looks like there's a
> bug that's slipped in (or been uncovered) somewhere.  I'll try to narrow
> down to a dataset and code that can reproduce.
>
> On Apr 10, 2013, at 6:29 PM, Lanny Ripple <la...@spotright.com> wrote:
>
> We are using Astyanax in production but I cut back to just Hadoop and
> Cassandra to confirm it's a Cassandra (or our use of Cassandra) problem.
>
> We do have some extremely large rows but we went from everything working
> with 1.1.5 to almost everything carping with 1.2.3.  Something has changed.
>  Perhaps we were doing something wrong earlier that 1.2.3 exposed but
> surprises are never welcome in production.
>
> On Apr 10, 2013, at 8:10 AM, <mo...@barclays.com> wrote:
>
> I also saw this when upgrading from C* 1.0 to 1.2.2, and from hector 0.6
> to 0.8
> Turns out the Thrift message really was too long.
> The mystery to me: Why no complaints in previous versions? Were some
> checks added in Thrift or Hector?
>
> -----Original Message-----
> From: Lanny Ripple [mailto:lanny@spotright.com]
> Sent: Tuesday, April 09, 2013 6:17 PM
> To: user@cassandra.apache.org
> Subject: Thrift message length exceeded
>
> Hello,
>
> We have recently upgraded to Cass 1.2.3 from Cass 1.1.5.  We ran
> sstableupgrades and got the ring on its feet and we are now seeing a new
> issue.
>
> When we run MapReduce jobs against practically any table we find the
> following errors:
>
> 2013-04-09 09:58:47,746 INFO org.apache.hadoop.util.NativeCodeLoader:
> Loaded the native-hadoop library
> 2013-04-09 09:58:47,899 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
> Initializing JVM Metrics with processName=MAP, sessionId=
> 2013-04-09 09:58:48,021 INFO org.apache.hadoop.util.ProcessTree: setsid
> exited with exit code 0
> 2013-04-09 09:58:48,024 INFO org.apache.hadoop.mapred.Task:  Using
> ResourceCalculatorPlugin :
> org.apache.hadoop.util.LinuxResourceCalculatorPlugin@4a48edb5
> 2013-04-09 09:58:50,475 INFO org.apache.hadoop.mapred.TaskLogsTruncater:
> Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
> 2013-04-09 09:58:50,477 WARN org.apache.hadoop.mapred.Child: Error running
> child
> java.lang.RuntimeException: org.apache.thrift.TException: Message length
> exceeded: 106
> at
> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.maybeInit(ColumnFamilyRecordReader.java:384)
> at
> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.computeNext(ColumnFamilyRecordReader.java:390)
> at
> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.computeNext(ColumnFamilyRecordReader.java:313)
> at
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
> at
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
> at
> org.apache.cassandra.hadoop.ColumnFamilyRecordReader.getProgress(ColumnFamilyRecordReader.java:103)
> at
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.getProgress(MapTask.java:444)
> at
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:460)
> at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1278)
> at org.apache.hadoop.mapred.Child.main(Child.java:260)
> Caused by: org.apache.thrift.TException: Message length exceeded: 106
> at
> org.apache.thrift.protocol.TBinaryProtocol.checkReadLength(TBinaryProtocol.java:393)
> at
> org.apache.thrift.protocol.TBinaryProtocol.readBinary(TBinaryProtocol.java:363)
> at org.apache.cassandra.thrift.Column.read(Column.java:528)
> at
> org.apache.cassandra.thrift.ColumnOrSuperColumn.read(ColumnOrSuperColumn.java:507)
> at org.apache.cassandra.thrift.KeySlice.read(KeySlice.java:408)
> at
> org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassandra.java:12905)
> at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
> at
> org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:734)
> at
> org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:718)
> at
> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.maybeInit(ColumnFamilyRecordReader.java:346)
> ... 16 more
> 2013-04-09 09:58:50,481 INFO org.apache.hadoop.mapred.Task: Runnning
> cleanup for the task
>
> The message length listed on each failed job differs (not always 106).
>  Jobs that used to run fine now fail with code compiled against cass 1.2.3
> (and work fine if compiled against 1.1.5 and run against the 1.2.3 servers
> in production).  I'm using the following setup to configure the job:
>
> def cassConfig(job: Job) {
>  val conf = job.getConfiguration()
>
>  ConfigHelper.setInputRpcPort(conf, "" + 9160)
>  ConfigHelper.setInputInitialAddress(conf, Config.hostip)
>
>  ConfigHelper.setInputPartitioner(conf,
> "org.apache.cassandra.dht.RandomPartitioner")
>  ConfigHelper.setInputColumnFamily(conf, Config.keyspace, Config.cfname)
>
>  val pred = {
>    val range = new SliceRange()
>      .setStart("".getBytes("UTF-8"))
>      .setFinish("".getBytes("UTF-8"))
>      .setReversed(false)
>      .setCount(4096 * 1000)
>
>    new SlicePredicate().setSlice_range(range)
>  }
>
>  ConfigHelper.setInputSlicePredicate(conf, pred)
> }
>
> The job consists only of a mapper that increments counters for each row
> and associated columns so all I'm really doing is exercising
> ColumnFamilyRecordReader.
>
> Has anyone else seen this?  Is there a workaround/fix to get our jobs
> running?
>
> Thanks
> _______________________________________________
>
> This message may contain information that is confidential or privileged.
> If you are not an intended recipient of this message, please delete it and
> any attachments, and notify the sender that you have received it in error.
> Unless specifically stated in the message or otherwise indicated, you may
> not duplicate, redistribute or forward this message or any portion thereof,
> including any attachments, by any means to any other person, including any
> retail investor or customer. This message is not a recommendation, advice,
> offer or solicitation, to buy/sell any product or service, and is not an
> official confirmation of any transaction. Any opinions presented are solely
> those of the author and do not necessarily represent those of Barclays.
>
> This message is subject to terms available at:
> www.barclays.com/emaildisclaimer and, if received from Barclays' Sales or
> Trading desk, the terms available at:
> www.barclays.com/salesandtradingdisclaimer/. By messaging with Barclays
> you consent to the foregoing. Barclays Bank PLC is a company registered in
> England (number 1026167) with its registered office at 1 Churchill Place,
> London, E14 5HP. This email may relate to or be sent from other members of
> the Barclays group.
>
> _______________________________________________
>
>
>
>
>

Re: Thrift message length exceeded

Posted by aaron morton <aa...@thelastpickle.com>.

Can you confirm the you are using the same thrift version that ships 1.2.3 ? 

Cheers

-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 16/04/2013, at 10:17 AM, Lanny Ripple <la...@spotright.com> wrote:

> A bump to say I found this
> 
>  http://stackoverflow.com/questions/15487540/pig-cassandra-message-length-exceeded
> 
> so others are seeing similar behavior.
> 
> From what I can see of org.apache.cassandra.hadoop nothing has changed since 1.1.5 when we didn't see such things but sure looks like there's a bug that's slipped in (or been uncovered) somewhere.  I'll try to narrow down to a dataset and code that can reproduce.
> 
> On Apr 10, 2013, at 6:29 PM, Lanny Ripple <la...@spotright.com> wrote:
> 
>> We are using Astyanax in production but I cut back to just Hadoop and Cassandra to confirm it's a Cassandra (or our use of Cassandra) problem.
>> 
>> We do have some extremely large rows but we went from everything working with 1.1.5 to almost everything carping with 1.2.3.  Something has changed.  Perhaps we were doing something wrong earlier that 1.2.3 exposed but surprises are never welcome in production.
>> 
>> On Apr 10, 2013, at 8:10 AM, <mo...@barclays.com> wrote:
>> 
>>> I also saw this when upgrading from C* 1.0 to 1.2.2, and from hector 0.6 to 0.8
>>> Turns out the Thrift message really was too long.
>>> The mystery to me: Why no complaints in previous versions? Were some checks added in Thrift or Hector?
>>> 
>>> -----Original Message-----
>>> From: Lanny Ripple [mailto:lanny@spotright.com] 
>>> Sent: Tuesday, April 09, 2013 6:17 PM
>>> To: user@cassandra.apache.org
>>> Subject: Thrift message length exceeded
>>> 
>>> Hello,
>>> 
>>> We have recently upgraded to Cass 1.2.3 from Cass 1.1.5.  We ran sstableupgrades and got the ring on its feet and we are now seeing a new issue.
>>> 
>>> When we run MapReduce jobs against practically any table we find the following errors:
>>> 
>>> 2013-04-09 09:58:47,746 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
>>> 2013-04-09 09:58:47,899 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=MAP, sessionId=
>>> 2013-04-09 09:58:48,021 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0
>>> 2013-04-09 09:58:48,024 INFO org.apache.hadoop.mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@4a48edb5
>>> 2013-04-09 09:58:50,475 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
>>> 2013-04-09 09:58:50,477 WARN org.apache.hadoop.mapred.Child: Error running child
>>> java.lang.RuntimeException: org.apache.thrift.TException: Message length exceeded: 106
>>> 	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.maybeInit(ColumnFamilyRecordReader.java:384)
>>> 	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.computeNext(ColumnFamilyRecordReader.java:390)
>>> 	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.computeNext(ColumnFamilyRecordReader.java:313)
>>> 	at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
>>> 	at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
>>> 	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader.getProgress(ColumnFamilyRecordReader.java:103)
>>> 	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.getProgress(MapTask.java:444)
>>> 	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:460)
>>> 	at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
>>> 	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
>>> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
>>> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
>>> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
>>> 	at java.security.AccessController.doPrivileged(Native Method)
>>> 	at javax.security.auth.Subject.doAs(Subject.java:396)
>>> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1278)
>>> 	at org.apache.hadoop.mapred.Child.main(Child.java:260)
>>> Caused by: org.apache.thrift.TException: Message length exceeded: 106
>>> 	at org.apache.thrift.protocol.TBinaryProtocol.checkReadLength(TBinaryProtocol.java:393)
>>> 	at org.apache.thrift.protocol.TBinaryProtocol.readBinary(TBinaryProtocol.java:363)
>>> 	at org.apache.cassandra.thrift.Column.read(Column.java:528)
>>> 	at org.apache.cassandra.thrift.ColumnOrSuperColumn.read(ColumnOrSuperColumn.java:507)
>>> 	at org.apache.cassandra.thrift.KeySlice.read(KeySlice.java:408)
>>> 	at org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassandra.java:12905)
>>> 	at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
>>> 	at org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:734)
>>> 	at org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:718)
>>> 	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.maybeInit(ColumnFamilyRecordReader.java:346)
>>> 	... 16 more
>>> 2013-04-09 09:58:50,481 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task
>>> 
>>> The message length listed on each failed job differs (not always 106).  Jobs that used to run fine now fail with code compiled against cass 1.2.3 (and work fine if compiled against 1.1.5 and run against the 1.2.3 servers in production).  I'm using the following setup to configure the job:
>>> 
>>> def cassConfig(job: Job) {
>>>  val conf = job.getConfiguration()
>>> 
>>>  ConfigHelper.setInputRpcPort(conf, "" + 9160)
>>>  ConfigHelper.setInputInitialAddress(conf, Config.hostip)
>>> 
>>>  ConfigHelper.setInputPartitioner(conf, "org.apache.cassandra.dht.RandomPartitioner")
>>>  ConfigHelper.setInputColumnFamily(conf, Config.keyspace, Config.cfname)
>>> 
>>>  val pred = {
>>>    val range = new SliceRange()
>>>      .setStart("".getBytes("UTF-8"))
>>>      .setFinish("".getBytes("UTF-8"))
>>>      .setReversed(false)
>>>      .setCount(4096 * 1000)
>>> 
>>>    new SlicePredicate().setSlice_range(range)
>>>  }
>>> 
>>>  ConfigHelper.setInputSlicePredicate(conf, pred)
>>> }
>>> 
>>> The job consists only of a mapper that increments counters for each row and associated columns so all I'm really doing is exercising ColumnFamilyRecordReader.
>>> 
>>> Has anyone else seen this?  Is there a workaround/fix to get our jobs running?
>>> 
>>> Thanks
>>> _______________________________________________
>>> 
>>> This message may contain information that is confidential or privileged. If you are not an intended recipient of this message, please delete it and any attachments, and notify the sender that you have received it in error. Unless specifically stated in the message or otherwise indicated, you may not duplicate, redistribute or forward this message or any portion thereof, including any attachments, by any means to any other person, including any retail investor or customer. This message is not a recommendation, advice, offer or solicitation, to buy/sell any product or service, and is not an official confirmation of any transaction. Any opinions presented are solely those of the author and do not necessarily represent those of Barclays.
>>> 
>>> This message is subject to terms available at: www.barclays.com/emaildisclaimer and, if received from Barclays' Sales or Trading desk, the terms available at: www.barclays.com/salesandtradingdisclaimer/. By messaging with Barclays you consent to the foregoing. Barclays Bank PLC is a company registered in England (number 1026167) with its registered office at 1 Churchill Place, London, E14 5HP. This email may relate to or be sent from other members of the Barclays group.
>>> 
>>> _______________________________________________
>> 
>

Re: Thrift message length exceeded

Posted by Lanny Ripple <la...@spotright.com>.

A bump to say I found this

  http://stackoverflow.com/questions/15487540/pig-cassandra-message-length-exceeded

so others are seeing similar behavior.

From what I can see of org.apache.cassandra.hadoop nothing has changed since 1.1.5 when we didn't see such things but sure looks like there's a bug that's slipped in (or been uncovered) somewhere.  I'll try to narrow down to a dataset and code that can reproduce.

On Apr 10, 2013, at 6:29 PM, Lanny Ripple <la...@spotright.com> wrote:

> We are using Astyanax in production but I cut back to just Hadoop and Cassandra to confirm it's a Cassandra (or our use of Cassandra) problem.
> 
> We do have some extremely large rows but we went from everything working with 1.1.5 to almost everything carping with 1.2.3.  Something has changed.  Perhaps we were doing something wrong earlier that 1.2.3 exposed but surprises are never welcome in production.
> 
> On Apr 10, 2013, at 8:10 AM, <mo...@barclays.com> wrote:
> 
>> I also saw this when upgrading from C* 1.0 to 1.2.2, and from hector 0.6 to 0.8
>> Turns out the Thrift message really was too long.
>> The mystery to me: Why no complaints in previous versions? Were some checks added in Thrift or Hector?
>> 
>> -----Original Message-----
>> From: Lanny Ripple [mailto:lanny@spotright.com] 
>> Sent: Tuesday, April 09, 2013 6:17 PM
>> To: user@cassandra.apache.org
>> Subject: Thrift message length exceeded
>> 
>> Hello,
>> 
>> We have recently upgraded to Cass 1.2.3 from Cass 1.1.5.  We ran sstableupgrades and got the ring on its feet and we are now seeing a new issue.
>> 
>> When we run MapReduce jobs against practically any table we find the following errors:
>> 
>> 2013-04-09 09:58:47,746 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
>> 2013-04-09 09:58:47,899 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=MAP, sessionId=
>> 2013-04-09 09:58:48,021 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0
>> 2013-04-09 09:58:48,024 INFO org.apache.hadoop.mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@4a48edb5
>> 2013-04-09 09:58:50,475 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
>> 2013-04-09 09:58:50,477 WARN org.apache.hadoop.mapred.Child: Error running child
>> java.lang.RuntimeException: org.apache.thrift.TException: Message length exceeded: 106
>> 	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.maybeInit(ColumnFamilyRecordReader.java:384)
>> 	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.computeNext(ColumnFamilyRecordReader.java:390)
>> 	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.computeNext(ColumnFamilyRecordReader.java:313)
>> 	at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
>> 	at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
>> 	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader.getProgress(ColumnFamilyRecordReader.java:103)
>> 	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.getProgress(MapTask.java:444)
>> 	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:460)
>> 	at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
>> 	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
>> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
>> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
>> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
>> 	at java.security.AccessController.doPrivileged(Native Method)
>> 	at javax.security.auth.Subject.doAs(Subject.java:396)
>> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1278)
>> 	at org.apache.hadoop.mapred.Child.main(Child.java:260)
>> Caused by: org.apache.thrift.TException: Message length exceeded: 106
>> 	at org.apache.thrift.protocol.TBinaryProtocol.checkReadLength(TBinaryProtocol.java:393)
>> 	at org.apache.thrift.protocol.TBinaryProtocol.readBinary(TBinaryProtocol.java:363)
>> 	at org.apache.cassandra.thrift.Column.read(Column.java:528)
>> 	at org.apache.cassandra.thrift.ColumnOrSuperColumn.read(ColumnOrSuperColumn.java:507)
>> 	at org.apache.cassandra.thrift.KeySlice.read(KeySlice.java:408)
>> 	at org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassandra.java:12905)
>> 	at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
>> 	at org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:734)
>> 	at org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:718)
>> 	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.maybeInit(ColumnFamilyRecordReader.java:346)
>> 	... 16 more
>> 2013-04-09 09:58:50,481 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task
>> 
>> The message length listed on each failed job differs (not always 106).  Jobs that used to run fine now fail with code compiled against cass 1.2.3 (and work fine if compiled against 1.1.5 and run against the 1.2.3 servers in production).  I'm using the following setup to configure the job:
>> 
>> def cassConfig(job: Job) {
>>   val conf = job.getConfiguration()
>> 
>>   ConfigHelper.setInputRpcPort(conf, "" + 9160)
>>   ConfigHelper.setInputInitialAddress(conf, Config.hostip)
>> 
>>   ConfigHelper.setInputPartitioner(conf, "org.apache.cassandra.dht.RandomPartitioner")
>>   ConfigHelper.setInputColumnFamily(conf, Config.keyspace, Config.cfname)
>> 
>>   val pred = {
>>     val range = new SliceRange()
>>       .setStart("".getBytes("UTF-8"))
>>       .setFinish("".getBytes("UTF-8"))
>>       .setReversed(false)
>>       .setCount(4096 * 1000)
>> 
>>     new SlicePredicate().setSlice_range(range)
>>   }
>> 
>>   ConfigHelper.setInputSlicePredicate(conf, pred)
>> }
>> 
>> The job consists only of a mapper that increments counters for each row and associated columns so all I'm really doing is exercising ColumnFamilyRecordReader.
>> 
>> Has anyone else seen this?  Is there a workaround/fix to get our jobs running?
>> 
>> Thanks
>> _______________________________________________
>> 
>> This message may contain information that is confidential or privileged. If you are not an intended recipient of this message, please delete it and any attachments, and notify the sender that you have received it in error. Unless specifically stated in the message or otherwise indicated, you may not duplicate, redistribute or forward this message or any portion thereof, including any attachments, by any means to any other person, including any retail investor or customer. This message is not a recommendation, advice, offer or solicitation, to buy/sell any product or service, and is not an official confirmation of any transaction. Any opinions presented are solely those of the author and do not necessarily represent those of Barclays.
>> 
>> This message is subject to terms available at: www.barclays.com/emaildisclaimer and, if received from Barclays' Sales or Trading desk, the terms available at: www.barclays.com/salesandtradingdisclaimer/. By messaging with Barclays you consent to the foregoing. Barclays Bank PLC is a company registered in England (number 1026167) with its registered office at 1 Churchill Place, London, E14 5HP. This email may relate to or be sent from other members of the Barclays group.
>> 
>> _______________________________________________
>

Re: Thrift message length exceeded

Posted by Edward Capriolo <ed...@gmail.com>.

Maybe you should enable the wide row support that uses get_paged_slice
instead of get_range_slice and possibly will not have the same issue.


On Wed, Apr 10, 2013 at 7:29 PM, Lanny Ripple <la...@spotright.com> wrote:

> We are using Astyanax in production but I cut back to just Hadoop and
> Cassandra to confirm it's a Cassandra (or our use of Cassandra) problem.
>
> We do have some extremely large rows but we went from everything working
> with 1.1.5 to almost everything carping with 1.2.3.  Something has changed.
>  Perhaps we were doing something wrong earlier that 1.2.3 exposed but
> surprises are never welcome in production.
>
> On Apr 10, 2013, at 8:10 AM, <mo...@barclays.com> wrote:
>
> > I also saw this when upgrading from C* 1.0 to 1.2.2, and from hector 0.6
> to 0.8
> > Turns out the Thrift message really was too long.
> > The mystery to me: Why no complaints in previous versions? Were some
> checks added in Thrift or Hector?
> >
> > -----Original Message-----
> > From: Lanny Ripple [mailto:lanny@spotright.com]
> > Sent: Tuesday, April 09, 2013 6:17 PM
> > To: user@cassandra.apache.org
> > Subject: Thrift message length exceeded
> >
> > Hello,
> >
> > We have recently upgraded to Cass 1.2.3 from Cass 1.1.5.  We ran
> sstableupgrades and got the ring on its feet and we are now seeing a new
> issue.
> >
> > When we run MapReduce jobs against practically any table we find the
> following errors:
> >
> > 2013-04-09 09:58:47,746 INFO org.apache.hadoop.util.NativeCodeLoader:
> Loaded the native-hadoop library
> > 2013-04-09 09:58:47,899 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
> Initializing JVM Metrics with processName=MAP, sessionId=
> > 2013-04-09 09:58:48,021 INFO org.apache.hadoop.util.ProcessTree: setsid
> exited with exit code 0
> > 2013-04-09 09:58:48,024 INFO org.apache.hadoop.mapred.Task:  Using
> ResourceCalculatorPlugin :
> org.apache.hadoop.util.LinuxResourceCalculatorPlugin@4a48edb5
> > 2013-04-09 09:58:50,475 INFO org.apache.hadoop.mapred.TaskLogsTruncater:
> Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
> > 2013-04-09 09:58:50,477 WARN org.apache.hadoop.mapred.Child: Error
> running child
> > java.lang.RuntimeException: org.apache.thrift.TException: Message length
> exceeded: 106
> >       at
> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.maybeInit(ColumnFamilyRecordReader.java:384)
> >       at
> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.computeNext(ColumnFamilyRecordReader.java:390)
> >       at
> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.computeNext(ColumnFamilyRecordReader.java:313)
> >       at
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
> >       at
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
> >       at
> org.apache.cassandra.hadoop.ColumnFamilyRecordReader.getProgress(ColumnFamilyRecordReader.java:103)
> >       at
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.getProgress(MapTask.java:444)
> >       at
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:460)
> >       at
> org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
> >       at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
> >       at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
> >       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
> >       at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
> >       at java.security.AccessController.doPrivileged(Native Method)
> >       at javax.security.auth.Subject.doAs(Subject.java:396)
> >       at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1278)
> >       at org.apache.hadoop.mapred.Child.main(Child.java:260)
> > Caused by: org.apache.thrift.TException: Message length exceeded: 106
> >       at
> org.apache.thrift.protocol.TBinaryProtocol.checkReadLength(TBinaryProtocol.java:393)
> >       at
> org.apache.thrift.protocol.TBinaryProtocol.readBinary(TBinaryProtocol.java:363)
> >       at org.apache.cassandra.thrift.Column.read(Column.java:528)
> >       at
> org.apache.cassandra.thrift.ColumnOrSuperColumn.read(ColumnOrSuperColumn.java:507)
> >       at org.apache.cassandra.thrift.KeySlice.read(KeySlice.java:408)
> >       at
> org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassandra.java:12905)
> >       at
> org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
> >       at
> org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:734)
> >       at
> org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:718)
> >       at
> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.maybeInit(ColumnFamilyRecordReader.java:346)
> >       ... 16 more
> > 2013-04-09 09:58:50,481 INFO org.apache.hadoop.mapred.Task: Runnning
> cleanup for the task
> >
> > The message length listed on each failed job differs (not always 106).
>  Jobs that used to run fine now fail with code compiled against cass 1.2.3
> (and work fine if compiled against 1.1.5 and run against the 1.2.3 servers
> in production).  I'm using the following setup to configure the job:
> >
> >  def cassConfig(job: Job) {
> >    val conf = job.getConfiguration()
> >
> >    ConfigHelper.setInputRpcPort(conf, "" + 9160)
> >    ConfigHelper.setInputInitialAddress(conf, Config.hostip)
> >
> >    ConfigHelper.setInputPartitioner(conf,
> "org.apache.cassandra.dht.RandomPartitioner")
> >    ConfigHelper.setInputColumnFamily(conf, Config.keyspace,
> Config.cfname)
> >
> >    val pred = {
> >      val range = new SliceRange()
> >        .setStart("".getBytes("UTF-8"))
> >        .setFinish("".getBytes("UTF-8"))
> >        .setReversed(false)
> >        .setCount(4096 * 1000)
> >
> >      new SlicePredicate().setSlice_range(range)
> >    }
> >
> >    ConfigHelper.setInputSlicePredicate(conf, pred)
> >  }
> >
> > The job consists only of a mapper that increments counters for each row
> and associated columns so all I'm really doing is exercising
> ColumnFamilyRecordReader.
> >
> > Has anyone else seen this?  Is there a workaround/fix to get our jobs
> running?
> >
> > Thanks
> > _______________________________________________
> >
> > This message may contain information that is confidential or privileged.
> If you are not an intended recipient of this message, please delete it and
> any attachments, and notify the sender that you have received it in error.
> Unless specifically stated in the message or otherwise indicated, you may
> not duplicate, redistribute or forward this message or any portion thereof,
> including any attachments, by any means to any other person, including any
> retail investor or customer. This message is not a recommendation, advice,
> offer or solicitation, to buy/sell any product or service, and is not an
> official confirmation of any transaction. Any opinions presented are solely
> those of the author and do not necessarily represent those of Barclays.
> >
> > This message is subject to terms available at:
> www.barclays.com/emaildisclaimer and, if received from Barclays' Sales or
> Trading desk, the terms available at:
> www.barclays.com/salesandtradingdisclaimer/. By messaging with Barclays
> you consent to the foregoing. Barclays Bank PLC is a company registered in
> England (number 1026167) with its registered office at 1 Churchill Place,
> London, E14 5HP. This email may relate to or be sent from other members of
> the Barclays group.
> >
> > _______________________________________________
>
>

Re: Thrift message length exceeded

Posted by Lanny Ripple <la...@spotright.com>.

We are using Astyanax in production but I cut back to just Hadoop and Cassandra to confirm it's a Cassandra (or our use of Cassandra) problem.

We do have some extremely large rows but we went from everything working with 1.1.5 to almost everything carping with 1.2.3.  Something has changed.  Perhaps we were doing something wrong earlier that 1.2.3 exposed but surprises are never welcome in production.

On Apr 10, 2013, at 8:10 AM, <mo...@barclays.com> wrote:

> I also saw this when upgrading from C* 1.0 to 1.2.2, and from hector 0.6 to 0.8
> Turns out the Thrift message really was too long.
> The mystery to me: Why no complaints in previous versions? Were some checks added in Thrift or Hector?
> 
> -----Original Message-----
> From: Lanny Ripple [mailto:lanny@spotright.com] 
> Sent: Tuesday, April 09, 2013 6:17 PM
> To: user@cassandra.apache.org
> Subject: Thrift message length exceeded
> 
> Hello,
> 
> We have recently upgraded to Cass 1.2.3 from Cass 1.1.5.  We ran sstableupgrades and got the ring on its feet and we are now seeing a new issue.
> 
> When we run MapReduce jobs against practically any table we find the following errors:
> 
> 2013-04-09 09:58:47,746 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
> 2013-04-09 09:58:47,899 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=MAP, sessionId=
> 2013-04-09 09:58:48,021 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0
> 2013-04-09 09:58:48,024 INFO org.apache.hadoop.mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@4a48edb5
> 2013-04-09 09:58:50,475 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
> 2013-04-09 09:58:50,477 WARN org.apache.hadoop.mapred.Child: Error running child
> java.lang.RuntimeException: org.apache.thrift.TException: Message length exceeded: 106
> 	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.maybeInit(ColumnFamilyRecordReader.java:384)
> 	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.computeNext(ColumnFamilyRecordReader.java:390)
> 	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.computeNext(ColumnFamilyRecordReader.java:313)
> 	at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
> 	at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
> 	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader.getProgress(ColumnFamilyRecordReader.java:103)
> 	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.getProgress(MapTask.java:444)
> 	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:460)
> 	at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
> 	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:396)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1278)
> 	at org.apache.hadoop.mapred.Child.main(Child.java:260)
> Caused by: org.apache.thrift.TException: Message length exceeded: 106
> 	at org.apache.thrift.protocol.TBinaryProtocol.checkReadLength(TBinaryProtocol.java:393)
> 	at org.apache.thrift.protocol.TBinaryProtocol.readBinary(TBinaryProtocol.java:363)
> 	at org.apache.cassandra.thrift.Column.read(Column.java:528)
> 	at org.apache.cassandra.thrift.ColumnOrSuperColumn.read(ColumnOrSuperColumn.java:507)
> 	at org.apache.cassandra.thrift.KeySlice.read(KeySlice.java:408)
> 	at org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassandra.java:12905)
> 	at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
> 	at org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:734)
> 	at org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:718)
> 	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.maybeInit(ColumnFamilyRecordReader.java:346)
> 	... 16 more
> 2013-04-09 09:58:50,481 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task
> 
> The message length listed on each failed job differs (not always 106).  Jobs that used to run fine now fail with code compiled against cass 1.2.3 (and work fine if compiled against 1.1.5 and run against the 1.2.3 servers in production).  I'm using the following setup to configure the job:
> 
>  def cassConfig(job: Job) {
>    val conf = job.getConfiguration()
> 
>    ConfigHelper.setInputRpcPort(conf, "" + 9160)
>    ConfigHelper.setInputInitialAddress(conf, Config.hostip)
> 
>    ConfigHelper.setInputPartitioner(conf, "org.apache.cassandra.dht.RandomPartitioner")
>    ConfigHelper.setInputColumnFamily(conf, Config.keyspace, Config.cfname)
> 
>    val pred = {
>      val range = new SliceRange()
>        .setStart("".getBytes("UTF-8"))
>        .setFinish("".getBytes("UTF-8"))
>        .setReversed(false)
>        .setCount(4096 * 1000)
> 
>      new SlicePredicate().setSlice_range(range)
>    }
> 
>    ConfigHelper.setInputSlicePredicate(conf, pred)
>  }
> 
> The job consists only of a mapper that increments counters for each row and associated columns so all I'm really doing is exercising ColumnFamilyRecordReader.
> 
> Has anyone else seen this?  Is there a workaround/fix to get our jobs running?
> 
> Thanks
> _______________________________________________
> 
> This message may contain information that is confidential or privileged. If you are not an intended recipient of this message, please delete it and any attachments, and notify the sender that you have received it in error. Unless specifically stated in the message or otherwise indicated, you may not duplicate, redistribute or forward this message or any portion thereof, including any attachments, by any means to any other person, including any retail investor or customer. This message is not a recommendation, advice, offer or solicitation, to buy/sell any product or service, and is not an official confirmation of any transaction. Any opinions presented are solely those of the author and do not necessarily represent those of Barclays.
> 
> This message is subject to terms available at: www.barclays.com/emaildisclaimer and, if received from Barclays' Sales or Trading desk, the terms available at: www.barclays.com/salesandtradingdisclaimer/. By messaging with Barclays you consent to the foregoing. Barclays Bank PLC is a company registered in England (number 1026167) with its registered office at 1 Churchill Place, London, E14 5HP. This email may relate to or be sent from other members of the Barclays group.
> 
> _______________________________________________

RE: Thrift message length exceeded

Posted by mo...@barclays.com.

I also saw this when upgrading from C* 1.0 to 1.2.2, and from hector 0.6 to 0.8
Turns out the Thrift message really was too long.
The mystery to me: Why no complaints in previous versions? Were some checks added in Thrift or Hector?

-----Original Message-----
From: Lanny Ripple [mailto:lanny@spotright.com] 
Sent: Tuesday, April 09, 2013 6:17 PM
To: user@cassandra.apache.org
Subject: Thrift message length exceeded

Hello,

We have recently upgraded to Cass 1.2.3 from Cass 1.1.5.  We ran sstableupgrades and got the ring on its feet and we are now seeing a new issue.

When we run MapReduce jobs against practically any table we find the following errors:

2013-04-09 09:58:47,746 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
2013-04-09 09:58:47,899 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=MAP, sessionId=
2013-04-09 09:58:48,021 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0
2013-04-09 09:58:48,024 INFO org.apache.hadoop.mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@4a48edb5
2013-04-09 09:58:50,475 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
2013-04-09 09:58:50,477 WARN org.apache.hadoop.mapred.Child: Error running child
java.lang.RuntimeException: org.apache.thrift.TException: Message length exceeded: 106
	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.maybeInit(ColumnFamilyRecordReader.java:384)
	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.computeNext(ColumnFamilyRecordReader.java:390)
	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.computeNext(ColumnFamilyRecordReader.java:313)
	at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
	at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader.getProgress(ColumnFamilyRecordReader.java:103)
	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.getProgress(MapTask.java:444)
	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:460)
	at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1278)
	at org.apache.hadoop.mapred.Child.main(Child.java:260)
Caused by: org.apache.thrift.TException: Message length exceeded: 106
	at org.apache.thrift.protocol.TBinaryProtocol.checkReadLength(TBinaryProtocol.java:393)
	at org.apache.thrift.protocol.TBinaryProtocol.readBinary(TBinaryProtocol.java:363)
	at org.apache.cassandra.thrift.Column.read(Column.java:528)
	at org.apache.cassandra.thrift.ColumnOrSuperColumn.read(ColumnOrSuperColumn.java:507)
	at org.apache.cassandra.thrift.KeySlice.read(KeySlice.java:408)
	at org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassandra.java:12905)
	at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
	at org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:734)
	at org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:718)
	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.maybeInit(ColumnFamilyRecordReader.java:346)
	... 16 more
2013-04-09 09:58:50,481 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task

The message length listed on each failed job differs (not always 106).  Jobs that used to run fine now fail with code compiled against cass 1.2.3 (and work fine if compiled against 1.1.5 and run against the 1.2.3 servers in production).  I'm using the following setup to configure the job:

  def cassConfig(job: Job) {
    val conf = job.getConfiguration()

    ConfigHelper.setInputRpcPort(conf, "" + 9160)
    ConfigHelper.setInputInitialAddress(conf, Config.hostip)

    ConfigHelper.setInputPartitioner(conf, "org.apache.cassandra.dht.RandomPartitioner")
    ConfigHelper.setInputColumnFamily(conf, Config.keyspace, Config.cfname)

    val pred = {
      val range = new SliceRange()
        .setStart("".getBytes("UTF-8"))
        .setFinish("".getBytes("UTF-8"))
        .setReversed(false)
        .setCount(4096 * 1000)

      new SlicePredicate().setSlice_range(range)
    }

    ConfigHelper.setInputSlicePredicate(conf, pred)
  }

The job consists only of a mapper that increments counters for each row and associated columns so all I'm really doing is exercising ColumnFamilyRecordReader.

Has anyone else seen this?  Is there a workaround/fix to get our jobs running?

Thanks
_______________________________________________

This message may contain information that is confidential or privileged. If you are not an intended recipient of this message, please delete it and any attachments, and notify the sender that you have received it in error. Unless specifically stated in the message or otherwise indicated, you may not duplicate, redistribute or forward this message or any portion thereof, including any attachments, by any means to any other person, including any retail investor or customer. This message is not a recommendation, advice, offer or solicitation, to buy/sell any product or service, and is not an official confirmation of any transaction. Any opinions presented are solely those of the author and do not necessarily represent those of Barclays.

This message is subject to terms available at: www.barclays.com/emaildisclaimer and, if received from Barclays' Sales or Trading desk, the terms available at: www.barclays.com/salesandtradingdisclaimer/. By messaging with Barclays you consent to the foregoing. Barclays Bank PLC is a company registered in England (number 1026167) with its registered office at 1 Churchill Place, London, E14 5HP. This email may relate to or be sent from other members of the Barclays group.

_______________________________________________