You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Janne Jalkanen <Ja...@ecyrd.com> on 2013/09/25 09:51:56 UTC

Mystery PIG issue with 1.2.10

Heya!

I am seeing something rather strange in the way Cass 1.2 + Pig seem to handle integer values.

Setup: Cassandra 1.2.10, OSX 10.8, JDK 1.7u40, Pig 0.11.1.  Single node for testing this. 

First a table:

> CREATE TABLE testc (
  key text PRIMARY KEY,
  ivalue int,
  svalue text,
  value bigint
) WITH COMPACT STORAGE;

> insert into testc (key,ivalue,svalue,value) values ('foo',10,'bar',65);
> select * from testc;

 key | ivalue | svalue | value
-----+--------+--------+-------
 foo |     10 |    bar |     65

For my Pig setup, I then use libraries from different C* versions to actually talk to my database (which stays on 1.2.10 all the time).

Cassandra 1.0.12 (using cassandra_storage.jar):

> testc = LOAD 'cassandra://keyspace/testc' USING CassandraStorage();
> dump testc
(foo,(svalue,bar),(ivalue,10),(value,65),{})

Cassandra 1.1.10:

> testc = LOAD 'cassandra://keyspace/testc' USING CassandraStorage();
> dump testc
(foo,(svalue,bar),(ivalue,10),(value,65),{})

Cassandra 1.2.10:

> (testc = LOAD 'cassandra://keyspace/testc' USING CassandraStorage();
> dump testc
foo,{(ivalue,
),(svalue,bar),(value,A)})


To me it appears that ints and bigints are interpreted as ascii values in cass 1.2.10.  Did something change for CassandraStorage, is there a regression, or am I doing something wrong?  Quick perusal of the JIRA didn't reveal anything that I could directly pin on this.

Note that using compact storage does not seem to affect the issue, though it obviously changes the resulting pig format.

In addition, trying to use Pygmalion 

> tf = foreach testc generate key, flatten(FromCassandraBag('ivalue,svalue,value',columns)) as (ivalue:int,svalue:chararray,lvalue:long);
> dump tf

(foo,
,bar,A)

So no help there. Explicitly casting the values to (long) or (int) just results in a ClassCastException.

/Janne

Re: Mystery PIG issue with 1.2.10

Posted by Janne Jalkanen <Ja...@ecyrd.com>.
Sorry, got sidetracked :)

https://issues.apache.org/jira/browse/CASSANDRA-6102

/Janne

On Sep 26, 2013, at 20:04 , Robert Coli <rc...@eventbrite.com> wrote:

> On Thu, Sep 26, 2013 at 1:00 AM, Janne Jalkanen <ja...@ecyrd.com> wrote:
> 
> Unfortunately no, as I have a dozen legacy columnfamilies… Since no clear answers appeared, I'm going to assume that this is a regression and file a JIRA ticket on this.
> 
> Could you let the list know the ticket number, when you do? :)
> 
> =Rob


Re: Mystery PIG issue with 1.2.10

Posted by Chad Johnston <cj...@megatome.com>.
The OP was using a Thrift table and CassandraStorage. I verified that the
problem does not exist with a CQL3 table and CqlStorage.

Chad


On Thu, Sep 26, 2013 at 7:05 PM, Aaron Morton <aa...@thelastpickle.com>wrote:

>
> Unfortunately no, as I have a dozen legacy columnfamilies… Since no clear
>> answers appeared, I'm going to assume that this is a regression and file a
>> JIRA ticket on this.
>>
> Could you explain that a little more?
>
> You tried using the CqlStorage read with a CQL 3 table and it did not work
> ?
>
> Cheers
>
> -----------------
> Aaron Morton
> New Zealand
> @aaronmorton
>
> Co-Founder & Principal Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com
>
> On 27/09/2013, at 5:04 AM, Robert Coli <rc...@eventbrite.com> wrote:
>
> On Thu, Sep 26, 2013 at 1:00 AM, Janne Jalkanen <ja...@ecyrd.com>wrote:
>
>>
>> Unfortunately no, as I have a dozen legacy columnfamilies… Since no clear
>> answers appeared, I'm going to assume that this is a regression and file a
>> JIRA ticket on this.
>>
>
> Could you let the list know the ticket number, when you do? :)
>
> =Rob
>
>
>

Re: Mystery PIG issue with 1.2.10

Posted by Aaron Morton <aa...@thelastpickle.com>.
> Unfortunately no, as I have a dozen legacy columnfamilies… Since no clear answers appeared, I'm going to assume that this is a regression and file a JIRA ticket on this.
Could you explain that a little more? 

You tried using the CqlStorage read with a CQL 3 table and it did not work ? 

Cheers

-----------------
Aaron Morton
New Zealand
@aaronmorton

Co-Founder & Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 27/09/2013, at 5:04 AM, Robert Coli <rc...@eventbrite.com> wrote:

> On Thu, Sep 26, 2013 at 1:00 AM, Janne Jalkanen <ja...@ecyrd.com> wrote:
> 
> Unfortunately no, as I have a dozen legacy columnfamilies… Since no clear answers appeared, I'm going to assume that this is a regression and file a JIRA ticket on this.
> 
> Could you let the list know the ticket number, when you do? :)
> 
> =Rob


Re: Mystery PIG issue with 1.2.10

Posted by Robert Coli <rc...@eventbrite.com>.
On Thu, Sep 26, 2013 at 1:00 AM, Janne Jalkanen <ja...@ecyrd.com>wrote:

>
> Unfortunately no, as I have a dozen legacy columnfamilies… Since no clear
> answers appeared, I'm going to assume that this is a regression and file a
> JIRA ticket on this.
>

Could you let the list know the ticket number, when you do? :)

=Rob

Re: Mystery PIG issue with 1.2.10

Posted by Janne Jalkanen <ja...@ecyrd.com>.
Unfortunately no, as I have a dozen legacy columnfamilies… Since no clear answers appeared, I'm going to assume that this is a regression and file a JIRA ticket on this.

/Janne

On 26 Sep 2013, at 08:00, Aaron Morton <aa...@thelastpickle.com> wrote:

>> > (testc = LOAD 'cassandra://keyspace/testc' USING CassandraStorage();
>> > dump testc
>> foo,{(ivalue,
>> ),(svalue,bar),(value,A)})
> 
> 
> 
> If the CQL 3 data ye wish to read, CqlStorage be the driver of your success. 
> 
> (btw there is a ticket out to update the example if you get excited https://issues.apache.org/jira/browse/CASSANDRA-5709)
> 
> Cheers
> 
> 
> -----------------
> Aaron Morton
> New Zealand
> @aaronmorton
> 
> Co-Founder & Principal Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com
> 
> On 26/09/2013, at 3:57 AM, Chad Johnston <cj...@megatome.com> wrote:
> 
>> As an FYI, creating the table without the "WITH COMPACT STORAGE" and using CqlStorage works just fine in 1.2.10.
>> 
>> I know that CqlStorage and AbstractCassandraStorage got changed for 1.2.10 - maybe there's a regression with the existing CassandraStorage?
>> 
>> Chad
>> 
>> 
>> On Wed, Sep 25, 2013 at 1:51 AM, Janne Jalkanen <Ja...@ecyrd.com> wrote:
>> Heya!
>> 
>> I am seeing something rather strange in the way Cass 1.2 + Pig seem to handle integer values.
>> 
>> Setup: Cassandra 1.2.10, OSX 10.8, JDK 1.7u40, Pig 0.11.1.  Single node for testing this.
>> 
>> First a table:
>> 
>> > CREATE TABLE testc (
>>   key text PRIMARY KEY,
>>   ivalue int,
>>   svalue text,
>>   value bigint
>> ) WITH COMPACT STORAGE;
>> 
>> > insert into testc (key,ivalue,svalue,value) values ('foo',10,'bar',65);
>> > select * from testc;
>> 
>>  key | ivalue | svalue | value
>> -----+--------+--------+-------
>>  foo |     10 |    bar |     65
>> 
>> For my Pig setup, I then use libraries from different C* versions to actually talk to my database (which stays on 1.2.10 all the time).
>> 
>> Cassandra 1.0.12 (using cassandra_storage.jar):
>> 
>> > testc = LOAD 'cassandra://keyspace/testc' USING CassandraStorage();
>> > dump testc
>> (foo,(svalue,bar),(ivalue,10),(value,65),{})
>> 
>> Cassandra 1.1.10:
>> 
>> > testc = LOAD 'cassandra://keyspace/testc' USING CassandraStorage();
>> > dump testc
>> (foo,(svalue,bar),(ivalue,10),(value,65),{})
>> 
>> Cassandra 1.2.10:
>> 
>> > (testc = LOAD 'cassandra://keyspace/testc' USING CassandraStorage();
>> > dump testc
>> foo,{(ivalue,
>> ),(svalue,bar),(value,A)})
>> 
>> 
>> To me it appears that ints and bigints are interpreted as ascii values in cass 1.2.10.  Did something change for CassandraStorage, is there a regression, or am I doing something wrong?  Quick perusal of the JIRA didn't reveal anything that I could directly pin on this.
>> 
>> Note that using compact storage does not seem to affect the issue, though it obviously changes the resulting pig format.
>> 
>> In addition, trying to use Pygmalion
>> 
>> > tf = foreach testc generate key, flatten(FromCassandraBag('ivalue,svalue,value',columns)) as (ivalue:int,svalue:chararray,lvalue:long);
>> > dump tf
>> 
>> (foo,
>> ,bar,A)
>> 
>> So no help there. Explicitly casting the values to (long) or (int) just results in a ClassCastException.
>> 
>> /Janne
>> 
> 


Re: Mystery PIG issue with 1.2.10

Posted by Aaron Morton <aa...@thelastpickle.com>.
> > (testc = LOAD 'cassandra://keyspace/testc' USING CassandraStorage();
> > dump testc
> foo,{(ivalue,
> ),(svalue,bar),(value,A)})



If the CQL 3 data ye wish to read, CqlStorage be the driver of your success. 

(btw there is a ticket out to update the example if you get excited https://issues.apache.org/jira/browse/CASSANDRA-5709)

Cheers


-----------------
Aaron Morton
New Zealand
@aaronmorton

Co-Founder & Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 26/09/2013, at 3:57 AM, Chad Johnston <cj...@megatome.com> wrote:

> As an FYI, creating the table without the "WITH COMPACT STORAGE" and using CqlStorage works just fine in 1.2.10.
> 
> I know that CqlStorage and AbstractCassandraStorage got changed for 1.2.10 - maybe there's a regression with the existing CassandraStorage?
> 
> Chad
> 
> 
> On Wed, Sep 25, 2013 at 1:51 AM, Janne Jalkanen <Ja...@ecyrd.com> wrote:
> Heya!
> 
> I am seeing something rather strange in the way Cass 1.2 + Pig seem to handle integer values.
> 
> Setup: Cassandra 1.2.10, OSX 10.8, JDK 1.7u40, Pig 0.11.1.  Single node for testing this.
> 
> First a table:
> 
> > CREATE TABLE testc (
>   key text PRIMARY KEY,
>   ivalue int,
>   svalue text,
>   value bigint
> ) WITH COMPACT STORAGE;
> 
> > insert into testc (key,ivalue,svalue,value) values ('foo',10,'bar',65);
> > select * from testc;
> 
>  key | ivalue | svalue | value
> -----+--------+--------+-------
>  foo |     10 |    bar |     65
> 
> For my Pig setup, I then use libraries from different C* versions to actually talk to my database (which stays on 1.2.10 all the time).
> 
> Cassandra 1.0.12 (using cassandra_storage.jar):
> 
> > testc = LOAD 'cassandra://keyspace/testc' USING CassandraStorage();
> > dump testc
> (foo,(svalue,bar),(ivalue,10),(value,65),{})
> 
> Cassandra 1.1.10:
> 
> > testc = LOAD 'cassandra://keyspace/testc' USING CassandraStorage();
> > dump testc
> (foo,(svalue,bar),(ivalue,10),(value,65),{})
> 
> Cassandra 1.2.10:
> 
> > (testc = LOAD 'cassandra://keyspace/testc' USING CassandraStorage();
> > dump testc
> foo,{(ivalue,
> ),(svalue,bar),(value,A)})
> 
> 
> To me it appears that ints and bigints are interpreted as ascii values in cass 1.2.10.  Did something change for CassandraStorage, is there a regression, or am I doing something wrong?  Quick perusal of the JIRA didn't reveal anything that I could directly pin on this.
> 
> Note that using compact storage does not seem to affect the issue, though it obviously changes the resulting pig format.
> 
> In addition, trying to use Pygmalion
> 
> > tf = foreach testc generate key, flatten(FromCassandraBag('ivalue,svalue,value',columns)) as (ivalue:int,svalue:chararray,lvalue:long);
> > dump tf
> 
> (foo,
> ,bar,A)
> 
> So no help there. Explicitly casting the values to (long) or (int) just results in a ClassCastException.
> 
> /Janne
> 


Re: Mystery PIG issue with 1.2.10

Posted by Chad Johnston <cj...@megatome.com>.
As an FYI, creating the table without the "WITH COMPACT STORAGE" and using
CqlStorage works just fine in 1.2.10.

I know that CqlStorage and AbstractCassandraStorage got changed for 1.2.10
- maybe there's a regression with the existing CassandraStorage?

Chad


On Wed, Sep 25, 2013 at 1:51 AM, Janne Jalkanen <Ja...@ecyrd.com>wrote:

> Heya!
>
> I am seeing something rather strange in the way Cass 1.2 + Pig seem to
> handle integer values.
>
> Setup: Cassandra 1.2.10, OSX 10.8, JDK 1.7u40, Pig 0.11.1.  Single node
> for testing this.
>
> First a table:
>
> > CREATE TABLE testc (
>   key text PRIMARY KEY,
>   ivalue int,
>   svalue text,
>   value bigint
> ) WITH COMPACT STORAGE;
>
> > insert into testc (key,ivalue,svalue,value) values ('foo',10,'bar',65);
> > select * from testc;
>
>  key | ivalue | svalue | value
> -----+--------+--------+-------
>  foo |     10 |    bar |     65
>
> For my Pig setup, I then use libraries from different C* versions to
> actually talk to my database (which stays on 1.2.10 all the time).
>
> Cassandra 1.0.12 (using cassandra_storage.jar):
>
> > testc = LOAD 'cassandra://keyspace/testc' USING CassandraStorage();
> > dump testc
> (foo,(svalue,bar),(ivalue,10),(value,65),{})
>
> Cassandra 1.1.10:
>
> > testc = LOAD 'cassandra://keyspace/testc' USING CassandraStorage();
> > dump testc
> (foo,(svalue,bar),(ivalue,10),(value,65),{})
>
> Cassandra 1.2.10:
>
> > (testc = LOAD 'cassandra://keyspace/testc' USING CassandraStorage();
> > dump testc
> foo,{(ivalue,
> ),(svalue,bar),(value,A)})
>
>
> To me it appears that ints and bigints are interpreted as ascii values in
> cass 1.2.10.  Did something change for CassandraStorage, is there a
> regression, or am I doing something wrong?  Quick perusal of the JIRA
> didn't reveal anything that I could directly pin on this.
>
> Note that using compact storage does not seem to affect the issue, though
> it obviously changes the resulting pig format.
>
> In addition, trying to use Pygmalion
>
> > tf = foreach testc generate key,
> flatten(FromCassandraBag('ivalue,svalue,value',columns)) as
> (ivalue:int,svalue:chararray,lvalue:long);
> > dump tf
>
> (foo,
> ,bar,A)
>
> So no help there. Explicitly casting the values to (long) or (int) just
> results in a ClassCastException.
>
> /Janne