You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@apex.apache.org by Russell Spitzer <ru...@datastax.com> on 2015/12/10 23:47:24 UTC

Cassandra Operator Code Review

Hi, I'm Russell and Software Engineer at DataStax and I work on the Spark
Cassandra Connector. I am excited about Apex as a great streaming solution
so I took a at the integration with C* and I had a few comments

https://github.com/apache/incubator-apex
-malhar/blob/master/contrib/src/main/java/com/datatorrent/contrib/cassandra/AbstractCassandraTransactionableOutputOperator.java

This behavior is a bit scary for me, building large batches like this
(especially non partition specific) can lead to some stability problems
over time. Hint build up can be a concern since those are stored in C* as
well Pre C* 3.0. Originally the spark C* connector used batches of 64kb but
this caused a large amount of problems on clusters with a HI RF or poorly
provisioned setups. Some method to lock the total batch size down may be
useful.

The other issue is that the "Atomicity" of the batch is a point of serious
fights within the C* community. One of the biggest issues being that do to
the nature of repair and entropy in the system the Atomicity of a batch
cannot be guaranteed in a traditional database sort of way. The guarantee
breaks completely in Multi-DC environments for example.

All this said, it is probably sufficient from a Data Loss perspective if
the CL is high enough and the batches are small enough.

https://issues.apache.org/jira/browse/CASSANDRA-10701

There are some other "Caveats" to batches that you should also be aware of.
For example a batch containing INSERT ( 1, 2, 1) and  INSERT (1 ,1, 2) will
treat these inserts as having occurred at the same timestamp (unless they
are manually adjusted). Which will end up with a row (1 , 2, 2) based on
the greatest value of colliding rows.

I also don't see a way to adjust Consistency Level here?

https://github.com/apache/incubator-apex
-malhar/blob/master/contrib/src/main/java/com/datatorrent/contrib/cassandra/CassandraPOJOInputOperator.java

The metadata for any give table can be retrieved without running a query
via the driver's Cluster's metadata object. May be better for future
proofing?

This class may also want to allow the pushdown of projections to C* to
limit columns retrieved or if ambitious, pushdown of clustering column
predicates.



https://github.com/apache/incubator-apex
-malhar/blob/master/contrib/src/main/java/com/datatorrent/contrib/cassandra/CassandraPOJOOutputOperator.java

Same metadata comment as with Input and CL comment

https://github.com/apache/incubator-apex
-malhar/blob/master/contrib/src/main/java/com/datatorrent/contrib/cassandra/CassandraStore.java


https://github.com/apache/incubator-apex
-malhar/blob/master/contrib/src/main/java/com/datatorrent/contrib/cassandra/CassandraTransactionalStore.java


I still haven't gotten to the Transaction Store but hopefully I can take a
good read later.

Thanks for your time,
Russ
-- 
http://datastax.com/all/images/cs_logo_color_sm.png

Re: Cassandra Operator Code Review

Posted by Priyanka Gugale <pr...@datatorrent.com>.

Hi Russel,

I have couple of questions:
*About batch size*:
1. The operator we have operates on one keyspace and table at a time.
Provided that is it still a problem if we have batches of large size?
2. What is recommended size of a batch? We can introduce max batch size in
operator easily.

*Atomicity*:
One way to reduce risk is use smaller batch size right? Is there any other
way to make sure cassandra does atomic commit?

*Consistency Level*:
Should we provide option to configure CL at application level or it's
required at record level?

Also I couldn't understand what we can do for records which has same
timestamp and cassandra takes higher value from colliding rows.

-Priyanka

On Fri, Dec 11, 2015 at 4:56 AM, Pramod Immaneni <pr...@datatorrent.com>
wrote:

> Russel,
>
> Would you be interested in taking up the JIRA and implementing the
> improvements as you have a deeper understanding of the inner working
> of Cassandra or would you be interested in helping the person who
> takes up this JIRA.
>
> Thanks
>
> > On Dec 10, 2015, at 3:09 PM, Chandni Singh <ch...@datatorrent.com>
> wrote:
> >
> > Thanks Russel!
> >
> > We will make these fixes. Will get back to you in case we have more
> > questions.
> >
> > Have created a JIRA:
> > https://malhar.atlassian.net/browse/MLHR-1935
> >
> > Thanks
> > Chandni
> >
> > On Thu, Dec 10, 2015 at 2:47 PM, Russell Spitzer <ru...@datastax.com>
> > wrote:
> >
> >> Hi, I'm Russell and Software Engineer at DataStax and I work on the
> Spark
> >> Cassandra Connector. I am excited about Apex as a great streaming
> solution
> >> so I took a at the integration with C* and I had a few comments
> >>
> >> https://github.com/apache/incubator-apex
> >>
> >>
> -malhar/blob/master/contrib/src/main/java/com/datatorrent/contrib/cassandra/AbstractCassandraTransactionableOutputOperator.java
> >>
> >> This behavior is a bit scary for me, building large batches like this
> >> (especially non partition specific) can lead to some stability problems
> >> over time. Hint build up can be a concern since those are stored in C*
> as
> >> well Pre C* 3.0. Originally the spark C* connector used batches of 64kb
> but
> >> this caused a large amount of problems on clusters with a HI RF or
> poorly
> >> provisioned setups. Some method to lock the total batch size down may be
> >> useful.
> >>
> >> The other issue is that the "Atomicity" of the batch is a point of
> serious
> >> fights within the C* community. One of the biggest issues being that do
> to
> >> the nature of repair and entropy in the system the Atomicity of a batch
> >> cannot be guaranteed in a traditional database sort of way. The
> guarantee
> >> breaks completely in Multi-DC environments for example.
> >>
> >> All this said, it is probably sufficient from a Data Loss perspective if
> >> the CL is high enough and the batches are small enough.
> >>
> >> https://issues.apache.org/jira/browse/CASSANDRA-10701
> >>
> >> There are some other "Caveats" to batches that you should also be aware
> of.
> >> For example a batch containing INSERT ( 1, 2, 1) and  INSERT (1 ,1, 2)
> will
> >> treat these inserts as having occurred at the same timestamp (unless
> they
> >> are manually adjusted). Which will end up with a row (1 , 2, 2) based on
> >> the greatest value of colliding rows.
> >>
> >> I also don't see a way to adjust Consistency Level here?
> >>
> >> https://github.com/apache/incubator-apex
> >>
> >>
> -malhar/blob/master/contrib/src/main/java/com/datatorrent/contrib/cassandra/CassandraPOJOInputOperator.java
> >>
> >> The metadata for any give table can be retrieved without running a query
> >> via the driver's Cluster's metadata object. May be better for future
> >> proofing?
> >>
> >> This class may also want to allow the pushdown of projections to C* to
> >> limit columns retrieved or if ambitious, pushdown of clustering column
> >> predicates.
> >>
> >>
> >>
> >> https://github.com/apache/incubator-apex
> >>
> >>
> -malhar/blob/master/contrib/src/main/java/com/datatorrent/contrib/cassandra/CassandraPOJOOutputOperator.java
> >>
> >> Same metadata comment as with Input and CL comment
> >>
> >> https://github.com/apache/incubator-apex
> >>
> >>
> -malhar/blob/master/contrib/src/main/java/com/datatorrent/contrib/cassandra/CassandraStore.java
> >>
> >>
> >> https://github.com/apache/incubator-apex
> >>
> >>
> -malhar/blob/master/contrib/src/main/java/com/datatorrent/contrib/cassandra/CassandraTransactionalStore.java
> >>
> >>
> >> I still haven't gotten to the Transaction Store but hopefully I can
> take a
> >> good read later.
> >>
> >> Thanks for your time,
> >> Russ
> >> --
> >> http://datastax.com/all/images/cs_logo_color_sm.png
> >>
>

Re: Cassandra Operator Code Review

Posted by Pramod Immaneni <pr...@datatorrent.com>.

Russel,

Would you be interested in taking up the JIRA and implementing the
improvements as you have a deeper understanding of the inner working
of Cassandra or would you be interested in helping the person who
takes up this JIRA.

Thanks

> On Dec 10, 2015, at 3:09 PM, Chandni Singh <ch...@datatorrent.com> wrote:
>
> Thanks Russel!
>
> We will make these fixes. Will get back to you in case we have more
> questions.
>
> Have created a JIRA:
> https://malhar.atlassian.net/browse/MLHR-1935
>
> Thanks
> Chandni
>
> On Thu, Dec 10, 2015 at 2:47 PM, Russell Spitzer <ru...@datastax.com>
> wrote:
>
>> Hi, I'm Russell and Software Engineer at DataStax and I work on the Spark
>> Cassandra Connector. I am excited about Apex as a great streaming solution
>> so I took a at the integration with C* and I had a few comments
>>
>> https://github.com/apache/incubator-apex
>>
>> -malhar/blob/master/contrib/src/main/java/com/datatorrent/contrib/cassandra/AbstractCassandraTransactionableOutputOperator.java
>>
>> This behavior is a bit scary for me, building large batches like this
>> (especially non partition specific) can lead to some stability problems
>> over time. Hint build up can be a concern since those are stored in C* as
>> well Pre C* 3.0. Originally the spark C* connector used batches of 64kb but
>> this caused a large amount of problems on clusters with a HI RF or poorly
>> provisioned setups. Some method to lock the total batch size down may be
>> useful.
>>
>> The other issue is that the "Atomicity" of the batch is a point of serious
>> fights within the C* community. One of the biggest issues being that do to
>> the nature of repair and entropy in the system the Atomicity of a batch
>> cannot be guaranteed in a traditional database sort of way. The guarantee
>> breaks completely in Multi-DC environments for example.
>>
>> All this said, it is probably sufficient from a Data Loss perspective if
>> the CL is high enough and the batches are small enough.
>>
>> https://issues.apache.org/jira/browse/CASSANDRA-10701
>>
>> There are some other "Caveats" to batches that you should also be aware of.
>> For example a batch containing INSERT ( 1, 2, 1) and  INSERT (1 ,1, 2) will
>> treat these inserts as having occurred at the same timestamp (unless they
>> are manually adjusted). Which will end up with a row (1 , 2, 2) based on
>> the greatest value of colliding rows.
>>
>> I also don't see a way to adjust Consistency Level here?
>>
>> https://github.com/apache/incubator-apex
>>
>> -malhar/blob/master/contrib/src/main/java/com/datatorrent/contrib/cassandra/CassandraPOJOInputOperator.java
>>
>> The metadata for any give table can be retrieved without running a query
>> via the driver's Cluster's metadata object. May be better for future
>> proofing?
>>
>> This class may also want to allow the pushdown of projections to C* to
>> limit columns retrieved or if ambitious, pushdown of clustering column
>> predicates.
>>
>>
>>
>> https://github.com/apache/incubator-apex
>>
>> -malhar/blob/master/contrib/src/main/java/com/datatorrent/contrib/cassandra/CassandraPOJOOutputOperator.java
>>
>> Same metadata comment as with Input and CL comment
>>
>> https://github.com/apache/incubator-apex
>>
>> -malhar/blob/master/contrib/src/main/java/com/datatorrent/contrib/cassandra/CassandraStore.java
>>
>>
>> https://github.com/apache/incubator-apex
>>
>> -malhar/blob/master/contrib/src/main/java/com/datatorrent/contrib/cassandra/CassandraTransactionalStore.java
>>
>>
>> I still haven't gotten to the Transaction Store but hopefully I can take a
>> good read later.
>>
>> Thanks for your time,
>> Russ
>> --
>> http://datastax.com/all/images/cs_logo_color_sm.png
>>

Re: Cassandra Operator Code Review

Posted by Chandni Singh <ch...@datatorrent.com>.

Thanks Russel!

We will make these fixes. Will get back to you in case we have more
questions.

Have created a JIRA:
https://malhar.atlassian.net/browse/MLHR-1935

Thanks
Chandni

On Thu, Dec 10, 2015 at 2:47 PM, Russell Spitzer <ru...@datastax.com>
wrote:

> Hi, I'm Russell and Software Engineer at DataStax and I work on the Spark
> Cassandra Connector. I am excited about Apex as a great streaming solution
> so I took a at the integration with C* and I had a few comments
>
> https://github.com/apache/incubator-apex
>
> -malhar/blob/master/contrib/src/main/java/com/datatorrent/contrib/cassandra/AbstractCassandraTransactionableOutputOperator.java
>
> This behavior is a bit scary for me, building large batches like this
> (especially non partition specific) can lead to some stability problems
> over time. Hint build up can be a concern since those are stored in C* as
> well Pre C* 3.0. Originally the spark C* connector used batches of 64kb but
> this caused a large amount of problems on clusters with a HI RF or poorly
> provisioned setups. Some method to lock the total batch size down may be
> useful.
>
> The other issue is that the "Atomicity" of the batch is a point of serious
> fights within the C* community. One of the biggest issues being that do to
> the nature of repair and entropy in the system the Atomicity of a batch
> cannot be guaranteed in a traditional database sort of way. The guarantee
> breaks completely in Multi-DC environments for example.
>
> All this said, it is probably sufficient from a Data Loss perspective if
> the CL is high enough and the batches are small enough.
>
> https://issues.apache.org/jira/browse/CASSANDRA-10701
>
> There are some other "Caveats" to batches that you should also be aware of.
> For example a batch containing INSERT ( 1, 2, 1) and  INSERT (1 ,1, 2) will
> treat these inserts as having occurred at the same timestamp (unless they
> are manually adjusted). Which will end up with a row (1 , 2, 2) based on
> the greatest value of colliding rows.
>
> I also don't see a way to adjust Consistency Level here?
>
> https://github.com/apache/incubator-apex
>
> -malhar/blob/master/contrib/src/main/java/com/datatorrent/contrib/cassandra/CassandraPOJOInputOperator.java
>
> The metadata for any give table can be retrieved without running a query
> via the driver's Cluster's metadata object. May be better for future
> proofing?
>
> This class may also want to allow the pushdown of projections to C* to
> limit columns retrieved or if ambitious, pushdown of clustering column
> predicates.
>
>
>
> https://github.com/apache/incubator-apex
>
> -malhar/blob/master/contrib/src/main/java/com/datatorrent/contrib/cassandra/CassandraPOJOOutputOperator.java
>
> Same metadata comment as with Input and CL comment
>
> https://github.com/apache/incubator-apex
>
> -malhar/blob/master/contrib/src/main/java/com/datatorrent/contrib/cassandra/CassandraStore.java
>
>
> https://github.com/apache/incubator-apex
>
> -malhar/blob/master/contrib/src/main/java/com/datatorrent/contrib/cassandra/CassandraTransactionalStore.java
>
>
> I still haven't gotten to the Transaction Store but hopefully I can take a
> good read later.
>
> Thanks for your time,
> Russ
> --
> http://datastax.com/all/images/cs_logo_color_sm.png
>