You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nifi.apache.org by Michael Giroux <mi...@yahoo.com.INVALID> on 2017/02/01 12:58:49 UTC

Re: Thoughts on change to PutCassandraQL (nifi-cassandra-processors)

Hi Nifi Developers,
I've attached some artifacts associated with my proposed change:
NIFI-XXXX.patch - a git patch with the changestestPutCassandra.xml - a simple NiFi flow (template) for testingcreate_test_table.cql - a script for creating a keyspace and table in cassandra
Summary of changes:  Added the ability to select whether the PutCassandraQL processor caches prepared statements.
Comments:  I expected to see faster performance - however, I did not measure a noticeable increase or decrease in performance.  BUT, it eliminates the
2017-01-31 14:30:54,287 WARN [cluster1-worker-1] com.datastax.driver.core.Cluster Re-preparing already prepared query insert into test_table (id, timestamp, id1, timestamp1, id
2, timestamp2, id3, timestamp3, id4, timestamp4, id5, timestamp5, id6, timestamp6) values (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?);. Please note that preparing the same query
 more than once is generally an anti-pattern and will likely affect performance. Consider preparing the statement only once.
messages in the nifi-app.log.
The tests that I included with the code are not as comprehensive as I would like.  I ran things through the debugger to examine status, but obviously not an option for automated integration tests.  That's why I included the additional files for testing.
Got to admit I was a little disappointed to not see a performance increase.  Still it does clean things up in what I believe is the most common use case (the same statement used many times).


      From: Joe Witt <jo...@gmail.com>
 To: dev@nifi.apache.org; Michael Giroux <mi...@yahoo.com> 
 Sent: Tuesday, January 24, 2017 4:07 PM
 Subject: Re: Thoughts on change to PutCassandraQL (nifi-cassandra-processors)
   
Michael

It certainly sounds interesting.  You might want to share a pointer to
your code in github or provide a patch for folks to look at.  If you
have some before/after results to share too and a sample case that
could be valuable.

Thanks
Joe

On Tue, Jan 24, 2017 at 12:46 PM, Michael Giroux
<mi...@yahoo.com.invalid> wrote:
> Hi All,
> I'm currently using the PutCassandraQL processor and have implemented a feature that is (I believe) worthwhile.  I'd like to see hear your thoughts and donate the code if you're interested.  I've implemented a cache for the cql PreparedStatement.  In my specific use case I'm using the same set of PreparedStatements millions of times...  the cache should save a round trip to the database for all but the first of these calls.
> If you all are interested let me know I'll follow the process to get the code into the baseline.  Thanks!
>
>

   

Re: Thoughts on change to PutCassandraQL (nifi-cassandra-processors)

Posted by Matt Burgess <ma...@apache.org>.
Michael,

Thank you for your contribution! I took the liberty of writing up
NIFI-3425 [1], and adding your patch file.  I haven't had a chance to
take a look yet but will review soon.

Regards,
Matt

[1] https://issues.apache.org/jira/browse/NIFI-3425


On Wed, Feb 1, 2017 at 7:58 AM, Michael Giroux
<mi...@yahoo.com.invalid> wrote:
> Hi Nifi Developers,
>
> I've attached some artifacts associated with my proposed change:
>
> NIFI-XXXX.patch - a git patch with the changes
> testPutCassandra.xml - a simple NiFi flow (template) for testing
> create_test_table.cql - a script for creating a keyspace and table in
> cassandra
>
> Summary of changes:  Added the ability to select whether the PutCassandraQL
> processor caches prepared statements.
>
> Comments:  I expected to see faster performance - however, I did not measure
> a noticeable increase or decrease in performance.  BUT, it eliminates the
>
> 2017-01-31 14:30:54,287 WARN [cluster1-worker-1]
> com.datastax.driver.core.Cluster Re-preparing already prepared query insert
> into test_table (id, timestamp, id1, timestamp1, id
> 2, timestamp2, id3, timestamp3, id4, timestamp4, id5, timestamp5, id6,
> timestamp6) values (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?);. Please note
> that preparing the same query
>  more than once is generally an anti-pattern and will likely affect
> performance. Consider preparing the statement only once.
>
> messages in the nifi-app.log.
>
> The tests that I included with the code are not as comprehensive as I would
> like.  I ran things through the debugger to examine status, but obviously
> not an option for automated integration tests.  That's why I included the
> additional files for testing.
>
> Got to admit I was a little disappointed to not see a performance increase.
> Still it does clean things up in what I believe is the most common use case
> (the same statement used many times).
>
>
>
> ________________________________
> From: Joe Witt <jo...@gmail.com>
> To: dev@nifi.apache.org; Michael Giroux <mi...@yahoo.com>
> Sent: Tuesday, January 24, 2017 4:07 PM
> Subject: Re: Thoughts on change to PutCassandraQL
> (nifi-cassandra-processors)
>
> Michael
>
> It certainly sounds interesting.  You might want to share a pointer to
> your code in github or provide a patch for folks to look at.  If you
> have some before/after results to share too and a sample case that
> could be valuable.
>
> Thanks
> Joe
>
> On Tue, Jan 24, 2017 at 12:46 PM, Michael Giroux
> <mi...@yahoo.com.invalid> wrote:
>> Hi All,
>> I'm currently using the PutCassandraQL processor and have implemented a
>> feature that is (I believe) worthwhile.  I'd like to see hear your thoughts
>> and donate the code if you're interested.  I've implemented a cache for the
>> cql PreparedStatement.  In my specific use case I'm using the same set of
>> PreparedStatements millions of times...  the cache should save a round trip
>> to the database for all but the first of these calls.
>> If you all are interested let me know I'll follow the process to get the
>> code into the baseline.  Thanks!
>>
>>
>
>