You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@nifi.apache.org by "Daniel Cave (JIRA)" <ji...@apache.org> on 2015/12/02 20:29:11 UTC
[jira] [Comment Edited] (NIFI-1201) Allow ExecuteSQL to run queries with that use a variable timestamp or sequence id

    [ https://issues.apache.org/jira/browse/NIFI-1201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15036392#comment-15036392 ] 

Daniel Cave edited comment on NIFI-1201 at 12/2/15 7:28 PM:
------------------------------------------------------------

Maintaining in state with a persisted value would not guarantee that the previous read had been successful.  However, below is an example of how this can be done with the current ExecuteSQL/PutSQL processors.

This can be done through ExecuteSQL by subquery a data table to read the variable value to filter by as well as capture the current date for future use.

For example
select account_name,  
(select round((sysdate - to_date('01-01-1970 00:00:00', 'DD-MM-YYYY HH24:MI:SS'))*24*60*60) from dual) as current_read_date
from some_account_t
where created_date > (select round((last_read_date - to_date('01-01-1970 00:00:00', 'DD-MM-YYYY HH24:MI:SS'))*24*60*60) from last_read_t where application like '%Nifi%' and last_read_status='SUCCESS'); 

The only other processor needed is PutSQL to update the last read date and status done at flow end.

Using this method rather than a cache in ExecuteSQL itself guards against service or further processing interruptions that ExecuteSQL would not be aware of and that would make the cached value invalid.


was (Author: daniel cave):
Maintaining in state without a persisted value would not guarantee that the previous read had been successful.  However, below is an example of how this can be done with the current ExecuteSQL/PutSQL processors.

This can be done through ExecuteSQL by subquery a data table to read the variable value to filter by as well as capture the current date for future use.

For example
select account_name,  
(select round((sysdate - to_date('01-01-1970 00:00:00', 'DD-MM-YYYY HH24:MI:SS'))*24*60*60) from dual) as current_read_date
from some_account_t
where created_date > (select round((last_read_date - to_date('01-01-1970 00:00:00', 'DD-MM-YYYY HH24:MI:SS'))*24*60*60) from last_read_t where application like '%Nifi%' and last_read_status='SUCCESS'); 

The only other processor needed is PutSQL to update the last read date and status done at flow end.

Using this method rather than a cache in ExecuteSQL itself guards against service or further processing interruptions that ExecuteSQL would not be aware of and that would make the cached value invalid.

> Allow ExecuteSQL to run queries with that use a variable timestamp or sequence id
> ---------------------------------------------------------------------------------
>
>                 Key: NIFI-1201
>                 URL: https://issues.apache.org/jira/browse/NIFI-1201
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Extensions
>            Reporter: Randy Gelhausen
>            Priority: Minor
>
> Users are employing ExecuteSQL as a means to schedule periodic queries against remote databases. Other tools that do this type of task include the ability to maintain and automatically increment a sequence or timestamp used in query predicates.
> For example:
> select * from src_table where created_at > "2015-11-19 12:00:00"
> Then a minute later:
> select * from src_table where created_at > "2015-11-19 12:01:00"
> Or:
> insert into my_table values (${prev_id}+1, ${now()})
> Today users can implement the same logic with a series of processors, but much work could be saved by allowing ExecuteSQL to maintain these bits of state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)