You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@nifi.apache.org by "Daniel Cave (JIRA)" <ji...@apache.org> on 2015/12/02 20:29:11 UTC
[jira] [Comment Edited] (NIFI-1201) Allow ExecuteSQL to run queries
with that use a variable timestamp or sequence id
[ https://issues.apache.org/jira/browse/NIFI-1201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15036392#comment-15036392 ]
Daniel Cave edited comment on NIFI-1201 at 12/2/15 7:28 PM:
------------------------------------------------------------
Maintaining in state with a persisted value would not guarantee that the previous read had been successful. However, below is an example of how this can be done with the current ExecuteSQL/PutSQL processors.
This can be done through ExecuteSQL by subquery a data table to read the variable value to filter by as well as capture the current date for future use.
For example
select account_name,
(select round((sysdate - to_date('01-01-1970 00:00:00', 'DD-MM-YYYY HH24:MI:SS'))*24*60*60) from dual) as current_read_date
from some_account_t
where created_date > (select round((last_read_date - to_date('01-01-1970 00:00:00', 'DD-MM-YYYY HH24:MI:SS'))*24*60*60) from last_read_t where application like '%Nifi%' and last_read_status='SUCCESS');
The only other processor needed is PutSQL to update the last read date and status done at flow end.
Using this method rather than a cache in ExecuteSQL itself guards against service or further processing interruptions that ExecuteSQL would not be aware of and that would make the cached value invalid.
was (Author: daniel cave):
Maintaining in state without a persisted value would not guarantee that the previous read had been successful. However, below is an example of how this can be done with the current ExecuteSQL/PutSQL processors.
This can be done through ExecuteSQL by subquery a data table to read the variable value to filter by as well as capture the current date for future use.
For example
select account_name,
(select round((sysdate - to_date('01-01-1970 00:00:00', 'DD-MM-YYYY HH24:MI:SS'))*24*60*60) from dual) as current_read_date
from some_account_t
where created_date > (select round((last_read_date - to_date('01-01-1970 00:00:00', 'DD-MM-YYYY HH24:MI:SS'))*24*60*60) from last_read_t where application like '%Nifi%' and last_read_status='SUCCESS');
The only other processor needed is PutSQL to update the last read date and status done at flow end.
Using this method rather than a cache in ExecuteSQL itself guards against service or further processing interruptions that ExecuteSQL would not be aware of and that would make the cached value invalid.
> Allow ExecuteSQL to run queries with that use a variable timestamp or sequence id
> ---------------------------------------------------------------------------------
>
> Key: NIFI-1201
> URL: https://issues.apache.org/jira/browse/NIFI-1201
> Project: Apache NiFi
> Issue Type: Improvement
> Components: Extensions
> Reporter: Randy Gelhausen
> Priority: Minor
>
> Users are employing ExecuteSQL as a means to schedule periodic queries against remote databases. Other tools that do this type of task include the ability to maintain and automatically increment a sequence or timestamp used in query predicates.
> For example:
> select * from src_table where created_at > "2015-11-19 12:00:00"
> Then a minute later:
> select * from src_table where created_at > "2015-11-19 12:01:00"
> Or:
> insert into my_table values (${prev_id}+1, ${now()})
> Today users can implement the same logic with a series of processors, but much work could be saved by allowing ExecuteSQL to maintain these bits of state.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)