You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@nifi.apache.org by 黃黃彥周 <yj...@gmail.com> on 2018/04/23 10:36:57 UTC

Adjust the GenerateTableFetch processor for handling big tables.

Hello,


I have utilize GenerateTableFetch processor to handle incremental record
fetching for a period of time.

It’s works quite well for normal tables.


While speak to big tables, it has some potential problems.

For example, we have source table has more than 100 billion records at
source. (Oracle)

This cause ‘COUNT’, ‘ORDER BY’ operation to consume much server resources
at source database which is not acceptable result.


However, GenerateTableFetch processor has hard code ‘COUNT(*)’ and possibly
‘ORDER BY’ statement to batch generate query statement.


Currently we decide to use some processors to compose ability to fetch
batch record by where clause with auto increment time interval to perform
incremental fetch big tables.



Just wonder let GenerateTableFetch processor support another mode which is
time interval batch generation?

Actually QueryDatabaseTable has similar issue on discussion

https://issues.apache.org/jira/browse/NIFI-4385


Thanks for any feedback.



-Deon