You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@apex.apache.org by "Bhupesh Chawda (JIRA)" <ji...@apache.org> on 2016/07/15 06:06:20 UTC

[jira] [Resolved] (APEXMALHAR-2066) Add jdbc poller input operator

     [ https://issues.apache.org/jira/browse/APEXMALHAR-2066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bhupesh Chawda resolved APEXMALHAR-2066.
----------------------------------------
       Resolution: Done
    Fix Version/s: 3.5.0

> Add jdbc poller input operator
> ------------------------------
>
>                 Key: APEXMALHAR-2066
>                 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2066
>             Project: Apache Apex Malhar
>          Issue Type: Task
>            Reporter: Ashwin Chandra Putta
>            Assignee: devendra tagare
>             Fix For: 3.5.0
>
>
> Create a JDBC poller input operator that has the following features.
> 1. poll from external jdbc store asynchronously in the input operator.
> 2. polling frequency and batch size should be configurable.
> 3. should be idempotent.
> 4. should be partition-able.
> 5. should be batch + polling capable.
> Assumptions for idempotency & partitioning,
> 1.User needs to provide tableName,dbConnection,setEmitColumnList,look-up key.
> 2.Optionally batchSize,pollInterval,Look-up key and a where clause can be given.
> 3.This operator uses static partitioning to arrive at range queries for exactly once reads.
> This operator will create a configured number of non-polling static partitions for fetching the existing data in the table. And an additional
> single partition for polling additive data.
> 4.Assumption is that there is an ordered column using which range queries can be formed.
> The *key* column, based on which the polling will happen, is any column which has ever increasing values and supports greater than and less
> than operations in SQL. 
> 5.If an emitColumnList is provided, please ensure that the keyColumn is the first column in the list
> 6.Range queries are formed using the JdbcMetaDataUtility Output - comma separated list of the emit columns eg columnA,columnB,columnC
> 7. Only newly added data which has increasing ids will be fetched by the
>    polling jdbc partition
> Per window the first and the last key processed is saved using the FSWindowDataManager - (<lowerBound,UpperBound>,operatorId,windowId).This (lowerBound,upperBoundPair) is then used for recovery.The queries are constructed using the JDBCMetaDataUtility.
> JDBCMetaDataUtility
> A utility class used to retrieve the metadata for a given unique key of a SQL table. This class would emit range queries based on a primary index given.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)