You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@gobblin.apache.org by "Alex Li (Jira)" <ji...@apache.org> on 2020/01/14 19:08:00 UTC

[jira] [Created] (GOBBLIN-1025) Add retry for PK-Chunking iterator

Alex Li created GOBBLIN-1025:
--------------------------------

             Summary: Add retry for PK-Chunking iterator
                 Key: GOBBLIN-1025
                 URL: https://issues.apache.org/jira/browse/GOBBLIN-1025
             Project: Apache Gobblin
          Issue Type: Improvement
            Reporter: Alex Li


In SFDC connector, there is a class called `ResultIterator` (I will change the name to SalesforceRecordIterator).
It was using by only PK-Chunking currently. It encapsulated fetching a list of result files to a record iterator.

However, the csvReader.nextRecord() may throw out network IO exception. We should do retry in this case.

When a result file is fetched partly and one network IO exception happens, we are in a special situation - first half of the file is already fetched to our local, but another half of the file is still on datasource. 
We need to
1. reopen the file stream
2. skip all the records that we already fetched, seek the cursor to the record which we haven't fetched yet.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)