You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "Rostislav Koryakin (JIRA)" <ji...@apache.org> on 2017/03/19 18:08:41 UTC

[jira] [Created] (NIFI-3623) PutSQL сan't insert multiple records in postgres if one causes an error

Rostislav Koryakin created NIFI-3623:
----------------------------------------

             Summary: PutSQL сan't insert multiple records in postgres if one causes an error
                 Key: NIFI-3623
                 URL: https://issues.apache.org/jira/browse/NIFI-3623
             Project: Apache NiFi
          Issue Type: Bug
          Components: Core Framework
    Affects Versions: 1.2.0
            Reporter: Rostislav Koryakin


PutSQL configured as follows:
Concurrent tasks: 1
Batch size: 100
Obtain generated keys: false

A flow is configured to make a snapshot of particular database table and put into a postgres twice a day.
Assume there is postgres db table "A" with fields: date, id, value. And primary key: (date, id).
Sometimes due to network issues or restart of postgres not all entries are inserted.
Example: "A" contains entries with id 1, 50, and 99.
PutSQL consumes 100 flowfiles with id from 0 to 99 and tries to insert entries. It starts transaction, but the transaction is rolled back due to constraint violation for keys 1, 50 and 99. Next time the same situation is repeated.
According to PutSQL implementation: it expects the driver to return a list of succeeded and failed statements. But for postgres - all statement are failed and all flowfiles go to failure instead of retry.

There several ways to solve it:
1) use batch size 1 (looks bad in terms of performance)
2) use obtain generated keys = true (there is no need for them)
3) Address the issue somehow and move 97 of 100 flowfiles to "retry" to allow process them again.

Is the expected behaviour in this situation to get 97 files in retry? Or it is normal that all go to "failure" ?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)