You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by "Andras Piros (JIRA)" <ji...@apache.org> on 2016/09/21 14:38:20 UTC

[jira] [Commented] (OOZIE-2662) DB migration fails if DB is too big

    [ https://issues.apache.org/jira/browse/OOZIE-2662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15510138#comment-15510138 ] 

Andras Piros commented on OOZIE-2662:
-------------------------------------

The [*current patch*|https://issues.apache.org/jira/secure/attachment/12829584/OOZIE-2662.002.wip.patch] addresses duplicate entries by skipping only the rows that violate the primary key constraint enforced by {{@Id}}. Please see {{TestDBLoadDump.testSecondImportDoesNotImportDuplicates()}} for details.

The only problem with that is when there are rows with any other type of constraint violations we get the same behavior: only violating rows are skipped while performing the import process. Please see {{TestDBLoadDump.testImportSkipsRowsContainingInvalidData()}} for details.

Using OpenJPA we cannot distinguish between different types of constraint violations (due to {{@Id}} or {{@Length}}, for example) - OpenJPA wraps both inside the very same {{RollbackException}} using the very same mechanisms.

So the question is, *[~rkanter]* and *[~jaydeepvishwakarma]*, should we go on like that, or should we not skip but halt the whole import process on both duplicate and otherwise violating rows?

> DB migration fails if DB is too big
> -----------------------------------
>
>                 Key: OOZIE-2662
>                 URL: https://issues.apache.org/jira/browse/OOZIE-2662
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: Peter Cseh
>            Assignee: Andras Piros
>         Attachments: OOZIE-2662.001.patch, OOZIE-2662.002.wip.patch
>
>
> The initial version of the DB import tool commits all the workflows, actions etc. in one huge commit. If it does not fits into the memory, AOOME is thrown.
> We should commit every 1k or 10k elements to prevent this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)