You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by András Piros via Review Board <no...@reviews.apache.org> on 2017/12/01 11:39:48 UTC

Re: Review Request 52782: OOZIE-2662 DB migration fails if DB is too big


> On July 4, 2017, 11:24 a.m., Peter Bacsko wrote:
> > tools/src/main/java/org/apache/oozie/tools/OozieDBImportCLI.java
> > Lines 95 (patched)
> > <https://reviews.apache.org/r/52782/diff/4/?file=1768922#file1768922line98>
> >
> >     Should we make this configurable?
> >     
> >     Can a bigger value possibly speed up the whole process?

It depends on the available heap, and the average size of an entity. 1000 I found in practice fast enough and also consuming not an awful lot of heap.

Anyway, made this configurable.


- András


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52782/#review179561
-----------------------------------------------------------


On July 17, 2017, 4:05 p.m., András Piros wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52782/
> -----------------------------------------------------------
> 
> (Updated July 17, 2017, 4:05 p.m.)
> 
> 
> Review request for oozie, Attila Sasvari, Peter Cseh, Peter Bacsko, and Robert Kanter.
> 
> 
> Repository: oozie-git
> 
> 
> Description
> -------
> 
> We get each 1000 rows into a separate JPA `EntityTransaction` to reduce heap size. Furthermore, of at least one row inside that tx fails, we retry the whole batch into separate `EntityTransaction`s each.
> 
> Following error handling is implemented:
> 
> 1. check if all necessary tables are present and empty
> 2. rows are imported till the end even if there are skipped rows in the meanwhile
> 3. if at least one row is skipped in the meanwhile for some `ConstraintViolationException`, we delete all rows of all necessary tables. That enables the user to have the log messages of all the erroneous rows in one run, and Oozie database is never in an inconsistent state of some rows present, some not present of an import
> 
> 
> Diffs
> -----
> 
>   core/src/main/resources/oozie-default.xml 832bbe14a4b027e198061527a956e2992cbec174 
>   tools/src/main/java/org/apache/oozie/tools/OozieDBImportCLI.java 0e14a30693a76b8b2bdc2f7ceaf3f045d69f4155 
>   tools/src/test/java/org/apache/oozie/tools/TestDBLoadDump.java c43223ef05aa702be49565ba2626314628e63749 
>   tools/src/test/resources/dumpData/invalid/ooziedb_ac.json PRE-CREATION 
>   tools/src/test/resources/dumpData/invalid/ooziedb_ca.json PRE-CREATION 
>   tools/src/test/resources/dumpData/invalid/ooziedb_cj.json PRE-CREATION 
>   tools/src/test/resources/dumpData/invalid/ooziedb_sysinfo.json PRE-CREATION 
>   tools/src/test/resources/dumpData/invalid/ooziedb_wf.json PRE-CREATION 
>   tools/src/test/resources/dumpData/ooziedb_ac.json  
>   tools/src/test/resources/dumpData/ooziedb_bna.json  
>   tools/src/test/resources/dumpData/ooziedb_bnj.json  
>   tools/src/test/resources/dumpData/ooziedb_ca.json  
>   tools/src/test/resources/dumpData/ooziedb_cj.json  
>   tools/src/test/resources/dumpData/ooziedb_slareg.json  
>   tools/src/test/resources/dumpData/ooziedb_slasum.json  
>   tools/src/test/resources/dumpData/ooziedb_sysinfo.json  
>   tools/src/test/resources/dumpData/ooziedb_wf.json  
>   tools/src/test/resources/dumpData/valid/ooziedb_slareg.json PRE-CREATION 
>   tools/src/test/resources/dumpData/valid/ooziedb_slasum.json PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/52782/diff/6/
> 
> 
> Testing
> -------
> 
> See `TestDBLoadDump` for further reference.
> 
> 
> Thanks,
> 
> András Piros
> 
>