You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Chris Leonard (JIRA)" <ji...@apache.org> on 2016/02/04 01:49:39 UTC

[jira] [Commented] (PIG-3770) Enhance DBStorage to make it more flexible (should batch statements?, rollback on job failure, support command line arguments etc.)

    [ https://issues.apache.org/jira/browse/PIG-3770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131467#comment-15131467 ] 

Chris Leonard commented on PIG-3770:
------------------------------------

Is there any update on this or traction with it yet? We are trying to enable a data transfer from Hadoop using PIG's STORE ... USING org.apache.pig.piggybank.storage.DBStorage command, and it's generating dozens of threads on the MSSQL Server target, which are all bad database citizens. They begin one transaction and then issue singleton updates, all of which acquire locks and hold them until our server runs out of locks. Not good!

> Enhance DBStorage to make it more flexible (should batch statements?, rollback on job failure, support command line arguments etc.)
> -----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-3770
>                 URL: https://issues.apache.org/jira/browse/PIG-3770
>             Project: Pig
>          Issue Type: Improvement
>          Components: piggybank
>            Reporter: Nezih Yigitbasi
>            Assignee: Nezih Yigitbasi
>
> First of all, the TestDBStorage unit test is *broken*. It doesn't even run the DBStorage store logic. I debugged it and added logs to find out that putNext is not even called. The reason this unit test doesn't fail is that the verification loop at the end of the testWriteToDB method that traverses the result set simply doesn't do any verification since the result set is empty (since DBStorage store logic is not called at all) and it doesn't enter that for loop. (If it could run it would fail as the verification logic is also broken: see that the orders in the expNames, expRations, and expDates do not even match). This has to be fixed.
> I propose to improve DBStorage with the following changes:
> - fix the problems with the unit test described above to make it work, and make it more comprehensive (the unit test currently only inserts three records, this test has to be made more comprehensive)
> - use command line options in the constructor like other Pig store functions (PigStorage, HBaseStorage, etc.) to make DBStorage more flexible. With this change it would be easy to implement PIG-3597
> - DBStorage supports rollbacks on task failures, but *not* on job failures. This is a nice to have feature that's requested before, see PIG-1891



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)