You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sqoop.apache.org by "Shinichi Yamashita (Created) (JIRA)" <ji...@apache.org> on 2011/11/02 07:10:33 UTC

[jira] [Created] (SQOOP-387) Use TRUNCATE for PostgreSQLManager

Use TRUNCATE for PostgreSQLManager
----------------------------------

                 Key: SQOOP-387
                 URL: https://issues.apache.org/jira/browse/SQOOP-387
             Project: Sqoop
          Issue Type: Improvement
          Components: connectors/postgresql
            Reporter: Shinichi Yamashita


PostgreSQLManager uses "DELETE" sentence for procesing in staging-table.
When it does "EXPORT" many times, the processing time is prolonged in "DELETE".
Therefore replace it with "TRUNCATE" sentence and improve performance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (SQOOP-387) Use TRUNCATE for PostgreSQLManager

Posted by "Bilung Lee (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SQOOP-387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13143501#comment-13143501 ] 

Bilung Lee commented on SQOOP-387:
----------------------------------

Thanks for your contribution!  One comment.  Instead of overriding the whole migrateData and deleteAllRecords methods, how about having a getDeleteQuery method to return the DELETE statement in SqlManager?  Then, use it in migrateData/deleteAllRecords methods of SqlManager and only override that in PostgreSQLManager to return TRUNCATE statement?

                
> Use TRUNCATE for PostgreSQLManager
> ----------------------------------
>
>                 Key: SQOOP-387
>                 URL: https://issues.apache.org/jira/browse/SQOOP-387
>             Project: Sqoop
>          Issue Type: Improvement
>          Components: connectors/postgresql
>            Reporter: Shinichi Yamashita
>         Attachments: SQOOP-387.patch
>
>
> PostgreSQLManager uses "DELETE" sentence for procesing in staging-table.
> When it does "EXPORT" many times, the processing time is prolonged in "DELETE".
> Therefore replace it with "TRUNCATE" sentence and improve performance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (SQOOP-387) Use TRUNCATE for PostgreSQLManager

Posted by "Shinichi Yamashita (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SQOOP-387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232259#comment-13232259 ] 

Shinichi Yamashita commented on SQOOP-387:
------------------------------------------

Hi,

I forgot to tell that TRUNCATE of PostgreSQL did not return a value such as DELETE to you.
Therefore I think that I apply the patch which I attached first and use it.
If a count of the number of deletion is necessary, I want to make modifications to count it with the first patch.

Regards,
Shinichi
                
> Use TRUNCATE for PostgreSQLManager
> ----------------------------------
>
>                 Key: SQOOP-387
>                 URL: https://issues.apache.org/jira/browse/SQOOP-387
>             Project: Sqoop
>          Issue Type: Improvement
>          Components: connectors/postgresql
>            Reporter: Shinichi Yamashita
>         Attachments: SQOOP-387.patch
>
>
> PostgreSQLManager uses "DELETE" sentence for procesing in staging-table.
> When it does "EXPORT" many times, the processing time is prolonged in "DELETE".
> Therefore replace it with "TRUNCATE" sentence and improve performance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (SQOOP-387) Use TRUNCATE for PostgreSQLManager

Posted by "Shinichi Yamashita (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SQOOP-387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shinichi Yamashita updated SQOOP-387:
-------------------------------------

    Attachment: SQOOP-387.patch

Override migrateData and deleteAllRecords methods to use in staging-table.
                
> Use TRUNCATE for PostgreSQLManager
> ----------------------------------
>
>                 Key: SQOOP-387
>                 URL: https://issues.apache.org/jira/browse/SQOOP-387
>             Project: Sqoop
>          Issue Type: Improvement
>          Components: connectors/postgresql
>            Reporter: Shinichi Yamashita
>         Attachments: SQOOP-387.patch
>
>
> PostgreSQLManager uses "DELETE" sentence for procesing in staging-table.
> When it does "EXPORT" many times, the processing time is prolonged in "DELETE".
> Therefore replace it with "TRUNCATE" sentence and improve performance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (SQOOP-387) Use TRUNCATE for PostgreSQLManager

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SQOOP-387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228817#comment-13228817 ] 

jiraposter@reviews.apache.org commented on SQOOP-387:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4272/#review5803
-----------------------------------------------------------


Seem that the TRUNCATE statement may not return the number of affected rows like the DELETE statement does.  You may want to consider removing dependency on the return count as pointed out below.

An alternative option is to factor out more code to be overridden if the dependency is to be kept for other databases.



http://svn.apache.org/repos/asf/incubator/sqoop/trunk/src/java/org/apache/sqoop/manager/SqlManager.java
<https://reviews.apache.org/r/4272/#comment12649>

    The return code seems to be 0 for a successful TRUNCATE.



http://svn.apache.org/repos/asf/incubator/sqoop/trunk/src/java/org/apache/sqoop/manager/SqlManager.java
<https://reviews.apache.org/r/4272/#comment12650>

    The return count won't make sense for TRUNCATE here.



http://svn.apache.org/repos/asf/incubator/sqoop/trunk/src/java/org/apache/sqoop/manager/SqlManager.java
<https://reviews.apache.org/r/4272/#comment12653>

    The return count would be 0 for a successful TRUNCATE.



http://svn.apache.org/repos/asf/incubator/sqoop/trunk/src/java/org/apache/sqoop/manager/SqlManager.java
<https://reviews.apache.org/r/4272/#comment12655>

    Due to the reason mentioned above, this checking would fail for PostgreSQL now.


- Bilung


On 2012-03-09 17:31:41, Shinichi Yamashita wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/4272/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2012-03-09 17:31:41)
bq.  
bq.  
bq.  Review request for Sqoop and Bilung Lee.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  I fixed SQOOP-387 patch.
bq.  
bq.  1. I add getDeleteQuery method, which returns DELETE statement in SqlManager by default and returns TRUNCATE statement in PostgreSQLManager.
bq.  2. Modify deleteAllRecords method and migrateData method of SqlManager to use getDeleteQuery method.
bq.  
bq.  
bq.  This addresses bug SQOOP-387.
bq.      https://issues.apache.org/jira/browse/SQOOP-387
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    http://svn.apache.org/repos/asf/incubator/sqoop/trunk/src/java/org/apache/sqoop/manager/PostgresqlManager.java 1298894 
bq.    http://svn.apache.org/repos/asf/incubator/sqoop/trunk/src/java/org/apache/sqoop/manager/SqlManager.java 1298894 
bq.  
bq.  Diff: https://reviews.apache.org/r/4272/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  ant test
bq.  I confirmed "TRUNCATE TABLE" query was executed in PostgreSQL.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Shinichi
bq.  
bq.


                
> Use TRUNCATE for PostgreSQLManager
> ----------------------------------
>
>                 Key: SQOOP-387
>                 URL: https://issues.apache.org/jira/browse/SQOOP-387
>             Project: Sqoop
>          Issue Type: Improvement
>          Components: connectors/postgresql
>            Reporter: Shinichi Yamashita
>         Attachments: SQOOP-387.patch
>
>
> PostgreSQLManager uses "DELETE" sentence for procesing in staging-table.
> When it does "EXPORT" many times, the processing time is prolonged in "DELETE".
> Therefore replace it with "TRUNCATE" sentence and improve performance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (SQOOP-387) Use TRUNCATE for PostgreSQLManager

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SQOOP-387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226229#comment-13226229 ] 

jiraposter@reviews.apache.org commented on SQOOP-387:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4272/
-----------------------------------------------------------

Review request for Sqoop and Bilung Lee.


Summary
-------

I fixed SQOOP-387 patch.

1. I add getDeleteQuery method, which returns DELETE statement in SqlManager by default and returns TRUNCATE statement in PostgreSQLManager.
2. Modify deleteAllRecords method and migrateData method of SqlManager to use getDeleteQuery method.


This addresses bug SQOOP-387.
    https://issues.apache.org/jira/browse/SQOOP-387


Diffs
-----

  http://svn.apache.org/repos/asf/incubator/sqoop/trunk/src/java/org/apache/sqoop/manager/PostgresqlManager.java 1298894 
  http://svn.apache.org/repos/asf/incubator/sqoop/trunk/src/java/org/apache/sqoop/manager/SqlManager.java 1298894 

Diff: https://reviews.apache.org/r/4272/diff


Testing
-------

ant test
I confirmed "TRUNCATE TABLE" query was executed in PostgreSQL.


Thanks,

Shinichi


                
> Use TRUNCATE for PostgreSQLManager
> ----------------------------------
>
>                 Key: SQOOP-387
>                 URL: https://issues.apache.org/jira/browse/SQOOP-387
>             Project: Sqoop
>          Issue Type: Improvement
>          Components: connectors/postgresql
>            Reporter: Shinichi Yamashita
>         Attachments: SQOOP-387.patch
>
>
> PostgreSQLManager uses "DELETE" sentence for procesing in staging-table.
> When it does "EXPORT" many times, the processing time is prolonged in "DELETE".
> Therefore replace it with "TRUNCATE" sentence and improve performance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (SQOOP-387) Use TRUNCATE for PostgreSQLManager

Posted by "Shinichi Yamashita (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SQOOP-387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13145700#comment-13145700 ] 

Shinichi Yamashita commented on SQOOP-387:
------------------------------------------

The reason that overridden only PostgreSQLManager (not SqlManager) is because PostgreSQL can handle truncate in transaction. 
In other words TRUNCATE of PostgreSQL can have rollback mechanism. 

In addition, PostgreSQL doesn't release disk space until it executes "VACUUM". 
Therefore the implementation of current PostgreSQL causes performance degradation in the long term.

                
> Use TRUNCATE for PostgreSQLManager
> ----------------------------------
>
>                 Key: SQOOP-387
>                 URL: https://issues.apache.org/jira/browse/SQOOP-387
>             Project: Sqoop
>          Issue Type: Improvement
>          Components: connectors/postgresql
>            Reporter: Shinichi Yamashita
>         Attachments: SQOOP-387.patch
>
>
> PostgreSQLManager uses "DELETE" sentence for procesing in staging-table.
> When it does "EXPORT" many times, the processing time is prolonged in "DELETE".
> Therefore replace it with "TRUNCATE" sentence and improve performance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira