You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sqoop.apache.org by Masatake Iwasaki <iw...@nttdata.co.jp> on 2013/06/22 18:28:26 UTC

Review Request: SQOOP-999: Support bulk load from HDFS to PostgreSQL using COPY ... FROM

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12048/
-----------------------------------------------------------

Review request for Sqoop.


Description
-------

Export with PostgreSQL Copy API can be realized in a straitforward way. Attached file is not complete patch but sample implementation without tests and consideration for corner cases.
Though this code resolves dependency for PostgreSQL JDBC by ivy, is it acceptable for the project?


This addresses bug SQOOP-999.
    https://issues.apache.org/jira/browse/SQOOP-999


Diffs
-----

  ivy.xml 750adfc 
  ivy/libraries.properties 430d554 
  src/java/org/apache/sqoop/manager/PostgreSQLCopyManager.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/PostgreSQLCopyExportJob.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/PostgreSQLCopyMapper.java PRE-CREATION 

Diff: https://reviews.apache.org/r/12048/diff/


Testing
-------


Thanks,

Masatake Iwasaki


Re: Review Request: SQOOP-999: Support bulk load from HDFS to PostgreSQL using COPY ... FROM

Posted by Masatake Iwasaki <iw...@nttdata.co.jp>.

> On June 23, 2013, 3:39 p.m., Jarek Cecho wrote:
> > Hi Masatake,
> > I believe that depending on the PostgreSQL JDBC driver should be fine as long as the licensing is correct. I've took a look and the JDBC Driver is licensed using BSD license that is compatible with ASL and therefore I believe that depending on PostgreSQL JDBC driver should be fine. Please note that this is not the case for majority of other JDBC drivers for other vendors.
> > 
> > I've just read the code (did not tried apply the patch nor run it) and it seems fine to me, so please do not hesitate and continue in the development process!

Hi Jarek, thanks for your comments. I got it. I am going to refine the code and add tests.


- Masatake


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12048/#review22304
-----------------------------------------------------------


On June 22, 2013, 4:28 p.m., Masatake Iwasaki wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/12048/
> -----------------------------------------------------------
> 
> (Updated June 22, 2013, 4:28 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Description
> -------
> 
> Export with PostgreSQL Copy API can be realized in a straitforward way. Attached file is not complete patch but sample implementation without tests and consideration for corner cases.
> Though this code resolves dependency for PostgreSQL JDBC by ivy, is it acceptable for the project?
> 
> 
> This addresses bug SQOOP-999.
>     https://issues.apache.org/jira/browse/SQOOP-999
> 
> 
> Diffs
> -----
> 
>   ivy.xml 750adfc 
>   ivy/libraries.properties 430d554 
>   src/java/org/apache/sqoop/manager/PostgreSQLCopyManager.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/PostgreSQLCopyExportJob.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/PostgreSQLCopyMapper.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/12048/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Masatake Iwasaki
> 
>


Re: Review Request: SQOOP-999: Support bulk load from HDFS to PostgreSQL using COPY ... FROM

Posted by Jarek Cecho <ja...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12048/#review22304
-----------------------------------------------------------


Hi Masatake,
I believe that depending on the PostgreSQL JDBC driver should be fine as long as the licensing is correct. I've took a look and the JDBC Driver is licensed using BSD license that is compatible with ASL and therefore I believe that depending on PostgreSQL JDBC driver should be fine. Please note that this is not the case for majority of other JDBC drivers for other vendors.

I've just read the code (did not tried apply the patch nor run it) and it seems fine to me, so please do not hesitate and continue in the development process!


src/java/org/apache/sqoop/manager/PostgreSQLCopyManager.java
<https://reviews.apache.org/r/12048/#comment45782>

    Since the DirectPostgresqlManager do not implement the exportTable() method, do you think that it would make sense to merge those two connectors into one?



src/java/org/apache/sqoop/manager/PostgreSQLCopyManager.java
<https://reviews.apache.org/r/12048/#comment45781>

    Nit: Trailing whitespace.



src/java/org/apache/sqoop/mapreduce/PostgreSQLCopyMapper.java
<https://reviews.apache.org/r/12048/#comment45783>

    Nit: Trailing whitespace


Jarcec

- Jarek Cecho


On June 22, 2013, 4:28 p.m., Masatake Iwasaki wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/12048/
> -----------------------------------------------------------
> 
> (Updated June 22, 2013, 4:28 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Description
> -------
> 
> Export with PostgreSQL Copy API can be realized in a straitforward way. Attached file is not complete patch but sample implementation without tests and consideration for corner cases.
> Though this code resolves dependency for PostgreSQL JDBC by ivy, is it acceptable for the project?
> 
> 
> This addresses bug SQOOP-999.
>     https://issues.apache.org/jira/browse/SQOOP-999
> 
> 
> Diffs
> -----
> 
>   ivy.xml 750adfc 
>   ivy/libraries.properties 430d554 
>   src/java/org/apache/sqoop/manager/PostgreSQLCopyManager.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/PostgreSQLCopyExportJob.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/PostgreSQLCopyMapper.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/12048/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Masatake Iwasaki
> 
>


Re: Review Request 12048: SQOOP-999: Support bulk load from HDFS to PostgreSQL using COPY ... FROM

Posted by Jarek Cecho <ja...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12048/#review22745
-----------------------------------------------------------

Ship it!


Looks good to me! Please upload the latest patch to the JIRA and I'll commit it.

- Jarek Cecho


On July 2, 2013, 9:28 p.m., Masatake Iwasaki wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/12048/
> -----------------------------------------------------------
> 
> (Updated July 2, 2013, 9:28 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-999
>     https://issues.apache.org/jira/browse/SQOOP-999
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> -------
> 
> Export with PostgreSQL Copy API can be realized in a straitforward way. Attached file is not complete patch but sample implementation without tests and consideration for corner cases.
> Though this code resolves dependency for PostgreSQL JDBC by ivy, is it acceptable for the project?
> 
> 
> Diffs
> -----
> 
>   ivy.xml 750adfc 
>   ivy/libraries.properties 430d554 
>   src/java/org/apache/sqoop/manager/DirectPostgresqlManager.java c085218 
>   src/java/org/apache/sqoop/mapreduce/postgresql/PostgreSQLCopyExportJob.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/postgresql/PostgreSQLCopyExportMapper.java PRE-CREATION 
>   src/test/com/cloudera/sqoop/manager/DirectPostgreSQLExportManualTest.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/12048/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Masatake Iwasaki
> 
>


Re: Review Request 12048: SQOOP-999: Support bulk load from HDFS to PostgreSQL using COPY ... FROM

Posted by Masatake Iwasaki <iw...@oss.nttdata.co.jp>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12048/
-----------------------------------------------------------

(Updated July 2, 2013, 9:28 p.m.)


Review request for Sqoop.


Changes
-------

refactored package name.


Bugs: SQOOP-999
    https://issues.apache.org/jira/browse/SQOOP-999


Repository: sqoop-trunk


Description
-------

Export with PostgreSQL Copy API can be realized in a straitforward way. Attached file is not complete patch but sample implementation without tests and consideration for corner cases.
Though this code resolves dependency for PostgreSQL JDBC by ivy, is it acceptable for the project?


Diffs (updated)
-----

  ivy.xml 750adfc 
  ivy/libraries.properties 430d554 
  src/java/org/apache/sqoop/manager/DirectPostgresqlManager.java c085218 
  src/java/org/apache/sqoop/mapreduce/postgresql/PostgreSQLCopyExportJob.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/postgresql/PostgreSQLCopyExportMapper.java PRE-CREATION 
  src/test/com/cloudera/sqoop/manager/DirectPostgreSQLExportManualTest.java PRE-CREATION 

Diff: https://reviews.apache.org/r/12048/diff/


Testing
-------


Thanks,

Masatake Iwasaki


Re: Review Request 12048: SQOOP-999: Support bulk load from HDFS to PostgreSQL using COPY ... FROM

Posted by Masatake Iwasaki <iw...@oss.nttdata.co.jp>.

> On July 2, 2013, 7:50 p.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/mapreduce/PostgreSQLCopyExportMapper.java, line 19
> > <https://reviews.apache.org/r/12048/diff/3/?file=314909#file314909line19>
> >
> >     Nit: I'm slowly trying to clean up the giant package org.apache.sqoop.mapreduce. Would you mind putting this PostgreSQL specific implementation file into package org.apache.sqoop.mapreduce.postgresql? There are already packages for MySQL, Netezza or Microsoft SQL Server.

I'll move pg_bulkload related codes later in SQOOP-1118.


- Masatake


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12048/#review22678
-----------------------------------------------------------


On July 2, 2013, 9:28 p.m., Masatake Iwasaki wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/12048/
> -----------------------------------------------------------
> 
> (Updated July 2, 2013, 9:28 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-999
>     https://issues.apache.org/jira/browse/SQOOP-999
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> -------
> 
> Export with PostgreSQL Copy API can be realized in a straitforward way. Attached file is not complete patch but sample implementation without tests and consideration for corner cases.
> Though this code resolves dependency for PostgreSQL JDBC by ivy, is it acceptable for the project?
> 
> 
> Diffs
> -----
> 
>   ivy.xml 750adfc 
>   ivy/libraries.properties 430d554 
>   src/java/org/apache/sqoop/manager/DirectPostgresqlManager.java c085218 
>   src/java/org/apache/sqoop/mapreduce/postgresql/PostgreSQLCopyExportJob.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/postgresql/PostgreSQLCopyExportMapper.java PRE-CREATION 
>   src/test/com/cloudera/sqoop/manager/DirectPostgreSQLExportManualTest.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/12048/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Masatake Iwasaki
> 
>


Re: Review Request 12048: SQOOP-999: Support bulk load from HDFS to PostgreSQL using COPY ... FROM

Posted by Jarek Cecho <ja...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12048/#review22678
-----------------------------------------------------------


Hi Masatake,
thank you very much for incorporating all my feedback. I do have one last nit and we are ready to get it in!


src/java/org/apache/sqoop/mapreduce/PostgreSQLCopyExportMapper.java
<https://reviews.apache.org/r/12048/#comment46362>

    Nit: I'm slowly trying to clean up the giant package org.apache.sqoop.mapreduce. Would you mind putting this PostgreSQL specific implementation file into package org.apache.sqoop.mapreduce.postgresql? There are already packages for MySQL, Netezza or Microsoft SQL Server.


Jarcec

- Jarek Cecho


On July 2, 2013, 9:05 a.m., Masatake Iwasaki wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/12048/
> -----------------------------------------------------------
> 
> (Updated July 2, 2013, 9:05 a.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-999
>     https://issues.apache.org/jira/browse/SQOOP-999
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> -------
> 
> Export with PostgreSQL Copy API can be realized in a straitforward way. Attached file is not complete patch but sample implementation without tests and consideration for corner cases.
> Though this code resolves dependency for PostgreSQL JDBC by ivy, is it acceptable for the project?
> 
> 
> Diffs
> -----
> 
>   ivy.xml 750adfc 
>   ivy/libraries.properties 430d554 
>   src/java/org/apache/sqoop/manager/DirectPostgresqlManager.java c085218 
>   src/java/org/apache/sqoop/mapreduce/PostgreSQLCopyExportJob.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/PostgreSQLCopyExportMapper.java PRE-CREATION 
>   src/test/com/cloudera/sqoop/manager/DirectPostgreSQLExportManualTest.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/12048/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Masatake Iwasaki
> 
>


Re: Review Request 12048: SQOOP-999: Support bulk load from HDFS to PostgreSQL using COPY ... FROM

Posted by Masatake Iwasaki <iw...@oss.nttdata.co.jp>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12048/
-----------------------------------------------------------

(Updated July 2, 2013, 9:05 a.m.)


Review request for Sqoop.


Changes
-------

- throw Exception when HCatalog table name is specified.
- added debug print.
- made test database location and username configurable.
- refactored the name of mapper class.


Bugs: SQOOP-999
    https://issues.apache.org/jira/browse/SQOOP-999


Repository: sqoop-trunk


Description
-------

Export with PostgreSQL Copy API can be realized in a straitforward way. Attached file is not complete patch but sample implementation without tests and consideration for corner cases.
Though this code resolves dependency for PostgreSQL JDBC by ivy, is it acceptable for the project?


Diffs (updated)
-----

  ivy.xml 750adfc 
  ivy/libraries.properties 430d554 
  src/java/org/apache/sqoop/manager/DirectPostgresqlManager.java c085218 
  src/java/org/apache/sqoop/mapreduce/PostgreSQLCopyExportJob.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/PostgreSQLCopyExportMapper.java PRE-CREATION 
  src/test/com/cloudera/sqoop/manager/DirectPostgreSQLExportManualTest.java PRE-CREATION 

Diff: https://reviews.apache.org/r/12048/diff/


Testing
-------


Thanks,

Masatake Iwasaki


Re: Review Request 12048: SQOOP-999: Support bulk load from HDFS to PostgreSQL using COPY ... FROM

Posted by Masatake Iwasaki <iw...@oss.nttdata.co.jp>.

> On June 29, 2013, 10:27 p.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/mapreduce/PostgreSQLCopyExportJob.java, lines 63-65
> > <https://reviews.apache.org/r/12048/diff/2/?file=310625#file310625line63>
> >
> >     The HCatalog integration is not supported right?

I'll consider to create another JIRA issue for supporting HCatalog integration.


- Masatake


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12048/#review22570
-----------------------------------------------------------


On July 2, 2013, 9:05 a.m., Masatake Iwasaki wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/12048/
> -----------------------------------------------------------
> 
> (Updated July 2, 2013, 9:05 a.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-999
>     https://issues.apache.org/jira/browse/SQOOP-999
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> -------
> 
> Export with PostgreSQL Copy API can be realized in a straitforward way. Attached file is not complete patch but sample implementation without tests and consideration for corner cases.
> Though this code resolves dependency for PostgreSQL JDBC by ivy, is it acceptable for the project?
> 
> 
> Diffs
> -----
> 
>   ivy.xml 750adfc 
>   ivy/libraries.properties 430d554 
>   src/java/org/apache/sqoop/manager/DirectPostgresqlManager.java c085218 
>   src/java/org/apache/sqoop/mapreduce/PostgreSQLCopyExportJob.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/PostgreSQLCopyExportMapper.java PRE-CREATION 
>   src/test/com/cloudera/sqoop/manager/DirectPostgreSQLExportManualTest.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/12048/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Masatake Iwasaki
> 
>


Re: Review Request 12048: SQOOP-999: Support bulk load from HDFS to PostgreSQL using COPY ... FROM

Posted by Jarek Cecho <ja...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12048/#review22570
-----------------------------------------------------------


Hi Masatake,
thank you very much for all your effort! I do have couple of notes:


src/java/org/apache/sqoop/mapreduce/PostgreSQLCopyExportJob.java
<https://reviews.apache.org/r/12048/#comment46290>

    The HCatalog integration is not supported right?



src/java/org/apache/sqoop/mapreduce/PostgreSQLCopyMapper.java
<https://reviews.apache.org/r/12048/#comment46288>

    I would suggest to log the query for easier debugging.



src/test/com/cloudera/sqoop/manager/DirectPostgreSQLExportManualTest.java
<https://reviews.apache.org/r/12048/#comment46289>

    I would suggest to parametrise the build-in JDBC path similarly as is done in the PostgresqlTest class:
    
    https://github.com/apache/sqoop/blob/branch-1.4.0/src/test/com/cloudera/sqoop/manager/PostgresqlTest.java#L87


Jarcec

- Jarek Cecho


On June 24, 2013, 11:29 p.m., Masatake Iwasaki wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/12048/
> -----------------------------------------------------------
> 
> (Updated June 24, 2013, 11:29 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-999
>     https://issues.apache.org/jira/browse/SQOOP-999
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> -------
> 
> Export with PostgreSQL Copy API can be realized in a straitforward way. Attached file is not complete patch but sample implementation without tests and consideration for corner cases.
> Though this code resolves dependency for PostgreSQL JDBC by ivy, is it acceptable for the project?
> 
> 
> Diffs
> -----
> 
>   ivy.xml 750adfc 
>   ivy/libraries.properties 430d554 
>   src/java/org/apache/sqoop/manager/DirectPostgresqlManager.java c085218 
>   src/java/org/apache/sqoop/mapreduce/PostgreSQLCopyExportJob.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/PostgreSQLCopyMapper.java PRE-CREATION 
>   src/test/com/cloudera/sqoop/manager/DirectPostgreSQLExportManualTest.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/12048/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Masatake Iwasaki
> 
>


Re: Review Request: SQOOP-999: Support bulk load from HDFS to PostgreSQL using COPY ... FROM

Posted by Masatake Iwasaki <iw...@nttdata.co.jp>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12048/
-----------------------------------------------------------

(Updated June 24, 2013, 11:29 p.m.)


Review request for Sqoop.


Changes
-------

Manager logic is moved to DirectPostgresqlManager.
Tests added.
Trivial fixes based on checkstyle.


Description
-------

Export with PostgreSQL Copy API can be realized in a straitforward way. Attached file is not complete patch but sample implementation without tests and consideration for corner cases.
Though this code resolves dependency for PostgreSQL JDBC by ivy, is it acceptable for the project?


This addresses bug SQOOP-999.
    https://issues.apache.org/jira/browse/SQOOP-999


Diffs (updated)
-----

  ivy.xml 750adfc 
  ivy/libraries.properties 430d554 
  src/java/org/apache/sqoop/manager/DirectPostgresqlManager.java c085218 
  src/java/org/apache/sqoop/mapreduce/PostgreSQLCopyExportJob.java PRE-CREATION 
  src/java/org/apache/sqoop/mapreduce/PostgreSQLCopyMapper.java PRE-CREATION 
  src/test/com/cloudera/sqoop/manager/DirectPostgreSQLExportManualTest.java PRE-CREATION 

Diff: https://reviews.apache.org/r/12048/diff/


Testing
-------


Thanks,

Masatake Iwasaki