You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sqoop.apache.org by Abraham Elmahrek <ab...@cloudera.com> on 2014/03/03 14:57:33 UTC

Re: Review Request 16812: SQOOP-777. Sqoop2: Pluggable Intermediate Data Format

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16812/
-----------------------------------------------------------

(Updated March 3, 2014, 1:57 p.m.)


Review request for Sqoop.


Bugs: SQOOP-777
    https://issues.apache.org/jira/browse/SQOOP-777


Repository: sqoop-sqoop2


Description
-------

Previous review from hari is in first iteration of this review only.

commit 1c5122611fbd4a46a629421f7c55746cfc14f136
Author: Hari Shreedharan <hs...@apache.org>
Date:   Fri Jan 10 15:07:08 2014 -0800

    SQOOP-777. Sqoop2: Pluggable Intermediate Data Format
    
    Implemented a pluggable intermediate data format that decouples the internal representation of the data from the connector and the output formats. Connectors can choose to implement and support a format that is more efficient for them. Also separated the SqoopWritable so th
    
    I ran a full build - all tests including integration tests pass. I have not added any new tests, yet. I will add unit tests for the new classes. Also, I have not tried running this on an actual cluster - so things may be broken. I'd like some initial feedback based on the c
    
    I also implemented escaping of characters. There is some work remaining to support binary format, but it is mostly integration, the basic implementation is in place.

:100644 100644 db11b5b... 9bfa07d... M  common/pom.xml
:100644 100644 3e1adc7... f971240... M  common/src/main/java/org/apache/sqoop/etl/io/DataReader.java
:100644 100644 d81364e... e547875... M  common/src/main/java/org/apache/sqoop/etl/io/DataWriter.java
:100644 100644 8b630b2... 30c26a3... M  common/src/main/java/org/apache/sqoop/schema/type/Column.java
:100644 100644 e0da80f... 9c70db9... M  connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcConnector.java
:100644 100644 ef39cdc... 075890f... M  connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcExportInitializer.java
:100644 100644 96818ba... 795ffdb... M  connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportInitializer.java
:000000 100644 0000000... 6c33423... A  connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/util/InitializationUtils.java
:100644 100644 aa1c4ff... 7220018... M  connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestExportLoader.java
:100644 100644 a7ed6ba... 28399f2... M  connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestImportExtractor.java
:100644 100644 4056e14... f54837d... M  connector/connector-sdk/pom.xml
:000000 100644 0000000... d4874f2... A  connector/connector-sdk/src/main/java/org/apache/sqoop/connector/CSVIntermediateDataFormat.java
:000000 100644 0000000... 63e14d2... A  connector/connector-sdk/src/main/java/org/apache/sqoop/connector/IntermediateDataFormat.java
:000000 100644 0000000... 6e5479f... A  connector/connector-sdk/src/test/java/org/apache/sqoop/connector/CSVIntermediateDataFormatTest.java
:100644 100644 e052584... 3b2ef94... M  core/src/main/java/org/apache/sqoop/framework/JobManager.java
:100644 100644 53d0039... 9f5c47d... M  core/src/main/java/org/apache/sqoop/framework/SubmissionRequest.java
:100644 100644 f9a2a0e... 9754afd... M  execution/mapreduce/pom.xml
:100644 100644 5c0a027... 7680e33... M  execution/mapreduce/src/main/java/org/apache/sqoop/execution/mapreduce/MapreduceExecutionEngine.java
:100644 100644 7fd9a01... 3604898... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/JobConstants.java
:100644 100644 1978ec6... 099cdd3... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsExportExtractor.java
:100644 100644 a07c511... ee6bf39... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsSequenceImportLoader.java
:100644 100644 4621942... 7b799ca... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsTextImportLoader.java
:000000 100644 0000000... 71ae980... A  execution/mapreduce/src/main/java/org/apache/sqoop/job/io/SqoopWritable.java
:100644 100644 356ae8a... b495cc9... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopFileOutputFormat.java
:100644 100644 92de37e... 8164ffe... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopMapper.java
:100644 100644 90de6ef... b3503da... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopNullOutputFormat.java
:100644 100644 7dedee9... 16e59d8... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopOutputFormatLoadExecutor.java
:100644 100644 98a2c51... 5aaceb3... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopReducer.java
:100644 100644 e21f15b... 4e9e6ea... M  execution/mapreduce/src/test/java/org/apache/sqoop/job/JobUtils.java
:100644 100644 b7079dd... ddda423... M  execution/mapreduce/src/test/java/org/apache/sqoop/job/TestHdfsExtract.java
:100644 100644 f849aae... 2968411... M  execution/mapreduce/src/test/java/org/apache/sqoop/job/TestHdfsLoad.java
:100644 100644 7b264c6... 954990f... M  execution/mapreduce/src/test/java/org/apache/sqoop/job/TestMapReduce.java
:000000 100644 0000000... aea7de3... A  execution/mapreduce/src/test/java/org/apache/sqoop/job/io/SqoopWritableTest.java
:100644 100644 bee8ab7... 663dfb5... M  execution/mapreduce/src/test/java/org/apache/sqoop/job/mr/TestSqoopOutputFormatLoadExecutor.java
:100644 100644 1e2f005... a722c74... M  pom.xml
:100644 100644 0b240e8... 43f17d4... M  spi/pom.xml
:100644 100644 2becc56... 298e8f5... M  spi/src/main/java/org/apache/sqoop/connector/spi/SqoopConnector.java
:100644 100644 6fc485b... 8d1a4e8... M  submission/mapreduce/src/main/java/org/apache/sqoop/submission/mapreduce/MapreduceSubmissionEngine.java


Diffs (updated)
-----

  common/pom.xml db11b5ba70093704bc3f7a0a98cf7d44172e2b7d 
  common/src/main/java/org/apache/sqoop/etl/io/DataReader.java 3e1adc7084cf9ef1b93c8146110c751c7de34376 
  common/src/main/java/org/apache/sqoop/etl/io/DataWriter.java d81364ef81bc747437f5b78520bacec8ee2613a3 
  common/src/main/java/org/apache/sqoop/schema/type/Column.java 8b630b28295700b204da9f5a948eaef106ca559b 
  connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcConnector.java e0da80f77230fb836ef6ab36e3bd867d6cef553a 
  connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcExportInitializer.java ef39cdcbc41280a0dec338c8c2ab27b7cf76385b 
  connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportInitializer.java 96818ba27b25538643f9c5777ed8abebdf0eb1e9 
  connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/util/InitializationUtils.java PRE-CREATION 
  connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestExportLoader.java aa1c4ff7ff138de5efc5bc1bca5b0d869d214a1c 
  connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestImportExtractor.java a7ed6bab95c76c4928414e7850cc3e52a3ce4595 
  connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestImportInitializer.java a33fa3631400c3382a72e6286b1efb3fca0c9cb9 
  connector/connector-sdk/pom.xml 4056e14978cb5e679233e9fdde11d53f1ca2756b 
  connector/connector-sdk/src/main/java/org/apache/sqoop/connector/idf/CSVIntermediateDataFormat.java PRE-CREATION 
  connector/connector-sdk/src/main/java/org/apache/sqoop/connector/idf/IntermediateDataFormat.java PRE-CREATION 
  connector/connector-sdk/src/main/java/org/apache/sqoop/connector/idf/IntermediateDataFormatError.java PRE-CREATION 
  connector/connector-sdk/src/test/java/org/apache/sqoop/connector/idf/CSVIntermediateDataFormatTest.java PRE-CREATION 
  core/src/main/java/org/apache/sqoop/framework/JobManager.java e0525846029bb394b2614a33595ebe6fcdd4aaf2 
  core/src/main/java/org/apache/sqoop/framework/SubmissionRequest.java 53d003980a172e4e0acf18630a3496909c17cb5c 
  execution/mapreduce/pom.xml f9a2a0e57b499cc785bdb6548a2d6113e7b867c0 
  execution/mapreduce/src/main/java/org/apache/sqoop/execution/mapreduce/MapreduceExecutionEngine.java 5c0a02739a5f7598dc83ab82631b80410ab39213 
  execution/mapreduce/src/main/java/org/apache/sqoop/job/JobConstants.java 7fd9a017140fb91292a88e5a944ae549915c86e4 
  execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsExportExtractor.java 1978ec61e07beb6b1e75bc4da8c698c738ac4e4c 
  execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsSequenceImportLoader.java a07c5111e9e037dab4dacb128a53918f61599495 
  execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsTextImportLoader.java 462194238755e603dd60226a97cf357cf63e0f20 
  execution/mapreduce/src/main/java/org/apache/sqoop/job/io/SqoopWritable.java PRE-CREATION 
  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopFileOutputFormat.java 356ae8ab6f3cad13b644e1f033278d9012938956 
  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopMapper.java 92de37e1b7ffd3c4ce584e4d10c4e1f9af03c369 
  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopNullOutputFormat.java 90de6efa5f3145d0480e375d82bd5c6a9760a799 
  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopOutputFormatLoadExecutor.java 7dedee9f12b11e3af5cf6ee5c81bbc377a04a4fd 
  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopReducer.java 98a2c518b65130ebe272faf721b77099a1e85a63 
  execution/mapreduce/src/test/java/org/apache/sqoop/execution/mapreduce/MapreduceExecutionEngineTest.java 39d1b53abb1a3d71afbb2bc3aaf65b0cfc669a53 
  execution/mapreduce/src/test/java/org/apache/sqoop/job/JobUtils.java e21f15b8be2aacb7463ab4c41cfce63cbb71cf57 
  execution/mapreduce/src/test/java/org/apache/sqoop/job/TestHdfsExtract.java b7079dd47c80c776a508b07f0ee5152c1d3fd20e 
  execution/mapreduce/src/test/java/org/apache/sqoop/job/TestHdfsLoad.java f849aaef10c3b995afbe6326353256affbceee67 
  execution/mapreduce/src/test/java/org/apache/sqoop/job/TestMapReduce.java 7b264c68ae761fbaf91bc51ed312de70b9ba3fd3 
  execution/mapreduce/src/test/java/org/apache/sqoop/job/io/SqoopWritableTest.java PRE-CREATION 
  execution/mapreduce/src/test/java/org/apache/sqoop/job/mr/TestSqoopOutputFormatLoadExecutor.java bee8ab75f1f8e6508fca905aafe7c67f6b5d8d9a 
  pom.xml 1e2f0057ca2e9cd635ccbce75e5d6e39eeabb163 
  spi/pom.xml 0b240e8a1146cc7a06a2c50c181fb73b11e6dd8c 
  spi/src/main/java/org/apache/sqoop/connector/spi/SqoopConnector.java 2becc56364343f1cffdf91d9b17144666acbdd2e 
  submission/mapreduce/src/main/java/org/apache/sqoop/submission/mapreduce/MapreduceSubmissionEngine.java 6fc485b66ea8c33d09d172675a104954d570f3a7 

Diff: https://reviews.apache.org/r/16812/diff/


Testing
-------

mvn test


Thanks,

Abraham Elmahrek


Re: Review Request 16812: SQOOP-777. Sqoop2: Pluggable Intermediate Data Format

Posted by Gwen Shapira <gs...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16812/#review47658
-----------------------------------------------------------

Ship it!


I think its ready as is.

Just check the newline+reducer case and open a jira if it breaks.

- Gwen Shapira


On June 5, 2014, 11:58 p.m., Abraham Elmahrek wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/16812/
> -----------------------------------------------------------
> 
> (Updated June 5, 2014, 11:58 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-777
>     https://issues.apache.org/jira/browse/SQOOP-777
> 
> 
> Repository: sqoop-sqoop2
> 
> 
> Description
> -------
> 
> Previous review from hari is in first iteration of this review only.
> 
> commit 1c5122611fbd4a46a629421f7c55746cfc14f136
> Author: Hari Shreedharan <hs...@apache.org>
> Date:   Fri Jan 10 15:07:08 2014 -0800
> 
>     SQOOP-777. Sqoop2: Pluggable Intermediate Data Format
>     
>     Implemented a pluggable intermediate data format that decouples the internal representation of the data from the connector and the output formats. Connectors can choose to implement and support a format that is more efficient for them. Also separated the SqoopWritable so th
>     
>     I ran a full build - all tests including integration tests pass. I have not added any new tests, yet. I will add unit tests for the new classes. Also, I have not tried running this on an actual cluster - so things may be broken. I'd like some initial feedback based on the c
>     
>     I also implemented escaping of characters. There is some work remaining to support binary format, but it is mostly integration, the basic implementation is in place.
> 
> :100644 100644 db11b5b... 9bfa07d... M  common/pom.xml
> :100644 100644 3e1adc7... f971240... M  common/src/main/java/org/apache/sqoop/etl/io/DataReader.java
> :100644 100644 d81364e... e547875... M  common/src/main/java/org/apache/sqoop/etl/io/DataWriter.java
> :100644 100644 8b630b2... 30c26a3... M  common/src/main/java/org/apache/sqoop/schema/type/Column.java
> :100644 100644 e0da80f... 9c70db9... M  connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcConnector.java
> :100644 100644 ef39cdc... 075890f... M  connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcExportInitializer.java
> :100644 100644 96818ba... 795ffdb... M  connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportInitializer.java
> :000000 100644 0000000... 6c33423... A  connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/util/InitializationUtils.java
> :100644 100644 aa1c4ff... 7220018... M  connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestExportLoader.java
> :100644 100644 a7ed6ba... 28399f2... M  connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestImportExtractor.java
> :100644 100644 4056e14... f54837d... M  connector/connector-sdk/pom.xml
> :000000 100644 0000000... d4874f2... A  connector/connector-sdk/src/main/java/org/apache/sqoop/connector/CSVIntermediateDataFormat.java
> :000000 100644 0000000... 63e14d2... A  connector/connector-sdk/src/main/java/org/apache/sqoop/connector/IntermediateDataFormat.java
> :000000 100644 0000000... 6e5479f... A  connector/connector-sdk/src/test/java/org/apache/sqoop/connector/CSVIntermediateDataFormatTest.java
> :100644 100644 e052584... 3b2ef94... M  core/src/main/java/org/apache/sqoop/framework/JobManager.java
> :100644 100644 53d0039... 9f5c47d... M  core/src/main/java/org/apache/sqoop/framework/SubmissionRequest.java
> :100644 100644 f9a2a0e... 9754afd... M  execution/mapreduce/pom.xml
> :100644 100644 5c0a027... 7680e33... M  execution/mapreduce/src/main/java/org/apache/sqoop/execution/mapreduce/MapreduceExecutionEngine.java
> :100644 100644 7fd9a01... 3604898... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/JobConstants.java
> :100644 100644 1978ec6... 099cdd3... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsExportExtractor.java
> :100644 100644 a07c511... ee6bf39... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsSequenceImportLoader.java
> :100644 100644 4621942... 7b799ca... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsTextImportLoader.java
> :000000 100644 0000000... 71ae980... A  execution/mapreduce/src/main/java/org/apache/sqoop/job/io/SqoopWritable.java
> :100644 100644 356ae8a... b495cc9... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopFileOutputFormat.java
> :100644 100644 92de37e... 8164ffe... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopMapper.java
> :100644 100644 90de6ef... b3503da... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopNullOutputFormat.java
> :100644 100644 7dedee9... 16e59d8... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopOutputFormatLoadExecutor.java
> :100644 100644 98a2c51... 5aaceb3... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopReducer.java
> :100644 100644 e21f15b... 4e9e6ea... M  execution/mapreduce/src/test/java/org/apache/sqoop/job/JobUtils.java
> :100644 100644 b7079dd... ddda423... M  execution/mapreduce/src/test/java/org/apache/sqoop/job/TestHdfsExtract.java
> :100644 100644 f849aae... 2968411... M  execution/mapreduce/src/test/java/org/apache/sqoop/job/TestHdfsLoad.java
> :100644 100644 7b264c6... 954990f... M  execution/mapreduce/src/test/java/org/apache/sqoop/job/TestMapReduce.java
> :000000 100644 0000000... aea7de3... A  execution/mapreduce/src/test/java/org/apache/sqoop/job/io/SqoopWritableTest.java
> :100644 100644 bee8ab7... 663dfb5... M  execution/mapreduce/src/test/java/org/apache/sqoop/job/mr/TestSqoopOutputFormatLoadExecutor.java
> :100644 100644 1e2f005... a722c74... M  pom.xml
> :100644 100644 0b240e8... 43f17d4... M  spi/pom.xml
> :100644 100644 2becc56... 298e8f5... M  spi/src/main/java/org/apache/sqoop/connector/spi/SqoopConnector.java
> :100644 100644 6fc485b... 8d1a4e8... M  submission/mapreduce/src/main/java/org/apache/sqoop/submission/mapreduce/MapreduceSubmissionEngine.java
> 
> 
> Diffs
> -----
> 
>   .gitignore 62bda436320440172ed737f6d7d257b8088c8441 
>   common/pom.xml db11b5ba70093704bc3f7a0a98cf7d44172e2b7d 
>   common/src/main/java/org/apache/sqoop/etl/io/DataReader.java 3e1adc7084cf9ef1b93c8146110c751c7de34376 
>   common/src/main/java/org/apache/sqoop/etl/io/DataWriter.java d81364ef81bc747437f5b78520bacec8ee2613a3 
>   common/src/main/java/org/apache/sqoop/schema/type/Column.java 8b630b28295700b204da9f5a948eaef106ca559b 
>   connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcConnector.java e0da80f77230fb836ef6ab36e3bd867d6cef553a 
>   connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcExportInitializer.java ef39cdcbc41280a0dec338c8c2ab27b7cf76385b 
>   connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportInitializer.java 96818ba27b25538643f9c5777ed8abebdf0eb1e9 
>   connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/util/InitializationUtils.java PRE-CREATION 
>   connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestExportLoader.java d4c4565aa1ece0a0a9f1da291b72f87887db79fd 
>   connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestImportExtractor.java a7ed6bab95c76c4928414e7850cc3e52a3ce4595 
>   connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestImportInitializer.java a33fa3631400c3382a72e6286b1efb3fca0c9cb9 
>   connector/connector-sdk/pom.xml 4056e14978cb5e679233e9fdde11d53f1ca2756b 
>   connector/connector-sdk/src/main/java/org/apache/sqoop/connector/idf/CSVIntermediateDataFormat.java PRE-CREATION 
>   connector/connector-sdk/src/main/java/org/apache/sqoop/connector/idf/IntermediateDataFormat.java PRE-CREATION 
>   connector/connector-sdk/src/main/java/org/apache/sqoop/connector/idf/IntermediateDataFormatError.java PRE-CREATION 
>   connector/connector-sdk/src/test/java/org/apache/sqoop/connector/idf/CSVIntermediateDataFormatTest.java PRE-CREATION 
>   core/src/main/java/org/apache/sqoop/framework/JobManager.java e0525846029bb394b2614a33595ebe6fcdd4aaf2 
>   core/src/main/java/org/apache/sqoop/framework/SubmissionRequest.java 53d003980a172e4e0acf18630a3496909c17cb5c 
>   execution/mapreduce/pom.xml f9a2a0e57b499cc785bdb6548a2d6113e7b867c0 
>   execution/mapreduce/src/main/java/org/apache/sqoop/execution/mapreduce/MapreduceExecutionEngine.java 5c0a02739a5f7598dc83ab82631b80410ab39213 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/JobConstants.java 7fd9a017140fb91292a88e5a944ae549915c86e4 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsExportExtractor.java 1978ec61e07beb6b1e75bc4da8c698c738ac4e4c 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsSequenceImportLoader.java a07c5111e9e037dab4dacb128a53918f61599495 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsTextImportLoader.java 462194238755e603dd60226a97cf357cf63e0f20 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/io/SqoopWritable.java PRE-CREATION 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopFileOutputFormat.java 356ae8ab6f3cad13b644e1f033278d9012938956 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopMapper.java 92de37e1b7ffd3c4ce584e4d10c4e1f9af03c369 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopNullOutputFormat.java 90de6efa5f3145d0480e375d82bd5c6a9760a799 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopOutputFormatLoadExecutor.java 7dedee9f12b11e3af5cf6ee5c81bbc377a04a4fd 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopReducer.java 98a2c518b65130ebe272faf721b77099a1e85a63 
>   execution/mapreduce/src/test/java/org/apache/sqoop/execution/mapreduce/MapreduceExecutionEngineTest.java 39d1b53abb1a3d71afbb2bc3aaf65b0cfc669a53 
>   execution/mapreduce/src/test/java/org/apache/sqoop/job/JobUtils.java e21f15b8be2aacb7463ab4c41cfce63cbb71cf57 
>   execution/mapreduce/src/test/java/org/apache/sqoop/job/TestHdfsExtract.java b7079dd47c80c776a508b07f0ee5152c1d3fd20e 
>   execution/mapreduce/src/test/java/org/apache/sqoop/job/TestHdfsLoad.java f849aaef10c3b995afbe6326353256affbceee67 
>   execution/mapreduce/src/test/java/org/apache/sqoop/job/TestMapReduce.java 7b264c68ae761fbaf91bc51ed312de70b9ba3fd3 
>   execution/mapreduce/src/test/java/org/apache/sqoop/job/io/SqoopWritableTest.java PRE-CREATION 
>   execution/mapreduce/src/test/java/org/apache/sqoop/job/mr/TestSqoopOutputFormatLoadExecutor.java bee8ab75f1f8e6508fca905aafe7c67f6b5d8d9a 
>   pom.xml 1e2f0057ca2e9cd635ccbce75e5d6e39eeabb163 
>   spi/pom.xml 0b240e8a1146cc7a06a2c50c181fb73b11e6dd8c 
>   spi/src/main/java/org/apache/sqoop/connector/spi/SqoopConnector.java 2becc56364343f1cffdf91d9b17144666acbdd2e 
>   submission/mapreduce/src/main/java/org/apache/sqoop/submission/mapreduce/MapreduceSubmissionEngine.java 6fc485b66ea8c33d09d172675a104954d570f3a7 
> 
> Diff: https://reviews.apache.org/r/16812/diff/
> 
> 
> Testing
> -------
> 
> mvn test
> 
> 
> Thanks,
> 
> Abraham Elmahrek
> 
>


Re: Review Request 16812: SQOOP-777. Sqoop2: Pluggable Intermediate Data Format

Posted by Abraham Elmahrek <ab...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16812/
-----------------------------------------------------------

(Updated July 22, 2014, 7:50 a.m.)


Review request for Sqoop.


Bugs: SQOOP-777
    https://issues.apache.org/jira/browse/SQOOP-777


Repository: sqoop-sqoop2


Description
-------

Previous review from hari is in first iteration of this review only.

commit 1c5122611fbd4a46a629421f7c55746cfc14f136
Author: Hari Shreedharan <hs...@apache.org>
Date:   Fri Jan 10 15:07:08 2014 -0800

    SQOOP-777. Sqoop2: Pluggable Intermediate Data Format
    
    Implemented a pluggable intermediate data format that decouples the internal representation of the data from the connector and the output formats. Connectors can choose to implement and support a format that is more efficient for them. Also separated the SqoopWritable so th
    
    I ran a full build - all tests including integration tests pass. I have not added any new tests, yet. I will add unit tests for the new classes. Also, I have not tried running this on an actual cluster - so things may be broken. I'd like some initial feedback based on the c
    
    I also implemented escaping of characters. There is some work remaining to support binary format, but it is mostly integration, the basic implementation is in place.

:100644 100644 db11b5b... 9bfa07d... M  common/pom.xml
:100644 100644 3e1adc7... f971240... M  common/src/main/java/org/apache/sqoop/etl/io/DataReader.java
:100644 100644 d81364e... e547875... M  common/src/main/java/org/apache/sqoop/etl/io/DataWriter.java
:100644 100644 8b630b2... 30c26a3... M  common/src/main/java/org/apache/sqoop/schema/type/Column.java
:100644 100644 e0da80f... 9c70db9... M  connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcConnector.java
:100644 100644 ef39cdc... 075890f... M  connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcExportInitializer.java
:100644 100644 96818ba... 795ffdb... M  connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportInitializer.java
:000000 100644 0000000... 6c33423... A  connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/util/InitializationUtils.java
:100644 100644 aa1c4ff... 7220018... M  connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestExportLoader.java
:100644 100644 a7ed6ba... 28399f2... M  connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestImportExtractor.java
:100644 100644 4056e14... f54837d... M  connector/connector-sdk/pom.xml
:000000 100644 0000000... d4874f2... A  connector/connector-sdk/src/main/java/org/apache/sqoop/connector/CSVIntermediateDataFormat.java
:000000 100644 0000000... 63e14d2... A  connector/connector-sdk/src/main/java/org/apache/sqoop/connector/IntermediateDataFormat.java
:000000 100644 0000000... 6e5479f... A  connector/connector-sdk/src/test/java/org/apache/sqoop/connector/CSVIntermediateDataFormatTest.java
:100644 100644 e052584... 3b2ef94... M  core/src/main/java/org/apache/sqoop/framework/JobManager.java
:100644 100644 53d0039... 9f5c47d... M  core/src/main/java/org/apache/sqoop/framework/SubmissionRequest.java
:100644 100644 f9a2a0e... 9754afd... M  execution/mapreduce/pom.xml
:100644 100644 5c0a027... 7680e33... M  execution/mapreduce/src/main/java/org/apache/sqoop/execution/mapreduce/MapreduceExecutionEngine.java
:100644 100644 7fd9a01... 3604898... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/JobConstants.java
:100644 100644 1978ec6... 099cdd3... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsExportExtractor.java
:100644 100644 a07c511... ee6bf39... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsSequenceImportLoader.java
:100644 100644 4621942... 7b799ca... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsTextImportLoader.java
:000000 100644 0000000... 71ae980... A  execution/mapreduce/src/main/java/org/apache/sqoop/job/io/SqoopWritable.java
:100644 100644 356ae8a... b495cc9... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopFileOutputFormat.java
:100644 100644 92de37e... 8164ffe... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopMapper.java
:100644 100644 90de6ef... b3503da... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopNullOutputFormat.java
:100644 100644 7dedee9... 16e59d8... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopOutputFormatLoadExecutor.java
:100644 100644 98a2c51... 5aaceb3... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopReducer.java
:100644 100644 e21f15b... 4e9e6ea... M  execution/mapreduce/src/test/java/org/apache/sqoop/job/JobUtils.java
:100644 100644 b7079dd... ddda423... M  execution/mapreduce/src/test/java/org/apache/sqoop/job/TestHdfsExtract.java
:100644 100644 f849aae... 2968411... M  execution/mapreduce/src/test/java/org/apache/sqoop/job/TestHdfsLoad.java
:100644 100644 7b264c6... 954990f... M  execution/mapreduce/src/test/java/org/apache/sqoop/job/TestMapReduce.java
:000000 100644 0000000... aea7de3... A  execution/mapreduce/src/test/java/org/apache/sqoop/job/io/SqoopWritableTest.java
:100644 100644 bee8ab7... 663dfb5... M  execution/mapreduce/src/test/java/org/apache/sqoop/job/mr/TestSqoopOutputFormatLoadExecutor.java
:100644 100644 1e2f005... a722c74... M  pom.xml
:100644 100644 0b240e8... 43f17d4... M  spi/pom.xml
:100644 100644 2becc56... 298e8f5... M  spi/src/main/java/org/apache/sqoop/connector/spi/SqoopConnector.java
:100644 100644 6fc485b... 8d1a4e8... M  submission/mapreduce/src/main/java/org/apache/sqoop/submission/mapreduce/MapreduceSubmissionEngine.java


Diffs (updated)
-----

  common/pom.xml db11b5b 
  common/src/main/java/org/apache/sqoop/etl/io/DataReader.java 3e1adc7 
  common/src/main/java/org/apache/sqoop/etl/io/DataWriter.java d81364e 
  common/src/main/java/org/apache/sqoop/schema/type/Column.java 8b630b2 
  connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcConnector.java e0da80f 
  connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcConnectorError.java 2b1a0ad 
  connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcExportInitializer.java ef39cdc 
  connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportInitializer.java 96818ba 
  connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestExportLoader.java d4c4565 
  connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestImportExtractor.java a7ed6ba 
  connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestImportInitializer.java a33fa36 
  connector/connector-sdk/pom.xml 4056e14 
  connector/connector-sdk/src/main/java/org/apache/sqoop/connector/idf/CSVIntermediateDataFormat.java PRE-CREATION 
  connector/connector-sdk/src/main/java/org/apache/sqoop/connector/idf/IntermediateDataFormat.java PRE-CREATION 
  connector/connector-sdk/src/main/java/org/apache/sqoop/connector/idf/IntermediateDataFormatError.java PRE-CREATION 
  connector/connector-sdk/src/test/java/org/apache/sqoop/connector/idf/CSVIntermediateDataFormatTest.java PRE-CREATION 
  core/src/main/java/org/apache/sqoop/framework/JobManager.java e052584 
  core/src/main/java/org/apache/sqoop/framework/SubmissionRequest.java a138db5 
  execution/mapreduce/pom.xml f9a2a0e 
  execution/mapreduce/src/main/java/org/apache/sqoop/execution/mapreduce/MapreduceExecutionEngine.java 5c0a027 
  execution/mapreduce/src/main/java/org/apache/sqoop/job/JobConstants.java 7fd9a01 
  execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsExportExtractor.java 1978ec6 
  execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsSequenceImportLoader.java a07c511 
  execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsTextImportLoader.java 4621942 
  execution/mapreduce/src/main/java/org/apache/sqoop/job/io/SqoopWritable.java PRE-CREATION 
  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopFileOutputFormat.java 356ae8a 
  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopMapper.java 92de37e 
  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopNullOutputFormat.java 90de6ef 
  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopOutputFormatLoadExecutor.java 7dedee9 
  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopReducer.java 98a2c51 
  execution/mapreduce/src/test/java/org/apache/sqoop/execution/mapreduce/MapreduceExecutionEngineTest.java 39d1b53 
  execution/mapreduce/src/test/java/org/apache/sqoop/job/JobUtils.java e21f15b 
  execution/mapreduce/src/test/java/org/apache/sqoop/job/TestHdfsExtract.java b7079dd 
  execution/mapreduce/src/test/java/org/apache/sqoop/job/TestHdfsLoad.java f849aae 
  execution/mapreduce/src/test/java/org/apache/sqoop/job/TestMapReduce.java 7b264c6 
  execution/mapreduce/src/test/java/org/apache/sqoop/job/io/SqoopWritableTest.java PRE-CREATION 
  execution/mapreduce/src/test/java/org/apache/sqoop/job/mr/TestSqoopOutputFormatLoadExecutor.java bee8ab7 
  pom.xml 1e2f005 
  spi/pom.xml 0b240e8 
  spi/src/main/java/org/apache/sqoop/connector/spi/SqoopConnector.java 2becc56 
  submission/mapreduce/src/main/java/org/apache/sqoop/submission/mapreduce/MapreduceSubmissionEngine.java bfc28ef 

Diff: https://reviews.apache.org/r/16812/diff/


Testing
-------

mvn test


Thanks,

Abraham Elmahrek


Re: Review Request 16812: SQOOP-777. Sqoop2: Pluggable Intermediate Data Format

Posted by Jarek Cecho <ja...@apache.org>.

> On July 19, 2014, 10:03 p.m., Jarek Cecho wrote:
> > execution/mapreduce/src/main/java/org/apache/sqoop/job/io/SqoopWritable.java, line 27
> > <https://reviews.apache.org/r/16812/diff/5/?file=604300#file604300line27>
> >
> >     Didn't we wanted to use the Configurable interface to skip the string encoding here?
> 
> Abraham Elmahrek wrote:
>     Yessir, but we've decided to do this in a follow up Jira when Hadoop 2.5 is released.

Make sense, could you then create the follow up JIRA?


> On July 19, 2014, 10:03 p.m., Jarek Cecho wrote:
> > connector/connector-sdk/src/test/java/org/apache/sqoop/connector/idf/CSVIntermediateDataFormatTest.java, lines 72-76
> > <https://reviews.apache.org/r/16812/diff/5/?file=604291#file604291line72>
> >
> >     This do seem incorrect to me - we are expecting 6 columns, but 0 columns are given - this should lead to an exception right?
> 
> Abraham Elmahrek wrote:
>     I suppose the exception would be something along the lines of "incorrect number of columns"?
>     
>     Adding null handling as well.

Agreed.


> On July 19, 2014, 10:03 p.m., Jarek Cecho wrote:
> > connector/connector-sdk/src/main/java/org/apache/sqoop/connector/idf/CSVIntermediateDataFormat.java, line 132
> > <https://reviews.apache.org/r/16812/diff/5/?file=604288#file604288line132>
> >
> >     Wouldn't be efficient to find indexes and use substring method rather then build the value character by character?
> 
> Abraham Elmahrek wrote:
>     I'm not sure which is better in this case actually. Using "indexOf" and "substring" might be generally faster or it might not be. Using "indexOf" and "substring" would require some back tracking to catch all cases. Either way, could we get this in and optimize this in a follow up Jira?
>

Works for me.


- Jarek


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16812/#review48182
-----------------------------------------------------------


On July 22, 2014, 7:50 a.m., Abraham Elmahrek wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/16812/
> -----------------------------------------------------------
> 
> (Updated July 22, 2014, 7:50 a.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-777
>     https://issues.apache.org/jira/browse/SQOOP-777
> 
> 
> Repository: sqoop-sqoop2
> 
> 
> Description
> -------
> 
> Previous review from hari is in first iteration of this review only.
> 
> commit 1c5122611fbd4a46a629421f7c55746cfc14f136
> Author: Hari Shreedharan <hs...@apache.org>
> Date:   Fri Jan 10 15:07:08 2014 -0800
> 
>     SQOOP-777. Sqoop2: Pluggable Intermediate Data Format
>     
>     Implemented a pluggable intermediate data format that decouples the internal representation of the data from the connector and the output formats. Connectors can choose to implement and support a format that is more efficient for them. Also separated the SqoopWritable so th
>     
>     I ran a full build - all tests including integration tests pass. I have not added any new tests, yet. I will add unit tests for the new classes. Also, I have not tried running this on an actual cluster - so things may be broken. I'd like some initial feedback based on the c
>     
>     I also implemented escaping of characters. There is some work remaining to support binary format, but it is mostly integration, the basic implementation is in place.
> 
> :100644 100644 db11b5b... 9bfa07d... M  common/pom.xml
> :100644 100644 3e1adc7... f971240... M  common/src/main/java/org/apache/sqoop/etl/io/DataReader.java
> :100644 100644 d81364e... e547875... M  common/src/main/java/org/apache/sqoop/etl/io/DataWriter.java
> :100644 100644 8b630b2... 30c26a3... M  common/src/main/java/org/apache/sqoop/schema/type/Column.java
> :100644 100644 e0da80f... 9c70db9... M  connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcConnector.java
> :100644 100644 ef39cdc... 075890f... M  connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcExportInitializer.java
> :100644 100644 96818ba... 795ffdb... M  connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportInitializer.java
> :000000 100644 0000000... 6c33423... A  connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/util/InitializationUtils.java
> :100644 100644 aa1c4ff... 7220018... M  connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestExportLoader.java
> :100644 100644 a7ed6ba... 28399f2... M  connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestImportExtractor.java
> :100644 100644 4056e14... f54837d... M  connector/connector-sdk/pom.xml
> :000000 100644 0000000... d4874f2... A  connector/connector-sdk/src/main/java/org/apache/sqoop/connector/CSVIntermediateDataFormat.java
> :000000 100644 0000000... 63e14d2... A  connector/connector-sdk/src/main/java/org/apache/sqoop/connector/IntermediateDataFormat.java
> :000000 100644 0000000... 6e5479f... A  connector/connector-sdk/src/test/java/org/apache/sqoop/connector/CSVIntermediateDataFormatTest.java
> :100644 100644 e052584... 3b2ef94... M  core/src/main/java/org/apache/sqoop/framework/JobManager.java
> :100644 100644 53d0039... 9f5c47d... M  core/src/main/java/org/apache/sqoop/framework/SubmissionRequest.java
> :100644 100644 f9a2a0e... 9754afd... M  execution/mapreduce/pom.xml
> :100644 100644 5c0a027... 7680e33... M  execution/mapreduce/src/main/java/org/apache/sqoop/execution/mapreduce/MapreduceExecutionEngine.java
> :100644 100644 7fd9a01... 3604898... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/JobConstants.java
> :100644 100644 1978ec6... 099cdd3... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsExportExtractor.java
> :100644 100644 a07c511... ee6bf39... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsSequenceImportLoader.java
> :100644 100644 4621942... 7b799ca... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsTextImportLoader.java
> :000000 100644 0000000... 71ae980... A  execution/mapreduce/src/main/java/org/apache/sqoop/job/io/SqoopWritable.java
> :100644 100644 356ae8a... b495cc9... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopFileOutputFormat.java
> :100644 100644 92de37e... 8164ffe... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopMapper.java
> :100644 100644 90de6ef... b3503da... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopNullOutputFormat.java
> :100644 100644 7dedee9... 16e59d8... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopOutputFormatLoadExecutor.java
> :100644 100644 98a2c51... 5aaceb3... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopReducer.java
> :100644 100644 e21f15b... 4e9e6ea... M  execution/mapreduce/src/test/java/org/apache/sqoop/job/JobUtils.java
> :100644 100644 b7079dd... ddda423... M  execution/mapreduce/src/test/java/org/apache/sqoop/job/TestHdfsExtract.java
> :100644 100644 f849aae... 2968411... M  execution/mapreduce/src/test/java/org/apache/sqoop/job/TestHdfsLoad.java
> :100644 100644 7b264c6... 954990f... M  execution/mapreduce/src/test/java/org/apache/sqoop/job/TestMapReduce.java
> :000000 100644 0000000... aea7de3... A  execution/mapreduce/src/test/java/org/apache/sqoop/job/io/SqoopWritableTest.java
> :100644 100644 bee8ab7... 663dfb5... M  execution/mapreduce/src/test/java/org/apache/sqoop/job/mr/TestSqoopOutputFormatLoadExecutor.java
> :100644 100644 1e2f005... a722c74... M  pom.xml
> :100644 100644 0b240e8... 43f17d4... M  spi/pom.xml
> :100644 100644 2becc56... 298e8f5... M  spi/src/main/java/org/apache/sqoop/connector/spi/SqoopConnector.java
> :100644 100644 6fc485b... 8d1a4e8... M  submission/mapreduce/src/main/java/org/apache/sqoop/submission/mapreduce/MapreduceSubmissionEngine.java
> 
> 
> Diffs
> -----
> 
>   common/pom.xml db11b5b 
>   common/src/main/java/org/apache/sqoop/etl/io/DataReader.java 3e1adc7 
>   common/src/main/java/org/apache/sqoop/etl/io/DataWriter.java d81364e 
>   common/src/main/java/org/apache/sqoop/schema/type/Column.java 8b630b2 
>   connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcConnector.java e0da80f 
>   connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcConnectorError.java 2b1a0ad 
>   connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcExportInitializer.java ef39cdc 
>   connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportInitializer.java 96818ba 
>   connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestExportLoader.java d4c4565 
>   connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestImportExtractor.java a7ed6ba 
>   connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestImportInitializer.java a33fa36 
>   connector/connector-sdk/pom.xml 4056e14 
>   connector/connector-sdk/src/main/java/org/apache/sqoop/connector/idf/CSVIntermediateDataFormat.java PRE-CREATION 
>   connector/connector-sdk/src/main/java/org/apache/sqoop/connector/idf/IntermediateDataFormat.java PRE-CREATION 
>   connector/connector-sdk/src/main/java/org/apache/sqoop/connector/idf/IntermediateDataFormatError.java PRE-CREATION 
>   connector/connector-sdk/src/test/java/org/apache/sqoop/connector/idf/CSVIntermediateDataFormatTest.java PRE-CREATION 
>   core/src/main/java/org/apache/sqoop/framework/JobManager.java e052584 
>   core/src/main/java/org/apache/sqoop/framework/SubmissionRequest.java a138db5 
>   execution/mapreduce/pom.xml f9a2a0e 
>   execution/mapreduce/src/main/java/org/apache/sqoop/execution/mapreduce/MapreduceExecutionEngine.java 5c0a027 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/JobConstants.java 7fd9a01 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsExportExtractor.java 1978ec6 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsSequenceImportLoader.java a07c511 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsTextImportLoader.java 4621942 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/io/SqoopWritable.java PRE-CREATION 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopFileOutputFormat.java 356ae8a 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopMapper.java 92de37e 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopNullOutputFormat.java 90de6ef 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopOutputFormatLoadExecutor.java 7dedee9 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopReducer.java 98a2c51 
>   execution/mapreduce/src/test/java/org/apache/sqoop/execution/mapreduce/MapreduceExecutionEngineTest.java 39d1b53 
>   execution/mapreduce/src/test/java/org/apache/sqoop/job/JobUtils.java e21f15b 
>   execution/mapreduce/src/test/java/org/apache/sqoop/job/TestHdfsExtract.java b7079dd 
>   execution/mapreduce/src/test/java/org/apache/sqoop/job/TestHdfsLoad.java f849aae 
>   execution/mapreduce/src/test/java/org/apache/sqoop/job/TestMapReduce.java 7b264c6 
>   execution/mapreduce/src/test/java/org/apache/sqoop/job/io/SqoopWritableTest.java PRE-CREATION 
>   execution/mapreduce/src/test/java/org/apache/sqoop/job/mr/TestSqoopOutputFormatLoadExecutor.java bee8ab7 
>   pom.xml 1e2f005 
>   spi/pom.xml 0b240e8 
>   spi/src/main/java/org/apache/sqoop/connector/spi/SqoopConnector.java 2becc56 
>   submission/mapreduce/src/main/java/org/apache/sqoop/submission/mapreduce/MapreduceSubmissionEngine.java bfc28ef 
> 
> Diff: https://reviews.apache.org/r/16812/diff/
> 
> 
> Testing
> -------
> 
> mvn test
> 
> 
> Thanks,
> 
> Abraham Elmahrek
> 
>


Re: Review Request 16812: SQOOP-777. Sqoop2: Pluggable Intermediate Data Format

Posted by Abraham Elmahrek <ab...@cloudera.com>.

> On July 19, 2014, 10:03 p.m., Jarek Cecho wrote:
> > connector/connector-sdk/src/main/java/org/apache/sqoop/connector/idf/CSVIntermediateDataFormat.java, line 132
> > <https://reviews.apache.org/r/16812/diff/5/?file=604288#file604288line132>
> >
> >     Wouldn't be efficient to find indexes and use substring method rather then build the value character by character?

I'm not sure which is better in this case actually. Using "indexOf" and "substring" might be generally faster or it might not be. Using "indexOf" and "substring" would require some back tracking to catch all cases. Either way, could we get this in and optimize this in a follow up Jira?


> On July 19, 2014, 10:03 p.m., Jarek Cecho wrote:
> > execution/mapreduce/src/main/java/org/apache/sqoop/job/io/SqoopWritable.java, line 27
> > <https://reviews.apache.org/r/16812/diff/5/?file=604300#file604300line27>
> >
> >     Didn't we wanted to use the Configurable interface to skip the string encoding here?

Yessir, but we've decided to do this in a follow up Jira when Hadoop 2.5 is released.


> On July 19, 2014, 10:03 p.m., Jarek Cecho wrote:
> > connector/connector-sdk/src/test/java/org/apache/sqoop/connector/idf/CSVIntermediateDataFormatTest.java, lines 72-76
> > <https://reviews.apache.org/r/16812/diff/5/?file=604291#file604291line72>
> >
> >     This do seem incorrect to me - we are expecting 6 columns, but 0 columns are given - this should lead to an exception right?

I suppose the exception would be something along the lines of "incorrect number of columns"?

Adding null handling as well.


> On July 19, 2014, 10:03 p.m., Jarek Cecho wrote:
> > connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcExportInitializer.java, line 75
> > <https://reviews.apache.org/r/16812/diff/5/?file=604281#file604281line75>
> >
> >     I don't think that this method will work for "INSERT" statement.

Skipping this case for now. Will revisit in a follow up Jira.


> On July 19, 2014, 10:03 p.m., Jarek Cecho wrote:
> > connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/util/InitializationUtils.java, line 19
> > <https://reviews.apache.org/r/16812/diff/5/?file=604283#file604283line19>
> >
> >     I think that connector sdk would be better home for this one?
> >     
> >     Also this seems quite unrelated change to this patch, can we please upload it separately? This patch is large enough already.

Indeed. Skipping for now.


- Abraham


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16812/#review48182
-----------------------------------------------------------


On June 5, 2014, 11:58 p.m., Abraham Elmahrek wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/16812/
> -----------------------------------------------------------
> 
> (Updated June 5, 2014, 11:58 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-777
>     https://issues.apache.org/jira/browse/SQOOP-777
> 
> 
> Repository: sqoop-sqoop2
> 
> 
> Description
> -------
> 
> Previous review from hari is in first iteration of this review only.
> 
> commit 1c5122611fbd4a46a629421f7c55746cfc14f136
> Author: Hari Shreedharan <hs...@apache.org>
> Date:   Fri Jan 10 15:07:08 2014 -0800
> 
>     SQOOP-777. Sqoop2: Pluggable Intermediate Data Format
>     
>     Implemented a pluggable intermediate data format that decouples the internal representation of the data from the connector and the output formats. Connectors can choose to implement and support a format that is more efficient for them. Also separated the SqoopWritable so th
>     
>     I ran a full build - all tests including integration tests pass. I have not added any new tests, yet. I will add unit tests for the new classes. Also, I have not tried running this on an actual cluster - so things may be broken. I'd like some initial feedback based on the c
>     
>     I also implemented escaping of characters. There is some work remaining to support binary format, but it is mostly integration, the basic implementation is in place.
> 
> :100644 100644 db11b5b... 9bfa07d... M  common/pom.xml
> :100644 100644 3e1adc7... f971240... M  common/src/main/java/org/apache/sqoop/etl/io/DataReader.java
> :100644 100644 d81364e... e547875... M  common/src/main/java/org/apache/sqoop/etl/io/DataWriter.java
> :100644 100644 8b630b2... 30c26a3... M  common/src/main/java/org/apache/sqoop/schema/type/Column.java
> :100644 100644 e0da80f... 9c70db9... M  connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcConnector.java
> :100644 100644 ef39cdc... 075890f... M  connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcExportInitializer.java
> :100644 100644 96818ba... 795ffdb... M  connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportInitializer.java
> :000000 100644 0000000... 6c33423... A  connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/util/InitializationUtils.java
> :100644 100644 aa1c4ff... 7220018... M  connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestExportLoader.java
> :100644 100644 a7ed6ba... 28399f2... M  connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestImportExtractor.java
> :100644 100644 4056e14... f54837d... M  connector/connector-sdk/pom.xml
> :000000 100644 0000000... d4874f2... A  connector/connector-sdk/src/main/java/org/apache/sqoop/connector/CSVIntermediateDataFormat.java
> :000000 100644 0000000... 63e14d2... A  connector/connector-sdk/src/main/java/org/apache/sqoop/connector/IntermediateDataFormat.java
> :000000 100644 0000000... 6e5479f... A  connector/connector-sdk/src/test/java/org/apache/sqoop/connector/CSVIntermediateDataFormatTest.java
> :100644 100644 e052584... 3b2ef94... M  core/src/main/java/org/apache/sqoop/framework/JobManager.java
> :100644 100644 53d0039... 9f5c47d... M  core/src/main/java/org/apache/sqoop/framework/SubmissionRequest.java
> :100644 100644 f9a2a0e... 9754afd... M  execution/mapreduce/pom.xml
> :100644 100644 5c0a027... 7680e33... M  execution/mapreduce/src/main/java/org/apache/sqoop/execution/mapreduce/MapreduceExecutionEngine.java
> :100644 100644 7fd9a01... 3604898... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/JobConstants.java
> :100644 100644 1978ec6... 099cdd3... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsExportExtractor.java
> :100644 100644 a07c511... ee6bf39... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsSequenceImportLoader.java
> :100644 100644 4621942... 7b799ca... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsTextImportLoader.java
> :000000 100644 0000000... 71ae980... A  execution/mapreduce/src/main/java/org/apache/sqoop/job/io/SqoopWritable.java
> :100644 100644 356ae8a... b495cc9... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopFileOutputFormat.java
> :100644 100644 92de37e... 8164ffe... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopMapper.java
> :100644 100644 90de6ef... b3503da... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopNullOutputFormat.java
> :100644 100644 7dedee9... 16e59d8... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopOutputFormatLoadExecutor.java
> :100644 100644 98a2c51... 5aaceb3... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopReducer.java
> :100644 100644 e21f15b... 4e9e6ea... M  execution/mapreduce/src/test/java/org/apache/sqoop/job/JobUtils.java
> :100644 100644 b7079dd... ddda423... M  execution/mapreduce/src/test/java/org/apache/sqoop/job/TestHdfsExtract.java
> :100644 100644 f849aae... 2968411... M  execution/mapreduce/src/test/java/org/apache/sqoop/job/TestHdfsLoad.java
> :100644 100644 7b264c6... 954990f... M  execution/mapreduce/src/test/java/org/apache/sqoop/job/TestMapReduce.java
> :000000 100644 0000000... aea7de3... A  execution/mapreduce/src/test/java/org/apache/sqoop/job/io/SqoopWritableTest.java
> :100644 100644 bee8ab7... 663dfb5... M  execution/mapreduce/src/test/java/org/apache/sqoop/job/mr/TestSqoopOutputFormatLoadExecutor.java
> :100644 100644 1e2f005... a722c74... M  pom.xml
> :100644 100644 0b240e8... 43f17d4... M  spi/pom.xml
> :100644 100644 2becc56... 298e8f5... M  spi/src/main/java/org/apache/sqoop/connector/spi/SqoopConnector.java
> :100644 100644 6fc485b... 8d1a4e8... M  submission/mapreduce/src/main/java/org/apache/sqoop/submission/mapreduce/MapreduceSubmissionEngine.java
> 
> 
> Diffs
> -----
> 
>   .gitignore 62bda436320440172ed737f6d7d257b8088c8441 
>   common/pom.xml db11b5ba70093704bc3f7a0a98cf7d44172e2b7d 
>   common/src/main/java/org/apache/sqoop/etl/io/DataReader.java 3e1adc7084cf9ef1b93c8146110c751c7de34376 
>   common/src/main/java/org/apache/sqoop/etl/io/DataWriter.java d81364ef81bc747437f5b78520bacec8ee2613a3 
>   common/src/main/java/org/apache/sqoop/schema/type/Column.java 8b630b28295700b204da9f5a948eaef106ca559b 
>   connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcConnector.java e0da80f77230fb836ef6ab36e3bd867d6cef553a 
>   connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcExportInitializer.java ef39cdcbc41280a0dec338c8c2ab27b7cf76385b 
>   connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportInitializer.java 96818ba27b25538643f9c5777ed8abebdf0eb1e9 
>   connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/util/InitializationUtils.java PRE-CREATION 
>   connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestExportLoader.java d4c4565aa1ece0a0a9f1da291b72f87887db79fd 
>   connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestImportExtractor.java a7ed6bab95c76c4928414e7850cc3e52a3ce4595 
>   connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestImportInitializer.java a33fa3631400c3382a72e6286b1efb3fca0c9cb9 
>   connector/connector-sdk/pom.xml 4056e14978cb5e679233e9fdde11d53f1ca2756b 
>   connector/connector-sdk/src/main/java/org/apache/sqoop/connector/idf/CSVIntermediateDataFormat.java PRE-CREATION 
>   connector/connector-sdk/src/main/java/org/apache/sqoop/connector/idf/IntermediateDataFormat.java PRE-CREATION 
>   connector/connector-sdk/src/main/java/org/apache/sqoop/connector/idf/IntermediateDataFormatError.java PRE-CREATION 
>   connector/connector-sdk/src/test/java/org/apache/sqoop/connector/idf/CSVIntermediateDataFormatTest.java PRE-CREATION 
>   core/src/main/java/org/apache/sqoop/framework/JobManager.java e0525846029bb394b2614a33595ebe6fcdd4aaf2 
>   core/src/main/java/org/apache/sqoop/framework/SubmissionRequest.java 53d003980a172e4e0acf18630a3496909c17cb5c 
>   execution/mapreduce/pom.xml f9a2a0e57b499cc785bdb6548a2d6113e7b867c0 
>   execution/mapreduce/src/main/java/org/apache/sqoop/execution/mapreduce/MapreduceExecutionEngine.java 5c0a02739a5f7598dc83ab82631b80410ab39213 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/JobConstants.java 7fd9a017140fb91292a88e5a944ae549915c86e4 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsExportExtractor.java 1978ec61e07beb6b1e75bc4da8c698c738ac4e4c 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsSequenceImportLoader.java a07c5111e9e037dab4dacb128a53918f61599495 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsTextImportLoader.java 462194238755e603dd60226a97cf357cf63e0f20 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/io/SqoopWritable.java PRE-CREATION 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopFileOutputFormat.java 356ae8ab6f3cad13b644e1f033278d9012938956 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopMapper.java 92de37e1b7ffd3c4ce584e4d10c4e1f9af03c369 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopNullOutputFormat.java 90de6efa5f3145d0480e375d82bd5c6a9760a799 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopOutputFormatLoadExecutor.java 7dedee9f12b11e3af5cf6ee5c81bbc377a04a4fd 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopReducer.java 98a2c518b65130ebe272faf721b77099a1e85a63 
>   execution/mapreduce/src/test/java/org/apache/sqoop/execution/mapreduce/MapreduceExecutionEngineTest.java 39d1b53abb1a3d71afbb2bc3aaf65b0cfc669a53 
>   execution/mapreduce/src/test/java/org/apache/sqoop/job/JobUtils.java e21f15b8be2aacb7463ab4c41cfce63cbb71cf57 
>   execution/mapreduce/src/test/java/org/apache/sqoop/job/TestHdfsExtract.java b7079dd47c80c776a508b07f0ee5152c1d3fd20e 
>   execution/mapreduce/src/test/java/org/apache/sqoop/job/TestHdfsLoad.java f849aaef10c3b995afbe6326353256affbceee67 
>   execution/mapreduce/src/test/java/org/apache/sqoop/job/TestMapReduce.java 7b264c68ae761fbaf91bc51ed312de70b9ba3fd3 
>   execution/mapreduce/src/test/java/org/apache/sqoop/job/io/SqoopWritableTest.java PRE-CREATION 
>   execution/mapreduce/src/test/java/org/apache/sqoop/job/mr/TestSqoopOutputFormatLoadExecutor.java bee8ab75f1f8e6508fca905aafe7c67f6b5d8d9a 
>   pom.xml 1e2f0057ca2e9cd635ccbce75e5d6e39eeabb163 
>   spi/pom.xml 0b240e8a1146cc7a06a2c50c181fb73b11e6dd8c 
>   spi/src/main/java/org/apache/sqoop/connector/spi/SqoopConnector.java 2becc56364343f1cffdf91d9b17144666acbdd2e 
>   submission/mapreduce/src/main/java/org/apache/sqoop/submission/mapreduce/MapreduceSubmissionEngine.java 6fc485b66ea8c33d09d172675a104954d570f3a7 
> 
> Diff: https://reviews.apache.org/r/16812/diff/
> 
> 
> Testing
> -------
> 
> mvn test
> 
> 
> Thanks,
> 
> Abraham Elmahrek
> 
>


Re: Review Request 16812: SQOOP-777. Sqoop2: Pluggable Intermediate Data Format

Posted by Abraham Elmahrek <ab...@cloudera.com>.

> On July 19, 2014, 10:03 p.m., Jarek Cecho wrote:
> > execution/mapreduce/src/main/java/org/apache/sqoop/job/io/SqoopWritable.java, line 27
> > <https://reviews.apache.org/r/16812/diff/5/?file=604300#file604300line27>
> >
> >     Didn't we wanted to use the Configurable interface to skip the string encoding here?
> 
> Abraham Elmahrek wrote:
>     Yessir, but we've decided to do this in a follow up Jira when Hadoop 2.5 is released.
> 
> Jarek Cecho wrote:
>     Make sense, could you then create the follow up JIRA?

https://issues.apache.org/jira/browse/SQOOP-1349


- Abraham


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16812/#review48182
-----------------------------------------------------------


On July 22, 2014, 7:50 a.m., Abraham Elmahrek wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/16812/
> -----------------------------------------------------------
> 
> (Updated July 22, 2014, 7:50 a.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-777
>     https://issues.apache.org/jira/browse/SQOOP-777
> 
> 
> Repository: sqoop-sqoop2
> 
> 
> Description
> -------
> 
> Previous review from hari is in first iteration of this review only.
> 
> commit 1c5122611fbd4a46a629421f7c55746cfc14f136
> Author: Hari Shreedharan <hs...@apache.org>
> Date:   Fri Jan 10 15:07:08 2014 -0800
> 
>     SQOOP-777. Sqoop2: Pluggable Intermediate Data Format
>     
>     Implemented a pluggable intermediate data format that decouples the internal representation of the data from the connector and the output formats. Connectors can choose to implement and support a format that is more efficient for them. Also separated the SqoopWritable so th
>     
>     I ran a full build - all tests including integration tests pass. I have not added any new tests, yet. I will add unit tests for the new classes. Also, I have not tried running this on an actual cluster - so things may be broken. I'd like some initial feedback based on the c
>     
>     I also implemented escaping of characters. There is some work remaining to support binary format, but it is mostly integration, the basic implementation is in place.
> 
> :100644 100644 db11b5b... 9bfa07d... M  common/pom.xml
> :100644 100644 3e1adc7... f971240... M  common/src/main/java/org/apache/sqoop/etl/io/DataReader.java
> :100644 100644 d81364e... e547875... M  common/src/main/java/org/apache/sqoop/etl/io/DataWriter.java
> :100644 100644 8b630b2... 30c26a3... M  common/src/main/java/org/apache/sqoop/schema/type/Column.java
> :100644 100644 e0da80f... 9c70db9... M  connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcConnector.java
> :100644 100644 ef39cdc... 075890f... M  connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcExportInitializer.java
> :100644 100644 96818ba... 795ffdb... M  connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportInitializer.java
> :000000 100644 0000000... 6c33423... A  connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/util/InitializationUtils.java
> :100644 100644 aa1c4ff... 7220018... M  connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestExportLoader.java
> :100644 100644 a7ed6ba... 28399f2... M  connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestImportExtractor.java
> :100644 100644 4056e14... f54837d... M  connector/connector-sdk/pom.xml
> :000000 100644 0000000... d4874f2... A  connector/connector-sdk/src/main/java/org/apache/sqoop/connector/CSVIntermediateDataFormat.java
> :000000 100644 0000000... 63e14d2... A  connector/connector-sdk/src/main/java/org/apache/sqoop/connector/IntermediateDataFormat.java
> :000000 100644 0000000... 6e5479f... A  connector/connector-sdk/src/test/java/org/apache/sqoop/connector/CSVIntermediateDataFormatTest.java
> :100644 100644 e052584... 3b2ef94... M  core/src/main/java/org/apache/sqoop/framework/JobManager.java
> :100644 100644 53d0039... 9f5c47d... M  core/src/main/java/org/apache/sqoop/framework/SubmissionRequest.java
> :100644 100644 f9a2a0e... 9754afd... M  execution/mapreduce/pom.xml
> :100644 100644 5c0a027... 7680e33... M  execution/mapreduce/src/main/java/org/apache/sqoop/execution/mapreduce/MapreduceExecutionEngine.java
> :100644 100644 7fd9a01... 3604898... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/JobConstants.java
> :100644 100644 1978ec6... 099cdd3... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsExportExtractor.java
> :100644 100644 a07c511... ee6bf39... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsSequenceImportLoader.java
> :100644 100644 4621942... 7b799ca... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsTextImportLoader.java
> :000000 100644 0000000... 71ae980... A  execution/mapreduce/src/main/java/org/apache/sqoop/job/io/SqoopWritable.java
> :100644 100644 356ae8a... b495cc9... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopFileOutputFormat.java
> :100644 100644 92de37e... 8164ffe... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopMapper.java
> :100644 100644 90de6ef... b3503da... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopNullOutputFormat.java
> :100644 100644 7dedee9... 16e59d8... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopOutputFormatLoadExecutor.java
> :100644 100644 98a2c51... 5aaceb3... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopReducer.java
> :100644 100644 e21f15b... 4e9e6ea... M  execution/mapreduce/src/test/java/org/apache/sqoop/job/JobUtils.java
> :100644 100644 b7079dd... ddda423... M  execution/mapreduce/src/test/java/org/apache/sqoop/job/TestHdfsExtract.java
> :100644 100644 f849aae... 2968411... M  execution/mapreduce/src/test/java/org/apache/sqoop/job/TestHdfsLoad.java
> :100644 100644 7b264c6... 954990f... M  execution/mapreduce/src/test/java/org/apache/sqoop/job/TestMapReduce.java
> :000000 100644 0000000... aea7de3... A  execution/mapreduce/src/test/java/org/apache/sqoop/job/io/SqoopWritableTest.java
> :100644 100644 bee8ab7... 663dfb5... M  execution/mapreduce/src/test/java/org/apache/sqoop/job/mr/TestSqoopOutputFormatLoadExecutor.java
> :100644 100644 1e2f005... a722c74... M  pom.xml
> :100644 100644 0b240e8... 43f17d4... M  spi/pom.xml
> :100644 100644 2becc56... 298e8f5... M  spi/src/main/java/org/apache/sqoop/connector/spi/SqoopConnector.java
> :100644 100644 6fc485b... 8d1a4e8... M  submission/mapreduce/src/main/java/org/apache/sqoop/submission/mapreduce/MapreduceSubmissionEngine.java
> 
> 
> Diffs
> -----
> 
>   common/pom.xml db11b5b 
>   common/src/main/java/org/apache/sqoop/etl/io/DataReader.java 3e1adc7 
>   common/src/main/java/org/apache/sqoop/etl/io/DataWriter.java d81364e 
>   common/src/main/java/org/apache/sqoop/schema/type/Column.java 8b630b2 
>   connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcConnector.java e0da80f 
>   connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcConnectorError.java 2b1a0ad 
>   connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcExportInitializer.java ef39cdc 
>   connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportInitializer.java 96818ba 
>   connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestExportLoader.java d4c4565 
>   connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestImportExtractor.java a7ed6ba 
>   connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestImportInitializer.java a33fa36 
>   connector/connector-sdk/pom.xml 4056e14 
>   connector/connector-sdk/src/main/java/org/apache/sqoop/connector/idf/CSVIntermediateDataFormat.java PRE-CREATION 
>   connector/connector-sdk/src/main/java/org/apache/sqoop/connector/idf/IntermediateDataFormat.java PRE-CREATION 
>   connector/connector-sdk/src/main/java/org/apache/sqoop/connector/idf/IntermediateDataFormatError.java PRE-CREATION 
>   connector/connector-sdk/src/test/java/org/apache/sqoop/connector/idf/CSVIntermediateDataFormatTest.java PRE-CREATION 
>   core/src/main/java/org/apache/sqoop/framework/JobManager.java e052584 
>   core/src/main/java/org/apache/sqoop/framework/SubmissionRequest.java a138db5 
>   execution/mapreduce/pom.xml f9a2a0e 
>   execution/mapreduce/src/main/java/org/apache/sqoop/execution/mapreduce/MapreduceExecutionEngine.java 5c0a027 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/JobConstants.java 7fd9a01 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsExportExtractor.java 1978ec6 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsSequenceImportLoader.java a07c511 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsTextImportLoader.java 4621942 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/io/SqoopWritable.java PRE-CREATION 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopFileOutputFormat.java 356ae8a 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopMapper.java 92de37e 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopNullOutputFormat.java 90de6ef 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopOutputFormatLoadExecutor.java 7dedee9 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopReducer.java 98a2c51 
>   execution/mapreduce/src/test/java/org/apache/sqoop/execution/mapreduce/MapreduceExecutionEngineTest.java 39d1b53 
>   execution/mapreduce/src/test/java/org/apache/sqoop/job/JobUtils.java e21f15b 
>   execution/mapreduce/src/test/java/org/apache/sqoop/job/TestHdfsExtract.java b7079dd 
>   execution/mapreduce/src/test/java/org/apache/sqoop/job/TestHdfsLoad.java f849aae 
>   execution/mapreduce/src/test/java/org/apache/sqoop/job/TestMapReduce.java 7b264c6 
>   execution/mapreduce/src/test/java/org/apache/sqoop/job/io/SqoopWritableTest.java PRE-CREATION 
>   execution/mapreduce/src/test/java/org/apache/sqoop/job/mr/TestSqoopOutputFormatLoadExecutor.java bee8ab7 
>   pom.xml 1e2f005 
>   spi/pom.xml 0b240e8 
>   spi/src/main/java/org/apache/sqoop/connector/spi/SqoopConnector.java 2becc56 
>   submission/mapreduce/src/main/java/org/apache/sqoop/submission/mapreduce/MapreduceSubmissionEngine.java bfc28ef 
> 
> Diff: https://reviews.apache.org/r/16812/diff/
> 
> 
> Testing
> -------
> 
> mvn test
> 
> 
> Thanks,
> 
> Abraham Elmahrek
> 
>


Re: Review Request 16812: SQOOP-777. Sqoop2: Pluggable Intermediate Data Format

Posted by Jarek Cecho <ja...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16812/#review48182
-----------------------------------------------------------


Seems to be almost ready!


.gitignore
<https://reviews.apache.org/r/16812/#comment84523>

    This change seems unrelated to this patch and is breaking on current head of sqoop2 branch.



connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcConnector.java
<https://reviews.apache.org/r/16812/#comment84524>

    Let's put this sensible default to SqoopConnector class?



connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcExportInitializer.java
<https://reviews.apache.org/r/16812/#comment84525>

    I don't think that this method will work for "INSERT" statement.



connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/util/InitializationUtils.java
<https://reviews.apache.org/r/16812/#comment84526>

    I think that connector sdk would be better home for this one?
    
    Also this seems quite unrelated change to this patch, can we please upload it separately? This patch is large enough already.



connector/connector-sdk/src/main/java/org/apache/sqoop/connector/idf/CSVIntermediateDataFormat.java
<https://reviews.apache.org/r/16812/#comment84528>

    Wouldn't be efficient to find indexes and use substring method rather then build the value character by character?



connector/connector-sdk/src/test/java/org/apache/sqoop/connector/idf/CSVIntermediateDataFormatTest.java
<https://reviews.apache.org/r/16812/#comment84529>

    Please don't use "*" import, always iterate over all required imports.



connector/connector-sdk/src/test/java/org/apache/sqoop/connector/idf/CSVIntermediateDataFormatTest.java
<https://reviews.apache.org/r/16812/#comment84530>

    Nit: Can we put each "addColumn()" call into separate line? This is quite hard to read.



connector/connector-sdk/src/test/java/org/apache/sqoop/connector/idf/CSVIntermediateDataFormatTest.java
<https://reviews.apache.org/r/16812/#comment84531>

    This do seem incorrect to me - we are expecting 6 columns, but 0 columns are given - this should lead to an exception right?



connector/connector-sdk/src/test/java/org/apache/sqoop/connector/idf/CSVIntermediateDataFormatTest.java
<https://reviews.apache.org/r/16812/#comment84532>

    The comment seems unrelated here?



connector/connector-sdk/src/test/java/org/apache/sqoop/connector/idf/CSVIntermediateDataFormatTest.java
<https://reviews.apache.org/r/16812/#comment84533>

    The comment seems unrelated here?



execution/mapreduce/src/main/java/org/apache/sqoop/job/io/SqoopWritable.java
<https://reviews.apache.org/r/16812/#comment84534>

    Didn't we wanted to use the Configurable interface to skip the string encoding here?


Jarcec

- Jarek Cecho


On June 5, 2014, 11:58 p.m., Abraham Elmahrek wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/16812/
> -----------------------------------------------------------
> 
> (Updated June 5, 2014, 11:58 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-777
>     https://issues.apache.org/jira/browse/SQOOP-777
> 
> 
> Repository: sqoop-sqoop2
> 
> 
> Description
> -------
> 
> Previous review from hari is in first iteration of this review only.
> 
> commit 1c5122611fbd4a46a629421f7c55746cfc14f136
> Author: Hari Shreedharan <hs...@apache.org>
> Date:   Fri Jan 10 15:07:08 2014 -0800
> 
>     SQOOP-777. Sqoop2: Pluggable Intermediate Data Format
>     
>     Implemented a pluggable intermediate data format that decouples the internal representation of the data from the connector and the output formats. Connectors can choose to implement and support a format that is more efficient for them. Also separated the SqoopWritable so th
>     
>     I ran a full build - all tests including integration tests pass. I have not added any new tests, yet. I will add unit tests for the new classes. Also, I have not tried running this on an actual cluster - so things may be broken. I'd like some initial feedback based on the c
>     
>     I also implemented escaping of characters. There is some work remaining to support binary format, but it is mostly integration, the basic implementation is in place.
> 
> :100644 100644 db11b5b... 9bfa07d... M  common/pom.xml
> :100644 100644 3e1adc7... f971240... M  common/src/main/java/org/apache/sqoop/etl/io/DataReader.java
> :100644 100644 d81364e... e547875... M  common/src/main/java/org/apache/sqoop/etl/io/DataWriter.java
> :100644 100644 8b630b2... 30c26a3... M  common/src/main/java/org/apache/sqoop/schema/type/Column.java
> :100644 100644 e0da80f... 9c70db9... M  connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcConnector.java
> :100644 100644 ef39cdc... 075890f... M  connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcExportInitializer.java
> :100644 100644 96818ba... 795ffdb... M  connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportInitializer.java
> :000000 100644 0000000... 6c33423... A  connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/util/InitializationUtils.java
> :100644 100644 aa1c4ff... 7220018... M  connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestExportLoader.java
> :100644 100644 a7ed6ba... 28399f2... M  connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestImportExtractor.java
> :100644 100644 4056e14... f54837d... M  connector/connector-sdk/pom.xml
> :000000 100644 0000000... d4874f2... A  connector/connector-sdk/src/main/java/org/apache/sqoop/connector/CSVIntermediateDataFormat.java
> :000000 100644 0000000... 63e14d2... A  connector/connector-sdk/src/main/java/org/apache/sqoop/connector/IntermediateDataFormat.java
> :000000 100644 0000000... 6e5479f... A  connector/connector-sdk/src/test/java/org/apache/sqoop/connector/CSVIntermediateDataFormatTest.java
> :100644 100644 e052584... 3b2ef94... M  core/src/main/java/org/apache/sqoop/framework/JobManager.java
> :100644 100644 53d0039... 9f5c47d... M  core/src/main/java/org/apache/sqoop/framework/SubmissionRequest.java
> :100644 100644 f9a2a0e... 9754afd... M  execution/mapreduce/pom.xml
> :100644 100644 5c0a027... 7680e33... M  execution/mapreduce/src/main/java/org/apache/sqoop/execution/mapreduce/MapreduceExecutionEngine.java
> :100644 100644 7fd9a01... 3604898... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/JobConstants.java
> :100644 100644 1978ec6... 099cdd3... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsExportExtractor.java
> :100644 100644 a07c511... ee6bf39... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsSequenceImportLoader.java
> :100644 100644 4621942... 7b799ca... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsTextImportLoader.java
> :000000 100644 0000000... 71ae980... A  execution/mapreduce/src/main/java/org/apache/sqoop/job/io/SqoopWritable.java
> :100644 100644 356ae8a... b495cc9... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopFileOutputFormat.java
> :100644 100644 92de37e... 8164ffe... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopMapper.java
> :100644 100644 90de6ef... b3503da... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopNullOutputFormat.java
> :100644 100644 7dedee9... 16e59d8... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopOutputFormatLoadExecutor.java
> :100644 100644 98a2c51... 5aaceb3... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopReducer.java
> :100644 100644 e21f15b... 4e9e6ea... M  execution/mapreduce/src/test/java/org/apache/sqoop/job/JobUtils.java
> :100644 100644 b7079dd... ddda423... M  execution/mapreduce/src/test/java/org/apache/sqoop/job/TestHdfsExtract.java
> :100644 100644 f849aae... 2968411... M  execution/mapreduce/src/test/java/org/apache/sqoop/job/TestHdfsLoad.java
> :100644 100644 7b264c6... 954990f... M  execution/mapreduce/src/test/java/org/apache/sqoop/job/TestMapReduce.java
> :000000 100644 0000000... aea7de3... A  execution/mapreduce/src/test/java/org/apache/sqoop/job/io/SqoopWritableTest.java
> :100644 100644 bee8ab7... 663dfb5... M  execution/mapreduce/src/test/java/org/apache/sqoop/job/mr/TestSqoopOutputFormatLoadExecutor.java
> :100644 100644 1e2f005... a722c74... M  pom.xml
> :100644 100644 0b240e8... 43f17d4... M  spi/pom.xml
> :100644 100644 2becc56... 298e8f5... M  spi/src/main/java/org/apache/sqoop/connector/spi/SqoopConnector.java
> :100644 100644 6fc485b... 8d1a4e8... M  submission/mapreduce/src/main/java/org/apache/sqoop/submission/mapreduce/MapreduceSubmissionEngine.java
> 
> 
> Diffs
> -----
> 
>   .gitignore 62bda436320440172ed737f6d7d257b8088c8441 
>   common/pom.xml db11b5ba70093704bc3f7a0a98cf7d44172e2b7d 
>   common/src/main/java/org/apache/sqoop/etl/io/DataReader.java 3e1adc7084cf9ef1b93c8146110c751c7de34376 
>   common/src/main/java/org/apache/sqoop/etl/io/DataWriter.java d81364ef81bc747437f5b78520bacec8ee2613a3 
>   common/src/main/java/org/apache/sqoop/schema/type/Column.java 8b630b28295700b204da9f5a948eaef106ca559b 
>   connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcConnector.java e0da80f77230fb836ef6ab36e3bd867d6cef553a 
>   connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcExportInitializer.java ef39cdcbc41280a0dec338c8c2ab27b7cf76385b 
>   connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportInitializer.java 96818ba27b25538643f9c5777ed8abebdf0eb1e9 
>   connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/util/InitializationUtils.java PRE-CREATION 
>   connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestExportLoader.java d4c4565aa1ece0a0a9f1da291b72f87887db79fd 
>   connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestImportExtractor.java a7ed6bab95c76c4928414e7850cc3e52a3ce4595 
>   connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestImportInitializer.java a33fa3631400c3382a72e6286b1efb3fca0c9cb9 
>   connector/connector-sdk/pom.xml 4056e14978cb5e679233e9fdde11d53f1ca2756b 
>   connector/connector-sdk/src/main/java/org/apache/sqoop/connector/idf/CSVIntermediateDataFormat.java PRE-CREATION 
>   connector/connector-sdk/src/main/java/org/apache/sqoop/connector/idf/IntermediateDataFormat.java PRE-CREATION 
>   connector/connector-sdk/src/main/java/org/apache/sqoop/connector/idf/IntermediateDataFormatError.java PRE-CREATION 
>   connector/connector-sdk/src/test/java/org/apache/sqoop/connector/idf/CSVIntermediateDataFormatTest.java PRE-CREATION 
>   core/src/main/java/org/apache/sqoop/framework/JobManager.java e0525846029bb394b2614a33595ebe6fcdd4aaf2 
>   core/src/main/java/org/apache/sqoop/framework/SubmissionRequest.java 53d003980a172e4e0acf18630a3496909c17cb5c 
>   execution/mapreduce/pom.xml f9a2a0e57b499cc785bdb6548a2d6113e7b867c0 
>   execution/mapreduce/src/main/java/org/apache/sqoop/execution/mapreduce/MapreduceExecutionEngine.java 5c0a02739a5f7598dc83ab82631b80410ab39213 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/JobConstants.java 7fd9a017140fb91292a88e5a944ae549915c86e4 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsExportExtractor.java 1978ec61e07beb6b1e75bc4da8c698c738ac4e4c 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsSequenceImportLoader.java a07c5111e9e037dab4dacb128a53918f61599495 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsTextImportLoader.java 462194238755e603dd60226a97cf357cf63e0f20 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/io/SqoopWritable.java PRE-CREATION 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopFileOutputFormat.java 356ae8ab6f3cad13b644e1f033278d9012938956 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopMapper.java 92de37e1b7ffd3c4ce584e4d10c4e1f9af03c369 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopNullOutputFormat.java 90de6efa5f3145d0480e375d82bd5c6a9760a799 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopOutputFormatLoadExecutor.java 7dedee9f12b11e3af5cf6ee5c81bbc377a04a4fd 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopReducer.java 98a2c518b65130ebe272faf721b77099a1e85a63 
>   execution/mapreduce/src/test/java/org/apache/sqoop/execution/mapreduce/MapreduceExecutionEngineTest.java 39d1b53abb1a3d71afbb2bc3aaf65b0cfc669a53 
>   execution/mapreduce/src/test/java/org/apache/sqoop/job/JobUtils.java e21f15b8be2aacb7463ab4c41cfce63cbb71cf57 
>   execution/mapreduce/src/test/java/org/apache/sqoop/job/TestHdfsExtract.java b7079dd47c80c776a508b07f0ee5152c1d3fd20e 
>   execution/mapreduce/src/test/java/org/apache/sqoop/job/TestHdfsLoad.java f849aaef10c3b995afbe6326353256affbceee67 
>   execution/mapreduce/src/test/java/org/apache/sqoop/job/TestMapReduce.java 7b264c68ae761fbaf91bc51ed312de70b9ba3fd3 
>   execution/mapreduce/src/test/java/org/apache/sqoop/job/io/SqoopWritableTest.java PRE-CREATION 
>   execution/mapreduce/src/test/java/org/apache/sqoop/job/mr/TestSqoopOutputFormatLoadExecutor.java bee8ab75f1f8e6508fca905aafe7c67f6b5d8d9a 
>   pom.xml 1e2f0057ca2e9cd635ccbce75e5d6e39eeabb163 
>   spi/pom.xml 0b240e8a1146cc7a06a2c50c181fb73b11e6dd8c 
>   spi/src/main/java/org/apache/sqoop/connector/spi/SqoopConnector.java 2becc56364343f1cffdf91d9b17144666acbdd2e 
>   submission/mapreduce/src/main/java/org/apache/sqoop/submission/mapreduce/MapreduceSubmissionEngine.java 6fc485b66ea8c33d09d172675a104954d570f3a7 
> 
> Diff: https://reviews.apache.org/r/16812/diff/
> 
> 
> Testing
> -------
> 
> mvn test
> 
> 
> Thanks,
> 
> Abraham Elmahrek
> 
>


Re: Review Request 16812: SQOOP-777. Sqoop2: Pluggable Intermediate Data Format

Posted by Abraham Elmahrek <ab...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16812/
-----------------------------------------------------------

(Updated June 5, 2014, 11:58 p.m.)


Review request for Sqoop.


Bugs: SQOOP-777
    https://issues.apache.org/jira/browse/SQOOP-777


Repository: sqoop-sqoop2


Description
-------

Previous review from hari is in first iteration of this review only.

commit 1c5122611fbd4a46a629421f7c55746cfc14f136
Author: Hari Shreedharan <hs...@apache.org>
Date:   Fri Jan 10 15:07:08 2014 -0800

    SQOOP-777. Sqoop2: Pluggable Intermediate Data Format
    
    Implemented a pluggable intermediate data format that decouples the internal representation of the data from the connector and the output formats. Connectors can choose to implement and support a format that is more efficient for them. Also separated the SqoopWritable so th
    
    I ran a full build - all tests including integration tests pass. I have not added any new tests, yet. I will add unit tests for the new classes. Also, I have not tried running this on an actual cluster - so things may be broken. I'd like some initial feedback based on the c
    
    I also implemented escaping of characters. There is some work remaining to support binary format, but it is mostly integration, the basic implementation is in place.

:100644 100644 db11b5b... 9bfa07d... M  common/pom.xml
:100644 100644 3e1adc7... f971240... M  common/src/main/java/org/apache/sqoop/etl/io/DataReader.java
:100644 100644 d81364e... e547875... M  common/src/main/java/org/apache/sqoop/etl/io/DataWriter.java
:100644 100644 8b630b2... 30c26a3... M  common/src/main/java/org/apache/sqoop/schema/type/Column.java
:100644 100644 e0da80f... 9c70db9... M  connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcConnector.java
:100644 100644 ef39cdc... 075890f... M  connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcExportInitializer.java
:100644 100644 96818ba... 795ffdb... M  connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportInitializer.java
:000000 100644 0000000... 6c33423... A  connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/util/InitializationUtils.java
:100644 100644 aa1c4ff... 7220018... M  connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestExportLoader.java
:100644 100644 a7ed6ba... 28399f2... M  connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestImportExtractor.java
:100644 100644 4056e14... f54837d... M  connector/connector-sdk/pom.xml
:000000 100644 0000000... d4874f2... A  connector/connector-sdk/src/main/java/org/apache/sqoop/connector/CSVIntermediateDataFormat.java
:000000 100644 0000000... 63e14d2... A  connector/connector-sdk/src/main/java/org/apache/sqoop/connector/IntermediateDataFormat.java
:000000 100644 0000000... 6e5479f... A  connector/connector-sdk/src/test/java/org/apache/sqoop/connector/CSVIntermediateDataFormatTest.java
:100644 100644 e052584... 3b2ef94... M  core/src/main/java/org/apache/sqoop/framework/JobManager.java
:100644 100644 53d0039... 9f5c47d... M  core/src/main/java/org/apache/sqoop/framework/SubmissionRequest.java
:100644 100644 f9a2a0e... 9754afd... M  execution/mapreduce/pom.xml
:100644 100644 5c0a027... 7680e33... M  execution/mapreduce/src/main/java/org/apache/sqoop/execution/mapreduce/MapreduceExecutionEngine.java
:100644 100644 7fd9a01... 3604898... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/JobConstants.java
:100644 100644 1978ec6... 099cdd3... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsExportExtractor.java
:100644 100644 a07c511... ee6bf39... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsSequenceImportLoader.java
:100644 100644 4621942... 7b799ca... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsTextImportLoader.java
:000000 100644 0000000... 71ae980... A  execution/mapreduce/src/main/java/org/apache/sqoop/job/io/SqoopWritable.java
:100644 100644 356ae8a... b495cc9... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopFileOutputFormat.java
:100644 100644 92de37e... 8164ffe... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopMapper.java
:100644 100644 90de6ef... b3503da... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopNullOutputFormat.java
:100644 100644 7dedee9... 16e59d8... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopOutputFormatLoadExecutor.java
:100644 100644 98a2c51... 5aaceb3... M  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopReducer.java
:100644 100644 e21f15b... 4e9e6ea... M  execution/mapreduce/src/test/java/org/apache/sqoop/job/JobUtils.java
:100644 100644 b7079dd... ddda423... M  execution/mapreduce/src/test/java/org/apache/sqoop/job/TestHdfsExtract.java
:100644 100644 f849aae... 2968411... M  execution/mapreduce/src/test/java/org/apache/sqoop/job/TestHdfsLoad.java
:100644 100644 7b264c6... 954990f... M  execution/mapreduce/src/test/java/org/apache/sqoop/job/TestMapReduce.java
:000000 100644 0000000... aea7de3... A  execution/mapreduce/src/test/java/org/apache/sqoop/job/io/SqoopWritableTest.java
:100644 100644 bee8ab7... 663dfb5... M  execution/mapreduce/src/test/java/org/apache/sqoop/job/mr/TestSqoopOutputFormatLoadExecutor.java
:100644 100644 1e2f005... a722c74... M  pom.xml
:100644 100644 0b240e8... 43f17d4... M  spi/pom.xml
:100644 100644 2becc56... 298e8f5... M  spi/src/main/java/org/apache/sqoop/connector/spi/SqoopConnector.java
:100644 100644 6fc485b... 8d1a4e8... M  submission/mapreduce/src/main/java/org/apache/sqoop/submission/mapreduce/MapreduceSubmissionEngine.java


Diffs (updated)
-----

  .gitignore 62bda436320440172ed737f6d7d257b8088c8441 
  common/pom.xml db11b5ba70093704bc3f7a0a98cf7d44172e2b7d 
  common/src/main/java/org/apache/sqoop/etl/io/DataReader.java 3e1adc7084cf9ef1b93c8146110c751c7de34376 
  common/src/main/java/org/apache/sqoop/etl/io/DataWriter.java d81364ef81bc747437f5b78520bacec8ee2613a3 
  common/src/main/java/org/apache/sqoop/schema/type/Column.java 8b630b28295700b204da9f5a948eaef106ca559b 
  connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcConnector.java e0da80f77230fb836ef6ab36e3bd867d6cef553a 
  connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcExportInitializer.java ef39cdcbc41280a0dec338c8c2ab27b7cf76385b 
  connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportInitializer.java 96818ba27b25538643f9c5777ed8abebdf0eb1e9 
  connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/util/InitializationUtils.java PRE-CREATION 
  connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestExportLoader.java d4c4565aa1ece0a0a9f1da291b72f87887db79fd 
  connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestImportExtractor.java a7ed6bab95c76c4928414e7850cc3e52a3ce4595 
  connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestImportInitializer.java a33fa3631400c3382a72e6286b1efb3fca0c9cb9 
  connector/connector-sdk/pom.xml 4056e14978cb5e679233e9fdde11d53f1ca2756b 
  connector/connector-sdk/src/main/java/org/apache/sqoop/connector/idf/CSVIntermediateDataFormat.java PRE-CREATION 
  connector/connector-sdk/src/main/java/org/apache/sqoop/connector/idf/IntermediateDataFormat.java PRE-CREATION 
  connector/connector-sdk/src/main/java/org/apache/sqoop/connector/idf/IntermediateDataFormatError.java PRE-CREATION 
  connector/connector-sdk/src/test/java/org/apache/sqoop/connector/idf/CSVIntermediateDataFormatTest.java PRE-CREATION 
  core/src/main/java/org/apache/sqoop/framework/JobManager.java e0525846029bb394b2614a33595ebe6fcdd4aaf2 
  core/src/main/java/org/apache/sqoop/framework/SubmissionRequest.java 53d003980a172e4e0acf18630a3496909c17cb5c 
  execution/mapreduce/pom.xml f9a2a0e57b499cc785bdb6548a2d6113e7b867c0 
  execution/mapreduce/src/main/java/org/apache/sqoop/execution/mapreduce/MapreduceExecutionEngine.java 5c0a02739a5f7598dc83ab82631b80410ab39213 
  execution/mapreduce/src/main/java/org/apache/sqoop/job/JobConstants.java 7fd9a017140fb91292a88e5a944ae549915c86e4 
  execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsExportExtractor.java 1978ec61e07beb6b1e75bc4da8c698c738ac4e4c 
  execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsSequenceImportLoader.java a07c5111e9e037dab4dacb128a53918f61599495 
  execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsTextImportLoader.java 462194238755e603dd60226a97cf357cf63e0f20 
  execution/mapreduce/src/main/java/org/apache/sqoop/job/io/SqoopWritable.java PRE-CREATION 
  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopFileOutputFormat.java 356ae8ab6f3cad13b644e1f033278d9012938956 
  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopMapper.java 92de37e1b7ffd3c4ce584e4d10c4e1f9af03c369 
  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopNullOutputFormat.java 90de6efa5f3145d0480e375d82bd5c6a9760a799 
  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopOutputFormatLoadExecutor.java 7dedee9f12b11e3af5cf6ee5c81bbc377a04a4fd 
  execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopReducer.java 98a2c518b65130ebe272faf721b77099a1e85a63 
  execution/mapreduce/src/test/java/org/apache/sqoop/execution/mapreduce/MapreduceExecutionEngineTest.java 39d1b53abb1a3d71afbb2bc3aaf65b0cfc669a53 
  execution/mapreduce/src/test/java/org/apache/sqoop/job/JobUtils.java e21f15b8be2aacb7463ab4c41cfce63cbb71cf57 
  execution/mapreduce/src/test/java/org/apache/sqoop/job/TestHdfsExtract.java b7079dd47c80c776a508b07f0ee5152c1d3fd20e 
  execution/mapreduce/src/test/java/org/apache/sqoop/job/TestHdfsLoad.java f849aaef10c3b995afbe6326353256affbceee67 
  execution/mapreduce/src/test/java/org/apache/sqoop/job/TestMapReduce.java 7b264c68ae761fbaf91bc51ed312de70b9ba3fd3 
  execution/mapreduce/src/test/java/org/apache/sqoop/job/io/SqoopWritableTest.java PRE-CREATION 
  execution/mapreduce/src/test/java/org/apache/sqoop/job/mr/TestSqoopOutputFormatLoadExecutor.java bee8ab75f1f8e6508fca905aafe7c67f6b5d8d9a 
  pom.xml 1e2f0057ca2e9cd635ccbce75e5d6e39eeabb163 
  spi/pom.xml 0b240e8a1146cc7a06a2c50c181fb73b11e6dd8c 
  spi/src/main/java/org/apache/sqoop/connector/spi/SqoopConnector.java 2becc56364343f1cffdf91d9b17144666acbdd2e 
  submission/mapreduce/src/main/java/org/apache/sqoop/submission/mapreduce/MapreduceSubmissionEngine.java 6fc485b66ea8c33d09d172675a104954d570f3a7 

Diff: https://reviews.apache.org/r/16812/diff/


Testing
-------

mvn test


Thanks,

Abraham Elmahrek