You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sqoop.apache.org by rj...@gmail.com on 2013/06/01 14:17:28 UTC

Review Request: Sqoop2: Export sub directories option

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11588/
-----------------------------------------------------------

Review request for Sqoop.


Description
-------


Introducing new option for exporting the sub directories of the given input directory.

Added new input 
Input configuration

Input directory: /input
Recursive: 
  0 : NO
  1 : YES
Choose: 0


This addresses bug SQOOP-1063.
    https://issues.apache.org/jira/browse/SQOOP-1063


Diffs
-----

  core/src/main/java/org/apache/sqoop/framework/configuration/InputForm.java d5cbeec 
  core/src/main/resources/framework-resources.properties cebc90e 
  execution/mapreduce/src/main/java/org/apache/sqoop/execution/mapreduce/MapreduceExecutionEngine.java 767080c 
  execution/mapreduce/src/main/java/org/apache/sqoop/job/JobConstants.java e2b3ce8 
  execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsExportPartitioner.java b3590dc 
  execution/mapreduce/src/test/java/org/apache/sqoop/job/TestHdfsExtract.java b3e6050 

Diff: https://reviews.apache.org/r/11588/diff/


Testing
-------

Done


Thanks,

vasanthkumar


Re: Review Request 11588: Sqoop2: Export sub directories option

Posted by Jarek Cecho <ja...@apache.org>.

> On June 9, 2013, 3:39 p.m., Jarek Cecho wrote:
> > core/src/main/java/org/apache/sqoop/framework/configuration/InputForm.java, line 31
> > <https://reviews.apache.org/r/11588/diff/3/?file=300448#file300448line31>
> >
> >     It seems to me that using Enum for simple boolean input is unnecessary overhead. We are adding support for Boolen model type in SQOOP-1076, but the patch is not yet committed. Do you think that it would make sense to wait and use that instead of the Enum?

Hi Vasanth,
SQOOP-1076 is already committed, so feel free to take advantage of it!

Jarcec


- Jarek


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11588/#review21503
-----------------------------------------------------------


On June 4, 2013, 11:58 a.m., vasanthkumar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/11588/
> -----------------------------------------------------------
> 
> (Updated June 4, 2013, 11:58 a.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-1063
>     https://issues.apache.org/jira/browse/SQOOP-1063
> 
> 
> Repository: sqoop-sqoop2
> 
> 
> Description
> -------
> 
> 
> Introducing new option for exporting the sub directories of the given input directory.
> 
> Added new input 
> Input configuration
> 
> Input directory: /input
> Recursive: 
>   0 : NO
>   1 : YES
> Choose: 0
> 
> 
> Diffs
> -----
> 
>   core/src/main/java/org/apache/sqoop/framework/configuration/InputForm.java d5cbeec 
>   core/src/main/resources/framework-resources.properties cebc90e 
>   docs/src/site/sphinx/ClientAPI.rst 4f3fda6 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsExportPartitioner.java b3590dc 
>   execution/mapreduce/src/test/java/org/apache/sqoop/job/TestHdfsExtract.java b3e6050 
>   submission/mapreduce/src/main/java/org/apache/sqoop/submission/mapreduce/MapreduceSubmissionEngine.java 0e8c9f7 
> 
> Diff: https://reviews.apache.org/r/11588/diff/
> 
> 
> Testing
> -------
> 
> Done
> 
> 
> Thanks,
> 
> vasanthkumar
> 
>


Re: Review Request: Sqoop2: Export sub directories option

Posted by Jarek Cecho <ja...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11588/#review21503
-----------------------------------------------------------


Hi Vasanth,
thank you very much for working on this! I do have couple of additional notes:

I do have a small concerns about the test case TestHdfsExtract that seems to be written in a way that the context objects are not passed into the job conf. I think that this is artifact from early days that we should fix in a nice manner as is very likely that we will have to add more inputs in the future.


core/src/main/java/org/apache/sqoop/framework/configuration/InputForm.java
<https://reviews.apache.org/r/11588/#comment44503>

    It seems to me that using Enum for simple boolean input is unnecessary overhead. We are adding support for Boolen model type in SQOOP-1076, but the patch is not yet committed. Do you think that it would make sense to wait and use that instead of the Enum?



execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsExportPartitioner.java
<https://reviews.apache.org/r/11588/#comment44504>

    It seems that boolean is more appropriate data type for this variable.



execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsExportPartitioner.java
<https://reviews.apache.org/r/11588/#comment44694>

    It seems that the code here will firstly recursively traverse through the file system to find all directories, store them in memory and then iterate over the stored directories to get list of files. Wouldn't be faster and less memory intensive to do both passes at the same time? E.g. traverse though the filesystem and build directly the list of files?



execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsExportPartitioner.java
<https://reviews.apache.org/r/11588/#comment44695>

    Similarly as above, would be simple and more effective to calculate the size of all files directly rather than building list of all directories in memory first?


Jarcec

- Jarek Cecho


On June 4, 2013, 11:58 a.m., vasanthkumar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/11588/
> -----------------------------------------------------------
> 
> (Updated June 4, 2013, 11:58 a.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Description
> -------
> 
> 
> Introducing new option for exporting the sub directories of the given input directory.
> 
> Added new input 
> Input configuration
> 
> Input directory: /input
> Recursive: 
>   0 : NO
>   1 : YES
> Choose: 0
> 
> 
> This addresses bug SQOOP-1063.
>     https://issues.apache.org/jira/browse/SQOOP-1063
> 
> 
> Diffs
> -----
> 
>   core/src/main/java/org/apache/sqoop/framework/configuration/InputForm.java d5cbeec 
>   core/src/main/resources/framework-resources.properties cebc90e 
>   docs/src/site/sphinx/ClientAPI.rst 4f3fda6 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsExportPartitioner.java b3590dc 
>   execution/mapreduce/src/test/java/org/apache/sqoop/job/TestHdfsExtract.java b3e6050 
>   submission/mapreduce/src/main/java/org/apache/sqoop/submission/mapreduce/MapreduceSubmissionEngine.java 0e8c9f7 
> 
> Diff: https://reviews.apache.org/r/11588/diff/
> 
> 
> Testing
> -------
> 
> Done
> 
> 
> Thanks,
> 
> vasanthkumar
> 
>


Re: Review Request 11588: Sqoop2: Export sub directories option

Posted by Vasanth kumar RJ <rj...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11588/
-----------------------------------------------------------

(Updated Aug. 16, 2014, 2:43 a.m.)


Review request for Sqoop.


Changes
-------

updated patch.


Bugs: SQOOP-1063
    https://issues.apache.org/jira/browse/SQOOP-1063


Repository: sqoop-sqoop2


Description
-------

Introducing new option for exporting the sub directories of the given input directory.

Added new input 
Input configuration

Input directory: /input
Recursive: true


Diffs (updated)
-----

  core/src/main/java/org/apache/sqoop/framework/configuration/InputForm.java d5cbeec 
  core/src/main/resources/framework-resources.properties 7ecb9ae 
  execution/mapreduce/src/main/java/org/apache/sqoop/execution/mapreduce/MapreduceExecutionEngine.java 84f6213 
  execution/mapreduce/src/main/java/org/apache/sqoop/job/JobConstants.java b2fa15d 
  execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsExportPartitioner.java b3590dc 
  execution/mapreduce/src/test/java/org/apache/sqoop/job/TestHdfsExtract.java 8061c78 

Diff: https://reviews.apache.org/r/11588/diff/


Testing
-------

Test case written and executed.


Thanks,

Vasanth kumar RJ


Re: Review Request 11588: Sqoop2: Export sub directories option

Posted by Vasanth kumar RJ <rj...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11588/
-----------------------------------------------------------

(Updated Nov. 21, 2013, 10:53 p.m.)


Review request for Sqoop.


Changes
-------

Implemented review comments.


Bugs: SQOOP-1063
    https://issues.apache.org/jira/browse/SQOOP-1063


Repository: sqoop-sqoop2


Description (updated)
-------

Introducing new option for exporting the sub directories of the given input directory.

Added new input 
Input configuration

Input directory: /input
Recursive: true


Diffs (updated)
-----

  core/src/main/java/org/apache/sqoop/framework/configuration/InputForm.java d5cbeec 
  core/src/main/resources/framework-resources.properties 7ecb9ae 
  execution/mapreduce/src/main/java/org/apache/sqoop/execution/mapreduce/MapreduceExecutionEngine.java 5c0a027 
  execution/mapreduce/src/main/java/org/apache/sqoop/job/JobConstants.java 7fd9a01 
  execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsExportPartitioner.java b3590dc 
  execution/mapreduce/src/test/java/org/apache/sqoop/job/TestHdfsExtract.java b7079dd 

Diff: https://reviews.apache.org/r/11588/diff/


Testing (updated)
-------

Test case written and executed.


Thanks,

Vasanth kumar RJ


Re: Review Request: Sqoop2: Export sub directories option

Posted by rj...@gmail.com.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11588/
-----------------------------------------------------------

(Updated June 4, 2013, 11:58 a.m.)


Review request for Sqoop.


Changes
-------

Implemented Jarcec suggestions. 

Framework job configuration (ExportJobConfiguration) value are not setting and patch fixes a bug in MapreduceSubmissionEngine.java.

Kindly review.


Description
-------


Introducing new option for exporting the sub directories of the given input directory.

Added new input 
Input configuration

Input directory: /input
Recursive: 
  0 : NO
  1 : YES
Choose: 0


This addresses bug SQOOP-1063.
    https://issues.apache.org/jira/browse/SQOOP-1063


Diffs (updated)
-----

  core/src/main/java/org/apache/sqoop/framework/configuration/InputForm.java d5cbeec 
  core/src/main/resources/framework-resources.properties cebc90e 
  docs/src/site/sphinx/ClientAPI.rst 4f3fda6 
  execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsExportPartitioner.java b3590dc 
  execution/mapreduce/src/test/java/org/apache/sqoop/job/TestHdfsExtract.java b3e6050 
  submission/mapreduce/src/main/java/org/apache/sqoop/submission/mapreduce/MapreduceSubmissionEngine.java 0e8c9f7 

Diff: https://reviews.apache.org/r/11588/diff/


Testing
-------

Done


Thanks,

vasanthkumar


Re: Review Request: Sqoop2: Export sub directories option

Posted by Jarek Cecho <ja...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11588/#review21303
-----------------------------------------------------------


Hi Vasanth,
thank you very much for taking this one!


core/src/main/java/org/apache/sqoop/framework/configuration/InputForm.java
<https://reviews.apache.org/r/11588/#comment44201>

    Size parameter for EnumInput is not used, so I would recommend removing it from the annotation.



execution/mapreduce/src/main/java/org/apache/sqoop/execution/mapreduce/MapreduceExecutionEngine.java
<https://reviews.apache.org/r/11588/#comment44203>

    I think that there is no reason to serialize this information into the Mapreduce config object. It should be retrieved from the Configuration object itself now where we are correctly passing the objects through entire mapreduce execution, right?



execution/mapreduce/src/main/java/org/apache/sqoop/execution/mapreduce/MapreduceExecutionEngine.java
<https://reviews.apache.org/r/11588/#comment44202>

    Shouldn't this condition use logical or instead of and?


Jarcec

- Jarek Cecho


On June 1, 2013, 12:17 p.m., vasanthkumar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/11588/
> -----------------------------------------------------------
> 
> (Updated June 1, 2013, 12:17 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Description
> -------
> 
> 
> Introducing new option for exporting the sub directories of the given input directory.
> 
> Added new input 
> Input configuration
> 
> Input directory: /input
> Recursive: 
>   0 : NO
>   1 : YES
> Choose: 0
> 
> 
> This addresses bug SQOOP-1063.
>     https://issues.apache.org/jira/browse/SQOOP-1063
> 
> 
> Diffs
> -----
> 
>   core/src/main/java/org/apache/sqoop/framework/configuration/InputForm.java d5cbeec 
>   core/src/main/resources/framework-resources.properties cebc90e 
>   execution/mapreduce/src/main/java/org/apache/sqoop/execution/mapreduce/MapreduceExecutionEngine.java 767080c 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/JobConstants.java e2b3ce8 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsExportPartitioner.java b3590dc 
>   execution/mapreduce/src/test/java/org/apache/sqoop/job/TestHdfsExtract.java b3e6050 
> 
> Diff: https://reviews.apache.org/r/11588/diff/
> 
> 
> Testing
> -------
> 
> Done
> 
> 
> Thanks,
> 
> vasanthkumar
> 
>