You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sqoop.apache.org by Jarek Cecho <ja...@apache.org> on 2013/05/20 15:48:10 UTC

Re: Review Request: Export dir to support subdirectories

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10646/#review20764
-----------------------------------------------------------


Hi Vasanth,
thank you very much for working on this patch, greatly appreciated! Would you mind introducing test case that will cover the new introduced functionality?


src/java/org/apache/sqoop/mapreduce/ExportJobBase.java
<https://reviews.apache.org/r/10646/#comment42860>

    I don't feel entirely comfortable about this as it will change behavior of the default input format that is skipping certain names. For example files/directories starting with dot or underscore are normally skipped.
    
    Perhaps we could introduce new parameter like --recursive-export that will be properly documented? 


Jarcec

- Jarek Cecho


On April 19, 2013, 12:02 p.m., vasanthkumar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/10646/
> -----------------------------------------------------------
> 
> (Updated April 19, 2013, 12:02 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Description
> -------
> 
> Export dir to support subdirectories
> 
> 
> This addresses bug SQOOP-951.
>     https://issues.apache.org/jira/browse/SQOOP-951
> 
> 
> Diffs
> -----
> 
>   src/java/org/apache/sqoop/mapreduce/CombineFileInputFormat.java 7d2be38 
>   src/java/org/apache/sqoop/mapreduce/ExportJobBase.java 1065d0b 
> 
> Diff: https://reviews.apache.org/r/10646/diff/
> 
> 
> Testing
> -------
> 
> Done
> 
> 
> Thanks,
> 
> vasanthkumar
> 
>


Re: Review Request: Export dir to support subdirectories

Posted by rj...@gmail.com.

> On May 20, 2013, 1:48 p.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/mapreduce/ExportJobBase.java, lines 213-226
> > <https://reviews.apache.org/r/10646/diff/1/?file=282673#file282673line213>
> >
> >     I don't feel entirely comfortable about this as it will change behavior of the default input format that is skipping certain names. For example files/directories starting with dot or underscore are normally skipped.
> >     
> >     Perhaps we could introduce new parameter like --recursive-export that will be properly documented?

Hi Jarcec,

Currently sqoop is normally skipping the files starting with dot or underscore. Yes this patch skips the dot or underscore files. Here, adding only input paths.
So you want to include the file starting with dot and underscore?

Kindly suggest


- vasanthkumar


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10646/#review20764
-----------------------------------------------------------


On April 19, 2013, 12:02 p.m., vasanthkumar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/10646/
> -----------------------------------------------------------
> 
> (Updated April 19, 2013, 12:02 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Description
> -------
> 
> Export dir to support subdirectories
> 
> 
> This addresses bug SQOOP-951.
>     https://issues.apache.org/jira/browse/SQOOP-951
> 
> 
> Diffs
> -----
> 
>   src/java/org/apache/sqoop/mapreduce/CombineFileInputFormat.java 7d2be38 
>   src/java/org/apache/sqoop/mapreduce/ExportJobBase.java 1065d0b 
> 
> Diff: https://reviews.apache.org/r/10646/diff/
> 
> 
> Testing
> -------
> 
> Done
> 
> 
> Thanks,
> 
> vasanthkumar
> 
>


Re: Review Request: Export dir to support subdirectories

Posted by Jarek Cecho <ja...@apache.org>.

> On May 20, 2013, 1:48 p.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/mapreduce/ExportJobBase.java, lines 213-226
> > <https://reviews.apache.org/r/10646/diff/1/?file=282673#file282673line213>
> >
> >     I don't feel entirely comfortable about this as it will change behavior of the default input format that is skipping certain names. For example files/directories starting with dot or underscore are normally skipped.
> >     
> >     Perhaps we could introduce new parameter like --recursive-export that will be properly documented?
> 
> vasanthkumar wrote:
>     Hi Jarcec,
>     
>     Currently sqoop is normally skipping the files starting with dot or underscore. Yes this patch skips the dot or underscore files. Here, adding only input paths.
>     So you want to include the file starting with dot and underscore?
>     
>     Kindly suggest

I did not tried it myself yet, but I believe that with this patch Sqoop will try to add content of directories starting with dot or underscore such as "_logs" or others that might be generated by mapreduce job automatically. If that would be indeed the case, then I'm afraid that this patch might break current customer deployments and hence my concern about backward compatibility.


- Jarek


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10646/#review20764
-----------------------------------------------------------


On April 19, 2013, 12:02 p.m., vasanthkumar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/10646/
> -----------------------------------------------------------
> 
> (Updated April 19, 2013, 12:02 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Description
> -------
> 
> Export dir to support subdirectories
> 
> 
> This addresses bug SQOOP-951.
>     https://issues.apache.org/jira/browse/SQOOP-951
> 
> 
> Diffs
> -----
> 
>   src/java/org/apache/sqoop/mapreduce/CombineFileInputFormat.java 7d2be38 
>   src/java/org/apache/sqoop/mapreduce/ExportJobBase.java 1065d0b 
> 
> Diff: https://reviews.apache.org/r/10646/diff/
> 
> 
> Testing
> -------
> 
> Done
> 
> 
> Thanks,
> 
> vasanthkumar
> 
>


Re: Review Request: Export dir to support subdirectories

Posted by Venkat Ranganathan <n....@live.com>.

On May 20, 2013, 1:48 p.m., vasanthkumar wrote:
> > Jarcec

With the introduction of Hcat support, we will be able to move entire hive tables with partitions in it (one of the cases for this) and I would assume data in HDFS in subdirectories destined for a single table would typically be a Hive table.   But still this can help in some scenarios.   I agree with Jarcec's comments that --recursive-export option would be needed to specifically request the changed behavior. 


- Venkat


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10646/#review20764
-----------------------------------------------------------


On April 19, 2013, 12:02 p.m., vasanthkumar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/10646/
> -----------------------------------------------------------
> 
> (Updated April 19, 2013, 12:02 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Description
> -------
> 
> Export dir to support subdirectories
> 
> 
> This addresses bug SQOOP-951.
>     https://issues.apache.org/jira/browse/SQOOP-951
> 
> 
> Diffs
> -----
> 
>   src/java/org/apache/sqoop/mapreduce/CombineFileInputFormat.java 7d2be38 
>   src/java/org/apache/sqoop/mapreduce/ExportJobBase.java 1065d0b 
> 
> Diff: https://reviews.apache.org/r/10646/diff/
> 
> 
> Testing
> -------
> 
> Done
> 
> 
> Thanks,
> 
> vasanthkumar
> 
>