You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sqoop.apache.org by Erzsebet Szilagyi <er...@gmail.com> on 2016/10/19 15:16:53 UTC

Re: Review Request 52212: SQOOP-3013: Configuration "tmpjars" is not checked for empty strings before passing to MR

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52212/
-----------------------------------------------------------

(Updated Oct. 19, 2016, 5:16 p.m.)


Review request for Sqoop, Boglarka Egyed, Chris Teoh, Attila Szabo, Anna Szonyi, and Szabolcs Vasas.


Changes
-------

A missing check caused some tests to fail - added the check back.
Changed a proposed test case slightly to make more sense.


Bugs: SQOOP-3013
    https://issues.apache.org/jira/browse/SQOOP-3013


Repository: sqoop-trunk


Description
-------

When setting job configurations and adding files to "tmpjars", Sqoop does not sanitize the list of empty strings.
Sqoop should remove empty strings before starting the MR job and raise a warning if an empty string was found.

The proposed changes check for empty strings in "tmpjars" and remove them along with raising a warning.


Diffs (updated)
-----

  src/java/org/apache/sqoop/mapreduce/JobBase.java 7ed2684 
  src/test/org/apache/sqoop/mapreduce/TestJobBase.java PRE-CREATION 

Diff: https://reviews.apache.org/r/52212/diff/


Testing
-------

- Live test with command including: "tmpjars=,,valid,,,validother,,,"
- Unit tests


Thanks,

Erzsebet Szilagyi


Re: Review Request 52212: SQOOP-3013: Configuration "tmpjars" is not checked for empty strings before passing to MR

Posted by Szabolcs Vasas <va...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52212/#review153360
-----------------------------------------------------------




src/java/org/apache/sqoop/mapreduce/JobBase.java (line 202)
<https://reviews.apache.org/r/52212/#comment222658>

    I think we could use org.apache.commons.lang.StringUtils#isEmpty method instead of (null == tmpjars || (null != tmpjars && tmpjars.length() == 0))
    It basically does the same but makes the code less complex.



src/java/org/apache/sqoop/mapreduce/JobBase.java (line 226)
<https://reviews.apache.org/r/52212/#comment222659>

    I know this is not your code but this patch may be a good opportunity to replace the "tmpjars" literals in this method to org.apache.sqoop.config.ConfigurationConstants#MAPRED_DISTCACHE_CONF_PARAM.


Hi Liz,

I have applied the patch on my machine, all the tests are successful. Thank you for adding a new test class for JobBase, I have just one question regarding the test cases: can the tmpjarsInput contain whitespaces? For example is ",,valid,\t,  ,validother,," a valid tmpjarsInput we need to be able to sanitize?

- Szabolcs Vasas


On Oct. 19, 2016, 3:16 p.m., Erzsebet Szilagyi wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52212/
> -----------------------------------------------------------
> 
> (Updated Oct. 19, 2016, 3:16 p.m.)
> 
> 
> Review request for Sqoop, Boglarka Egyed, Chris Teoh, Attila Szabo, Anna Szonyi, and Szabolcs Vasas.
> 
> 
> Bugs: SQOOP-3013
>     https://issues.apache.org/jira/browse/SQOOP-3013
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> -------
> 
> When setting job configurations and adding files to "tmpjars", Sqoop does not sanitize the list of empty strings.
> Sqoop should remove empty strings before starting the MR job and raise a warning if an empty string was found.
> 
> The proposed changes check for empty strings in "tmpjars" and remove them along with raising a warning.
> 
> 
> Diffs
> -----
> 
>   src/java/org/apache/sqoop/mapreduce/JobBase.java 7ed2684 
>   src/test/org/apache/sqoop/mapreduce/TestJobBase.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/52212/diff/
> 
> 
> Testing
> -------
> 
> - Live test with command including: "tmpjars=,,valid,,,validother,,,"
> - Unit tests
> 
> 
> Thanks,
> 
> Erzsebet Szilagyi
> 
>


Re: Review Request 52212: SQOOP-3013: Configuration "tmpjars" is not checked for empty strings before passing to MR

Posted by Erzsebet Szilagyi <er...@gmail.com>.

> On Oct. 20, 2016, 11:56 a.m., Anna Szonyi wrote:
> > src/java/org/apache/sqoop/mapreduce/JobBase.java, line 213
> > <https://reviews.apache.org/r/52212/diff/3/?file=1541532#file1541532line213>
> >
> >     Do we only want to check for empty strings? Would it make sense to check for whitespaces using .isBlank() instead?

According to org.apache.hadoop.fs.Path.checkPathArg only null and "" are unacceptable.


- Erzsebet


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52212/#review153371
-----------------------------------------------------------


On Oct. 19, 2016, 5:16 p.m., Erzsebet Szilagyi wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52212/
> -----------------------------------------------------------
> 
> (Updated Oct. 19, 2016, 5:16 p.m.)
> 
> 
> Review request for Sqoop, Boglarka Egyed, Chris Teoh, Attila Szabo, Anna Szonyi, and Szabolcs Vasas.
> 
> 
> Bugs: SQOOP-3013
>     https://issues.apache.org/jira/browse/SQOOP-3013
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> -------
> 
> When setting job configurations and adding files to "tmpjars", Sqoop does not sanitize the list of empty strings.
> Sqoop should remove empty strings before starting the MR job and raise a warning if an empty string was found.
> 
> The proposed changes check for empty strings in "tmpjars" and remove them along with raising a warning.
> 
> 
> Diffs
> -----
> 
>   src/java/org/apache/sqoop/mapreduce/JobBase.java 7ed2684 
>   src/test/org/apache/sqoop/mapreduce/TestJobBase.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/52212/diff/
> 
> 
> Testing
> -------
> 
> - Live test with command including: "tmpjars=,,valid,,,validother,,,"
> - Unit tests
> 
> 
> Thanks,
> 
> Erzsebet Szilagyi
> 
>


Re: Review Request 52212: SQOOP-3013: Configuration "tmpjars" is not checked for empty strings before passing to MR

Posted by Anna Szonyi <sz...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52212/#review153371
-----------------------------------------------------------




src/java/org/apache/sqoop/mapreduce/JobBase.java (line 209)
<https://reviews.apache.org/r/52212/#comment222664>

    Do we only want to check for empty strings? Would it make sense to check for whitespaces using .isBlank() instead?



src/test/org/apache/sqoop/mapreduce/TestJobBase.java (lines 38 - 41)
<https://reviews.apache.org/r/52212/#comment222662>

    Should remove this header



src/test/org/apache/sqoop/mapreduce/TestJobBase.java (line 45)
<https://reviews.apache.org/r/52212/#comment222666>

    instead of using a boolean I might separate this method into two methods that verify on an error and one that verifies the "success" case to make it more obvious.



src/test/org/apache/sqoop/mapreduce/TestJobBase.java (line 53)
<https://reviews.apache.org/r/52212/#comment222667>

    Would it make sense to refactor this into several methods based on the comments to make it more obvious which parts are the setup steps and which are the validations/assertions?



src/test/org/apache/sqoop/mapreduce/TestJobBase.java (lines 70 - 74)
<https://reviews.apache.org/r/52212/#comment222665>

    This part could maybe be simplified by extracting the "warning" "logic" and potentially verifying on its parameters.


Hey Liz,

Thanks for the patch, this is great, I just have a few minor comments :), please check and see if it makes sense to incorporate them.

/Anna

- Anna Szonyi


On Oct. 19, 2016, 3:16 p.m., Erzsebet Szilagyi wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52212/
> -----------------------------------------------------------
> 
> (Updated Oct. 19, 2016, 3:16 p.m.)
> 
> 
> Review request for Sqoop, Boglarka Egyed, Chris Teoh, Attila Szabo, Anna Szonyi, and Szabolcs Vasas.
> 
> 
> Bugs: SQOOP-3013
>     https://issues.apache.org/jira/browse/SQOOP-3013
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> -------
> 
> When setting job configurations and adding files to "tmpjars", Sqoop does not sanitize the list of empty strings.
> Sqoop should remove empty strings before starting the MR job and raise a warning if an empty string was found.
> 
> The proposed changes check for empty strings in "tmpjars" and remove them along with raising a warning.
> 
> 
> Diffs
> -----
> 
>   src/java/org/apache/sqoop/mapreduce/JobBase.java 7ed2684 
>   src/test/org/apache/sqoop/mapreduce/TestJobBase.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/52212/diff/
> 
> 
> Testing
> -------
> 
> - Live test with command including: "tmpjars=,,valid,,,validother,,,"
> - Unit tests
> 
> 
> Thanks,
> 
> Erzsebet Szilagyi
> 
>


Re: Review Request 52212: SQOOP-3013: Configuration "tmpjars" is not checked for empty strings before passing to MR

Posted by Szabolcs Vasas <va...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52212/#review154889
-----------------------------------------------------------


Ship it!




Hi Liz,

It is a nice solution to use a spy to verify if logging happened! I have just a hair splitting comment: I can see that org.apache.accumulo.core.util.StringUtil import is added but it is not actually used in JobBase, we could remove it.

Thanks,
Szabolcs

- Szabolcs Vasas


On Nov. 4, 2016, 8:29 a.m., Erzsebet Szilagyi wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52212/
> -----------------------------------------------------------
> 
> (Updated Nov. 4, 2016, 8:29 a.m.)
> 
> 
> Review request for Sqoop, Boglarka Egyed, Chris Teoh, Attila Szabo, Anna Szonyi, and Szabolcs Vasas.
> 
> 
> Bugs: SQOOP-3013
>     https://issues.apache.org/jira/browse/SQOOP-3013
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> -------
> 
> When setting job configurations and adding files to "tmpjars", Sqoop does not sanitize the list of empty strings.
> Sqoop should remove empty strings before starting the MR job and raise a warning if an empty string was found.
> 
> The proposed changes check for empty strings in "tmpjars" and remove them along with raising a warning.
> 
> 
> Diffs
> -----
> 
>   src/java/org/apache/sqoop/mapreduce/JobBase.java 7ed2684 
>   src/test/org/apache/sqoop/mapreduce/TestJobBase.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/52212/diff/
> 
> 
> Testing
> -------
> 
> - Live test with command including: "tmpjars=,,valid,,,validother,,,"
> - Unit tests
> 
> 
> Thanks,
> 
> Erzsebet Szilagyi
> 
>


Re: Review Request 52212: SQOOP-3013: Configuration "tmpjars" is not checked for empty strings before passing to MR

Posted by Attila Szabo <ma...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52212/#review155576
-----------------------------------------------------------


Ship it!




Nice job Liz!

Especially from testing POV. Way to go!

- Attila Szabo


On Nov. 4, 2016, 12:33 p.m., Erzsebet Szilagyi wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52212/
> -----------------------------------------------------------
> 
> (Updated Nov. 4, 2016, 12:33 p.m.)
> 
> 
> Review request for Sqoop, Boglarka Egyed, Chris Teoh, Attila Szabo, Anna Szonyi, and Szabolcs Vasas.
> 
> 
> Bugs: SQOOP-3013
>     https://issues.apache.org/jira/browse/SQOOP-3013
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> -------
> 
> When setting job configurations and adding files to "tmpjars", Sqoop does not sanitize the list of empty strings.
> Sqoop should remove empty strings before starting the MR job and raise a warning if an empty string was found.
> 
> The proposed changes check for empty strings in "tmpjars" and remove them along with raising a warning.
> 
> 
> Diffs
> -----
> 
>   src/java/org/apache/sqoop/mapreduce/JobBase.java 7ed2684 
>   src/test/org/apache/sqoop/mapreduce/TestJobBase.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/52212/diff/
> 
> 
> Testing
> -------
> 
> - Live test with command including: "tmpjars=,,valid,,,validother,,,"
> - Unit tests
> 
> 
> Thanks,
> 
> Erzsebet Szilagyi
> 
>


Re: Review Request 52212: SQOOP-3013: Configuration "tmpjars" is not checked for empty strings before passing to MR

Posted by Erzsebet Szilagyi <er...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52212/
-----------------------------------------------------------

(Updated Nov. 4, 2016, 1:33 p.m.)


Review request for Sqoop, Boglarka Egyed, Chris Teoh, Attila Szabo, Anna Szonyi, and Szabolcs Vasas.


Changes
-------

removing unused import


Bugs: SQOOP-3013
    https://issues.apache.org/jira/browse/SQOOP-3013


Repository: sqoop-trunk


Description
-------

When setting job configurations and adding files to "tmpjars", Sqoop does not sanitize the list of empty strings.
Sqoop should remove empty strings before starting the MR job and raise a warning if an empty string was found.

The proposed changes check for empty strings in "tmpjars" and remove them along with raising a warning.


Diffs (updated)
-----

  src/java/org/apache/sqoop/mapreduce/JobBase.java 7ed2684 
  src/test/org/apache/sqoop/mapreduce/TestJobBase.java PRE-CREATION 

Diff: https://reviews.apache.org/r/52212/diff/


Testing
-------

- Live test with command including: "tmpjars=,,valid,,,validother,,,"
- Unit tests


Thanks,

Erzsebet Szilagyi


Re: Review Request 52212: SQOOP-3013: Configuration "tmpjars" is not checked for empty strings before passing to MR

Posted by Erzsebet Szilagyi <er...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52212/
-----------------------------------------------------------

(Updated Nov. 4, 2016, 9:29 a.m.)


Review request for Sqoop, Boglarka Egyed, Chris Teoh, Attila Szabo, Anna Szonyi, and Szabolcs Vasas.


Changes
-------

(renaming methods)


Bugs: SQOOP-3013
    https://issues.apache.org/jira/browse/SQOOP-3013


Repository: sqoop-trunk


Description
-------

When setting job configurations and adding files to "tmpjars", Sqoop does not sanitize the list of empty strings.
Sqoop should remove empty strings before starting the MR job and raise a warning if an empty string was found.

The proposed changes check for empty strings in "tmpjars" and remove them along with raising a warning.


Diffs (updated)
-----

  src/java/org/apache/sqoop/mapreduce/JobBase.java 7ed2684 
  src/test/org/apache/sqoop/mapreduce/TestJobBase.java PRE-CREATION 

Diff: https://reviews.apache.org/r/52212/diff/


Testing
-------

- Live test with command including: "tmpjars=,,valid,,,validother,,,"
- Unit tests


Thanks,

Erzsebet Szilagyi


Re: Review Request 52212: SQOOP-3013: Configuration "tmpjars" is not checked for empty strings before passing to MR

Posted by Erzsebet Szilagyi <er...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52212/
-----------------------------------------------------------

(Updated Nov. 2, 2016, 11:57 a.m.)


Review request for Sqoop, Boglarka Egyed, Chris Teoh, Attila Szabo, Anna Szonyi, and Szabolcs Vasas.


Bugs: SQOOP-3013
    https://issues.apache.org/jira/browse/SQOOP-3013


Repository: sqoop-trunk


Description
-------

When setting job configurations and adding files to "tmpjars", Sqoop does not sanitize the list of empty strings.
Sqoop should remove empty strings before starting the MR job and raise a warning if an empty string was found.

The proposed changes check for empty strings in "tmpjars" and remove them along with raising a warning.


Diffs (updated)
-----

  src/java/org/apache/sqoop/mapreduce/JobBase.java 7ed2684 
  src/test/org/apache/sqoop/mapreduce/TestJobBase.java PRE-CREATION 

Diff: https://reviews.apache.org/r/52212/diff/


Testing
-------

- Live test with command including: "tmpjars=,,valid,,,validother,,,"
- Unit tests


Thanks,

Erzsebet Szilagyi