You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sqoop.apache.org by Venkat Ranganathan <n....@live.com> on 2013/05/30 08:07:19 UTC

Review Request: Fix for SQOOP-1013

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11537/
-----------------------------------------------------------

Review request for Sqoop.


Description
-------

This addresses Boolean, date, time, and timestamp splitters.

THis also disallows char type splitters as discussed in SQOOP-976


Diffs
-----

  connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportPartitioner.java f80f30d 
  connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestImportPartitioner.java ee314d0 

Diff: https://reviews.apache.org/r/11537/diff/


Testing
-------

Introduced new unit tests to test new functionality
All tests pass


Thanks,

Venkat Ranganathan


Re: Review Request: Fix for SQOOP-1013

Posted by Venkat Ranganathan <n....@live.com>.

> On June 13, 2013, 1:41 a.m., Abraham Elmahrek wrote:
> > connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportPartitioner.java, line 513
> > <https://reviews.apache.org/r/11537/diff/2-4/?file=299088#file299088line513>
> >
> >     UTF-16 should support 0 to 0x10FFFF code points.

Good point Abrahama.  Actually the codepoint handling was onething I was not comfortable with that I discussed earlier with Jarek offline.   Using a large base will cause more rounding errors and repeating fractions towards higher fractional positions.   But I decided not to stay with 2^16 but as you noted, it does not cover the supplementary characters.   As you know, Java stores UTF16 as one or two char values depending on whether is in BMP plane or supplementary plane.  But iterating through the Java strings is in char boundaries.   I found another issue in that area.

I fixed that issue also, but we have to enhance this more in the future.


- Venkat


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11537/#review21827
-----------------------------------------------------------


On June 12, 2013, 5:23 p.m., Venkat Ranganathan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/11537/
> -----------------------------------------------------------
> 
> (Updated June 12, 2013, 5:23 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Description
> -------
> 
> This addresses Boolean, date, time, and timestamp splitters.
> 
> THis also disallows char type splitters as discussed in SQOOP-976
> 
> 
> Diffs
> -----
> 
>   connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportPartitioner.java f80f30d 
>   connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestImportPartitioner.java ee314d0 
> 
> Diff: https://reviews.apache.org/r/11537/diff/
> 
> 
> Testing
> -------
> 
> Introduced new unit tests to test new functionality
> All tests pass
> 
> 
> Thanks,
> 
> Venkat Ranganathan
> 
>


Re: Review Request: Fix for SQOOP-1013

Posted by Abraham Elmahrek <ab...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11537/#review21827
-----------------------------------------------------------


I completely agree with you guys. A future jira to improve these algorithms seem perfect.

This seems good except for the minor issues stated below. So... I'm +1.


connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportPartitioner.java
<https://reviews.apache.org/r/11537/#comment45094>

    UTF-16 should support 0 to 0x10FFFF code points.


- Abraham Elmahrek


On June 12, 2013, 5:23 p.m., Venkat Ranganathan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/11537/
> -----------------------------------------------------------
> 
> (Updated June 12, 2013, 5:23 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Description
> -------
> 
> This addresses Boolean, date, time, and timestamp splitters.
> 
> THis also disallows char type splitters as discussed in SQOOP-976
> 
> 
> Diffs
> -----
> 
>   connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportPartitioner.java f80f30d 
>   connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestImportPartitioner.java ee314d0 
> 
> Diff: https://reviews.apache.org/r/11537/diff/
> 
> 
> Testing
> -------
> 
> Introduced new unit tests to test new functionality
> All tests pass
> 
> 
> Thanks,
> 
> Venkat Ranganathan
> 
>


Re: Review Request: Fix for SQOOP-1013

Posted by Jarek Cecho <ja...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11537/#review21879
-----------------------------------------------------------

Ship it!


Ship It!

- Jarek Cecho


On June 13, 2013, 5:46 a.m., Venkat Ranganathan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/11537/
> -----------------------------------------------------------
> 
> (Updated June 13, 2013, 5:46 a.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Description
> -------
> 
> This addresses Boolean, date, time, and timestamp splitters.
> 
> THis also disallows char type splitters as discussed in SQOOP-976
> 
> 
> Diffs
> -----
> 
>   connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportPartitioner.java f80f30d 
>   connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestImportPartitioner.java ee314d0 
> 
> Diff: https://reviews.apache.org/r/11537/diff/
> 
> 
> Testing
> -------
> 
> Introduced new unit tests to test new functionality
> All tests pass
> 
> 
> Thanks,
> 
> Venkat Ranganathan
> 
>


Re: Review Request: Fix for SQOOP-1013

Posted by Abraham Elmahrek <ab...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11537/#review21867
-----------------------------------------------------------

Ship it!


Looks good to me!


connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportPartitioner.java
<https://reviews.apache.org/r/11537/#comment45167>

    Seems right. Nit: 2097152 can be represented as a hex literal: 0x200000. Might be easier to see.


- Abraham Elmahrek


On June 13, 2013, 5:46 a.m., Venkat Ranganathan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/11537/
> -----------------------------------------------------------
> 
> (Updated June 13, 2013, 5:46 a.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Description
> -------
> 
> This addresses Boolean, date, time, and timestamp splitters.
> 
> THis also disallows char type splitters as discussed in SQOOP-976
> 
> 
> Diffs
> -----
> 
>   connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportPartitioner.java f80f30d 
>   connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestImportPartitioner.java ee314d0 
> 
> Diff: https://reviews.apache.org/r/11537/diff/
> 
> 
> Testing
> -------
> 
> Introduced new unit tests to test new functionality
> All tests pass
> 
> 
> Thanks,
> 
> Venkat Ranganathan
> 
>


Re: Review Request: Fix for SQOOP-1013

Posted by Venkat Ranganathan <n....@live.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11537/#review21877
-----------------------------------------------------------



connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportPartitioner.java
<https://reviews.apache.org/r/11537/#comment45173>

    Thanks Abraham.   Good point.  Even though the comments mention the choice of the value, it can be more prominent.   When we revisit this as part of the follow on JIRA, we can do that


- Venkat Ranganathan


On June 13, 2013, 5:46 a.m., Venkat Ranganathan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/11537/
> -----------------------------------------------------------
> 
> (Updated June 13, 2013, 5:46 a.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Description
> -------
> 
> This addresses Boolean, date, time, and timestamp splitters.
> 
> THis also disallows char type splitters as discussed in SQOOP-976
> 
> 
> Diffs
> -----
> 
>   connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportPartitioner.java f80f30d 
>   connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestImportPartitioner.java ee314d0 
> 
> Diff: https://reviews.apache.org/r/11537/diff/
> 
> 
> Testing
> -------
> 
> Introduced new unit tests to test new functionality
> All tests pass
> 
> 
> Thanks,
> 
> Venkat Ranganathan
> 
>


Re: Review Request: Fix for SQOOP-1013

Posted by Venkat Ranganathan <n....@live.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11537/
-----------------------------------------------------------

(Updated June 13, 2013, 5:46 a.m.)


Review request for Sqoop.


Changes
-------

Fixed the base in coverstion to a large enough value to allow for all possible Unicode code points.   Fixed iterating through the String to properly handle codepoints


Description
-------

This addresses Boolean, date, time, and timestamp splitters.

THis also disallows char type splitters as discussed in SQOOP-976


Diffs (updated)
-----

  connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportPartitioner.java f80f30d 
  connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestImportPartitioner.java ee314d0 

Diff: https://reviews.apache.org/r/11537/diff/


Testing
-------

Introduced new unit tests to test new functionality
All tests pass


Thanks,

Venkat Ranganathan


Re: Review Request: Fix for SQOOP-1013

Posted by Venkat Ranganathan <n....@live.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11537/
-----------------------------------------------------------

(Updated June 12, 2013, 5:23 p.m.)


Review request for Sqoop.


Changes
-------

Updated diff.  Cleaned up text partitioning facility for full import usage.


Description
-------

This addresses Boolean, date, time, and timestamp splitters.

THis also disallows char type splitters as discussed in SQOOP-976


Diffs (updated)
-----

  connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportPartitioner.java f80f30d 
  connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestImportPartitioner.java ee314d0 

Diff: https://reviews.apache.org/r/11537/diff/


Testing
-------

Introduced new unit tests to test new functionality
All tests pass


Thanks,

Venkat Ranganathan


Re: Review Request: Fix for SQOOP-1013

Posted by Venkat Ranganathan <n....@live.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11537/
-----------------------------------------------------------

(Updated June 11, 2013, 6:41 p.m.)


Review request for Sqoop.


Changes
-------

Updated with review comments.   Text partitioning will be followed in a separate patch


Description
-------

This addresses Boolean, date, time, and timestamp splitters.

THis also disallows char type splitters as discussed in SQOOP-976


Diffs (updated)
-----

  connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportPartitioner.java f80f30d 
  connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestImportPartitioner.java ee314d0 

Diff: https://reviews.apache.org/r/11537/diff/


Testing
-------

Introduced new unit tests to test new functionality
All tests pass


Thanks,

Venkat Ranganathan


Re: Review Request: Fix for SQOOP-1013

Posted by Abraham Elmahrek <ab...@cloudera.com>.

> On May 31, 2013, 8:30 p.m., Abraham Elmahrek wrote:
> > Seems good except for a few comments and nit picks.
> > 
> > The datetime split logic is when certain intervals are "hotter" than others. IE: 1000000 rows out of 20000000 exist between the date range of december 1st to december 31st, but a user is importing the entire years data, with 12 node. Basically, 1 machine will extract 1000000 rows, while the others will extract 1000000/11 rows. In the future we could probably add some upfront analysis of the data to improve distribution.

Sorry... my wording above wasn't very concise... The datetime split logic doesn't seem to handle "hot" partitions is what I meant.


- Abraham


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11537/#review21237
-----------------------------------------------------------


On May 30, 2013, 6:25 p.m., Venkat Ranganathan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/11537/
> -----------------------------------------------------------
> 
> (Updated May 30, 2013, 6:25 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Description
> -------
> 
> This addresses Boolean, date, time, and timestamp splitters.
> 
> THis also disallows char type splitters as discussed in SQOOP-976
> 
> 
> Diffs
> -----
> 
>   connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportPartitioner.java f80f30d 
>   connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestImportPartitioner.java ee314d0 
> 
> Diff: https://reviews.apache.org/r/11537/diff/
> 
> 
> Testing
> -------
> 
> Introduced new unit tests to test new functionality
> All tests pass
> 
> 
> Thanks,
> 
> Venkat Ranganathan
> 
>


Re: Review Request: Fix for SQOOP-1013

Posted by Venkat Ranganathan <n....@live.com>.

> On May 31, 2013, 8:30 p.m., Abraham Elmahrek wrote:
> > Seems good except for a few comments and nit picks.
> > 
> > The datetime split logic is when certain intervals are "hotter" than others. IE: 1000000 rows out of 20000000 exist between the date range of december 1st to december 31st, but a user is importing the entire years data, with 12 node. Basically, 1 machine will extract 1000000 rows, while the others will extract 1000000/11 rows. In the future we could probably add some upfront analysis of the data to improve distribution.
> 
> Abraham Elmahrek wrote:
>     Sorry... my wording above wasn't very concise... The datetime split logic doesn't seem to handle "hot" partitions is what I meant.

Good point Abraham - even though this skew can happen to all datatypes that are used for splitting, with dates, there is more chance of such skew.  Typically the solution for this as also to efficiently manage the data life-cycle is dependent on the database facilities, so we may have to do database specific functionality like partitioning etc.   A generic solution can be attempted, but support for grouping and ordering based on expressions varies between different databases.   But something to keep in our mind


> On May 31, 2013, 8:30 p.m., Abraham Elmahrek wrote:
> > connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportPartitioner.java, lines 164-180
> > <https://reviews.apache.org/r/11537/diff/2/?file=299088#file299088line164>
> >
> >     Nit: This seems like it can be merged above.

I am assuming you are talking about the end condition.   Please see above


> On May 31, 2013, 8:30 p.m., Abraham Elmahrek wrote:
> > connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportPartitioner.java, lines 139-163
> > <https://reviews.apache.org/r/11537/diff/2/?file=299088#file299088line139>
> >
> >     Nit: This loop can probably handle all partitions by using constructDateConditions(objLB, objUB, false) to constructDateConditions(objLB, objUB, upperBound == maxDateValue). Also, upperBound += (i < remainder) ? 1 : 0;

That is true, but we have the pattern to add the last partition separately for all partitioning types.   It is better to be consistent, I thought.   Do you agree?


> On May 31, 2013, 8:30 p.m., Abraham Elmahrek wrote:
> > connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportPartitioner.java, line 190
> > <https://reviews.apache.org/r/11537/diff/2/?file=299088#file299088line190>
> >
> >     There doesn't seem to be any thing stopping partitionMaxValue from being null. So, it makes sense to add some null checking here as well.

min and max being null is handled for all datatypes.   NULL in the presence of all values will be treated as having a lower value and so it is impossible to get MAX to null while MIN is not null.

In fact, when one row has a non-null value for a column with all the other rows in the table having the split by column as NULL will return the single non-null value for both min and max.

So, the whole null handling thing has to be cleaned up.   I will file a follow on JIRA for that


> On May 31, 2013, 8:30 p.m., Abraham Elmahrek wrote:
> > connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportPartitioner.java, line 232
> > <https://reviews.apache.org/r/11537/diff/2/?file=299088#file299088line232>
> >
> >     There doesn't seem to be any thing stopping partitionMaxValue from being null. So, it makes sense to add some null checking here as well.

Please see above


- Venkat


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11537/#review21237
-----------------------------------------------------------


On May 30, 2013, 6:25 p.m., Venkat Ranganathan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/11537/
> -----------------------------------------------------------
> 
> (Updated May 30, 2013, 6:25 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Description
> -------
> 
> This addresses Boolean, date, time, and timestamp splitters.
> 
> THis also disallows char type splitters as discussed in SQOOP-976
> 
> 
> Diffs
> -----
> 
>   connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportPartitioner.java f80f30d 
>   connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestImportPartitioner.java ee314d0 
> 
> Diff: https://reviews.apache.org/r/11537/diff/
> 
> 
> Testing
> -------
> 
> Introduced new unit tests to test new functionality
> All tests pass
> 
> 
> Thanks,
> 
> Venkat Ranganathan
> 
>


Re: Review Request: Fix for SQOOP-1013

Posted by Abraham Elmahrek <ab...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11537/#review21237
-----------------------------------------------------------


Seems good except for a few comments and nit picks.

The datetime split logic is when certain intervals are "hotter" than others. IE: 1000000 rows out of 20000000 exist between the date range of december 1st to december 31st, but a user is importing the entire years data, with 12 node. Basically, 1 machine will extract 1000000 rows, while the others will extract 1000000/11 rows. In the future we could probably add some upfront analysis of the data to improve distribution.


connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportPartitioner.java
<https://reviews.apache.org/r/11537/#comment44096>

    Nit: This loop can probably handle all partitions by using constructDateConditions(objLB, objUB, false) to constructDateConditions(objLB, objUB, upperBound == maxDateValue). Also, upperBound += (i < remainder) ? 1 : 0;



connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportPartitioner.java
<https://reviews.apache.org/r/11537/#comment44094>

    Nit: This seems like it can be merged above.



connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportPartitioner.java
<https://reviews.apache.org/r/11537/#comment44099>

    There doesn't seem to be any thing stopping partitionMaxValue from being null. So, it makes sense to add some null checking here as well.



connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportPartitioner.java
<https://reviews.apache.org/r/11537/#comment44098>

    There doesn't seem to be any thing stopping partitionMaxValue from being null. So, it makes sense to add some null checking here as well.


- Abraham Elmahrek


On May 30, 2013, 6:25 p.m., Venkat Ranganathan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/11537/
> -----------------------------------------------------------
> 
> (Updated May 30, 2013, 6:25 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Description
> -------
> 
> This addresses Boolean, date, time, and timestamp splitters.
> 
> THis also disallows char type splitters as discussed in SQOOP-976
> 
> 
> Diffs
> -----
> 
>   connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportPartitioner.java f80f30d 
>   connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestImportPartitioner.java ee314d0 
> 
> Diff: https://reviews.apache.org/r/11537/diff/
> 
> 
> Testing
> -------
> 
> Introduced new unit tests to test new functionality
> All tests pass
> 
> 
> Thanks,
> 
> Venkat Ranganathan
> 
>


Re: Review Request: Fix for SQOOP-1013

Posted by Venkat Ranganathan <n....@live.com>.

> On June 10, 2013, 3:36 p.m., Jarek Cecho wrote:
> > Hi Abe and Venkat,
> > thank you very much for working on this patch and supplying very valid comments! I definitely agree that current splitters have a lot room for improvement, especially for skewed values. However as this topic itself is very big, I would suggest to get this rather simpler implementation in and tweak it later. What do you think?

I agree.  I will file a follow on JIRA!


> On June 10, 2013, 3:36 p.m., Jarek Cecho wrote:
> > connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportPartitioner.java, lines 92-93
> > <https://reviews.apache.org/r/11537/diff/2/?file=299088#file299088line92>
> >
> >     Whereas splitting on text based column is not the most optimal way how to use Sqoop for importing data I would argue that in some situations it might be necessary. The SQOOP-976 is more about using text based column as a base for incremental import than as a general split by column. The main issue why we've decided not to support text based columns for incremental import is that it's very easy to insert value "in the middle" and thus make entire incremental import invalid. However this issue is not the case when doing normal one time import.

I agree.  I was reviewing SQOOP-976 and I decided to disallow it as I thought it was not useful in all situations.   But it is better to allow it for complete imports as splitting by text is more common for complete imports.   Good catch.  Let me update the diff


> On June 10, 2013, 3:36 p.m., Jarek Cecho wrote:
> > connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestImportPartitioner.java, line 35
> > <https://reviews.apache.org/r/11537/diff/2/?file=299089#file299089line35>
> >
> >     Extreme Nit: Would you mind removing this unused import?

Will do.  Thanks


> On June 10, 2013, 3:36 p.m., Jarek Cecho wrote:
> > connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportPartitioner.java, lines 415-421
> > <https://reviews.apache.org/r/11537/diff/2/?file=299088#file299088line415>
> >
> >     Nit: This method seems to be unused.

Yes.  Will remove it.  Thanks


- Venkat


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11537/#review21652
-----------------------------------------------------------


On May 30, 2013, 6:25 p.m., Venkat Ranganathan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/11537/
> -----------------------------------------------------------
> 
> (Updated May 30, 2013, 6:25 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Description
> -------
> 
> This addresses Boolean, date, time, and timestamp splitters.
> 
> THis also disallows char type splitters as discussed in SQOOP-976
> 
> 
> Diffs
> -----
> 
>   connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportPartitioner.java f80f30d 
>   connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestImportPartitioner.java ee314d0 
> 
> Diff: https://reviews.apache.org/r/11537/diff/
> 
> 
> Testing
> -------
> 
> Introduced new unit tests to test new functionality
> All tests pass
> 
> 
> Thanks,
> 
> Venkat Ranganathan
> 
>


Re: Review Request: Fix for SQOOP-1013

Posted by Jarek Cecho <ja...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11537/#review21652
-----------------------------------------------------------


Hi Abe and Venkat,
thank you very much for working on this patch and supplying very valid comments! I definitely agree that current splitters have a lot room for improvement, especially for skewed values. However as this topic itself is very big, I would suggest to get this rather simpler implementation in and tweak it later. What do you think?


connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportPartitioner.java
<https://reviews.apache.org/r/11537/#comment44741>

    Whereas splitting on text based column is not the most optimal way how to use Sqoop for importing data I would argue that in some situations it might be necessary. The SQOOP-976 is more about using text based column as a base for incremental import than as a general split by column. The main issue why we've decided not to support text based columns for incremental import is that it's very easy to insert value "in the middle" and thus make entire incremental import invalid. However this issue is not the case when doing normal one time import.



connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportPartitioner.java
<https://reviews.apache.org/r/11537/#comment44742>

    Nit: This method seems to be unused.



connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestImportPartitioner.java
<https://reviews.apache.org/r/11537/#comment44743>

    Extreme Nit: Would you mind removing this unused import?


Jarcec

- Jarek Cecho


On May 30, 2013, 6:25 p.m., Venkat Ranganathan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/11537/
> -----------------------------------------------------------
> 
> (Updated May 30, 2013, 6:25 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Description
> -------
> 
> This addresses Boolean, date, time, and timestamp splitters.
> 
> THis also disallows char type splitters as discussed in SQOOP-976
> 
> 
> Diffs
> -----
> 
>   connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportPartitioner.java f80f30d 
>   connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestImportPartitioner.java ee314d0 
> 
> Diff: https://reviews.apache.org/r/11537/diff/
> 
> 
> Testing
> -------
> 
> Introduced new unit tests to test new functionality
> All tests pass
> 
> 
> Thanks,
> 
> Venkat Ranganathan
> 
>


Re: Review Request: Fix for SQOOP-1013

Posted by Venkat Ranganathan <n....@live.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11537/
-----------------------------------------------------------

(Updated May 30, 2013, 6:25 p.m.)


Review request for Sqoop.


Changes
-------

Cleaned up unused imports and other code cleanup


Description
-------

This addresses Boolean, date, time, and timestamp splitters.

THis also disallows char type splitters as discussed in SQOOP-976


Diffs (updated)
-----

  connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportPartitioner.java f80f30d 
  connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestImportPartitioner.java ee314d0 

Diff: https://reviews.apache.org/r/11537/diff/


Testing
-------

Introduced new unit tests to test new functionality
All tests pass


Thanks,

Venkat Ranganathan