You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sqoop.apache.org by Jarek Cecho <ja...@apache.org> on 2013/06/25 23:42:49 UTC

Review Request 12089: SQOOP-1101 Sqoop2: Basic schema: Provide implementation of the proposed data types

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12089/
-----------------------------------------------------------

Review request for Sqoop.


Bugs: SQOOP-1101
    https://issues.apache.org/jira/browse/SQOOP-1101


Repository: sqoop-sqoop2


Description
-------

This part of SQOOP-1073 is adding classes representing the data types proposed in SQOOP-515.


Diffs
-----

  common/pom.xml 2921800 
  common/src/main/java/org/apache/sqoop/schema/Schema.java PRE-CREATION 
  common/src/main/java/org/apache/sqoop/schema/SchemaError.java PRE-CREATION 
  common/src/main/java/org/apache/sqoop/schema/type/AbstractComplexType.java PRE-CREATION 
  common/src/main/java/org/apache/sqoop/schema/type/AbstractDateTime.java PRE-CREATION 
  common/src/main/java/org/apache/sqoop/schema/type/AbstractNumber.java PRE-CREATION 
  common/src/main/java/org/apache/sqoop/schema/type/AbstractString.java PRE-CREATION 
  common/src/main/java/org/apache/sqoop/schema/type/Array.java PRE-CREATION 
  common/src/main/java/org/apache/sqoop/schema/type/Binary.java PRE-CREATION 
  common/src/main/java/org/apache/sqoop/schema/type/Bit.java PRE-CREATION 
  common/src/main/java/org/apache/sqoop/schema/type/Column.java PRE-CREATION 
  common/src/main/java/org/apache/sqoop/schema/type/Date.java PRE-CREATION 
  common/src/main/java/org/apache/sqoop/schema/type/DateTime.java PRE-CREATION 
  common/src/main/java/org/apache/sqoop/schema/type/Decimal.java PRE-CREATION 
  common/src/main/java/org/apache/sqoop/schema/type/Enum.java PRE-CREATION 
  common/src/main/java/org/apache/sqoop/schema/type/FixedPoint.java PRE-CREATION 
  common/src/main/java/org/apache/sqoop/schema/type/FloatingPoint.java PRE-CREATION 
  common/src/main/java/org/apache/sqoop/schema/type/Map.java PRE-CREATION 
  common/src/main/java/org/apache/sqoop/schema/type/Set.java PRE-CREATION 
  common/src/main/java/org/apache/sqoop/schema/type/Text.java PRE-CREATION 
  common/src/main/java/org/apache/sqoop/schema/type/Time.java PRE-CREATION 
  common/src/main/java/org/apache/sqoop/schema/type/Type.java PRE-CREATION 
  common/src/main/java/org/apache/sqoop/schema/type/Unsupported.java PRE-CREATION 

Diff: https://reviews.apache.org/r/12089/diff/


Testing
-------


Thanks,

Jarek Cecho


Re: Review Request 12089: SQOOP-1101 Sqoop2: Basic schema: Provide implementation of the proposed data types

Posted by Jarek Cecho <ja...@apache.org>.

> On June 25, 2013, 11:27 p.m., Abraham Elmahrek wrote:
> > Overall, looks good. I am mostly unsure about the use of 31 as the only prime number separating hash code values.

Hi Abe,
thank you very much for the review, appreciated! All the hashCode methods were generated by my IDE (IntelliJ IDEA) including the magical constant 31. I assume that this is best practice, however please correct me if I'm wrong.


> On June 25, 2013, 11:27 p.m., Abraham Elmahrek wrote:
> > common/src/main/java/org/apache/sqoop/schema/Schema.java, lines 87-89
> > <https://reviews.apache.org/r/12089/diff/1/?file=310883#file310883line87>
> >
> >     Doesn't seem necessary if this operation is idempotent.

The operation is not meant to be idempotent, the addColumn is suppose to be called iteratively to build a list of columns (where the order is important) and adding two columns with the same name should not be valid operation.


- Jarek


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12089/#review22380
-----------------------------------------------------------


On June 25, 2013, 9:42 p.m., Jarek Cecho wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/12089/
> -----------------------------------------------------------
> 
> (Updated June 25, 2013, 9:42 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-1101
>     https://issues.apache.org/jira/browse/SQOOP-1101
> 
> 
> Repository: sqoop-sqoop2
> 
> 
> Description
> -------
> 
> This part of SQOOP-1073 is adding classes representing the data types proposed in SQOOP-515.
> 
> 
> Diffs
> -----
> 
>   common/pom.xml 2921800 
>   common/src/main/java/org/apache/sqoop/schema/Schema.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/SchemaError.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractComplexType.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractDateTime.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractNumber.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractString.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Array.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Binary.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Bit.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Column.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Date.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/DateTime.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Decimal.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Enum.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/FixedPoint.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/FloatingPoint.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Map.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Set.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Text.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Time.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Type.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Unsupported.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/12089/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Jarek Cecho
> 
>


Re: Review Request 12089: SQOOP-1101 Sqoop2: Basic schema: Provide implementation of the proposed data types

Posted by Venkat Ranganathan <n....@live.com>.

> On June 25, 2013, 11:27 p.m., Abraham Elmahrek wrote:
> > Overall, looks good. I am mostly unsure about the use of 31 as the only prime number separating hash code values.
> 
> Jarek Cecho wrote:
>     Hi Abe,
>     thank you very much for the review, appreciated! All the hashCode methods were generated by my IDE (IntelliJ IDEA) including the magical constant 31. I assume that this is best practice, however please correct me if I'm wrong.
> 
> Abraham Elmahrek wrote:
>     The number 31 seems arbitrary. My understanding is that hash codes with multiple instance members involved require some level of separation. By consistently using the number 31, there is a small chance that there will be some overlap between the hash codes produced by the members. It should increase the chance of a collision... the amount it increases it by is likely small though if the members are strings. If the members are Boolean, the overlap is guaranteed... IE: class A: member1 = Boolean,  member1 = Boolean, hashCode = func { result = 0; if member1 != null, then result = member1.hashCode(); if member2 != null, then result = 31*result + member2.hashCode(); }. NOTE: Boolean in java has two values for its hash code.

I think the basic idea is to have an hash code that is somewhat similar to member1*31^(n-1) + member2*31^(n-2) + ... + member[n] using int arithmetic (where member1 is the hashcode of member object 1 for example computed similarly).  But it is done with integer arithmetic.   I have seen other numbers used, like say 37, but may be that has better dispersion but still will be an arbitrary number, right?


- Venkat


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12089/#review22380
-----------------------------------------------------------


On June 25, 2013, 9:42 p.m., Jarek Cecho wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/12089/
> -----------------------------------------------------------
> 
> (Updated June 25, 2013, 9:42 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-1101
>     https://issues.apache.org/jira/browse/SQOOP-1101
> 
> 
> Repository: sqoop-sqoop2
> 
> 
> Description
> -------
> 
> This part of SQOOP-1073 is adding classes representing the data types proposed in SQOOP-515.
> 
> 
> Diffs
> -----
> 
>   common/pom.xml 2921800 
>   common/src/main/java/org/apache/sqoop/schema/Schema.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/SchemaError.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractComplexType.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractDateTime.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractNumber.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractString.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Array.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Binary.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Bit.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Column.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Date.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/DateTime.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Decimal.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Enum.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/FixedPoint.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/FloatingPoint.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Map.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Set.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Text.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Time.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Type.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Unsupported.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/12089/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Jarek Cecho
> 
>


Re: Review Request 12089: SQOOP-1101 Sqoop2: Basic schema: Provide implementation of the proposed data types

Posted by Abraham Elmahrek <ab...@cloudera.com>.

> On June 25, 2013, 11:27 p.m., Abraham Elmahrek wrote:
> > Overall, looks good. I am mostly unsure about the use of 31 as the only prime number separating hash code values.
> 
> Jarek Cecho wrote:
>     Hi Abe,
>     thank you very much for the review, appreciated! All the hashCode methods were generated by my IDE (IntelliJ IDEA) including the magical constant 31. I assume that this is best practice, however please correct me if I'm wrong.

The number 31 seems arbitrary. My understanding is that hash codes with multiple instance members involved require some level of separation. By consistently using the number 31, there is a small chance that there will be some overlap between the hash codes produced by the members. It should increase the chance of a collision... the amount it increases it by is likely small though if the members are strings. If the members are Boolean, the overlap is guaranteed... IE: class A: member1 = Boolean,  member1 = Boolean, hashCode = func { result = 0; if member1 != null, then result = member1.hashCode(); if member2 != null, then result = 31*result + member2.hashCode(); }. NOTE: Boolean in java has two values for its hash code.


> On June 25, 2013, 11:27 p.m., Abraham Elmahrek wrote:
> > common/src/main/java/org/apache/sqoop/schema/Schema.java, lines 87-89
> > <https://reviews.apache.org/r/12089/diff/1/?file=310883#file310883line87>
> >
> >     Doesn't seem necessary if this operation is idempotent.
> 
> Jarek Cecho wrote:
>     The operation is not meant to be idempotent, the addColumn is suppose to be called iteratively to build a list of columns (where the order is important) and adding two columns with the same name should not be valid operation.

Makes sense!


- Abraham


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12089/#review22380
-----------------------------------------------------------


On June 25, 2013, 9:42 p.m., Jarek Cecho wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/12089/
> -----------------------------------------------------------
> 
> (Updated June 25, 2013, 9:42 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-1101
>     https://issues.apache.org/jira/browse/SQOOP-1101
> 
> 
> Repository: sqoop-sqoop2
> 
> 
> Description
> -------
> 
> This part of SQOOP-1073 is adding classes representing the data types proposed in SQOOP-515.
> 
> 
> Diffs
> -----
> 
>   common/pom.xml 2921800 
>   common/src/main/java/org/apache/sqoop/schema/Schema.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/SchemaError.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractComplexType.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractDateTime.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractNumber.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractString.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Array.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Binary.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Bit.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Column.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Date.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/DateTime.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Decimal.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Enum.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/FixedPoint.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/FloatingPoint.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Map.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Set.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Text.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Time.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Type.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Unsupported.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/12089/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Jarek Cecho
> 
>


Re: Review Request 12089: SQOOP-1101 Sqoop2: Basic schema: Provide implementation of the proposed data types

Posted by Abraham Elmahrek <ab...@cloudera.com>.

> On June 25, 2013, 11:27 p.m., Abraham Elmahrek wrote:
> > Overall, looks good. I am mostly unsure about the use of 31 as the only prime number separating hash code values.
> 
> Jarek Cecho wrote:
>     Hi Abe,
>     thank you very much for the review, appreciated! All the hashCode methods were generated by my IDE (IntelliJ IDEA) including the magical constant 31. I assume that this is best practice, however please correct me if I'm wrong.
> 
> Abraham Elmahrek wrote:
>     The number 31 seems arbitrary. My understanding is that hash codes with multiple instance members involved require some level of separation. By consistently using the number 31, there is a small chance that there will be some overlap between the hash codes produced by the members. It should increase the chance of a collision... the amount it increases it by is likely small though if the members are strings. If the members are Boolean, the overlap is guaranteed... IE: class A: member1 = Boolean,  member1 = Boolean, hashCode = func { result = 0; if member1 != null, then result = member1.hashCode(); if member2 != null, then result = 31*result + member2.hashCode(); }. NOTE: Boolean in java has two values for its hash code.
> 
> Venkat Ranganathan wrote:
>     I think the basic idea is to have an hash code that is somewhat similar to member1*31^(n-1) + member2*31^(n-2) + ... + member[n] using int arithmetic (where member1 is the hashcode of member object 1 for example computed similarly).  But it is done with integer arithmetic.   I have seen other numbers used, like say 37, but may be that has better dispersion but still will be an arbitrary number, right?
> 
> Abraham Elmahrek wrote:
>     Venkat,
>     That seems right. It's just the implementation is a bit skeptical given null values in members effectively zero out an iteration providing more opportunity for collisions. IE the scenario I've stated above.
>     
>     Anyways, I don't think it matters too much. If you guys are fine with it. I am too.
> 
> Venkat Ranganathan wrote:
>     I see what you are saying.  But is it not that initializing result to a non-zero value is what is needed here.   That is while computing hash code, initialize result to, say 1. 
>     
>     result = 1
>     result = result * 31 + member1 
>     result = result * 31 + member2
>     
>     Thanks
>
> 
> Abraham Elmahrek wrote:
>     Ah yeah. Good point. Nevermind!
> 
> Jarek Cecho wrote:
>     Thank you Abe and Venkat for the discussion. Is my understanding that current implementation of hashCode() method is good enough correct? Or would you prefer if I would do some changes there?

Seems good enough if the range is large enough! IE: having string members.


- Abraham


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12089/#review22380
-----------------------------------------------------------


On June 25, 2013, 9:42 p.m., Jarek Cecho wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/12089/
> -----------------------------------------------------------
> 
> (Updated June 25, 2013, 9:42 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-1101
>     https://issues.apache.org/jira/browse/SQOOP-1101
> 
> 
> Repository: sqoop-sqoop2
> 
> 
> Description
> -------
> 
> This part of SQOOP-1073 is adding classes representing the data types proposed in SQOOP-515.
> 
> 
> Diffs
> -----
> 
>   common/pom.xml 2921800 
>   common/src/main/java/org/apache/sqoop/schema/Schema.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/SchemaError.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractComplexType.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractDateTime.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractNumber.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractString.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Array.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Binary.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Bit.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Column.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Date.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/DateTime.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Decimal.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Enum.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/FixedPoint.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/FloatingPoint.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Map.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Set.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Text.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Time.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Type.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Unsupported.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/12089/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Jarek Cecho
> 
>


Re: Review Request 12089: SQOOP-1101 Sqoop2: Basic schema: Provide implementation of the proposed data types

Posted by Jarek Cecho <ja...@apache.org>.

> On June 25, 2013, 11:27 p.m., Abraham Elmahrek wrote:
> > Overall, looks good. I am mostly unsure about the use of 31 as the only prime number separating hash code values.
> 
> Jarek Cecho wrote:
>     Hi Abe,
>     thank you very much for the review, appreciated! All the hashCode methods were generated by my IDE (IntelliJ IDEA) including the magical constant 31. I assume that this is best practice, however please correct me if I'm wrong.
> 
> Abraham Elmahrek wrote:
>     The number 31 seems arbitrary. My understanding is that hash codes with multiple instance members involved require some level of separation. By consistently using the number 31, there is a small chance that there will be some overlap between the hash codes produced by the members. It should increase the chance of a collision... the amount it increases it by is likely small though if the members are strings. If the members are Boolean, the overlap is guaranteed... IE: class A: member1 = Boolean,  member1 = Boolean, hashCode = func { result = 0; if member1 != null, then result = member1.hashCode(); if member2 != null, then result = 31*result + member2.hashCode(); }. NOTE: Boolean in java has two values for its hash code.
> 
> Venkat Ranganathan wrote:
>     I think the basic idea is to have an hash code that is somewhat similar to member1*31^(n-1) + member2*31^(n-2) + ... + member[n] using int arithmetic (where member1 is the hashcode of member object 1 for example computed similarly).  But it is done with integer arithmetic.   I have seen other numbers used, like say 37, but may be that has better dispersion but still will be an arbitrary number, right?
> 
> Abraham Elmahrek wrote:
>     Venkat,
>     That seems right. It's just the implementation is a bit skeptical given null values in members effectively zero out an iteration providing more opportunity for collisions. IE the scenario I've stated above.
>     
>     Anyways, I don't think it matters too much. If you guys are fine with it. I am too.
> 
> Venkat Ranganathan wrote:
>     I see what you are saying.  But is it not that initializing result to a non-zero value is what is needed here.   That is while computing hash code, initialize result to, say 1. 
>     
>     result = 1
>     result = result * 31 + member1 
>     result = result * 31 + member2
>     
>     Thanks
>
> 
> Abraham Elmahrek wrote:
>     Ah yeah. Good point. Nevermind!
> 
> Jarek Cecho wrote:
>     Thank you Abe and Venkat for the discussion. Is my understanding that current implementation of hashCode() method is good enough correct? Or would you prefer if I would do some changes there?
> 
> Abraham Elmahrek wrote:
>     Seems good enough if the range is large enough! IE: having string members.
> 
> Venkat Ranganathan wrote:
>     I would suggest that we initialize the result to a non-zero integer - say 1 before calculating the hash code
>     
>     result = 1;
>     result = member1....
>     
>     Thanks
>     
>     Venkat

Hi Venkat,
thank you for the feedback, I've just updated the patch accordingly.


- Jarek


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12089/#review22380
-----------------------------------------------------------


On June 29, 2013, 9:13 p.m., Jarek Cecho wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/12089/
> -----------------------------------------------------------
> 
> (Updated June 29, 2013, 9:13 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-1101
>     https://issues.apache.org/jira/browse/SQOOP-1101
> 
> 
> Repository: sqoop-sqoop2
> 
> 
> Description
> -------
> 
> This part of SQOOP-1073 is adding classes representing the data types proposed in SQOOP-515.
> 
> 
> Diffs
> -----
> 
>   common/pom.xml 2921800 
>   common/src/main/java/org/apache/sqoop/schema/Schema.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/SchemaError.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractComplexType.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractDateTime.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractNumber.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractString.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Array.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Binary.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Bit.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Column.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Date.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/DateTime.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Decimal.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Enum.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/FixedPoint.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/FloatingPoint.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Map.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Set.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Text.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Time.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Type.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Unsupported.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/12089/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Jarek Cecho
> 
>


Re: Review Request 12089: SQOOP-1101 Sqoop2: Basic schema: Provide implementation of the proposed data types

Posted by Venkat Ranganathan <n....@live.com>.

> On June 25, 2013, 11:27 p.m., Abraham Elmahrek wrote:
> > Overall, looks good. I am mostly unsure about the use of 31 as the only prime number separating hash code values.
> 
> Jarek Cecho wrote:
>     Hi Abe,
>     thank you very much for the review, appreciated! All the hashCode methods were generated by my IDE (IntelliJ IDEA) including the magical constant 31. I assume that this is best practice, however please correct me if I'm wrong.
> 
> Abraham Elmahrek wrote:
>     The number 31 seems arbitrary. My understanding is that hash codes with multiple instance members involved require some level of separation. By consistently using the number 31, there is a small chance that there will be some overlap between the hash codes produced by the members. It should increase the chance of a collision... the amount it increases it by is likely small though if the members are strings. If the members are Boolean, the overlap is guaranteed... IE: class A: member1 = Boolean,  member1 = Boolean, hashCode = func { result = 0; if member1 != null, then result = member1.hashCode(); if member2 != null, then result = 31*result + member2.hashCode(); }. NOTE: Boolean in java has two values for its hash code.
> 
> Venkat Ranganathan wrote:
>     I think the basic idea is to have an hash code that is somewhat similar to member1*31^(n-1) + member2*31^(n-2) + ... + member[n] using int arithmetic (where member1 is the hashcode of member object 1 for example computed similarly).  But it is done with integer arithmetic.   I have seen other numbers used, like say 37, but may be that has better dispersion but still will be an arbitrary number, right?
> 
> Abraham Elmahrek wrote:
>     Venkat,
>     That seems right. It's just the implementation is a bit skeptical given null values in members effectively zero out an iteration providing more opportunity for collisions. IE the scenario I've stated above.
>     
>     Anyways, I don't think it matters too much. If you guys are fine with it. I am too.

I see what you are saying.  But is it not that initializing result to a non-zero value is what is needed here.   That is while computing hash code, initialize result to, say 1. 

result = 1
result = result * 31 + member1 
result = result * 31 + member2

Thanks


- Venkat


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12089/#review22380
-----------------------------------------------------------


On June 25, 2013, 9:42 p.m., Jarek Cecho wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/12089/
> -----------------------------------------------------------
> 
> (Updated June 25, 2013, 9:42 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-1101
>     https://issues.apache.org/jira/browse/SQOOP-1101
> 
> 
> Repository: sqoop-sqoop2
> 
> 
> Description
> -------
> 
> This part of SQOOP-1073 is adding classes representing the data types proposed in SQOOP-515.
> 
> 
> Diffs
> -----
> 
>   common/pom.xml 2921800 
>   common/src/main/java/org/apache/sqoop/schema/Schema.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/SchemaError.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractComplexType.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractDateTime.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractNumber.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractString.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Array.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Binary.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Bit.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Column.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Date.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/DateTime.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Decimal.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Enum.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/FixedPoint.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/FloatingPoint.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Map.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Set.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Text.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Time.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Type.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Unsupported.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/12089/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Jarek Cecho
> 
>


Re: Review Request 12089: SQOOP-1101 Sqoop2: Basic schema: Provide implementation of the proposed data types

Posted by Abraham Elmahrek <ab...@cloudera.com>.

> On June 25, 2013, 11:27 p.m., Abraham Elmahrek wrote:
> > Overall, looks good. I am mostly unsure about the use of 31 as the only prime number separating hash code values.
> 
> Jarek Cecho wrote:
>     Hi Abe,
>     thank you very much for the review, appreciated! All the hashCode methods were generated by my IDE (IntelliJ IDEA) including the magical constant 31. I assume that this is best practice, however please correct me if I'm wrong.
> 
> Abraham Elmahrek wrote:
>     The number 31 seems arbitrary. My understanding is that hash codes with multiple instance members involved require some level of separation. By consistently using the number 31, there is a small chance that there will be some overlap between the hash codes produced by the members. It should increase the chance of a collision... the amount it increases it by is likely small though if the members are strings. If the members are Boolean, the overlap is guaranteed... IE: class A: member1 = Boolean,  member1 = Boolean, hashCode = func { result = 0; if member1 != null, then result = member1.hashCode(); if member2 != null, then result = 31*result + member2.hashCode(); }. NOTE: Boolean in java has two values for its hash code.
> 
> Venkat Ranganathan wrote:
>     I think the basic idea is to have an hash code that is somewhat similar to member1*31^(n-1) + member2*31^(n-2) + ... + member[n] using int arithmetic (where member1 is the hashcode of member object 1 for example computed similarly).  But it is done with integer arithmetic.   I have seen other numbers used, like say 37, but may be that has better dispersion but still will be an arbitrary number, right?

Venkat,
That seems right. It's just the implementation is a bit skeptical given null values in members effectively zero out an iteration providing more opportunity for collisions. IE the scenario I've stated above.

Anyways, I don't think it matters too much. If you guys are fine with it. I am too.


- Abraham


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12089/#review22380
-----------------------------------------------------------


On June 25, 2013, 9:42 p.m., Jarek Cecho wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/12089/
> -----------------------------------------------------------
> 
> (Updated June 25, 2013, 9:42 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-1101
>     https://issues.apache.org/jira/browse/SQOOP-1101
> 
> 
> Repository: sqoop-sqoop2
> 
> 
> Description
> -------
> 
> This part of SQOOP-1073 is adding classes representing the data types proposed in SQOOP-515.
> 
> 
> Diffs
> -----
> 
>   common/pom.xml 2921800 
>   common/src/main/java/org/apache/sqoop/schema/Schema.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/SchemaError.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractComplexType.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractDateTime.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractNumber.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractString.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Array.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Binary.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Bit.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Column.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Date.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/DateTime.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Decimal.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Enum.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/FixedPoint.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/FloatingPoint.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Map.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Set.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Text.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Time.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Type.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Unsupported.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/12089/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Jarek Cecho
> 
>


Re: Review Request 12089: SQOOP-1101 Sqoop2: Basic schema: Provide implementation of the proposed data types

Posted by Venkat Ranganathan <n....@live.com>.

> On June 25, 2013, 11:27 p.m., Abraham Elmahrek wrote:
> > Overall, looks good. I am mostly unsure about the use of 31 as the only prime number separating hash code values.
> 
> Jarek Cecho wrote:
>     Hi Abe,
>     thank you very much for the review, appreciated! All the hashCode methods were generated by my IDE (IntelliJ IDEA) including the magical constant 31. I assume that this is best practice, however please correct me if I'm wrong.
> 
> Abraham Elmahrek wrote:
>     The number 31 seems arbitrary. My understanding is that hash codes with multiple instance members involved require some level of separation. By consistently using the number 31, there is a small chance that there will be some overlap between the hash codes produced by the members. It should increase the chance of a collision... the amount it increases it by is likely small though if the members are strings. If the members are Boolean, the overlap is guaranteed... IE: class A: member1 = Boolean,  member1 = Boolean, hashCode = func { result = 0; if member1 != null, then result = member1.hashCode(); if member2 != null, then result = 31*result + member2.hashCode(); }. NOTE: Boolean in java has two values for its hash code.
> 
> Venkat Ranganathan wrote:
>     I think the basic idea is to have an hash code that is somewhat similar to member1*31^(n-1) + member2*31^(n-2) + ... + member[n] using int arithmetic (where member1 is the hashcode of member object 1 for example computed similarly).  But it is done with integer arithmetic.   I have seen other numbers used, like say 37, but may be that has better dispersion but still will be an arbitrary number, right?
> 
> Abraham Elmahrek wrote:
>     Venkat,
>     That seems right. It's just the implementation is a bit skeptical given null values in members effectively zero out an iteration providing more opportunity for collisions. IE the scenario I've stated above.
>     
>     Anyways, I don't think it matters too much. If you guys are fine with it. I am too.
> 
> Venkat Ranganathan wrote:
>     I see what you are saying.  But is it not that initializing result to a non-zero value is what is needed here.   That is while computing hash code, initialize result to, say 1. 
>     
>     result = 1
>     result = result * 31 + member1 
>     result = result * 31 + member2
>     
>     Thanks
>
> 
> Abraham Elmahrek wrote:
>     Ah yeah. Good point. Nevermind!
> 
> Jarek Cecho wrote:
>     Thank you Abe and Venkat for the discussion. Is my understanding that current implementation of hashCode() method is good enough correct? Or would you prefer if I would do some changes there?
> 
> Abraham Elmahrek wrote:
>     Seems good enough if the range is large enough! IE: having string members.

I would suggest that we initialize the result to a non-zero integer - say 1 before calculating the hash code

result = 1;
result = member1....

Thanks

Venkat


- Venkat


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12089/#review22380
-----------------------------------------------------------


On June 25, 2013, 9:42 p.m., Jarek Cecho wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/12089/
> -----------------------------------------------------------
> 
> (Updated June 25, 2013, 9:42 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-1101
>     https://issues.apache.org/jira/browse/SQOOP-1101
> 
> 
> Repository: sqoop-sqoop2
> 
> 
> Description
> -------
> 
> This part of SQOOP-1073 is adding classes representing the data types proposed in SQOOP-515.
> 
> 
> Diffs
> -----
> 
>   common/pom.xml 2921800 
>   common/src/main/java/org/apache/sqoop/schema/Schema.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/SchemaError.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractComplexType.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractDateTime.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractNumber.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractString.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Array.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Binary.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Bit.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Column.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Date.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/DateTime.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Decimal.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Enum.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/FixedPoint.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/FloatingPoint.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Map.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Set.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Text.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Time.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Type.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Unsupported.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/12089/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Jarek Cecho
> 
>


Re: Review Request 12089: SQOOP-1101 Sqoop2: Basic schema: Provide implementation of the proposed data types

Posted by Jarek Cecho <ja...@apache.org>.

> On June 25, 2013, 11:27 p.m., Abraham Elmahrek wrote:
> > Overall, looks good. I am mostly unsure about the use of 31 as the only prime number separating hash code values.
> 
> Jarek Cecho wrote:
>     Hi Abe,
>     thank you very much for the review, appreciated! All the hashCode methods were generated by my IDE (IntelliJ IDEA) including the magical constant 31. I assume that this is best practice, however please correct me if I'm wrong.
> 
> Abraham Elmahrek wrote:
>     The number 31 seems arbitrary. My understanding is that hash codes with multiple instance members involved require some level of separation. By consistently using the number 31, there is a small chance that there will be some overlap between the hash codes produced by the members. It should increase the chance of a collision... the amount it increases it by is likely small though if the members are strings. If the members are Boolean, the overlap is guaranteed... IE: class A: member1 = Boolean,  member1 = Boolean, hashCode = func { result = 0; if member1 != null, then result = member1.hashCode(); if member2 != null, then result = 31*result + member2.hashCode(); }. NOTE: Boolean in java has two values for its hash code.
> 
> Venkat Ranganathan wrote:
>     I think the basic idea is to have an hash code that is somewhat similar to member1*31^(n-1) + member2*31^(n-2) + ... + member[n] using int arithmetic (where member1 is the hashcode of member object 1 for example computed similarly).  But it is done with integer arithmetic.   I have seen other numbers used, like say 37, but may be that has better dispersion but still will be an arbitrary number, right?
> 
> Abraham Elmahrek wrote:
>     Venkat,
>     That seems right. It's just the implementation is a bit skeptical given null values in members effectively zero out an iteration providing more opportunity for collisions. IE the scenario I've stated above.
>     
>     Anyways, I don't think it matters too much. If you guys are fine with it. I am too.
> 
> Venkat Ranganathan wrote:
>     I see what you are saying.  But is it not that initializing result to a non-zero value is what is needed here.   That is while computing hash code, initialize result to, say 1. 
>     
>     result = 1
>     result = result * 31 + member1 
>     result = result * 31 + member2
>     
>     Thanks
>
> 
> Abraham Elmahrek wrote:
>     Ah yeah. Good point. Nevermind!

Thank you Abe and Venkat for the discussion. Is my understanding that current implementation of hashCode() method is good enough correct? Or would you prefer if I would do some changes there?


- Jarek


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12089/#review22380
-----------------------------------------------------------


On June 25, 2013, 9:42 p.m., Jarek Cecho wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/12089/
> -----------------------------------------------------------
> 
> (Updated June 25, 2013, 9:42 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-1101
>     https://issues.apache.org/jira/browse/SQOOP-1101
> 
> 
> Repository: sqoop-sqoop2
> 
> 
> Description
> -------
> 
> This part of SQOOP-1073 is adding classes representing the data types proposed in SQOOP-515.
> 
> 
> Diffs
> -----
> 
>   common/pom.xml 2921800 
>   common/src/main/java/org/apache/sqoop/schema/Schema.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/SchemaError.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractComplexType.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractDateTime.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractNumber.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractString.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Array.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Binary.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Bit.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Column.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Date.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/DateTime.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Decimal.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Enum.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/FixedPoint.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/FloatingPoint.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Map.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Set.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Text.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Time.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Type.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Unsupported.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/12089/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Jarek Cecho
> 
>


Re: Review Request 12089: SQOOP-1101 Sqoop2: Basic schema: Provide implementation of the proposed data types

Posted by Abraham Elmahrek <ab...@cloudera.com>.

> On June 25, 2013, 11:27 p.m., Abraham Elmahrek wrote:
> > Overall, looks good. I am mostly unsure about the use of 31 as the only prime number separating hash code values.
> 
> Jarek Cecho wrote:
>     Hi Abe,
>     thank you very much for the review, appreciated! All the hashCode methods were generated by my IDE (IntelliJ IDEA) including the magical constant 31. I assume that this is best practice, however please correct me if I'm wrong.
> 
> Abraham Elmahrek wrote:
>     The number 31 seems arbitrary. My understanding is that hash codes with multiple instance members involved require some level of separation. By consistently using the number 31, there is a small chance that there will be some overlap between the hash codes produced by the members. It should increase the chance of a collision... the amount it increases it by is likely small though if the members are strings. If the members are Boolean, the overlap is guaranteed... IE: class A: member1 = Boolean,  member1 = Boolean, hashCode = func { result = 0; if member1 != null, then result = member1.hashCode(); if member2 != null, then result = 31*result + member2.hashCode(); }. NOTE: Boolean in java has two values for its hash code.
> 
> Venkat Ranganathan wrote:
>     I think the basic idea is to have an hash code that is somewhat similar to member1*31^(n-1) + member2*31^(n-2) + ... + member[n] using int arithmetic (where member1 is the hashcode of member object 1 for example computed similarly).  But it is done with integer arithmetic.   I have seen other numbers used, like say 37, but may be that has better dispersion but still will be an arbitrary number, right?
> 
> Abraham Elmahrek wrote:
>     Venkat,
>     That seems right. It's just the implementation is a bit skeptical given null values in members effectively zero out an iteration providing more opportunity for collisions. IE the scenario I've stated above.
>     
>     Anyways, I don't think it matters too much. If you guys are fine with it. I am too.
> 
> Venkat Ranganathan wrote:
>     I see what you are saying.  But is it not that initializing result to a non-zero value is what is needed here.   That is while computing hash code, initialize result to, say 1. 
>     
>     result = 1
>     result = result * 31 + member1 
>     result = result * 31 + member2
>     
>     Thanks
>

Ah yeah. Good point. Nevermind!


- Abraham


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12089/#review22380
-----------------------------------------------------------


On June 25, 2013, 9:42 p.m., Jarek Cecho wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/12089/
> -----------------------------------------------------------
> 
> (Updated June 25, 2013, 9:42 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-1101
>     https://issues.apache.org/jira/browse/SQOOP-1101
> 
> 
> Repository: sqoop-sqoop2
> 
> 
> Description
> -------
> 
> This part of SQOOP-1073 is adding classes representing the data types proposed in SQOOP-515.
> 
> 
> Diffs
> -----
> 
>   common/pom.xml 2921800 
>   common/src/main/java/org/apache/sqoop/schema/Schema.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/SchemaError.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractComplexType.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractDateTime.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractNumber.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractString.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Array.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Binary.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Bit.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Column.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Date.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/DateTime.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Decimal.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Enum.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/FixedPoint.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/FloatingPoint.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Map.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Set.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Text.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Time.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Type.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Unsupported.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/12089/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Jarek Cecho
> 
>


Re: Review Request 12089: SQOOP-1101 Sqoop2: Basic schema: Provide implementation of the proposed data types

Posted by Abraham Elmahrek <ab...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12089/#review22380
-----------------------------------------------------------


Overall, looks good. I am mostly unsure about the use of 31 as the only prime number separating hash code values.


common/src/main/java/org/apache/sqoop/schema/Schema.java
<https://reviews.apache.org/r/12089/#comment45913>

    Doesn't seem necessary if this operation is idempotent.



common/src/main/java/org/apache/sqoop/schema/type/DateTime.java
<https://reviews.apache.org/r/12089/#comment45907>

    Can we use a different base than 31 for this hashcode? DateTime's parent uses 31, which means there might be a common factor eventually.



common/src/main/java/org/apache/sqoop/schema/type/Decimal.java
<https://reviews.apache.org/r/12089/#comment45908>

    Same hash code comment as above.



common/src/main/java/org/apache/sqoop/schema/type/FixedPoint.java
<https://reviews.apache.org/r/12089/#comment45909>

    Same hash code comment as above



common/src/main/java/org/apache/sqoop/schema/type/FloatingPoint.java
<https://reviews.apache.org/r/12089/#comment45910>

    Same hash code comment as above



common/src/main/java/org/apache/sqoop/schema/type/Map.java
<https://reviews.apache.org/r/12089/#comment45911>

    Same hash code comment as above



common/src/main/java/org/apache/sqoop/schema/type/Time.java
<https://reviews.apache.org/r/12089/#comment45912>

    Same hash code comment as above


- Abraham Elmahrek


On June 25, 2013, 9:42 p.m., Jarek Cecho wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/12089/
> -----------------------------------------------------------
> 
> (Updated June 25, 2013, 9:42 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-1101
>     https://issues.apache.org/jira/browse/SQOOP-1101
> 
> 
> Repository: sqoop-sqoop2
> 
> 
> Description
> -------
> 
> This part of SQOOP-1073 is adding classes representing the data types proposed in SQOOP-515.
> 
> 
> Diffs
> -----
> 
>   common/pom.xml 2921800 
>   common/src/main/java/org/apache/sqoop/schema/Schema.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/SchemaError.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractComplexType.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractDateTime.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractNumber.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractString.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Array.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Binary.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Bit.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Column.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Date.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/DateTime.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Decimal.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Enum.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/FixedPoint.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/FloatingPoint.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Map.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Set.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Text.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Time.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Type.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Unsupported.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/12089/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Jarek Cecho
> 
>


Re: Review Request 12089: SQOOP-1101 Sqoop2: Basic schema: Provide implementation of the proposed data types

Posted by Venkat Ranganathan <n....@live.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12089/#review22586
-----------------------------------------------------------

Ship it!


Thanks Jarcec

- Venkat Ranganathan


On June 29, 2013, 9:13 p.m., Jarek Cecho wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/12089/
> -----------------------------------------------------------
> 
> (Updated June 29, 2013, 9:13 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-1101
>     https://issues.apache.org/jira/browse/SQOOP-1101
> 
> 
> Repository: sqoop-sqoop2
> 
> 
> Description
> -------
> 
> This part of SQOOP-1073 is adding classes representing the data types proposed in SQOOP-515.
> 
> 
> Diffs
> -----
> 
>   common/pom.xml 2921800 
>   common/src/main/java/org/apache/sqoop/schema/Schema.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/SchemaError.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractComplexType.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractDateTime.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractNumber.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractString.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Array.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Binary.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Bit.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Column.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Date.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/DateTime.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Decimal.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Enum.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/FixedPoint.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/FloatingPoint.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Map.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Set.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Text.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Time.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Type.java PRE-CREATION 
>   common/src/main/java/org/apache/sqoop/schema/type/Unsupported.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/12089/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Jarek Cecho
> 
>


Re: Review Request 12089: SQOOP-1101 Sqoop2: Basic schema: Provide implementation of the proposed data types

Posted by Jarek Cecho <ja...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12089/
-----------------------------------------------------------

(Updated June 29, 2013, 9:13 p.m.)


Review request for Sqoop.


Changes
-------

Initializing hashCode computation to nonzero value.


Bugs: SQOOP-1101
    https://issues.apache.org/jira/browse/SQOOP-1101


Repository: sqoop-sqoop2


Description
-------

This part of SQOOP-1073 is adding classes representing the data types proposed in SQOOP-515.


Diffs (updated)
-----

  common/pom.xml 2921800 
  common/src/main/java/org/apache/sqoop/schema/Schema.java PRE-CREATION 
  common/src/main/java/org/apache/sqoop/schema/SchemaError.java PRE-CREATION 
  common/src/main/java/org/apache/sqoop/schema/type/AbstractComplexType.java PRE-CREATION 
  common/src/main/java/org/apache/sqoop/schema/type/AbstractDateTime.java PRE-CREATION 
  common/src/main/java/org/apache/sqoop/schema/type/AbstractNumber.java PRE-CREATION 
  common/src/main/java/org/apache/sqoop/schema/type/AbstractString.java PRE-CREATION 
  common/src/main/java/org/apache/sqoop/schema/type/Array.java PRE-CREATION 
  common/src/main/java/org/apache/sqoop/schema/type/Binary.java PRE-CREATION 
  common/src/main/java/org/apache/sqoop/schema/type/Bit.java PRE-CREATION 
  common/src/main/java/org/apache/sqoop/schema/type/Column.java PRE-CREATION 
  common/src/main/java/org/apache/sqoop/schema/type/Date.java PRE-CREATION 
  common/src/main/java/org/apache/sqoop/schema/type/DateTime.java PRE-CREATION 
  common/src/main/java/org/apache/sqoop/schema/type/Decimal.java PRE-CREATION 
  common/src/main/java/org/apache/sqoop/schema/type/Enum.java PRE-CREATION 
  common/src/main/java/org/apache/sqoop/schema/type/FixedPoint.java PRE-CREATION 
  common/src/main/java/org/apache/sqoop/schema/type/FloatingPoint.java PRE-CREATION 
  common/src/main/java/org/apache/sqoop/schema/type/Map.java PRE-CREATION 
  common/src/main/java/org/apache/sqoop/schema/type/Set.java PRE-CREATION 
  common/src/main/java/org/apache/sqoop/schema/type/Text.java PRE-CREATION 
  common/src/main/java/org/apache/sqoop/schema/type/Time.java PRE-CREATION 
  common/src/main/java/org/apache/sqoop/schema/type/Type.java PRE-CREATION 
  common/src/main/java/org/apache/sqoop/schema/type/Unsupported.java PRE-CREATION 

Diff: https://reviews.apache.org/r/12089/diff/


Testing
-------


Thanks,

Jarek Cecho