You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Sachin Goyal <sg...@gmail.com> on 2014/08/12 18:35:23 UTC

Review Request 24609: Hive-7653: AvroSerDe does not support circular references in Schema

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24609/
-----------------------------------------------------------

Review request for hive.


Bugs: https://issues.apache.org/jira/browse/HIVE-7653
    https://issues.apache.org/jira/browse/https://issues.apache.org/jira/browse/HIVE-7653


Repository: hive-git


Description
-------

Hi,

I have submitted a patch for the following issue:
https://issues.apache.org/jira/browse/HIVE-7653

But the build is failing due to some other issue.
Its been failing for the past 70 builds or so and I don't think its related to my change.
Also, my local build for the same is passing.

Can someone please help me override/fix this test-failure?

Also, a code review of the above patch would be much appreciated.

Thanks
Sachin


Diffs
-----

  serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroDeserializer.java 688b072 
  serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroObjectInspectorGenerator.java 46cdb4f 
  serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerializer.java 2bd48ca 
  serde/src/java/org/apache/hadoop/hive/serde2/avro/InstanceCache.java d848005 
  serde/src/java/org/apache/hadoop/hive/serde2/avro/SchemaToTypeInfo.java 23e024f 
  serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroSerializer.java f8161da 
  serde/src/test/org/apache/hadoop/hive/serde2/avro/TestInstanceCache.java 1df88ee 

Diff: https://reviews.apache.org/r/24609/diff/


Testing
-------

All tests pass.
Also added a new unit-test for the patch.


Thanks,

Sachin Goyal


Re: Review Request 24609: Hive-7653: AvroSerDe does not support circular references in Schema

Posted by Sachin Goyal <sg...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24609/#review50727
-----------------------------------------------------------


Can someone please review this small patch?
It would be really helpful.

- Sachin Goyal


On Aug. 12, 2014, 4:35 p.m., Sachin Goyal wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24609/
> -----------------------------------------------------------
> 
> (Updated Aug. 12, 2014, 4:35 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: https://issues.apache.org/jira/browse/HIVE-7653
>     https://issues.apache.org/jira/browse/https://issues.apache.org/jira/browse/HIVE-7653
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Hi,
> 
> I have submitted a patch for the following issue:
> https://issues.apache.org/jira/browse/HIVE-7653
> 
> But the build is failing due to some other issue.
> Its been failing for the past 70 builds or so and I don't think its related to my change.
> Also, my local build for the same is passing.
> 
> Can someone please help me override/fix this test-failure?
> 
> Also, a code review of the above patch would be much appreciated.
> 
> Thanks
> Sachin
> 
> 
> Diffs
> -----
> 
>   serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroDeserializer.java 688b072 
>   serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroObjectInspectorGenerator.java 46cdb4f 
>   serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerializer.java 2bd48ca 
>   serde/src/java/org/apache/hadoop/hive/serde2/avro/InstanceCache.java d848005 
>   serde/src/java/org/apache/hadoop/hive/serde2/avro/SchemaToTypeInfo.java 23e024f 
>   serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroSerializer.java f8161da 
>   serde/src/test/org/apache/hadoop/hive/serde2/avro/TestInstanceCache.java 1df88ee 
> 
> Diff: https://reviews.apache.org/r/24609/diff/
> 
> 
> Testing
> -------
> 
> All tests pass.
> Also added a new unit-test for the patch.
> 
> 
> Thanks,
> 
> Sachin Goyal
> 
>


Re: Review Request 24609: Hive-7653: AvroSerDe does not support circular references in Schema

Posted by Sachin Goyal <sg...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24609/
-----------------------------------------------------------

(Updated Feb. 9, 2015, 8:13 p.m.)


Review request for hive.


Changes
-------

Thank you Szehon,

All comments for this diff have been applied.
Please review and help me apply the patch, if it looks ok.

Thanks
Sachin


Bugs: https://issues.apache.org/jira/browse/HIVE-7653
    https://issues.apache.org/jira/browse/https://issues.apache.org/jira/browse/HIVE-7653


Repository: hive-git


Description
-------

Hi,

I have submitted a patch for the following issue:
https://issues.apache.org/jira/browse/HIVE-7653

But the build is failing due to some other issue.
Its been failing for the past 70 builds or so and I don't think its related to my change.
Also, my local build for the same is passing.

Can someone please help me override/fix this test-failure?

Also, a code review of the above patch would be much appreciated.

Thanks
Sachin


Diffs (updated)
-----

  serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroDeserializer.java a2558f2 
  serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerializer.java 809c2f2 
  serde/src/java/org/apache/hadoop/hive/serde2/avro/InstanceCache.java d848005 
  serde/src/java/org/apache/hadoop/hive/serde2/avro/SchemaToTypeInfo.java 3998737 
  serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroSerializer.java b573f50 
  serde/src/test/org/apache/hadoop/hive/serde2/avro/TestInstanceCache.java 1df88ee 

Diff: https://reviews.apache.org/r/24609/diff/


Testing
-------

All tests pass.
Also added a new unit-test for the patch.


Thanks,

Sachin Goyal


Re: Review Request 24609: Hive-7653: AvroSerDe does not support circular references in Schema

Posted by Sachin Goyal <sg...@gmail.com>.

> On Nov. 11, 2014, 5:34 a.m., Szehon Ho wrote:
> > serde/src/java/org/apache/hadoop/hive/serde2/avro/SchemaToTypeInfo.java, line 170
> > <https://reviews.apache.org/r/24609/diff/1/?file=658578#file658578line170>
> >
> >     This is probably moot if we can use a Set (see below), but why IdentityHashMap?  Schema has hashcode defined.

Avro hashCode() for records takes all its elements hash-code.
For a circular schema, the hash-code will be different when cycle has been populated vs when the cycle has not been populated.
(Even when its the same object but hashcode is calculated at different times.)

Identity hash-map would be a little more efficient as well.


- Sachin


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24609/#review60751
-----------------------------------------------------------


On Aug. 12, 2014, 4:35 p.m., Sachin Goyal wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24609/
> -----------------------------------------------------------
> 
> (Updated Aug. 12, 2014, 4:35 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: https://issues.apache.org/jira/browse/HIVE-7653
>     https://issues.apache.org/jira/browse/https://issues.apache.org/jira/browse/HIVE-7653
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Hi,
> 
> I have submitted a patch for the following issue:
> https://issues.apache.org/jira/browse/HIVE-7653
> 
> But the build is failing due to some other issue.
> Its been failing for the past 70 builds or so and I don't think its related to my change.
> Also, my local build for the same is passing.
> 
> Can someone please help me override/fix this test-failure?
> 
> Also, a code review of the above patch would be much appreciated.
> 
> Thanks
> Sachin
> 
> 
> Diffs
> -----
> 
>   serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroDeserializer.java 688b072 
>   serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroObjectInspectorGenerator.java 46cdb4f 
>   serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerializer.java 2bd48ca 
>   serde/src/java/org/apache/hadoop/hive/serde2/avro/InstanceCache.java d848005 
>   serde/src/java/org/apache/hadoop/hive/serde2/avro/SchemaToTypeInfo.java 23e024f 
>   serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroSerializer.java f8161da 
>   serde/src/test/org/apache/hadoop/hive/serde2/avro/TestInstanceCache.java 1df88ee 
> 
> Diff: https://reviews.apache.org/r/24609/diff/
> 
> 
> Testing
> -------
> 
> All tests pass.
> Also added a new unit-test for the patch.
> 
> 
> Thanks,
> 
> Sachin Goyal
> 
>


Re: Review Request 24609: Hive-7653: AvroSerDe does not support circular references in Schema

Posted by Szehon Ho <sz...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24609/#review60751
-----------------------------------------------------------


Hi, I had a few comments.  Also do you mind adding an end to end test as well (qfile test) with actual avro file.  For example, see any of the avro*.q tests we have.

Also I admit not to know too much about this code, but if you can confirm my understanding, we are going to set VOID HiveType everytime we hit an avro schema we have visited already?  Does that work with Hive?


serde/src/java/org/apache/hadoop/hive/serde2/avro/InstanceCache.java
<https://reviews.apache.org/r/24609/#comment102158>

    Let's reduce code-churn by making a version that calls this one with (null) as second-argument.



serde/src/java/org/apache/hadoop/hive/serde2/avro/SchemaToTypeInfo.java
<https://reviews.apache.org/r/24609/#comment102155>

    Let's reduce code-churn by making a version that calls this one with (null) as second-argument.



serde/src/java/org/apache/hadoop/hive/serde2/avro/SchemaToTypeInfo.java
<https://reviews.apache.org/r/24609/#comment102154>

    The name 'seenSchemaMap' is more grammatical.  But maybe its better as a set, see comment below.



serde/src/java/org/apache/hadoop/hive/serde2/avro/SchemaToTypeInfo.java
<https://reviews.apache.org/r/24609/#comment102153>

    We need appropriate coding format, ie, parens for content of if, else blocks, and a space before the parens, like:
    
    if {
      ...
    } else {
      ...
    }



serde/src/java/org/apache/hadoop/hive/serde2/avro/SchemaToTypeInfo.java
<https://reviews.apache.org/r/24609/#comment102156>

    This is probably moot if we can use a Set (see below), but why IdentityHashMap?  Schema has hashcode defined.



serde/src/java/org/apache/hadoop/hive/serde2/avro/SchemaToTypeInfo.java
<https://reviews.apache.org/r/24609/#comment102152>

    If we are always going to return this (which I see resolves to VOID type?) , then can't we just use a Set instead of map?  If its in the set, return VOID.



serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroSerializer.java
<https://reviews.apache.org/r/24609/#comment102157>

    Formatting is wrong for these classes, they need to be properly indented.  Also please add the propery class visiblity.


- Szehon Ho


On Aug. 12, 2014, 4:35 p.m., Sachin Goyal wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24609/
> -----------------------------------------------------------
> 
> (Updated Aug. 12, 2014, 4:35 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: https://issues.apache.org/jira/browse/HIVE-7653
>     https://issues.apache.org/jira/browse/https://issues.apache.org/jira/browse/HIVE-7653
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Hi,
> 
> I have submitted a patch for the following issue:
> https://issues.apache.org/jira/browse/HIVE-7653
> 
> But the build is failing due to some other issue.
> Its been failing for the past 70 builds or so and I don't think its related to my change.
> Also, my local build for the same is passing.
> 
> Can someone please help me override/fix this test-failure?
> 
> Also, a code review of the above patch would be much appreciated.
> 
> Thanks
> Sachin
> 
> 
> Diffs
> -----
> 
>   serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroDeserializer.java 688b072 
>   serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroObjectInspectorGenerator.java 46cdb4f 
>   serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerializer.java 2bd48ca 
>   serde/src/java/org/apache/hadoop/hive/serde2/avro/InstanceCache.java d848005 
>   serde/src/java/org/apache/hadoop/hive/serde2/avro/SchemaToTypeInfo.java 23e024f 
>   serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroSerializer.java f8161da 
>   serde/src/test/org/apache/hadoop/hive/serde2/avro/TestInstanceCache.java 1df88ee 
> 
> Diff: https://reviews.apache.org/r/24609/diff/
> 
> 
> Testing
> -------
> 
> All tests pass.
> Also added a new unit-test for the patch.
> 
> 
> Thanks,
> 
> Sachin Goyal
> 
>