You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Szehon Ho <sz...@cloudera.com> on 2014/11/11 06:34:42 UTC

Re: Review Request 24609: Hive-7653: AvroSerDe does not support circular references in Schema

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24609/#review60751
-----------------------------------------------------------


Hi, I had a few comments.  Also do you mind adding an end to end test as well (qfile test) with actual avro file.  For example, see any of the avro*.q tests we have.

Also I admit not to know too much about this code, but if you can confirm my understanding, we are going to set VOID HiveType everytime we hit an avro schema we have visited already?  Does that work with Hive?


serde/src/java/org/apache/hadoop/hive/serde2/avro/InstanceCache.java
<https://reviews.apache.org/r/24609/#comment102158>

    Let's reduce code-churn by making a version that calls this one with (null) as second-argument.



serde/src/java/org/apache/hadoop/hive/serde2/avro/SchemaToTypeInfo.java
<https://reviews.apache.org/r/24609/#comment102155>

    Let's reduce code-churn by making a version that calls this one with (null) as second-argument.



serde/src/java/org/apache/hadoop/hive/serde2/avro/SchemaToTypeInfo.java
<https://reviews.apache.org/r/24609/#comment102154>

    The name 'seenSchemaMap' is more grammatical.  But maybe its better as a set, see comment below.



serde/src/java/org/apache/hadoop/hive/serde2/avro/SchemaToTypeInfo.java
<https://reviews.apache.org/r/24609/#comment102153>

    We need appropriate coding format, ie, parens for content of if, else blocks, and a space before the parens, like:
    
    if {
      ...
    } else {
      ...
    }



serde/src/java/org/apache/hadoop/hive/serde2/avro/SchemaToTypeInfo.java
<https://reviews.apache.org/r/24609/#comment102156>

    This is probably moot if we can use a Set (see below), but why IdentityHashMap?  Schema has hashcode defined.



serde/src/java/org/apache/hadoop/hive/serde2/avro/SchemaToTypeInfo.java
<https://reviews.apache.org/r/24609/#comment102152>

    If we are always going to return this (which I see resolves to VOID type?) , then can't we just use a Set instead of map?  If its in the set, return VOID.



serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroSerializer.java
<https://reviews.apache.org/r/24609/#comment102157>

    Formatting is wrong for these classes, they need to be properly indented.  Also please add the propery class visiblity.


- Szehon Ho


On Aug. 12, 2014, 4:35 p.m., Sachin Goyal wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24609/
> -----------------------------------------------------------
> 
> (Updated Aug. 12, 2014, 4:35 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: https://issues.apache.org/jira/browse/HIVE-7653
>     https://issues.apache.org/jira/browse/https://issues.apache.org/jira/browse/HIVE-7653
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Hi,
> 
> I have submitted a patch for the following issue:
> https://issues.apache.org/jira/browse/HIVE-7653
> 
> But the build is failing due to some other issue.
> Its been failing for the past 70 builds or so and I don't think its related to my change.
> Also, my local build for the same is passing.
> 
> Can someone please help me override/fix this test-failure?
> 
> Also, a code review of the above patch would be much appreciated.
> 
> Thanks
> Sachin
> 
> 
> Diffs
> -----
> 
>   serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroDeserializer.java 688b072 
>   serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroObjectInspectorGenerator.java 46cdb4f 
>   serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerializer.java 2bd48ca 
>   serde/src/java/org/apache/hadoop/hive/serde2/avro/InstanceCache.java d848005 
>   serde/src/java/org/apache/hadoop/hive/serde2/avro/SchemaToTypeInfo.java 23e024f 
>   serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroSerializer.java f8161da 
>   serde/src/test/org/apache/hadoop/hive/serde2/avro/TestInstanceCache.java 1df88ee 
> 
> Diff: https://reviews.apache.org/r/24609/diff/
> 
> 
> Testing
> -------
> 
> All tests pass.
> Also added a new unit-test for the patch.
> 
> 
> Thanks,
> 
> Sachin Goyal
> 
>


Re: Review Request 24609: Hive-7653: AvroSerDe does not support circular references in Schema

Posted by Sachin Goyal <sg...@gmail.com>.

> On Nov. 11, 2014, 5:34 a.m., Szehon Ho wrote:
> > serde/src/java/org/apache/hadoop/hive/serde2/avro/SchemaToTypeInfo.java, line 170
> > <https://reviews.apache.org/r/24609/diff/1/?file=658578#file658578line170>
> >
> >     This is probably moot if we can use a Set (see below), but why IdentityHashMap?  Schema has hashcode defined.

Avro hashCode() for records takes all its elements hash-code.
For a circular schema, the hash-code will be different when cycle has been populated vs when the cycle has not been populated.
(Even when its the same object but hashcode is calculated at different times.)

Identity hash-map would be a little more efficient as well.


- Sachin


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24609/#review60751
-----------------------------------------------------------


On Aug. 12, 2014, 4:35 p.m., Sachin Goyal wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24609/
> -----------------------------------------------------------
> 
> (Updated Aug. 12, 2014, 4:35 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: https://issues.apache.org/jira/browse/HIVE-7653
>     https://issues.apache.org/jira/browse/https://issues.apache.org/jira/browse/HIVE-7653
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Hi,
> 
> I have submitted a patch for the following issue:
> https://issues.apache.org/jira/browse/HIVE-7653
> 
> But the build is failing due to some other issue.
> Its been failing for the past 70 builds or so and I don't think its related to my change.
> Also, my local build for the same is passing.
> 
> Can someone please help me override/fix this test-failure?
> 
> Also, a code review of the above patch would be much appreciated.
> 
> Thanks
> Sachin
> 
> 
> Diffs
> -----
> 
>   serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroDeserializer.java 688b072 
>   serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroObjectInspectorGenerator.java 46cdb4f 
>   serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerializer.java 2bd48ca 
>   serde/src/java/org/apache/hadoop/hive/serde2/avro/InstanceCache.java d848005 
>   serde/src/java/org/apache/hadoop/hive/serde2/avro/SchemaToTypeInfo.java 23e024f 
>   serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroSerializer.java f8161da 
>   serde/src/test/org/apache/hadoop/hive/serde2/avro/TestInstanceCache.java 1df88ee 
> 
> Diff: https://reviews.apache.org/r/24609/diff/
> 
> 
> Testing
> -------
> 
> All tests pass.
> Also added a new unit-test for the patch.
> 
> 
> Thanks,
> 
> Sachin Goyal
> 
>