You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@orc.apache.org by omalley <gi...@git.apache.org> on 2016/05/26 15:20:10 UTC

[GitHub] orc pull request: ORC-53. Make the complex types Comparable so the...

GitHub user omalley opened a pull request:

    https://github.com/apache/orc/pull/29

    ORC-53. Make the complex types Comparable so they can be put as keys in OrcMap

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/omalley/orc orc-53

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/orc/pull/29.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #29
    
----
commit 4fdcb0b88bfa17d88d7ee6a7eaff2399b669dcc2
Author: Owen O'Malley <om...@apache.org>
Date:   2016-05-26T15:17:29Z

    ORC-53. Make the complex types Comparable so they can be put as keys to the
    OrcMap.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] orc issue #29: ORC-53. Make the complex types Comparable so they can be put ...

Posted by omalley <gi...@git.apache.org>.
Github user omalley commented on the issue:

    https://github.com/apache/orc/pull/29
  
    The compilation was working for me, but you had a good point that in the current version that everything should implement WritableComparable since obviously compareTo needs to recurse all the way down.
    
    Can you try the current version?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] orc issue #29: ORC-53. Make the complex types Comparable so they can be put ...

Posted by wagnermarkd <gi...@git.apache.org>.
Github user wagnermarkd commented on the issue:

    https://github.com/apache/orc/pull/29
  
    Everything works on the orc-52 branch, which is based on orc-53, but with orc-53 I still see the previous error for TestOrcStruct and this new one for OrcRecordReader:
    
    ```
    [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on project orc-mapreduce: Compilation failure: Compilation failure:
    [ERROR] /home/mwagner/git/orc/java/mapreduce/src/java/org/apache/orc/mapred/OrcRecordReader.java:[94,15] no suitable method found for setFieldValue(int,org.apache.hadoop.io.Writable)
    [ERROR] method org.apache.orc.mapred.OrcStruct.setFieldValue(int,org.apache.hadoop.io.WritableComparable) is not applicable
    [ERROR] (argument mismatch; org.apache.hadoop.io.Writable cannot be converted to org.apache.hadoop.io.WritableComparable)
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] orc pull request #29: ORC-53. Make the complex types Comparable so they can ...

Posted by wagnermarkd <gi...@git.apache.org>.
Github user wagnermarkd commented on a diff in the pull request:

    https://github.com/apache/orc/pull/29#discussion_r65451326
  
    --- Diff: java/mapreduce/src/test/org/apache/orc/mapred/TestOrcStruct.java ---
    @@ -123,4 +123,18 @@ public void testCompare() {
         assertEquals(1 , left.compareTo(right));
         assertEquals(-1, right.compareTo(left));
       }
    +
    +  @Test
    +  public void testSchemaInCompare() {
    +    TypeDescription leftType = TypeDescription.fromString("struct<s:string,i:int>");
    +    TypeDescription rightType = TypeDescription.fromString("struct<s:string,j:bigint>");
    +    OrcStruct left = new OrcStruct(leftType);
    +    OrcStruct right = new OrcStruct(rightType);
    +    assertEquals(-1, left.compareTo(right));
    +    assertEquals(1, right.compareTo(left));
    +    left.setAllFields(new Text("123"), new IntWritable(123));
    --- End diff --
    
    I checked out your branch and got a compilation error for setAllFields, which was introduced for ORC-52.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] orc pull request: ORC-53. Make the complex types Comparable so they can be p...

Posted by omalley <gi...@git.apache.org>.
Github user omalley commented on the pull request:

    https://github.com/apache/orc/pull/29
  
    Ok, this needed a relatively large change. To compare the schemas, I wanted to make TypeDescription implement Comparable, but the equals and hashCode were set up to match against the id, which would have largely screwed things up. So, I made SchemaEvolution use a HashMap<Integer, TypeDescription> and used the id as the key. That is more straightforward and let me change the semantics of equals and hashCode to be more reasonable. While I was in TypeDescription, I also made it Serializable since FindBugs was complaining about the OrcList and OrcMap having non-serializable fields.
    
    I also added a log4j.properties file into the test resources so that the log messages come out to the console during unit tests.
    
    I also changed the 4 complex ORC types to implement WritableComparable, which ORC-53 had done to better match with MapReduce's requirements for key types.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] orc pull request #29: ORC-53. Make the complex types Comparable so they can ...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/orc/pull/29


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] orc pull request: ORC-53. Make the complex types Comparable so they can be p...

Posted by wagnermarkd <gi...@git.apache.org>.
Github user wagnermarkd commented on a diff in the pull request:

    https://github.com/apache/orc/pull/29#discussion_r65266565
  
    --- Diff: java/mapreduce/src/java/org/apache/orc/mapred/OrcList.java ---
    @@ -71,4 +71,30 @@ public void readFields(DataInput input) throws IOException {
           }
         }
       }
    +
    +  @Override
    +  public int compareTo(OrcList<E> other) {
    +    if (other == null) {
    +      return -1;
    +    }
    +    int ourSize = size();
    +    int otherSize = other.size();
    +    for(int e=0; e < ourSize && e < otherSize; ++e) {
    +      Writable ours = get(e);
    +      Writable theirs = other.get(e);
    +      if (ours == null) {
    +        if (theirs != null) {
    +          return 1;
    +        }
    +      } else if (theirs == null) {
    +        return -1;
    +      } else {
    +        int val = ((Comparable<Writable>) ours).compareTo(theirs);
    --- End diff --
    
    Comparing two complex types with different nested types will cause trouble here. To avoid this, we need to first compare the schemas, then the values. This applies to all the complex types.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] orc pull request #29: ORC-53. Make the complex types Comparable so they can ...

Posted by wagnermarkd <gi...@git.apache.org>.
Github user wagnermarkd commented on a diff in the pull request:

    https://github.com/apache/orc/pull/29#discussion_r65451652
  
    --- Diff: java/mapreduce/src/java/org/apache/orc/mapred/OrcMap.java ---
    @@ -83,4 +83,52 @@ public void readFields(DataInput input) throws IOException {
           put(key, value);
         }
       }
    +
    +  @Override
    +  public int compareTo(OrcMap<K,V> other) {
    +    if (other == null) {
    +      return -1;
    +    }
    +    int result = keySchema.compareTo(other.keySchema);
    +    if (result != 0) {
    +      return result;
    +    }
    +    result = valueSchema.compareTo(other.valueSchema);
    +    if (result != 0) {
    +      return result;
    +    }
    +    Iterator<Map.Entry<K,V>> ourItr = entrySet().iterator();
    +    Iterator<Map.Entry<K,V>> theirItr = other.entrySet().iterator();
    +    while (ourItr.hasNext() && theirItr.hasNext()) {
    +      Map.Entry<K,V> ourItem = ourItr.next();
    +      Map.Entry<K,V> theirItem = theirItr.next();
    +      K ourKey = ourItem.getKey();
    +      K theirKey = theirItem.getKey();
    +      int val = ((Comparable<K>) ourKey).compareTo(theirKey);
    --- End diff --
    
    Changing the type of K to extend WritableComparable would remove the need for this cast


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---