You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@drill.apache.org by vvysotskyi <gi...@git.apache.org> on 2017/08/17 15:19:09 UTC

[GitHub] drill pull request #909: DRILL-4264: Allow field names to include dots

GitHub user vvysotskyi opened a pull request:

    https://github.com/apache/drill/pull/909

    DRILL-4264: Allow field names to include dots

    1. Removed checking the field name for dots.
    2. Replaced using `SchemaPath.getAsUnescapedPath()` method by `SchemaPath.getRootSegmentPath()` and `SchemaPathUtil.getMaterializedFieldFromSchemaPath()` where it is needed. 
    3. Replaced using `MaterializedField.getPath()` and `MaterializedField.getLastName()` methods by `MaterializedField.getName()` method and checked the correctness of the behaviour.
    4. Added tests

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/vvysotskyi/drill DRILL-4264

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/drill/pull/909.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #909
    
----
commit 4ba59488a96fb79455b192ed960a728481ceaf93
Author: Volodymyr Vysotskyi <vv...@gmail.com>
Date:   2017-07-05T19:08:59Z

    DRILL-4264: Allow field names to include dots

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #909: DRILL-4264: Allow field names to include dots

Posted by vvysotskyi <gi...@git.apache.org>.

Github user vvysotskyi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/909#discussion_r134782398
  
    --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/record/TestMaterializedField.java ---
    @@ -84,4 +89,22 @@ public void testClone() {
     
       }
     
    +  @Test // DRILL-4264
    +  public void testSchemaPathToMaterializedFieldConverting() {
    --- End diff --
    
    This test was designed to check the `SchemaPathUtil.getMaterializedFieldFromSchemaPath()` method. Since this method removed, I removed this test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #909: DRILL-4264: Allow field names to include dots

Posted by vvysotskyi <gi...@git.apache.org>.

Github user vvysotskyi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/909#discussion_r134688368
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/StreamingAggBatch.java ---
    @@ -293,7 +294,7 @@ private StreamingAggregator createAggregatorInternal() throws SchemaChangeExcept
             continue;
           }
           keyExprs[i] = expr;
    -      final MaterializedField outputField = MaterializedField.create(ne.getRef().getAsUnescapedPath(), expr.getMajorType());
    +      final MaterializedField outputField = SchemaPathUtil.getMaterializedFieldFromSchemaPath(ne.getRef(), expr.getMajorType());
    --- End diff --
    
    Yes, it should. But `MaterializedField` class is in the `vector` module, `SchemaPath` class is in the `drill-logical` module and `vector` module does not have the dependency on the `drill-logical` module.
    Replaced this code by the `MaterializedField.create(ne.getRef().getLastSegment().getNameSegment().getPath(), expr.getMajorType());` since simple name path is used here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #909: DRILL-4264: Allow field names to include dots

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/909#discussion_r134098783
  
    --- Diff: exec/vector/src/main/java/org/apache/drill/exec/vector/accessor/TupleAccessor.java ---
    @@ -48,9 +48,21 @@
     
         MaterializedField column(int index);
     
    -    MaterializedField column(String name);
    +    /**
    +     * Returns {@code MaterializedField} instance from schema using the name path specified in param.
    +     *
    +     * @param name full name path of the column in the schema
    +     * @return {@code MaterializedField} instance
    +     */
    +    MaterializedField column(String[] name);
    --- End diff --
    
    This is not right. As explained earlier, a tuple is a specialized map that allows both name and indexed access, but only to direct children. The name above should never be compound. Instead, the code must work the hierarchy itself (or we'd need to add a method to do so.)
    
    Note that this has been revised/replaced in PR #914.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #909: DRILL-4264: Allow field names to include dots

Posted by vvysotskyi <gi...@git.apache.org>.

Github user vvysotskyi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/909#discussion_r134539810
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/ScanBatch.java ---
    @@ -359,30 +361,109 @@ public Mutator(OperatorExecContext oContext, BufferAllocator allocator, VectorCo
         public <T extends ValueVector> T addField(MaterializedField field,
                                                   Class<T> clazz) throws SchemaChangeException {
           // Check if the field exists.
    -      ValueVector v = fieldVectorMap.get(field.getPath());
    -      if (v == null || v.getClass() != clazz) {
    +      ValueVector vector = fieldVectorMap.get(field.getName());
    +      ValueVector childVector = vector;
    +      // if a vector does not exist yet, creates value vector, or if it exists and has map type, omit this code
    +      if (vector == null || (vector.getClass() != clazz
    +            && (vector.getField().getType().getMinorType() != MinorType.MAP
    +                || field.getType().getMinorType() != MinorType.MAP))) {
             // Field does not exist--add it to the map and the output container.
    -        v = TypeHelper.getNewVector(field, allocator, callBack);
    -        if (!clazz.isAssignableFrom(v.getClass())) {
    +        vector = TypeHelper.getNewVector(field, allocator, callBack);
    +        childVector = vector;
    +        // gets inner field if the map was created the first time
    +        if (field.getType().getMinorType() == MinorType.MAP) {
    +          childVector = getChildVectorByField(vector, field);
    +        } else if (!clazz.isAssignableFrom(vector.getClass())) {
               throw new SchemaChangeException(
                   String.format(
                       "The class that was provided, %s, does not correspond to the "
                       + "expected vector type of %s.",
    -                  clazz.getSimpleName(), v.getClass().getSimpleName()));
    +                  clazz.getSimpleName(), vector.getClass().getSimpleName()));
             }
     
    -        final ValueVector old = fieldVectorMap.put(field.getPath(), v);
    +        final ValueVector old = fieldVectorMap.put(field.getName(), vector);
             if (old != null) {
               old.clear();
               container.remove(old);
             }
     
    -        container.add(v);
    +        container.add(vector);
             // Added new vectors to the container--mark that the schema has changed.
             schemaChanged = true;
           }
    +      // otherwise, checks that field and existing vector have a map type
    --- End diff --
    
    I was suggesting that the work here may be produced on the nested fields thru the map.
    I agree with you that it would be correct to deal with the desired field. So thanks for pointing this, I reverted the changes in this method.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #909: DRILL-4264: Allow field names to include dots

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/909#discussion_r135872295
  
    --- Diff: logical/src/main/java/org/apache/drill/common/expression/SchemaPath.java ---
    @@ -115,6 +112,33 @@ public static SchemaPath create(NamePart namePart) {
       }
     
       /**
    +   * Parses input string and returns {@code SchemaPath} instance.
    +   *
    +   * @param expr input string to be parsed
    +   * @return {@code SchemaPath} instance
    +   */
    +  public static SchemaPath parseFromString(String expr) {
    --- End diff --
    
    Thanks for the explanation. Can you copy it into the Javadoc for this method? Will help others in the future.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #909: DRILL-4264: Allow field names to include dots

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/909#discussion_r134061376
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/ScanBatch.java ---
    @@ -359,30 +361,109 @@ public Mutator(OperatorExecContext oContext, BufferAllocator allocator, VectorCo
         public <T extends ValueVector> T addField(MaterializedField field,
                                                   Class<T> clazz) throws SchemaChangeException {
           // Check if the field exists.
    -      ValueVector v = fieldVectorMap.get(field.getPath());
    -      if (v == null || v.getClass() != clazz) {
    +      ValueVector vector = fieldVectorMap.get(field.getName());
    +      ValueVector childVector = vector;
    +      // if a vector does not exist yet, creates value vector, or if it exists and has map type, omit this code
    +      if (vector == null || (vector.getClass() != clazz
    +            && (vector.getField().getType().getMinorType() != MinorType.MAP
    +                || field.getType().getMinorType() != MinorType.MAP))) {
             // Field does not exist--add it to the map and the output container.
    -        v = TypeHelper.getNewVector(field, allocator, callBack);
    -        if (!clazz.isAssignableFrom(v.getClass())) {
    +        vector = TypeHelper.getNewVector(field, allocator, callBack);
    +        childVector = vector;
    +        // gets inner field if the map was created the first time
    +        if (field.getType().getMinorType() == MinorType.MAP) {
    +          childVector = getChildVectorByField(vector, field);
    +        } else if (!clazz.isAssignableFrom(vector.getClass())) {
               throw new SchemaChangeException(
                   String.format(
                       "The class that was provided, %s, does not correspond to the "
                       + "expected vector type of %s.",
    -                  clazz.getSimpleName(), v.getClass().getSimpleName()));
    +                  clazz.getSimpleName(), vector.getClass().getSimpleName()));
             }
     
    -        final ValueVector old = fieldVectorMap.put(field.getPath(), v);
    +        final ValueVector old = fieldVectorMap.put(field.getName(), vector);
             if (old != null) {
               old.clear();
               container.remove(old);
             }
     
    -        container.add(v);
    +        container.add(vector);
             // Added new vectors to the container--mark that the schema has changed.
             schemaChanged = true;
           }
    +      // otherwise, checks that field and existing vector have a map type
    +      // and adds child fields from the field to the vector
    +      else if (field.getType().getMinorType() == MinorType.MAP
    +                  && vector.getField().getType().getMinorType() == MinorType.MAP
    +                  && !field.getChildren().isEmpty()) {
    +        // an incoming field contains only single child since it determines
    +        // full name path of the field in the schema
    +        childVector = addNestedChildToMap((MapVector) vector, Iterables.getLast(field.getChildren()));
    +        schemaChanged = true;
    +      }
     
    -      return clazz.cast(v);
    +      return clazz.cast(childVector);
    +    }
    +
    +    /**
    +     * Finds and returns value vector which path corresponds to the specified field.
    +     * If required vector is nested in the map, gets and returns this vector from the map.
    +     *
    +     * @param valueVector vector that should be checked
    +     * @param field       field that corresponds to required vector
    +     * @return value vector whose path corresponds to the specified field
    +     *
    +     * @throws SchemaChangeException if the field does not correspond to the found vector
    +     */
    +    private ValueVector getChildVectorByField(ValueVector valueVector,
    +                                              MaterializedField field) throws SchemaChangeException {
    +      if (field.getChildren().isEmpty()) {
    +        if (valueVector.getField().equals(field)) {
    +          return valueVector;
    +        } else {
    +          throw new SchemaChangeException(
    +            String.format(
    +              "The field that was provided, %s, does not correspond to the "
    +                + "expected vector type of %s.",
    +              field, valueVector.getClass().getSimpleName()));
    +        }
    +      } else {
    +        // an incoming field contains only single child since it determines
    +        // full name path of the field in the schema
    +        MaterializedField childField = Iterables.getLast(field.getChildren());
    +        return getChildVectorByField(((MapVector) valueVector).getChild(childField.getName()), childField);
    +      }
    --- End diff --
    
    Not sure I understand this. I understand the idea that if the vector is given, we want to check if the vector schema equals the given field. (But, not that, in general, the `isEquals()` won't work. A nullable vector will include the `$bits$` field as a nested column. A Varchar will inlcude `$offsets$`. But, the schema coming from the client won't contain these hidden fields. As a result, the two fields won't compare equal. I recently introduced an `isEquivalent()` method to handle these cases. Or, you can just check the `MajorType`s. (Though, you have to decide if a Varchar of precision 10 is the same as a Varchar of precision 20, say...)
    
    Again, the new result set loader handles all this.
    
    The part I don't follow is if the field has children. Why do we expect that the new field must match the last of the children? And, do we really want to do a sequential search through field names for every value of every column of every row?
    
    Yet again, the new result set loader fixes all this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #909: DRILL-4264: Allow field names to include dots

Posted by vvysotskyi <gi...@git.apache.org>.

Github user vvysotskyi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/909#discussion_r134787569
  
    --- Diff: exec/java-exec/src/test/java/org/apache/drill/test/rowSet/RowSetSchema.java ---
    @@ -83,7 +85,13 @@ private void updateStructure(int index, PhysicalSchema children) {
         public boolean isMap() { return mapSchema != null; }
         public PhysicalSchema mapSchema() { return mapSchema; }
         public MaterializedField field() { return field; }
    -    public String fullName() { return fullName; }
    +
    +    /**
    +     * Returns full name path of the column.
    --- End diff --
    
    I reverted these changes. Also, I commented out the test where this code is used with the map fields. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #909: DRILL-4264: Allow field names to include dots

Posted by vvysotskyi <gi...@git.apache.org>.

Github user vvysotskyi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/909#discussion_r134686740
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/ScanBatch.java ---
    @@ -359,30 +361,109 @@ public Mutator(OperatorExecContext oContext, BufferAllocator allocator, VectorCo
         public <T extends ValueVector> T addField(MaterializedField field,
                                                   Class<T> clazz) throws SchemaChangeException {
           // Check if the field exists.
    -      ValueVector v = fieldVectorMap.get(field.getPath());
    -      if (v == null || v.getClass() != clazz) {
    +      ValueVector vector = fieldVectorMap.get(field.getName());
    +      ValueVector childVector = vector;
    +      // if a vector does not exist yet, creates value vector, or if it exists and has map type, omit this code
    +      if (vector == null || (vector.getClass() != clazz
    +            && (vector.getField().getType().getMinorType() != MinorType.MAP
    +                || field.getType().getMinorType() != MinorType.MAP))) {
             // Field does not exist--add it to the map and the output container.
    -        v = TypeHelper.getNewVector(field, allocator, callBack);
    -        if (!clazz.isAssignableFrom(v.getClass())) {
    +        vector = TypeHelper.getNewVector(field, allocator, callBack);
    +        childVector = vector;
    +        // gets inner field if the map was created the first time
    +        if (field.getType().getMinorType() == MinorType.MAP) {
    +          childVector = getChildVectorByField(vector, field);
    +        } else if (!clazz.isAssignableFrom(vector.getClass())) {
               throw new SchemaChangeException(
                   String.format(
                       "The class that was provided, %s, does not correspond to the "
                       + "expected vector type of %s.",
    -                  clazz.getSimpleName(), v.getClass().getSimpleName()));
    +                  clazz.getSimpleName(), vector.getClass().getSimpleName()));
             }
     
    -        final ValueVector old = fieldVectorMap.put(field.getPath(), v);
    +        final ValueVector old = fieldVectorMap.put(field.getName(), vector);
             if (old != null) {
               old.clear();
               container.remove(old);
             }
     
    -        container.add(v);
    +        container.add(vector);
             // Added new vectors to the container--mark that the schema has changed.
             schemaChanged = true;
           }
    +      // otherwise, checks that field and existing vector have a map type
    +      // and adds child fields from the field to the vector
    +      else if (field.getType().getMinorType() == MinorType.MAP
    +                  && vector.getField().getType().getMinorType() == MinorType.MAP
    +                  && !field.getChildren().isEmpty()) {
    +        // an incoming field contains only single child since it determines
    +        // full name path of the field in the schema
    +        childVector = addNestedChildToMap((MapVector) vector, Iterables.getLast(field.getChildren()));
    +        schemaChanged = true;
    +      }
     
    -      return clazz.cast(v);
    +      return clazz.cast(childVector);
    +    }
    +
    +    /**
    +     * Finds and returns value vector which path corresponds to the specified field.
    +     * If required vector is nested in the map, gets and returns this vector from the map.
    +     *
    +     * @param valueVector vector that should be checked
    +     * @param field       field that corresponds to required vector
    +     * @return value vector whose path corresponds to the specified field
    +     *
    +     * @throws SchemaChangeException if the field does not correspond to the found vector
    +     */
    +    private ValueVector getChildVectorByField(ValueVector valueVector,
    +                                              MaterializedField field) throws SchemaChangeException {
    +      if (field.getChildren().isEmpty()) {
    +        if (valueVector.getField().equals(field)) {
    +          return valueVector;
    +        } else {
    +          throw new SchemaChangeException(
    +            String.format(
    +              "The field that was provided, %s, does not correspond to the "
    +                + "expected vector type of %s.",
    +              field, valueVector.getClass().getSimpleName()));
    +        }
    +      } else {
    +        // an incoming field contains only single child since it determines
    +        // full name path of the field in the schema
    +        MaterializedField childField = Iterables.getLast(field.getChildren());
    +        return getChildVectorByField(((MapVector) valueVector).getChild(childField.getName()), childField);
    +      }
    +    }
    +
    +    /**
    +     * Adds new vector with the specified field to the specified map vector.
    +     * If a field has few levels of nesting, corresponding maps will be created.
    +     *
    +     * @param valueVector a map where new vector will be added
    +     * @param field       a field that corresponds to the vector which should be created
    +     * @return vector that was created
    +     *
    +     * @throws SchemaChangeException if the field has a child and its type is not map
    +     */
    +    private ValueVector addNestedChildToMap(MapVector valueVector,
    --- End diff --
    
    This method was designed to handle the case when we have a map inside other map and we want to add a field into the nested map. But considering your second comment, this method also is not needed. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #909: DRILL-4264: Allow field names to include dots

Posted by vvysotskyi <gi...@git.apache.org>.

Github user vvysotskyi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/909#discussion_r134778939
  
    --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/expr/TestSchemaPathMaterialization.java ---
    @@ -93,4 +93,23 @@ public void testProjectionMultipleFiles() throws Exception {
           .go();
       }
     
    +  @Test //DRILL-4264
    +  public void testFieldNameWithDot() throws Exception {
    --- End diff --
    
    Added more tests


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #909: DRILL-4264: Allow field names to include dots

Posted by vvysotskyi <gi...@git.apache.org>.

Github user vvysotskyi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/909#discussion_r134764401
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/vector/complex/SchemaPathUtil.java ---
    @@ -0,0 +1,59 @@
    +/*
    +* Licensed to the Apache Software Foundation (ASF) under one or more
    +* contributor license agreements.  See the NOTICE file distributed with
    +* this work for additional information regarding copyright ownership.
    +* The ASF licenses this file to you under the Apache License, Version 2.0
    +* (the "License"); you may not use this file except in compliance with
    +* the License.  You may obtain a copy of the License at
    +*
    +* http://www.apache.org/licenses/LICENSE-2.0
    +*
    +* Unless required by applicable law or agreed to in writing, software
    +* distributed under the License is distributed on an "AS IS" BASIS,
    +* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +* See the License for the specific language governing permissions and
    +* limitations under the License.
    +*/
    +package org.apache.drill.exec.vector.complex;
    +
    +import org.apache.drill.common.expression.PathSegment;
    +import org.apache.drill.common.expression.SchemaPath;
    +import org.apache.drill.common.types.TypeProtos;
    +import org.apache.drill.common.types.Types;
    +import org.apache.drill.exec.record.MaterializedField;
    +
    +public class SchemaPathUtil {
    --- End diff --
    
    Removed this class, since all code where it was used, uses simple name path so it is not needed anymore.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #909: DRILL-4264: Allow field names to include dots

Posted by vvysotskyi <gi...@git.apache.org>.

Github user vvysotskyi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/909#discussion_r135753493
  
    --- Diff: contrib/format-maprdb/src/main/java/org/apache/drill/exec/store/mapr/db/binary/CompareFunctionsProcessor.java ---
    @@ -82,466 +45,37 @@ public static CompareFunctionsProcessor process(FunctionCall call, boolean nullC
             LogicalExpression swapArg = valueArg;
             valueArg = nameArg;
             nameArg = swapArg;
    -        evaluator.functionName = COMPARE_FUNCTIONS_TRANSPOSE_MAP.get(functionName);
    +        evaluator.setFunctionName(COMPARE_FUNCTIONS_TRANSPOSE_MAP.get(functionName));
           }
    -      evaluator.success = nameArg.accept(evaluator, valueArg);
    +      evaluator.setSuccess(nameArg.accept(evaluator, valueArg));
         } else if (nullComparatorSupported && call.args.get(0) instanceof SchemaPath) {
    -      evaluator.success = true;
    -      evaluator.path = (SchemaPath) nameArg;
    +      evaluator.setSuccess(true);
    +      evaluator.setPath((SchemaPath) nameArg);
         }
     
         return evaluator;
       }
     
    -  public CompareFunctionsProcessor(String functionName) {
    -    this.success = false;
    -    this.functionName = functionName;
    -    this.isEqualityFn = COMPARE_FUNCTIONS_TRANSPOSE_MAP.containsKey(functionName)
    -        && COMPARE_FUNCTIONS_TRANSPOSE_MAP.get(functionName).equals(functionName);
    -    this.isRowKeyPrefixComparison = false;
    -    this.sortOrderAscending = true;
    -  }
    -
    -  public byte[] getValue() {
    -    return value;
    -  }
    -
    -  public boolean isSuccess() {
    -    return success;
    -  }
    -
    -  public SchemaPath getPath() {
    -    return path;
    -  }
    -
    -  public String getFunctionName() {
    -    return functionName;
    -  }
    -
    -  public boolean isRowKeyPrefixComparison() {
    -	return isRowKeyPrefixComparison;
    -  }
    -
    -  public byte[] getRowKeyPrefixStartRow() {
    -    return rowKeyPrefixStartRow;
    -  }
    -
    -  public byte[] getRowKeyPrefixStopRow() {
    -  return rowKeyPrefixStopRow;
    -  }
    -
    -  public Filter getRowKeyPrefixFilter() {
    -  return rowKeyPrefixFilter;
    -  }
    -
    -  public boolean isSortOrderAscending() {
    -    return sortOrderAscending;
    -  }
    -
       @Override
    -  public Boolean visitCastExpression(CastExpression e, LogicalExpression valueArg) throws RuntimeException {
    -    if (e.getInput() instanceof CastExpression || e.getInput() instanceof SchemaPath) {
    -      return e.getInput().accept(this, valueArg);
    -    }
    -    return false;
    -  }
    -
    -  @Override
    -  public Boolean visitConvertExpression(ConvertExpression e, LogicalExpression valueArg) throws RuntimeException {
    -    if (e.getConvertFunction() == ConvertExpression.CONVERT_FROM) {
    -
    -      String encodingType = e.getEncodingType();
    -      int prefixLength    = 0;
    -
    -      // Handle scan pruning in the following scenario:
    -      // The row-key is a composite key and the CONVERT_FROM() function has byte_substr() as input function which is
    -      // querying for the first few bytes of the row-key(start-offset 1)
    -      // Example WHERE clause:
    -      // CONVERT_FROM(BYTE_SUBSTR(row_key, 1, 8), 'DATE_EPOCH_BE') < DATE '2015-06-17'
    -      if (e.getInput() instanceof FunctionCall) {
    -
    -        // We can prune scan range only for big-endian encoded data
    -        if (encodingType.endsWith("_BE") == false) {
    -          return false;
    -        }
    -
    -        FunctionCall call = (FunctionCall)e.getInput();
    -        String functionName = call.getName();
    -        if (!functionName.equalsIgnoreCase("byte_substr")) {
    -          return false;
    -        }
    -
    -        LogicalExpression nameArg = call.args.get(0);
    -        LogicalExpression valueArg1 = call.args.size() >= 2 ? call.args.get(1) : null;
    -        LogicalExpression valueArg2 = call.args.size() >= 3 ? call.args.get(2) : null;
    -
    -        if (((nameArg instanceof SchemaPath) == false) ||
    -             (valueArg1 == null) || ((valueArg1 instanceof IntExpression) == false) ||
    -             (valueArg2 == null) || ((valueArg2 instanceof IntExpression) == false)) {
    -          return false;
    -        }
    -
    -        boolean isRowKey = ((SchemaPath)nameArg).getAsUnescapedPath().equals(DrillHBaseConstants.ROW_KEY);
    -        int offset = ((IntExpression)valueArg1).getInt();
    -
    -        if (!isRowKey || (offset != 1)) {
    -          return false;
    -        }
    -
    -        this.path    = (SchemaPath)nameArg;
    -        prefixLength = ((IntExpression)valueArg2).getInt();
    -        this.isRowKeyPrefixComparison = true;
    -        return visitRowKeyPrefixConvertExpression(e, prefixLength, valueArg);
    -      }
    -
    -      if (e.getInput() instanceof SchemaPath) {
    -        ByteBuf bb = null;
    -
    -        switch (encodingType) {
    -        case "INT_BE":
    -        case "INT":
    -        case "UINT_BE":
    -        case "UINT":
    -        case "UINT4_BE":
    -        case "UINT4":
    -          if (valueArg instanceof IntExpression
    -              && (isEqualityFn || encodingType.startsWith("U"))) {
    -            bb = newByteBuf(4, encodingType.endsWith("_BE"));
    -            bb.writeInt(((IntExpression)valueArg).getInt());
    -          }
    -          break;
    -        case "BIGINT_BE":
    -        case "BIGINT":
    -        case "UINT8_BE":
    -        case "UINT8":
    -          if (valueArg instanceof LongExpression
    -              && (isEqualityFn || encodingType.startsWith("U"))) {
    -            bb = newByteBuf(8, encodingType.endsWith("_BE"));
    -            bb.writeLong(((LongExpression)valueArg).getLong());
    -          }
    -          break;
    -        case "FLOAT":
    -          if (valueArg instanceof FloatExpression && isEqualityFn) {
    -            bb = newByteBuf(4, true);
    -            bb.writeFloat(((FloatExpression)valueArg).getFloat());
    -          }
    -          break;
    -        case "DOUBLE":
    -          if (valueArg instanceof DoubleExpression && isEqualityFn) {
    -            bb = newByteBuf(8, true);
    -            bb.writeDouble(((DoubleExpression)valueArg).getDouble());
    -          }
    -          break;
    -        case "TIME_EPOCH":
    -        case "TIME_EPOCH_BE":
    -          if (valueArg instanceof TimeExpression) {
    -            bb = newByteBuf(8, encodingType.endsWith("_BE"));
    -            bb.writeLong(((TimeExpression)valueArg).getTime());
    -          }
    -          break;
    -        case "DATE_EPOCH":
    -        case "DATE_EPOCH_BE":
    -          if (valueArg instanceof DateExpression) {
    -            bb = newByteBuf(8, encodingType.endsWith("_BE"));
    -            bb.writeLong(((DateExpression)valueArg).getDate());
    -          }
    -          break;
    -        case "BOOLEAN_BYTE":
    -          if (valueArg instanceof BooleanExpression) {
    -            bb = newByteBuf(1, false /* does not matter */);
    -            bb.writeByte(((BooleanExpression)valueArg).getBoolean() ? 1 : 0);
    -          }
    -          break;
    -        case "DOUBLE_OB":
    -        case "DOUBLE_OBD":
    -          if (valueArg instanceof DoubleExpression) {
    -            bb = newByteBuf(9, true);
    -            PositionedByteRange br = new SimplePositionedMutableByteRange(bb.array(), 0, 9);
    -            if (encodingType.endsWith("_OBD")) {
    -              org.apache.hadoop.hbase.util.OrderedBytes.encodeFloat64(br,
    -                  ((DoubleExpression)valueArg).getDouble(), Order.DESCENDING);
    -              this.sortOrderAscending = false;
    -            } else {
    -              org.apache.hadoop.hbase.util.OrderedBytes.encodeFloat64(br,
    -                  ((DoubleExpression)valueArg).getDouble(), Order.ASCENDING);
    -            }
    +  protected ByteBuf getByteBuf(LogicalExpression valueArg, String encodingType) {
    --- End diff --
    
    `org.apache.drill.exec.store.mapr.db.binary.CompareFunctionsProcessor` and `org.apache.drill.exec.store.hbase.CompareFunctionsProcessor` classes contain almost the same code, so I moved mutual code to the abstract class `AbstractCompareFunctionsProcessor`. 
    `visitConvertExpression()` method in the `mapr.db.binary.CompareFunctionsProcessor` contains `UTF8_OB` and `UTF8_OBD` cases in the switch block, but this method in `hbase.CompareFunctionsProcessor` class does not contain them, so I added a default case to the method in the abstract class which calls `getByteBuf()` method which considers this difference between the methods.
    
    As I understand, this code is used for **scan pruning** when used `convert_from` function, so we are using byte buffers there.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #909: DRILL-4264: Allow field names to include dots

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/909#discussion_r134064098
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/vector/complex/SchemaPathUtil.java ---
    @@ -0,0 +1,59 @@
    +/*
    +* Licensed to the Apache Software Foundation (ASF) under one or more
    +* contributor license agreements.  See the NOTICE file distributed with
    +* this work for additional information regarding copyright ownership.
    +* The ASF licenses this file to you under the Apache License, Version 2.0
    +* (the "License"); you may not use this file except in compliance with
    +* the License.  You may obtain a copy of the License at
    +*
    +* http://www.apache.org/licenses/LICENSE-2.0
    +*
    +* Unless required by applicable law or agreed to in writing, software
    +* distributed under the License is distributed on an "AS IS" BASIS,
    +* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +* See the License for the specific language governing permissions and
    +* limitations under the License.
    +*/
    +package org.apache.drill.exec.vector.complex;
    +
    +import org.apache.drill.common.expression.PathSegment;
    +import org.apache.drill.common.expression.SchemaPath;
    +import org.apache.drill.common.types.TypeProtos;
    +import org.apache.drill.common.types.Types;
    +import org.apache.drill.exec.record.MaterializedField;
    +
    +public class SchemaPathUtil {
    +
    +  /**
    +   * Creates {@code MaterializedField} instance with {@code SchemaPath path} name and children.
    +   *
    +   * @param path the source of field name and children
    +   * @param type field type
    +   * @return {@code MaterializedField} instance with name and children
    +   * from the specified {@code SchemaPath path}
    +   */
    +  public static MaterializedField getMaterializedFieldFromSchemaPath(SchemaPath path,
    +                                                                     TypeProtos.MajorType type) {
    +    return getMaterializedFieldFromPathSegment(path.getRootSegment(), type);
    +  }
    +
    +  /**
    +   * Creates {@code MaterializedField} instance with {@code SchemaPath.NameSegment path} name and children.
    +   *
    +   * @param path the source of field name and children
    +   * @param type field type
    +   * @return {@code MaterializedField} instance with name and children
    +   * from the specified {@code PathSegment.NameSegment path}
    +   */
    +  public static MaterializedField getMaterializedFieldFromPathSegment(PathSegment.NameSegment path,
    +                                                                      TypeProtos.MajorType type) {
    +    if (path.isLastPath()) {
    +      return MaterializedField.create(path.getPath(), type);
    +    } else {
    +      MaterializedField field = MaterializedField.create(path.getPath(),
    --- End diff --
    
    Not at all sure that we want to do this. Suppose I have names `a.b`, and `a.c`. Do I really want to have both create their own map fields for `a`? Not very likely. Instead, I'd want to create the field for the map `a`, then add to that fields for `b` and `c`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #909: DRILL-4264: Allow field names to include dots

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/909#discussion_r134060544
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/ScanBatch.java ---
    @@ -359,30 +361,109 @@ public Mutator(OperatorExecContext oContext, BufferAllocator allocator, VectorCo
         public <T extends ValueVector> T addField(MaterializedField field,
                                                   Class<T> clazz) throws SchemaChangeException {
           // Check if the field exists.
    -      ValueVector v = fieldVectorMap.get(field.getPath());
    -      if (v == null || v.getClass() != clazz) {
    +      ValueVector vector = fieldVectorMap.get(field.getName());
    +      ValueVector childVector = vector;
    +      // if a vector does not exist yet, creates value vector, or if it exists and has map type, omit this code
    +      if (vector == null || (vector.getClass() != clazz
    +            && (vector.getField().getType().getMinorType() != MinorType.MAP
    +                || field.getType().getMinorType() != MinorType.MAP))) {
             // Field does not exist--add it to the map and the output container.
    -        v = TypeHelper.getNewVector(field, allocator, callBack);
    -        if (!clazz.isAssignableFrom(v.getClass())) {
    +        vector = TypeHelper.getNewVector(field, allocator, callBack);
    +        childVector = vector;
    +        // gets inner field if the map was created the first time
    +        if (field.getType().getMinorType() == MinorType.MAP) {
    +          childVector = getChildVectorByField(vector, field);
    +        } else if (!clazz.isAssignableFrom(vector.getClass())) {
               throw new SchemaChangeException(
                   String.format(
                       "The class that was provided, %s, does not correspond to the "
                       + "expected vector type of %s.",
    -                  clazz.getSimpleName(), v.getClass().getSimpleName()));
    +                  clazz.getSimpleName(), vector.getClass().getSimpleName()));
             }
     
    -        final ValueVector old = fieldVectorMap.put(field.getPath(), v);
    +        final ValueVector old = fieldVectorMap.put(field.getName(), vector);
             if (old != null) {
               old.clear();
               container.remove(old);
             }
     
    -        container.add(v);
    +        container.add(vector);
             // Added new vectors to the container--mark that the schema has changed.
             schemaChanged = true;
           }
    +      // otherwise, checks that field and existing vector have a map type
    --- End diff --
    
    Does this ever actually happen? It would require that the client
    
    1. Have access to the chain of map fields (since maps can contain maps can contain maps...)
    2. Hold onto the root map.
    3. When needing a new field, first check the child list in the materialized field of the leaf map (there is no such check at present.)
    4. If the field is missing, add it to the materialized field for the map.
    5. Pass in the root map to this method.
    
    The above is very unlikely. And, indeed, this code won't work if we have nested maps (a map within a map.)
    
    Probably what actually happens is that this method is called only for columns in the root tuple (the row.) If I want to add a column to a child map (at any nesting depth), I'll call the `addOrGet` method on the map field.
    
    So, I think there is a faulty assumption here. I think we should assume that, if the field is  a map, and it is new, then the child list must be empty. Map fields are then added to the map directly, they are not added via this method.
    
    Note that the new "result set loader" takes care of all this complexity...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #909: DRILL-4264: Allow field names to include dots

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/909#discussion_r134063880
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/vector/complex/SchemaPathUtil.java ---
    @@ -0,0 +1,59 @@
    +/*
    +* Licensed to the Apache Software Foundation (ASF) under one or more
    +* contributor license agreements.  See the NOTICE file distributed with
    +* this work for additional information regarding copyright ownership.
    +* The ASF licenses this file to you under the Apache License, Version 2.0
    +* (the "License"); you may not use this file except in compliance with
    +* the License.  You may obtain a copy of the License at
    +*
    +* http://www.apache.org/licenses/LICENSE-2.0
    +*
    +* Unless required by applicable law or agreed to in writing, software
    +* distributed under the License is distributed on an "AS IS" BASIS,
    +* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +* See the License for the specific language governing permissions and
    +* limitations under the License.
    +*/
    +package org.apache.drill.exec.vector.complex;
    +
    +import org.apache.drill.common.expression.PathSegment;
    +import org.apache.drill.common.expression.SchemaPath;
    +import org.apache.drill.common.types.TypeProtos;
    +import org.apache.drill.common.types.Types;
    +import org.apache.drill.exec.record.MaterializedField;
    +
    +public class SchemaPathUtil {
    --- End diff --
    
    Can these simply be static methods on Materialized Field? Can just call the method `create()` since the argument signatures will make the methods unique.
    
    `SchemaPath` is defined in `common`, which is visible to `vector`, so it is fine for `MaterializedField` to have visibility to `SchemaPath`.
    
    We would want to make clear that the `MaterializedField` is always for the tail member of `SchemaPath`. (Or is it? A schema path can refer to an array element...)
    
    If we have a name `a.b.c.d`, would we have need to create maps (or repeated maps) for `a`, `b`, and `c`, then to create and Int, say, for `d`? Or, is the context such that we aways have a context so that `b` is always in the context of `a`, etc.?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #909: DRILL-4264: Allow field names to include dots

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/909#discussion_r134098584
  
    --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/record/TestMaterializedField.java ---
    @@ -84,4 +89,22 @@ public void testClone() {
     
       }
     
    +  @Test // DRILL-4264
    +  public void testSchemaPathToMaterializedFieldConverting() {
    --- End diff --
    
    No test here for names with dots?
    
    Also, still not convinced of the semantics: not sure we want to create a tree of maps given a nested path. Instead, we need a way to refer to an existing tree and either navigate down that tree to find existing maps, or create a map as needed when a node is missing.
    
    Note that much work has been done in this area in PR #914.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill issue #909: DRILL-4264: Allow field names to include dots

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on the issue:

    https://github.com/apache/drill/pull/909
  
    Thanks much for the great work!
    +1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #909: DRILL-4264: Allow field names to include dots

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/909#discussion_r134061527
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/ScanBatch.java ---
    @@ -359,30 +361,109 @@ public Mutator(OperatorExecContext oContext, BufferAllocator allocator, VectorCo
         public <T extends ValueVector> T addField(MaterializedField field,
                                                   Class<T> clazz) throws SchemaChangeException {
           // Check if the field exists.
    -      ValueVector v = fieldVectorMap.get(field.getPath());
    -      if (v == null || v.getClass() != clazz) {
    +      ValueVector vector = fieldVectorMap.get(field.getName());
    +      ValueVector childVector = vector;
    +      // if a vector does not exist yet, creates value vector, or if it exists and has map type, omit this code
    +      if (vector == null || (vector.getClass() != clazz
    +            && (vector.getField().getType().getMinorType() != MinorType.MAP
    +                || field.getType().getMinorType() != MinorType.MAP))) {
             // Field does not exist--add it to the map and the output container.
    -        v = TypeHelper.getNewVector(field, allocator, callBack);
    -        if (!clazz.isAssignableFrom(v.getClass())) {
    +        vector = TypeHelper.getNewVector(field, allocator, callBack);
    +        childVector = vector;
    +        // gets inner field if the map was created the first time
    +        if (field.getType().getMinorType() == MinorType.MAP) {
    +          childVector = getChildVectorByField(vector, field);
    +        } else if (!clazz.isAssignableFrom(vector.getClass())) {
               throw new SchemaChangeException(
                   String.format(
                       "The class that was provided, %s, does not correspond to the "
                       + "expected vector type of %s.",
    -                  clazz.getSimpleName(), v.getClass().getSimpleName()));
    +                  clazz.getSimpleName(), vector.getClass().getSimpleName()));
             }
     
    -        final ValueVector old = fieldVectorMap.put(field.getPath(), v);
    +        final ValueVector old = fieldVectorMap.put(field.getName(), vector);
             if (old != null) {
               old.clear();
               container.remove(old);
             }
     
    -        container.add(v);
    +        container.add(vector);
             // Added new vectors to the container--mark that the schema has changed.
             schemaChanged = true;
           }
    +      // otherwise, checks that field and existing vector have a map type
    +      // and adds child fields from the field to the vector
    +      else if (field.getType().getMinorType() == MinorType.MAP
    +                  && vector.getField().getType().getMinorType() == MinorType.MAP
    +                  && !field.getChildren().isEmpty()) {
    +        // an incoming field contains only single child since it determines
    +        // full name path of the field in the schema
    +        childVector = addNestedChildToMap((MapVector) vector, Iterables.getLast(field.getChildren()));
    +        schemaChanged = true;
    +      }
     
    -      return clazz.cast(v);
    +      return clazz.cast(childVector);
    +    }
    +
    +    /**
    +     * Finds and returns value vector which path corresponds to the specified field.
    +     * If required vector is nested in the map, gets and returns this vector from the map.
    +     *
    +     * @param valueVector vector that should be checked
    +     * @param field       field that corresponds to required vector
    +     * @return value vector whose path corresponds to the specified field
    +     *
    +     * @throws SchemaChangeException if the field does not correspond to the found vector
    +     */
    +    private ValueVector getChildVectorByField(ValueVector valueVector,
    +                                              MaterializedField field) throws SchemaChangeException {
    +      if (field.getChildren().isEmpty()) {
    +        if (valueVector.getField().equals(field)) {
    +          return valueVector;
    +        } else {
    +          throw new SchemaChangeException(
    +            String.format(
    +              "The field that was provided, %s, does not correspond to the "
    +                + "expected vector type of %s.",
    +              field, valueVector.getClass().getSimpleName()));
    +        }
    +      } else {
    +        // an incoming field contains only single child since it determines
    +        // full name path of the field in the schema
    +        MaterializedField childField = Iterables.getLast(field.getChildren());
    +        return getChildVectorByField(((MapVector) valueVector).getChild(childField.getName()), childField);
    +      }
    +    }
    +
    +    /**
    +     * Adds new vector with the specified field to the specified map vector.
    +     * If a field has few levels of nesting, corresponding maps will be created.
    +     *
    +     * @param valueVector a map where new vector will be added
    +     * @param field       a field that corresponds to the vector which should be created
    +     * @return vector that was created
    +     *
    +     * @throws SchemaChangeException if the field has a child and its type is not map
    +     */
    +    private ValueVector addNestedChildToMap(MapVector valueVector,
    --- End diff --
    
    Why call this rather than telling the client to simply do the `addOrGet` on the map vector itself?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #909: DRILL-4264: Allow field names to include dots

Posted by vvysotskyi <gi...@git.apache.org>.

Github user vvysotskyi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/909#discussion_r134790779
  
    --- Diff: logical/src/main/java/org/apache/drill/common/expression/PathSegment.java ---
    @@ -17,11 +17,15 @@
      */
     package org.apache.drill.common.expression;
     
    -public abstract class PathSegment{
    +public abstract class PathSegment {
     
    -  PathSegment child;
    +  private PathSegment child;
    --- End diff --
    
    This field has a setter.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #909: DRILL-4264: Allow field names to include dots

Posted by vvysotskyi <gi...@git.apache.org>.

Github user vvysotskyi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/909#discussion_r136016798
  
    --- Diff: logical/src/main/java/org/apache/drill/common/expression/SchemaPath.java ---
    @@ -115,6 +112,33 @@ public static SchemaPath create(NamePart namePart) {
       }
     
       /**
    +   * Parses input string and returns {@code SchemaPath} instance.
    +   *
    +   * @param expr input string to be parsed
    +   * @return {@code SchemaPath} instance
    +   */
    +  public static SchemaPath parseFromString(String expr) {
    --- End diff --
    
    Done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill issue #909: DRILL-4264: Allow field names to include dots

Posted by amansinha100 <gi...@git.apache.org>.

Github user amansinha100 commented on the issue:

    https://github.com/apache/drill/pull/909
  
    Merged in d105950a7a9fb2ff3acd072ee65a51ef1fca120e.  @vvysotskyi pls close the PR (for some reason github is not showing me the option to close). 


---

[GitHub] drill pull request #909: DRILL-4264: Allow field names to include dots

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/909#discussion_r135359090
  
    --- Diff: contrib/format-maprdb/src/main/java/org/apache/drill/exec/store/mapr/db/binary/CompareFunctionsProcessor.java ---
    @@ -82,466 +45,37 @@ public static CompareFunctionsProcessor process(FunctionCall call, boolean nullC
             LogicalExpression swapArg = valueArg;
             valueArg = nameArg;
             nameArg = swapArg;
    -        evaluator.functionName = COMPARE_FUNCTIONS_TRANSPOSE_MAP.get(functionName);
    +        evaluator.setFunctionName(COMPARE_FUNCTIONS_TRANSPOSE_MAP.get(functionName));
           }
    -      evaluator.success = nameArg.accept(evaluator, valueArg);
    +      evaluator.setSuccess(nameArg.accept(evaluator, valueArg));
         } else if (nullComparatorSupported && call.args.get(0) instanceof SchemaPath) {
    -      evaluator.success = true;
    -      evaluator.path = (SchemaPath) nameArg;
    +      evaluator.setSuccess(true);
    +      evaluator.setPath((SchemaPath) nameArg);
         }
     
         return evaluator;
       }
     
    -  public CompareFunctionsProcessor(String functionName) {
    -    this.success = false;
    -    this.functionName = functionName;
    -    this.isEqualityFn = COMPARE_FUNCTIONS_TRANSPOSE_MAP.containsKey(functionName)
    -        && COMPARE_FUNCTIONS_TRANSPOSE_MAP.get(functionName).equals(functionName);
    -    this.isRowKeyPrefixComparison = false;
    -    this.sortOrderAscending = true;
    -  }
    -
    -  public byte[] getValue() {
    -    return value;
    -  }
    -
    -  public boolean isSuccess() {
    -    return success;
    -  }
    -
    -  public SchemaPath getPath() {
    -    return path;
    -  }
    -
    -  public String getFunctionName() {
    -    return functionName;
    -  }
    -
    -  public boolean isRowKeyPrefixComparison() {
    -	return isRowKeyPrefixComparison;
    -  }
    -
    -  public byte[] getRowKeyPrefixStartRow() {
    -    return rowKeyPrefixStartRow;
    -  }
    -
    -  public byte[] getRowKeyPrefixStopRow() {
    -  return rowKeyPrefixStopRow;
    -  }
    -
    -  public Filter getRowKeyPrefixFilter() {
    -  return rowKeyPrefixFilter;
    -  }
    -
    -  public boolean isSortOrderAscending() {
    -    return sortOrderAscending;
    -  }
    -
       @Override
    -  public Boolean visitCastExpression(CastExpression e, LogicalExpression valueArg) throws RuntimeException {
    -    if (e.getInput() instanceof CastExpression || e.getInput() instanceof SchemaPath) {
    -      return e.getInput().accept(this, valueArg);
    -    }
    -    return false;
    -  }
    -
    -  @Override
    -  public Boolean visitConvertExpression(ConvertExpression e, LogicalExpression valueArg) throws RuntimeException {
    -    if (e.getConvertFunction() == ConvertExpression.CONVERT_FROM) {
    -
    -      String encodingType = e.getEncodingType();
    -      int prefixLength    = 0;
    -
    -      // Handle scan pruning in the following scenario:
    -      // The row-key is a composite key and the CONVERT_FROM() function has byte_substr() as input function which is
    -      // querying for the first few bytes of the row-key(start-offset 1)
    -      // Example WHERE clause:
    -      // CONVERT_FROM(BYTE_SUBSTR(row_key, 1, 8), 'DATE_EPOCH_BE') < DATE '2015-06-17'
    -      if (e.getInput() instanceof FunctionCall) {
    -
    -        // We can prune scan range only for big-endian encoded data
    -        if (encodingType.endsWith("_BE") == false) {
    -          return false;
    -        }
    -
    -        FunctionCall call = (FunctionCall)e.getInput();
    -        String functionName = call.getName();
    -        if (!functionName.equalsIgnoreCase("byte_substr")) {
    -          return false;
    -        }
    -
    -        LogicalExpression nameArg = call.args.get(0);
    -        LogicalExpression valueArg1 = call.args.size() >= 2 ? call.args.get(1) : null;
    -        LogicalExpression valueArg2 = call.args.size() >= 3 ? call.args.get(2) : null;
    -
    -        if (((nameArg instanceof SchemaPath) == false) ||
    -             (valueArg1 == null) || ((valueArg1 instanceof IntExpression) == false) ||
    -             (valueArg2 == null) || ((valueArg2 instanceof IntExpression) == false)) {
    -          return false;
    -        }
    -
    -        boolean isRowKey = ((SchemaPath)nameArg).getAsUnescapedPath().equals(DrillHBaseConstants.ROW_KEY);
    -        int offset = ((IntExpression)valueArg1).getInt();
    -
    -        if (!isRowKey || (offset != 1)) {
    -          return false;
    -        }
    -
    -        this.path    = (SchemaPath)nameArg;
    -        prefixLength = ((IntExpression)valueArg2).getInt();
    -        this.isRowKeyPrefixComparison = true;
    -        return visitRowKeyPrefixConvertExpression(e, prefixLength, valueArg);
    -      }
    -
    -      if (e.getInput() instanceof SchemaPath) {
    -        ByteBuf bb = null;
    -
    -        switch (encodingType) {
    -        case "INT_BE":
    -        case "INT":
    -        case "UINT_BE":
    -        case "UINT":
    -        case "UINT4_BE":
    -        case "UINT4":
    -          if (valueArg instanceof IntExpression
    -              && (isEqualityFn || encodingType.startsWith("U"))) {
    -            bb = newByteBuf(4, encodingType.endsWith("_BE"));
    -            bb.writeInt(((IntExpression)valueArg).getInt());
    -          }
    -          break;
    -        case "BIGINT_BE":
    -        case "BIGINT":
    -        case "UINT8_BE":
    -        case "UINT8":
    -          if (valueArg instanceof LongExpression
    -              && (isEqualityFn || encodingType.startsWith("U"))) {
    -            bb = newByteBuf(8, encodingType.endsWith("_BE"));
    -            bb.writeLong(((LongExpression)valueArg).getLong());
    -          }
    -          break;
    -        case "FLOAT":
    -          if (valueArg instanceof FloatExpression && isEqualityFn) {
    -            bb = newByteBuf(4, true);
    -            bb.writeFloat(((FloatExpression)valueArg).getFloat());
    -          }
    -          break;
    -        case "DOUBLE":
    -          if (valueArg instanceof DoubleExpression && isEqualityFn) {
    -            bb = newByteBuf(8, true);
    -            bb.writeDouble(((DoubleExpression)valueArg).getDouble());
    -          }
    -          break;
    -        case "TIME_EPOCH":
    -        case "TIME_EPOCH_BE":
    -          if (valueArg instanceof TimeExpression) {
    -            bb = newByteBuf(8, encodingType.endsWith("_BE"));
    -            bb.writeLong(((TimeExpression)valueArg).getTime());
    -          }
    -          break;
    -        case "DATE_EPOCH":
    -        case "DATE_EPOCH_BE":
    -          if (valueArg instanceof DateExpression) {
    -            bb = newByteBuf(8, encodingType.endsWith("_BE"));
    -            bb.writeLong(((DateExpression)valueArg).getDate());
    -          }
    -          break;
    -        case "BOOLEAN_BYTE":
    -          if (valueArg instanceof BooleanExpression) {
    -            bb = newByteBuf(1, false /* does not matter */);
    -            bb.writeByte(((BooleanExpression)valueArg).getBoolean() ? 1 : 0);
    -          }
    -          break;
    -        case "DOUBLE_OB":
    -        case "DOUBLE_OBD":
    -          if (valueArg instanceof DoubleExpression) {
    -            bb = newByteBuf(9, true);
    -            PositionedByteRange br = new SimplePositionedMutableByteRange(bb.array(), 0, 9);
    -            if (encodingType.endsWith("_OBD")) {
    -              org.apache.hadoop.hbase.util.OrderedBytes.encodeFloat64(br,
    -                  ((DoubleExpression)valueArg).getDouble(), Order.DESCENDING);
    -              this.sortOrderAscending = false;
    -            } else {
    -              org.apache.hadoop.hbase.util.OrderedBytes.encodeFloat64(br,
    -                  ((DoubleExpression)valueArg).getDouble(), Order.ASCENDING);
    -            }
    +  protected ByteBuf getByteBuf(LogicalExpression valueArg, String encodingType) {
    --- End diff --
    
    Sorry, it is not at all clear why we deleted the big wad of code. Nor is it clear why, when resolving functions, we start looking at byte buffers. Resolving function is a setup task; byte buffers are runtime tasks. Can you explain this a bit?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #909: DRILL-4264: Allow field names to include dots

Posted by vvysotskyi <gi...@git.apache.org>.

Github user vvysotskyi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/909#discussion_r134689741
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/project/ProjectRecordBatch.java ---
    @@ -362,16 +363,16 @@ protected boolean setupNewSchema() throws SchemaChangeException {
                   final TransferPair tp = vvIn.makeTransferPair(vvOut);
                   transfers.add(tp);
                 }
    -          } else if (value != null && value.intValue() > 1) { // subsequent wildcards should do a copy of incoming valuevectors
    +          } else if (value != null && value > 1) { // subsequent wildcards should do a copy of incoming valuevectors
                 int k = 0;
                 for (final VectorWrapper<?> wrapper : incoming) {
                   final ValueVector vvIn = wrapper.getValueVector();
    -              final SchemaPath originalPath = SchemaPath.getSimplePath(vvIn.getField().getPath());
    -              if (k > result.outputNames.size()-1) {
    +              final SchemaPath originalPath = SchemaPath.getSimplePath(vvIn.getField().getName());
    +              if (k > result.outputNames.size() - 1) {
                     assert false;
                   }
                   final String name = result.outputNames.get(k++);  // get the renamed column names
    -              if (name == EMPTY_STRING) {
    +              if (EMPTY_STRING.equals(name)) {
    --- End diff --
    
    Thanks, replaced by `name.isEmpty()`, but `EMPTY_STRING` is used in other places, so left it in the class.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #909: DRILL-4264: Allow field names to include dots

Posted by vvysotskyi <gi...@git.apache.org>.

Github user vvysotskyi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/909#discussion_r134788014
  
    --- Diff: exec/java-exec/src/test/java/org/apache/drill/test/rowSet/RowSetSchema.java ---
    @@ -94,12 +102,20 @@ private void updateStructure(int index, PhysicalSchema children) {
        */
     
       public static class NameSpace<T> {
    -    private final Map<String,Integer> nameSpace = new HashMap<>();
    +    private final Map<String, Integer> nameSpace = new HashMap<>();
         private final List<T> columns = new ArrayList<>();
     
    -    public int add(String key, T value) {
    +    /**
    +     * Adds column path with specified value to the columns list
    +     * and returns the index of the column in the list.
    +     *
    +     * @param key   full name path of the column in the schema
    +     * @param value value to be added to the list
    +     * @return index of the column in the list
    +     */
    +    public int add(String[] key, T value) {
           int index = columns.size();
    -      nameSpace.put(key, index);
    +      nameSpace.put(SchemaPath.getCompoundPath(key).toExpr(), index);
    --- End diff --
    
    Thanks for the explanation, reverted my changes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #909: DRILL-4264: Allow field names to include dots

Posted by vvysotskyi <gi...@git.apache.org>.

Github user vvysotskyi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/909#discussion_r134540459
  
    --- Diff: contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/CompareFunctionsProcessor.java ---
    @@ -147,10 +147,10 @@ public Boolean visitCastExpression(CastExpression e, LogicalExpression valueArg)
     
       @Override
       public Boolean visitConvertExpression(ConvertExpression e, LogicalExpression valueArg) throws RuntimeException {
    -    if (e.getConvertFunction() == ConvertExpression.CONVERT_FROM) {
    +    if (ConvertExpression.CONVERT_FROM.equals(e.getConvertFunction())) {
    --- End diff --
    
    Since both these classes almost the same, I moved mutual code to the abstract class.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #909: DRILL-4264: Allow field names to include dots

Posted by vvysotskyi <gi...@git.apache.org>.

Github user vvysotskyi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/909#discussion_r134795741
  
    --- Diff: logical/src/main/java/org/apache/drill/common/expression/SchemaPath.java ---
    @@ -115,6 +112,33 @@ public static SchemaPath create(NamePart namePart) {
       }
     
       /**
    +   * Parses input string and returns {@code SchemaPath} instance.
    +   *
    +   * @param expr input string to be parsed
    +   * @return {@code SchemaPath} instance
    +   */
    +  public static SchemaPath parseFromString(String expr) {
    --- End diff --
    
    It parses a string using the same rules which are used for the field in the query. If a string contains dot outside backticks, or there are no backticks in the string, will be created `SchemaPath` with the `NameSegment` which contains one else `NameSegment`, etc. If a string contains [] then `ArraySegment` will be created.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #909: DRILL-4264: Allow field names to include dots

Posted by vvysotskyi <gi...@git.apache.org>.

Github user vvysotskyi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/909#discussion_r134683465
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/ScanBatch.java ---
    @@ -359,30 +361,109 @@ public Mutator(OperatorExecContext oContext, BufferAllocator allocator, VectorCo
         public <T extends ValueVector> T addField(MaterializedField field,
                                                   Class<T> clazz) throws SchemaChangeException {
           // Check if the field exists.
    -      ValueVector v = fieldVectorMap.get(field.getPath());
    -      if (v == null || v.getClass() != clazz) {
    +      ValueVector vector = fieldVectorMap.get(field.getName());
    +      ValueVector childVector = vector;
    +      // if a vector does not exist yet, creates value vector, or if it exists and has map type, omit this code
    +      if (vector == null || (vector.getClass() != clazz
    +            && (vector.getField().getType().getMinorType() != MinorType.MAP
    +                || field.getType().getMinorType() != MinorType.MAP))) {
             // Field does not exist--add it to the map and the output container.
    -        v = TypeHelper.getNewVector(field, allocator, callBack);
    -        if (!clazz.isAssignableFrom(v.getClass())) {
    +        vector = TypeHelper.getNewVector(field, allocator, callBack);
    +        childVector = vector;
    +        // gets inner field if the map was created the first time
    +        if (field.getType().getMinorType() == MinorType.MAP) {
    +          childVector = getChildVectorByField(vector, field);
    +        } else if (!clazz.isAssignableFrom(vector.getClass())) {
               throw new SchemaChangeException(
                   String.format(
                       "The class that was provided, %s, does not correspond to the "
                       + "expected vector type of %s.",
    -                  clazz.getSimpleName(), v.getClass().getSimpleName()));
    +                  clazz.getSimpleName(), vector.getClass().getSimpleName()));
             }
     
    -        final ValueVector old = fieldVectorMap.put(field.getPath(), v);
    +        final ValueVector old = fieldVectorMap.put(field.getName(), vector);
             if (old != null) {
               old.clear();
               container.remove(old);
             }
     
    -        container.add(v);
    +        container.add(vector);
             // Added new vectors to the container--mark that the schema has changed.
             schemaChanged = true;
           }
    +      // otherwise, checks that field and existing vector have a map type
    +      // and adds child fields from the field to the vector
    +      else if (field.getType().getMinorType() == MinorType.MAP
    +                  && vector.getField().getType().getMinorType() == MinorType.MAP
    +                  && !field.getChildren().isEmpty()) {
    +        // an incoming field contains only single child since it determines
    +        // full name path of the field in the schema
    +        childVector = addNestedChildToMap((MapVector) vector, Iterables.getLast(field.getChildren()));
    +        schemaChanged = true;
    +      }
     
    -      return clazz.cast(v);
    +      return clazz.cast(childVector);
    +    }
    +
    +    /**
    +     * Finds and returns value vector which path corresponds to the specified field.
    +     * If required vector is nested in the map, gets and returns this vector from the map.
    +     *
    +     * @param valueVector vector that should be checked
    +     * @param field       field that corresponds to required vector
    +     * @return value vector whose path corresponds to the specified field
    +     *
    +     * @throws SchemaChangeException if the field does not correspond to the found vector
    +     */
    +    private ValueVector getChildVectorByField(ValueVector valueVector,
    +                                              MaterializedField field) throws SchemaChangeException {
    +      if (field.getChildren().isEmpty()) {
    +        if (valueVector.getField().equals(field)) {
    +          return valueVector;
    +        } else {
    +          throw new SchemaChangeException(
    +            String.format(
    +              "The field that was provided, %s, does not correspond to the "
    +                + "expected vector type of %s.",
    +              field, valueVector.getClass().getSimpleName()));
    +        }
    +      } else {
    +        // an incoming field contains only single child since it determines
    +        // full name path of the field in the schema
    +        MaterializedField childField = Iterables.getLast(field.getChildren());
    +        return getChildVectorByField(((MapVector) valueVector).getChild(childField.getName()), childField);
    +      }
    --- End diff --
    
    This method was designed to be called only when the vector has just created, so its field does not contain `$bits$` or `$offsets$` fields in the children list. Its goal was to return the last inner vector that corresponds to the last nested field child. 
    But considering the previous comment, this method also is not needed anymore. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #909: DRILL-4264: Allow field names to include dots

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/909#discussion_r134098834
  
    --- Diff: logical/src/main/java/org/apache/drill/common/expression/PathSegment.java ---
    @@ -17,11 +17,15 @@
      */
     package org.apache.drill.common.expression;
     
    -public abstract class PathSegment{
    +public abstract class PathSegment {
     
    -  PathSegment child;
    +  private PathSegment child;
    --- End diff --
    
    final?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #909: DRILL-4264: Allow field names to include dots

Posted by vvysotskyi <gi...@git.apache.org>.

Github user vvysotskyi closed the pull request at:

    https://github.com/apache/drill/pull/909


---

[GitHub] drill issue #909: DRILL-4264: Allow field names to include dots

Posted by amansinha100 <gi...@git.apache.org>.

Github user amansinha100 commented on the issue:

    https://github.com/apache/drill/pull/909
  
    Thanks @vvysotskyi for the PR and @paul-rogers for reviewing the proposal and code.  


---

[GitHub] drill pull request #909: DRILL-4264: Allow field names to include dots

Posted by vvysotskyi <gi...@git.apache.org>.

Github user vvysotskyi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/909#discussion_r134788919
  
    --- Diff: exec/vector/src/main/java/org/apache/drill/exec/vector/accessor/TupleAccessor.java ---
    @@ -48,9 +48,21 @@
     
         MaterializedField column(int index);
     
    -    MaterializedField column(String name);
    +    /**
    +     * Returns {@code MaterializedField} instance from schema using the name path specified in param.
    +     *
    +     * @param name full name path of the column in the schema
    +     * @return {@code MaterializedField} instance
    +     */
    +    MaterializedField column(String[] name);
    --- End diff --
    
    Thanks, reverted my changes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #909: DRILL-4264: Allow field names to include dots

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/909#discussion_r135868684
  
    --- Diff: contrib/format-maprdb/src/main/java/org/apache/drill/exec/store/mapr/db/binary/CompareFunctionsProcessor.java ---
    @@ -82,466 +45,37 @@ public static CompareFunctionsProcessor process(FunctionCall call, boolean nullC
             LogicalExpression swapArg = valueArg;
             valueArg = nameArg;
             nameArg = swapArg;
    -        evaluator.functionName = COMPARE_FUNCTIONS_TRANSPOSE_MAP.get(functionName);
    +        evaluator.setFunctionName(COMPARE_FUNCTIONS_TRANSPOSE_MAP.get(functionName));
           }
    -      evaluator.success = nameArg.accept(evaluator, valueArg);
    +      evaluator.setSuccess(nameArg.accept(evaluator, valueArg));
         } else if (nullComparatorSupported && call.args.get(0) instanceof SchemaPath) {
    -      evaluator.success = true;
    -      evaluator.path = (SchemaPath) nameArg;
    +      evaluator.setSuccess(true);
    +      evaluator.setPath((SchemaPath) nameArg);
         }
     
         return evaluator;
       }
     
    -  public CompareFunctionsProcessor(String functionName) {
    -    this.success = false;
    -    this.functionName = functionName;
    -    this.isEqualityFn = COMPARE_FUNCTIONS_TRANSPOSE_MAP.containsKey(functionName)
    -        && COMPARE_FUNCTIONS_TRANSPOSE_MAP.get(functionName).equals(functionName);
    -    this.isRowKeyPrefixComparison = false;
    -    this.sortOrderAscending = true;
    -  }
    -
    -  public byte[] getValue() {
    -    return value;
    -  }
    -
    -  public boolean isSuccess() {
    -    return success;
    -  }
    -
    -  public SchemaPath getPath() {
    -    return path;
    -  }
    -
    -  public String getFunctionName() {
    -    return functionName;
    -  }
    -
    -  public boolean isRowKeyPrefixComparison() {
    -	return isRowKeyPrefixComparison;
    -  }
    -
    -  public byte[] getRowKeyPrefixStartRow() {
    -    return rowKeyPrefixStartRow;
    -  }
    -
    -  public byte[] getRowKeyPrefixStopRow() {
    -  return rowKeyPrefixStopRow;
    -  }
    -
    -  public Filter getRowKeyPrefixFilter() {
    -  return rowKeyPrefixFilter;
    -  }
    -
    -  public boolean isSortOrderAscending() {
    -    return sortOrderAscending;
    -  }
    -
       @Override
    -  public Boolean visitCastExpression(CastExpression e, LogicalExpression valueArg) throws RuntimeException {
    -    if (e.getInput() instanceof CastExpression || e.getInput() instanceof SchemaPath) {
    -      return e.getInput().accept(this, valueArg);
    -    }
    -    return false;
    -  }
    -
    -  @Override
    -  public Boolean visitConvertExpression(ConvertExpression e, LogicalExpression valueArg) throws RuntimeException {
    -    if (e.getConvertFunction() == ConvertExpression.CONVERT_FROM) {
    -
    -      String encodingType = e.getEncodingType();
    -      int prefixLength    = 0;
    -
    -      // Handle scan pruning in the following scenario:
    -      // The row-key is a composite key and the CONVERT_FROM() function has byte_substr() as input function which is
    -      // querying for the first few bytes of the row-key(start-offset 1)
    -      // Example WHERE clause:
    -      // CONVERT_FROM(BYTE_SUBSTR(row_key, 1, 8), 'DATE_EPOCH_BE') < DATE '2015-06-17'
    -      if (e.getInput() instanceof FunctionCall) {
    -
    -        // We can prune scan range only for big-endian encoded data
    -        if (encodingType.endsWith("_BE") == false) {
    -          return false;
    -        }
    -
    -        FunctionCall call = (FunctionCall)e.getInput();
    -        String functionName = call.getName();
    -        if (!functionName.equalsIgnoreCase("byte_substr")) {
    -          return false;
    -        }
    -
    -        LogicalExpression nameArg = call.args.get(0);
    -        LogicalExpression valueArg1 = call.args.size() >= 2 ? call.args.get(1) : null;
    -        LogicalExpression valueArg2 = call.args.size() >= 3 ? call.args.get(2) : null;
    -
    -        if (((nameArg instanceof SchemaPath) == false) ||
    -             (valueArg1 == null) || ((valueArg1 instanceof IntExpression) == false) ||
    -             (valueArg2 == null) || ((valueArg2 instanceof IntExpression) == false)) {
    -          return false;
    -        }
    -
    -        boolean isRowKey = ((SchemaPath)nameArg).getAsUnescapedPath().equals(DrillHBaseConstants.ROW_KEY);
    -        int offset = ((IntExpression)valueArg1).getInt();
    -
    -        if (!isRowKey || (offset != 1)) {
    -          return false;
    -        }
    -
    -        this.path    = (SchemaPath)nameArg;
    -        prefixLength = ((IntExpression)valueArg2).getInt();
    -        this.isRowKeyPrefixComparison = true;
    -        return visitRowKeyPrefixConvertExpression(e, prefixLength, valueArg);
    -      }
    -
    -      if (e.getInput() instanceof SchemaPath) {
    -        ByteBuf bb = null;
    -
    -        switch (encodingType) {
    -        case "INT_BE":
    -        case "INT":
    -        case "UINT_BE":
    -        case "UINT":
    -        case "UINT4_BE":
    -        case "UINT4":
    -          if (valueArg instanceof IntExpression
    -              && (isEqualityFn || encodingType.startsWith("U"))) {
    -            bb = newByteBuf(4, encodingType.endsWith("_BE"));
    -            bb.writeInt(((IntExpression)valueArg).getInt());
    -          }
    -          break;
    -        case "BIGINT_BE":
    -        case "BIGINT":
    -        case "UINT8_BE":
    -        case "UINT8":
    -          if (valueArg instanceof LongExpression
    -              && (isEqualityFn || encodingType.startsWith("U"))) {
    -            bb = newByteBuf(8, encodingType.endsWith("_BE"));
    -            bb.writeLong(((LongExpression)valueArg).getLong());
    -          }
    -          break;
    -        case "FLOAT":
    -          if (valueArg instanceof FloatExpression && isEqualityFn) {
    -            bb = newByteBuf(4, true);
    -            bb.writeFloat(((FloatExpression)valueArg).getFloat());
    -          }
    -          break;
    -        case "DOUBLE":
    -          if (valueArg instanceof DoubleExpression && isEqualityFn) {
    -            bb = newByteBuf(8, true);
    -            bb.writeDouble(((DoubleExpression)valueArg).getDouble());
    -          }
    -          break;
    -        case "TIME_EPOCH":
    -        case "TIME_EPOCH_BE":
    -          if (valueArg instanceof TimeExpression) {
    -            bb = newByteBuf(8, encodingType.endsWith("_BE"));
    -            bb.writeLong(((TimeExpression)valueArg).getTime());
    -          }
    -          break;
    -        case "DATE_EPOCH":
    -        case "DATE_EPOCH_BE":
    -          if (valueArg instanceof DateExpression) {
    -            bb = newByteBuf(8, encodingType.endsWith("_BE"));
    -            bb.writeLong(((DateExpression)valueArg).getDate());
    -          }
    -          break;
    -        case "BOOLEAN_BYTE":
    -          if (valueArg instanceof BooleanExpression) {
    -            bb = newByteBuf(1, false /* does not matter */);
    -            bb.writeByte(((BooleanExpression)valueArg).getBoolean() ? 1 : 0);
    -          }
    -          break;
    -        case "DOUBLE_OB":
    -        case "DOUBLE_OBD":
    -          if (valueArg instanceof DoubleExpression) {
    -            bb = newByteBuf(9, true);
    -            PositionedByteRange br = new SimplePositionedMutableByteRange(bb.array(), 0, 9);
    -            if (encodingType.endsWith("_OBD")) {
    -              org.apache.hadoop.hbase.util.OrderedBytes.encodeFloat64(br,
    -                  ((DoubleExpression)valueArg).getDouble(), Order.DESCENDING);
    -              this.sortOrderAscending = false;
    -            } else {
    -              org.apache.hadoop.hbase.util.OrderedBytes.encodeFloat64(br,
    -                  ((DoubleExpression)valueArg).getDouble(), Order.ASCENDING);
    -            }
    +  protected ByteBuf getByteBuf(LogicalExpression valueArg, String encodingType) {
    --- End diff --
    
    Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---