You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@drill.apache.org by paul-rogers <gi...@git.apache.org> on 2017/06/23 03:58:13 UTC

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes an improv...

GitHub user paul-rogers opened a pull request:

    https://github.com/apache/drill/pull/860

    DRILL-5601: Rollup of external sort fixes an improvements

    - DRILL-5513: Managed External Sort : OOM error during the merge phase
    - DRILL-5519: Sort fails to spill and results in an OOM
    - DRILL-5522: OOM during the merge and spill process of the managed external sort
    - DRILL-5594: Excessive buffer reallocations during merge phase of external sort
    - DRILL-5597: Incorrect "bits" vector allocation in nullable vectors allocateNew()
    - DRILL-5602: Repeated List Vector fails to initialize the offset vector
    
    All of the bugs have to do with handling low-memory conditions, and with
    correctly estimating the sizes of vectors, even when those vectors come
    from the spill file or from an exchange. Hence, the changes for all of
    the above issues are interrelated.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/paul-rogers/drill DRILL-5601

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/drill/pull/860.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #860
    
----
commit 63faa99ba0ef2e51bb18ac5a91cd95101a54ae88
Author: Paul Rogers <pr...@maprtech.com>
Date:   2017-04-06T20:57:19Z

    DRILL-5601: Rollup of external sort fixes an improvements
    
    - DRILL-5513: Managed External Sort : OOM error during the merge phase
    - DRILL-5519: Sort fails to spill and results in an OOM
    - DRILL-5522: OOM during the merge and spill process of the managed external sort
    - DRILL-5594: Excessive buffer reallocations during merge phase of external sort
    - DRILL-5597: Incorrect "bits" vector allocation in nullable vectors allocateNew()
    - DRILL-5602: Repeated List Vector fails to initialize the offset vector
    
    All of the bugs have to do with handling low-memory conditions, and with
    correctly estimating the sizes of vectors, even when those vectors come
    from the spill file or from an exchange. Hence, the changes for all of
    the above issues are interrelated.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r128126935
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/spill/RecordBatchSizer.java ---
    @@ -19,117 +19,162 @@
     
     import java.util.ArrayList;
     import java.util.List;
    +import java.util.Set;
     
    +import org.apache.drill.common.types.TypeProtos.DataMode;
     import org.apache.drill.common.types.TypeProtos.MinorType;
     import org.apache.drill.exec.expr.TypeHelper;
    +import org.apache.drill.exec.memory.AllocationManager.BufferLedger;
     import org.apache.drill.exec.memory.BaseAllocator;
     import org.apache.drill.exec.record.BatchSchema;
     import org.apache.drill.exec.record.MaterializedField;
     import org.apache.drill.exec.record.RecordBatch;
    +import org.apache.drill.exec.record.SmartAllocationHelper;
     import org.apache.drill.exec.record.VectorAccessible;
     import org.apache.drill.exec.record.VectorWrapper;
     import org.apache.drill.exec.record.selection.SelectionVector2;
    +import org.apache.drill.exec.vector.UInt4Vector;
     import org.apache.drill.exec.vector.ValueVector;
     import org.apache.drill.exec.vector.complex.AbstractMapVector;
    +import org.apache.drill.exec.vector.complex.RepeatedValueVector;
     import org.apache.drill.exec.vector.VariableWidthVector;
     
    +import com.google.common.collect.Sets;
    +
     /**
      * Given a record batch or vector container, determines the actual memory
      * consumed by each column, the average row, and the entire record batch.
      */
     
     public class RecordBatchSizer {
    -//  private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(RecordBatchSizer.class);
     
       /**
        * Column size information.
        */
       public static class ColumnSize {
    +    public final String prefix;
         public final MaterializedField metadata;
     
         /**
    -     * Assumed size from Drill metadata.
    +     * Assumed size from Drill metadata. Note that this information is
    +     * 100% bogus. Do not use it.
          */
     
    +    @Deprecated
         public int stdSize;
     
         /**
    -     * Actual memory consumed by all the vectors associated with this column.
    -     */
    -
    -    public int totalSize;
    -
    -    /**
          * Actual average column width as determined from actual memory use. This
          * size is larger than the actual data size since this size includes per-
          * column overhead such as any unused vector space, etc.
          */
     
    -    public int estSize;
    -    public int capacity;
    -    public int density;
    -    public int dataSize;
    -    public boolean variableWidth;
    -
    -    public ColumnSize(ValueVector vv) {
    -      metadata = vv.getField();
    -      stdSize = TypeHelper.getSize(metadata.getType());
    -
    -      // Can't get size estimates if this is an empty batch.
    -      int rowCount = vv.getAccessor().getValueCount();
    -      if (rowCount == 0) {
    -        estSize = stdSize;
    -        return;
    -      }
    +    public final int estSize;
    +    public final int valueCount;
    +    public final int entryCount;
    +    public final int dataSize;
    +    public final int estElementCount;
    +    public final boolean isVariableWidth;
     
    -      // Total size taken by all vectors (and underlying buffers)
    -      // associated with this vector.
    +    public ColumnSize(ValueVector v, String prefix, int valueCount) {
    +      this.prefix = prefix;
    +      this.valueCount = valueCount;
    +      metadata = v.getField();
    +      boolean isMap = v.getField().getType().getMinorType() == MinorType.MAP;
     
    -      totalSize = vv.getAllocatedByteCount();
    +      // The amount of memory consumed by the payload: the actual
    +      // data stored in the vectors.
     
    -      // Capacity is the number of values that the vector could
    -      // contain. This is useful only for fixed-length vectors.
    +      int mapSize = 0;
    +      if (v.getField().getDataMode() == DataMode.REPEATED) {
     
    -      capacity = vv.getValueCapacity();
    +        // Repeated vectors are special: they have an associated offset vector
    +        // that changes the value count of the contained vectors.
     
    -      // The amount of memory consumed by the payload: the actual
    -      // data stored in the vectors.
    +        @SuppressWarnings("resource")
    +        UInt4Vector offsetVector = ((RepeatedValueVector) v).getOffsetVector();
    +        entryCount = offsetVector.getAccessor().get(valueCount);
    +        estElementCount = roundUp(entryCount, valueCount);
    +        if (isMap) {
     
    -      dataSize = vv.getPayloadByteCount();
    +          // For map, the only data associated with the map vector
    +          // itself is the offset vector, if any.
     
    -      // Determine "density" the number of rows compared to potential
    -      // capacity. Low-density batches occur at block boundaries, ends
    -      // of files and so on. Low-density batches throw off our estimates
    -      // for Varchar columns because we don't know the actual number of
    -      // bytes consumed (that information is hidden behind the Varchar
    -      // implementation where we can't get at it.)
    +          mapSize = offsetVector.getPayloadByteCount(valueCount);
    +        }
    +      } else {
    +        entryCount = valueCount;
    +        estElementCount = 1;
    +      }
     
    -      density = roundUp(dataSize * 100, totalSize);
    -      estSize = roundUp(dataSize, rowCount);
    -      variableWidth = vv instanceof VariableWidthVector ;
    +      if (isMap) {
    +        dataSize = mapSize;
    +        stdSize = 0;
    +     } else {
    +        dataSize = v.getPayloadByteCount(valueCount);
    +        stdSize = TypeHelper.getSize(metadata.getType());
    +     }
    +      estSize = roundUp(dataSize, valueCount);
    +      isVariableWidth = v instanceof VariableWidthVector ;
         }
     
         @Override
         public String toString() {
           StringBuilder buf = new StringBuilder()
    +          .append(prefix)
               .append(metadata.getName())
               .append("(type: ")
    +          .append(metadata.getType().getMode().name())
    +          .append(" ")
               .append(metadata.getType().getMinorType().name())
    -          .append(", std col. size: ")
    +          .append(", count: ")
    +          .append(valueCount);
    +      if (metadata.getDataMode() == DataMode.REPEATED) {
    +        buf.append(", total entries: ")
    +           .append(entryCount)
    +           .append(", per-array: ")
    +           .append(estElementCount);
    +      }
    +      buf .append(", std size: ")
               .append(stdSize)
    -          .append(", actual col. size: ")
    +          .append(", actual size: ")
               .append(estSize)
    -          .append(", total size: ")
    -          .append(totalSize)
               .append(", data size: ")
               .append(dataSize)
    -          .append(", row capacity: ")
    -          .append(capacity)
    -          .append(", density: ")
    -          .append(density)
               .append(")");
           return buf.toString();
         }
    +
    +    public void buildAllocHelper(SmartAllocationHelper allocHelper) {
    +      int width = 0;
    +      switch(metadata.getType().getMinorType()) {
    --- End diff --
    
    Would definitely be nicer. Unfortunately, MajorType and MinorType are protobuf-generated classes, so we can't muck about with them -- alas.
    
    The isVariableWidth (which, I believe, you added) handles a subset of the cases. It handles required and nullable modes, but does not consider repeated vectors to be variable width. (That is REQUIRED or OPTIONAL VARCHAR is variable width; REPEATED VARCHAR is not. Is this a bug?)
    
    What we need is a better metadata system that abstracts out all of this stuff.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r128144906
  
    --- Diff: exec/vector/src/main/java/org/apache/drill/exec/vector/AllocationHelper.java ---
    @@ -26,14 +26,16 @@ public static void allocate(ValueVector v, int valueCount, int bytesPerValue) {
         allocate(v, valueCount, bytesPerValue, 5);
       }
     
    -  public static void allocatePrecomputedChildCount(ValueVector v, int valueCount, int bytesPerValue, int childValCount){
    -    if(v instanceof FixedWidthVector) {
    +  public static void allocatePrecomputedChildCount(ValueVector v, int valueCount, int bytesPerValue, int childValCount) {
    --- End diff --
    
    Well, sure. But, notice that I didn't write this; I only added a space. But, I went ahead and renamed all the "v"s...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r128143815
  
    --- Diff: exec/java-exec/src/test/java/org/apache/drill/test/ClientFixture.java ---
    @@ -231,4 +236,120 @@ public void setControls(String controls) {
       public RowSetBuilder rowSetBuilder(BatchSchema schema) {
         return new RowSetBuilder(allocator(), schema);
       }
    +
    +  /**
    +   * Very simple parser for semi-colon separated lists of SQL statements which
    +   * handles quoted semicolons. Drill can execute only one statement at a time
    +   * (without a trailing semi-colon.) This parser breaks up a statement list
    +   * into single statements. Input:<code><pre>
    +   * USE a.b;
    +   * ALTER SESSION SET `foo` = ";";
    +   * SELECT * FROM bar WHERE x = "\";";
    +   * </pre><code>Output:
    +   * <ul>
    +   * <li><tt>USE a.b</tt></li>
    +   * <li><tt>ALTER SESSION SET `foo` = ";"</tt></li>
    +   * <li><tt>SELECT * FROM bar WHERE x = "\";"</tt></li>
    +   */
    +
    +  public static class StatementParser {
    +    private final Reader in;
    +    private StringBuilder buf;
    +
    +    public StatementParser(Reader in) {
    +      this.in = in;
    +    }
    +
    +    public String parseNext() throws IOException {
    +      boolean eof = false;
    +      buf = new StringBuilder();
    +      outer:
    +      for (;;) {
    +        int c = in.read();
    +        if (c == -1) {
    +          eof = true;
    +          break;
    +        }
    +        if (c == ';') {
    +          break;
    +        }
    +        buf.append((char) c);
    +        if (c == '"' || c == '\'' || c == '`') {
    +          int quote = c;
    +          boolean escape = false;
    +          for (;;) {
    +            c = in.read();
    +            if (c == -1) {
    +              break outer;
    --- End diff --
    
    OK, so this isn't the world's most robust parser. Is just just trying to do a bit better than String.split(";")... But fixed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by Ben-Zvi <gi...@git.apache.org>.

Github user Ben-Zvi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r127071648
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/record/SmartAllocationHelper.java ---
    @@ -0,0 +1,156 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.drill.exec.record;
    +
    +import java.util.HashMap;
    +import java.util.Map;
    +import java.util.Map.Entry;
    +
    +import org.apache.drill.exec.vector.AllocationHelper;
    +import org.apache.drill.exec.vector.ValueVector;
    +import org.apache.drill.exec.vector.complex.AbstractMapVector;
    +import org.apache.drill.exec.vector.complex.RepeatedMapVector;
    +
    +/**
    + * Prototype mechanism to allocate vectors based on expected
    + * data sizes. This version uses a name-based map of fields
    + * to sizes. Better to represent the batch structurally and
    + * simply iterate over the schema rather than doing a per-field
    + * lookup. But, the mechanisms needed to do the efficient solution
    + * don't exist yet.
    + */
    +
    +public class SmartAllocationHelper {
    +
    +  public static class AllocationHint {
    +    public final int entryWidth;
    +    public final int elementCount;
    +
    +    private AllocationHint(int width, int elements) {
    +      entryWidth = width;
    +      elementCount = elements;
    +    }
    +
    +    @Override
    +    public String toString() {
    +      StringBuilder buf = new StringBuilder()
    +          .append("{");
    +      boolean comma = false;
    +      if (entryWidth > 0) {
    +        buf.append("width=")
    +           .append(entryWidth);
    +        comma = true;
    +      }
    +      if (elementCount > 0) {
    +        if (comma) {
    +          buf.append(", ");
    +        }
    +        buf.append("elements=")
    +        .append(elementCount);
    +      }
    +      buf.append("}");
    +      return buf.toString();
    +    }
    +  }
    +
    +  private Map<String, AllocationHint> hints = new HashMap<>();
    +
    +  public void variableWidth(String name, int width) {
    +    hints.put(name, new AllocationHint(width, 1));
    +  }
    +
    +  public void fixedWidthArray(String name, int elements) {
    +    hints.put(name, new AllocationHint(0, elements));
    +  }
    +
    +  public void variableWidthArray(String name, int width, int elements) {
    +    hints.put(name, new AllocationHint(width, elements));
    +  }
    +
    +  public void allocateBatch(VectorAccessible va, int recordCount) {
    +    for (VectorWrapper<?> w: va) {
    +      allocateVector(w.getValueVector(), "", recordCount);
    +    }
    +  }
    +
    +  private void allocateVector(ValueVector vector, String prefix, int recordCount) {
    --- End diff --
    
    Can this method be renamed to **allocateVectorOrMap()** ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill issue #860: DRILL-5601: Rollup of external sort fixes and improvements

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on the issue:

    https://github.com/apache/drill/pull/860
  
    Fixed; slightly different than suggested, but similar. Didn't want to change this code too terribly much, so did the minimum change that would work.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r128143493
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/record/SmartAllocationHelper.java ---
    @@ -0,0 +1,156 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.drill.exec.record;
    +
    +import java.util.HashMap;
    +import java.util.Map;
    +import java.util.Map.Entry;
    +
    +import org.apache.drill.exec.vector.AllocationHelper;
    +import org.apache.drill.exec.vector.ValueVector;
    +import org.apache.drill.exec.vector.complex.AbstractMapVector;
    +import org.apache.drill.exec.vector.complex.RepeatedMapVector;
    +
    +/**
    + * Prototype mechanism to allocate vectors based on expected
    + * data sizes. This version uses a name-based map of fields
    + * to sizes. Better to represent the batch structurally and
    + * simply iterate over the schema rather than doing a per-field
    + * lookup. But, the mechanisms needed to do the efficient solution
    + * don't exist yet.
    + */
    +
    +public class SmartAllocationHelper {
    +
    +  public static class AllocationHint {
    +    public final int entryWidth;
    +    public final int elementCount;
    +
    +    private AllocationHint(int width, int elements) {
    +      entryWidth = width;
    +      elementCount = elements;
    +    }
    +
    +    @Override
    +    public String toString() {
    +      StringBuilder buf = new StringBuilder()
    --- End diff --
    
    Clever! Or I can leave it as it is:
    
    * Use of string builder minimizes string copies
    * Layout lends itself to adding/removing members (which happens a lot with toString() methods)
    * This is never used in production; only during debugging, so performance is not a concern (which negates my first point, I suppose...)
    
    Got rid of the flat (though at the cost of assigning to a string; which someone didn't like in some previous Drill code review...)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by Ben-Zvi <gi...@git.apache.org>.

Github user Ben-Zvi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r127560655
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/StreamingAggTemplate.java ---
    @@ -36,6 +36,9 @@
       private int previousIndex = -1;
       private int underlyingIndex = 0;
       private int currentIndex;
    +  /**
    +   * Number of records added to the current aggregation group. (?)
    --- End diff --
    
    Yes, this is correct. No need for the "(?)"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by Ben-Zvi <gi...@git.apache.org>.

Github user Ben-Zvi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r123845357
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/spill/RecordBatchSizer.java ---
    @@ -177,7 +226,7 @@ public RecordBatchSizer(RecordBatch batch) {
       /**
        *  Count the nullable columns; used for memory estimation
        */
    -  public int numNullables;
    +  public int nullableCount;
       /**
        *
        * @param va
    --- End diff --
    
    Update this javadoc 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by Ben-Zvi <gi...@git.apache.org>.

Github user Ben-Zvi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r127345692
  
    --- Diff: exec/vector/src/main/java/org/apache/drill/exec/vector/complex/ListVector.java ---
    @@ -323,12 +325,15 @@ public void setValueCount(int valueCount) {
       }
     
       @Override
    -  public int getAllocatedByteCount() {
    -    return offsets.getAllocatedByteCount() + bits.getAllocatedByteCount() + super.getAllocatedByteCount();
    +  public void getLedgers(Set<BufferLedger> ledgers) {
    --- End diff --
    
    Same comment - set vs. get



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r128365478
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/managed/MergeSortWrapper.java ---
    @@ -147,7 +147,7 @@ private MSorter createNewMSorter(List<Ordering> orderings, MappingSet mainMappin
         cg.plainJavaCapable(true);
     
         // Uncomment out this line to debug the generated code.
    -//  cg.saveCodeForDebugging(true);
    +    cg.saveCodeForDebugging(true);
    --- End diff --
    
    Fixed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by Ben-Zvi <gi...@git.apache.org>.

Github user Ben-Zvi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r127067063
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/record/SmartAllocationHelper.java ---
    @@ -0,0 +1,156 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.drill.exec.record;
    +
    +import java.util.HashMap;
    +import java.util.Map;
    +import java.util.Map.Entry;
    +
    +import org.apache.drill.exec.vector.AllocationHelper;
    +import org.apache.drill.exec.vector.ValueVector;
    +import org.apache.drill.exec.vector.complex.AbstractMapVector;
    +import org.apache.drill.exec.vector.complex.RepeatedMapVector;
    +
    +/**
    + * Prototype mechanism to allocate vectors based on expected
    + * data sizes. This version uses a name-based map of fields
    + * to sizes. Better to represent the batch structurally and
    + * simply iterate over the schema rather than doing a per-field
    + * lookup. But, the mechanisms needed to do the efficient solution
    + * don't exist yet.
    + */
    +
    +public class SmartAllocationHelper {
    +
    +  public static class AllocationHint {
    +    public final int entryWidth;
    +    public final int elementCount;
    +
    +    private AllocationHint(int width, int elements) {
    +      entryWidth = width;
    +      elementCount = elements;
    +    }
    +
    +    @Override
    +    public String toString() {
    +      StringBuilder buf = new StringBuilder()
    --- End diff --
    
    This whole method can be written in only 4 lines of code \:
    
    if ( entryWidth > 0 && elementCount > 0 ) { return String.format("{width=%d, elements=%d}",entryWidth,entryCount); }
    if ( entryWidth > 0 ) { return String.format("{width=%d}",entryWidth); }
    if ( elementCount > 0 ) { return String.format("{elements=%d}",entryCount); }
    return String.format("{}");
    
    or
    
    String ew = ( entryWidth > 0 ) ? String.format("width=%d",entryWidth) : "";
    String comma = ( entryWidth > 0 && elementCount > 0 ) ? ", " : "";
    String ec = ( elementCount > 0 ) ? String.format("elements=%d",entryCount) : "";
    return "{" + ew + comma + ec + "}" ;
    
    The above can even be pushed into a single line !!! A la APL :-) 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by Ben-Zvi <gi...@git.apache.org>.

Github user Ben-Zvi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r123838906
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/spill/RecordBatchSizer.java ---
    @@ -19,117 +19,162 @@
     
     import java.util.ArrayList;
     import java.util.List;
    +import java.util.Set;
     
    +import org.apache.drill.common.types.TypeProtos.DataMode;
     import org.apache.drill.common.types.TypeProtos.MinorType;
     import org.apache.drill.exec.expr.TypeHelper;
    +import org.apache.drill.exec.memory.AllocationManager.BufferLedger;
     import org.apache.drill.exec.memory.BaseAllocator;
     import org.apache.drill.exec.record.BatchSchema;
     import org.apache.drill.exec.record.MaterializedField;
     import org.apache.drill.exec.record.RecordBatch;
    +import org.apache.drill.exec.record.SmartAllocationHelper;
     import org.apache.drill.exec.record.VectorAccessible;
     import org.apache.drill.exec.record.VectorWrapper;
     import org.apache.drill.exec.record.selection.SelectionVector2;
    +import org.apache.drill.exec.vector.UInt4Vector;
     import org.apache.drill.exec.vector.ValueVector;
     import org.apache.drill.exec.vector.complex.AbstractMapVector;
    +import org.apache.drill.exec.vector.complex.RepeatedValueVector;
     import org.apache.drill.exec.vector.VariableWidthVector;
     
    +import com.google.common.collect.Sets;
    +
     /**
      * Given a record batch or vector container, determines the actual memory
      * consumed by each column, the average row, and the entire record batch.
      */
     
     public class RecordBatchSizer {
    -//  private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(RecordBatchSizer.class);
     
       /**
        * Column size information.
        */
       public static class ColumnSize {
    +    public final String prefix;
         public final MaterializedField metadata;
     
         /**
    -     * Assumed size from Drill metadata.
    +     * Assumed size from Drill metadata. Note that this information is
    +     * 100% bogus. Do not use it.
          */
     
    +    @Deprecated
         public int stdSize;
     
         /**
    -     * Actual memory consumed by all the vectors associated with this column.
    -     */
    -
    -    public int totalSize;
    -
    -    /**
          * Actual average column width as determined from actual memory use. This
          * size is larger than the actual data size since this size includes per-
          * column overhead such as any unused vector space, etc.
          */
     
    -    public int estSize;
    -    public int capacity;
    -    public int density;
    -    public int dataSize;
    -    public boolean variableWidth;
    -
    -    public ColumnSize(ValueVector vv) {
    -      metadata = vv.getField();
    -      stdSize = TypeHelper.getSize(metadata.getType());
    -
    -      // Can't get size estimates if this is an empty batch.
    -      int rowCount = vv.getAccessor().getValueCount();
    -      if (rowCount == 0) {
    -        estSize = stdSize;
    -        return;
    -      }
    +    public final int estSize;
    +    public final int valueCount;
    +    public final int entryCount;
    +    public final int dataSize;
    +    public final int estElementCount;
    +    public final boolean isVariableWidth;
     
    -      // Total size taken by all vectors (and underlying buffers)
    -      // associated with this vector.
    +    public ColumnSize(ValueVector v, String prefix, int valueCount) {
    +      this.prefix = prefix;
    +      this.valueCount = valueCount;
    +      metadata = v.getField();
    +      boolean isMap = v.getField().getType().getMinorType() == MinorType.MAP;
     
    -      totalSize = vv.getAllocatedByteCount();
    +      // The amount of memory consumed by the payload: the actual
    +      // data stored in the vectors.
     
    -      // Capacity is the number of values that the vector could
    -      // contain. This is useful only for fixed-length vectors.
    +      int mapSize = 0;
    +      if (v.getField().getDataMode() == DataMode.REPEATED) {
     
    -      capacity = vv.getValueCapacity();
    +        // Repeated vectors are special: they have an associated offset vector
    +        // that changes the value count of the contained vectors.
     
    -      // The amount of memory consumed by the payload: the actual
    -      // data stored in the vectors.
    +        @SuppressWarnings("resource")
    +        UInt4Vector offsetVector = ((RepeatedValueVector) v).getOffsetVector();
    +        entryCount = offsetVector.getAccessor().get(valueCount);
    +        estElementCount = roundUp(entryCount, valueCount);
    +        if (isMap) {
     
    -      dataSize = vv.getPayloadByteCount();
    +          // For map, the only data associated with the map vector
    +          // itself is the offset vector, if any.
     
    -      // Determine "density" the number of rows compared to potential
    -      // capacity. Low-density batches occur at block boundaries, ends
    -      // of files and so on. Low-density batches throw off our estimates
    -      // for Varchar columns because we don't know the actual number of
    -      // bytes consumed (that information is hidden behind the Varchar
    -      // implementation where we can't get at it.)
    +          mapSize = offsetVector.getPayloadByteCount(valueCount);
    +        }
    +      } else {
    +        entryCount = valueCount;
    +        estElementCount = 1;
    +      }
     
    -      density = roundUp(dataSize * 100, totalSize);
    -      estSize = roundUp(dataSize, rowCount);
    -      variableWidth = vv instanceof VariableWidthVector ;
    +      if (isMap) {
    +        dataSize = mapSize;
    +        stdSize = 0;
    +     } else {
    +        dataSize = v.getPayloadByteCount(valueCount);
    +        stdSize = TypeHelper.getSize(metadata.getType());
    +     }
    +      estSize = roundUp(dataSize, valueCount);
    +      isVariableWidth = v instanceof VariableWidthVector ;
         }
     
         @Override
         public String toString() {
           StringBuilder buf = new StringBuilder()
    +          .append(prefix)
               .append(metadata.getName())
               .append("(type: ")
    +          .append(metadata.getType().getMode().name())
    +          .append(" ")
               .append(metadata.getType().getMinorType().name())
    -          .append(", std col. size: ")
    +          .append(", count: ")
    +          .append(valueCount);
    +      if (metadata.getDataMode() == DataMode.REPEATED) {
    +        buf.append(", total entries: ")
    +           .append(entryCount)
    +           .append(", per-array: ")
    +           .append(estElementCount);
    +      }
    +      buf .append(", std size: ")
               .append(stdSize)
    -          .append(", actual col. size: ")
    +          .append(", actual size: ")
               .append(estSize)
    -          .append(", total size: ")
    -          .append(totalSize)
               .append(", data size: ")
               .append(dataSize)
    -          .append(", row capacity: ")
    -          .append(capacity)
    -          .append(", density: ")
    -          .append(density)
               .append(")");
           return buf.toString();
         }
    +
    +    public void buildAllocHelper(SmartAllocationHelper allocHelper) {
    +      int width = 0;
    +      switch(metadata.getType().getMinorType()) {
    +      case VAR16CHAR:
    +      case VARBINARY:
    +      case VARCHAR:
    +
    +        // Subtract out the offset vector width
    +        width = estSize - 4;
    +
    +        // Subtract out the bits (is-set) vector width
    +        if (metadata.getDataMode() == DataMode.OPTIONAL) {
    +          width -= 1;
    +        }
    +        break;
    +      default:
    +        break;
    +      }
    +      String name = prefix + metadata.getName();
    +      if (metadata.getDataMode() == DataMode.REPEATED) {
    +        if (width > 0) {
    +          allocHelper.variableWidthArray(name, width, estElementCount);
    --- End diff --
    
    (1) The methods variableWidthArray() and variableWidth() can be combined into one (as the latter is called when estElementCount is 1, which makes the two do the same).
    (2) Can move this call (combined with the one below - on line 175) into the new if statement above that handles variable subtypes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r128127882
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/spill/RecordBatchSizer.java ---
    @@ -189,30 +238,29 @@ public RecordBatchSizer(VectorAccessible va) {
       public RecordBatchSizer(VectorAccessible va, SelectionVector2 sv2) {
         rowCount = va.getRecordCount();
         for (VectorWrapper<?> vw : va) {
    -      int size = measureColumn(vw.getValueVector());
    -      if ( size > maxSize ) { maxSize = size; }
    -      if ( vw.getField().isNullable() ) { numNullables++; }
    +      measureColumn(vw.getValueVector(), "", rowCount);
    +    }
    +
    +    for (BufferLedger ledger : ledgers) {
    +      accountedMemorySize += ledger.getAccountedSize();
         }
     
         if (rowCount > 0) {
    -      grossRowWidth = roundUp(totalBatchSize, rowCount);
    +      grossRowWidth = roundUp(accountedMemorySize, rowCount);
         }
     
         if (sv2 != null) {
           sv2Size = sv2.getBuffer(false).capacity();
    -      grossRowWidth += roundUp(sv2Size, rowCount);
    -      netRowWidth += 2;
    +      accountedMemorySize += sv2Size;
         }
     
    -    int totalDensity = 0;
    -    int usableCount = 0;
    -    for (ColumnSize colSize : columnSizes) {
    -      if ( colSize.density > 0 ) {
    -        usableCount++;
    -      }
    -      totalDensity += colSize.density;
    -    }
    -    avgDensity = roundUp(totalDensity, usableCount);
    +    computeEstimates();
    +  }
    +
    +  private void computeEstimates() {
    +    grossRowWidth = roundUp(accountedMemorySize, rowCount);
    +    netRowWidth = roundUp(netBatchSize, rowCount);
    +    avgDensity = roundUp(netBatchSize * 100, accountedMemorySize);
       }
     
       public void applySv2() {
    --- End diff --
    
    Fixed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by Ben-Zvi <gi...@git.apache.org>.

Github user Ben-Zvi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r127061135
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/record/SmartAllocationHelper.java ---
    @@ -0,0 +1,156 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.drill.exec.record;
    +
    +import java.util.HashMap;
    +import java.util.Map;
    +import java.util.Map.Entry;
    +
    +import org.apache.drill.exec.vector.AllocationHelper;
    +import org.apache.drill.exec.vector.ValueVector;
    +import org.apache.drill.exec.vector.complex.AbstractMapVector;
    +import org.apache.drill.exec.vector.complex.RepeatedMapVector;
    +
    +/**
    + * Prototype mechanism to allocate vectors based on expected
    + * data sizes. This version uses a name-based map of fields
    + * to sizes. Better to represent the batch structurally and
    + * simply iterate over the schema rather than doing a per-field
    + * lookup. But, the mechanisms needed to do the efficient solution
    + * don't exist yet.
    + */
    +
    +public class SmartAllocationHelper {
    +
    +  public static class AllocationHint {
    --- End diff --
    
    Should the class AllocationHint be made private ? It is only used internally by SmartAllocationHelper . 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r128366036
  
    --- Diff: exec/java-exec/src/main/resources/drill-module.conf ---
    @@ -239,9 +239,7 @@ drill.exec: {
         external: {
           // Drill uses the managed External Sort Batch by default.
           // Set this to true to use the legacy, unmanaged version.
    -      // Disabled in the intial commit, to be enabled after
    -      // tests are committed.
    -      disable_managed: true,
    +      disable_managed: false,
    --- End diff --
    
    Done later.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by Ben-Zvi <gi...@git.apache.org>.

Github user Ben-Zvi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r127537900
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/StreamingAggBatch.java ---
    @@ -265,7 +277,7 @@ private StreamingAggregator createAggregatorInternal() throws SchemaChangeExcept
             context.getFunctionRegistry(), context.getOptions());
         cg.getCodeGenerator().plainJavaCapable(true);
         // Uncomment out this line to debug the generated code.
    -//    cg.getCodeGenerator().saveCodeForDebugging(true);
    +    cg.getCodeGenerator().saveCodeForDebugging(true);
    --- End diff --
    
    Comment out  ...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r128124656
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/StreamingAggBatch.java ---
    @@ -265,7 +277,7 @@ private StreamingAggregator createAggregatorInternal() throws SchemaChangeExcept
             context.getFunctionRegistry(), context.getOptions());
         cg.getCodeGenerator().plainJavaCapable(true);
         // Uncomment out this line to debug the generated code.
    -//    cg.getCodeGenerator().saveCodeForDebugging(true);
    +    cg.getCodeGenerator().saveCodeForDebugging(true);
    --- End diff --
    
    Done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r128129092
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/managed/PriorityQueueCopierWrapper.java ---
    @@ -100,10 +102,12 @@ private PriorityQueueCopier newCopier(VectorAccessible batch) {
        * @param batchGroupList
        * @param outputContainer
        * @param targetRecordCount
    +   * @param allocHelper
        * @return
        */
    -  public BatchMerger startMerge(BatchSchema schema, List<? extends BatchGroup> batchGroupList, VectorContainer outputContainer, int targetRecordCount) {
    -    return new BatchMerger(this, schema, batchGroupList, outputContainer, targetRecordCount);
    +  public BatchMerger startMerge(BatchSchema schema, List<? extends BatchGroup> batchGroupList,
    --- End diff --
    
    Would probably do it that way if either a) it wasn't test code, or b) the allocation thing was not temporary.
    
    The sort code probably should actually create a vector allocator so we test what we ship. But, haven't got to that step yet.
    
    Also, hoping that the vector allocator (and sizer) disappear into a larger form of batch metadata.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by Ben-Zvi <gi...@git.apache.org>.

Github user Ben-Zvi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r127346284
  
    --- Diff: exec/vector/src/main/java/org/apache/drill/exec/vector/complex/RepeatedMapVector.java ---
    @@ -598,7 +604,8 @@ public void clear() {
       }
     
       @Override
    -  public int getAllocatedByteCount() {
    -    return super.getAllocatedByteCount( ) + offsets.getAllocatedByteCount();
    +  public void getLedgers(Set<BufferLedger> ledgers) {
    --- End diff --
    
    again ....


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r128365255
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/spill/RecordBatchSizer.java ---
    @@ -19,117 +19,162 @@
     
     import java.util.ArrayList;
     import java.util.List;
    +import java.util.Set;
     
    +import org.apache.drill.common.types.TypeProtos.DataMode;
     import org.apache.drill.common.types.TypeProtos.MinorType;
     import org.apache.drill.exec.expr.TypeHelper;
    +import org.apache.drill.exec.memory.AllocationManager.BufferLedger;
     import org.apache.drill.exec.memory.BaseAllocator;
     import org.apache.drill.exec.record.BatchSchema;
     import org.apache.drill.exec.record.MaterializedField;
     import org.apache.drill.exec.record.RecordBatch;
    +import org.apache.drill.exec.record.SmartAllocationHelper;
     import org.apache.drill.exec.record.VectorAccessible;
     import org.apache.drill.exec.record.VectorWrapper;
     import org.apache.drill.exec.record.selection.SelectionVector2;
    +import org.apache.drill.exec.vector.UInt4Vector;
     import org.apache.drill.exec.vector.ValueVector;
     import org.apache.drill.exec.vector.complex.AbstractMapVector;
    +import org.apache.drill.exec.vector.complex.RepeatedValueVector;
     import org.apache.drill.exec.vector.VariableWidthVector;
     
    +import com.google.common.collect.Sets;
    +
     /**
      * Given a record batch or vector container, determines the actual memory
      * consumed by each column, the average row, and the entire record batch.
      */
     
     public class RecordBatchSizer {
    -//  private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(RecordBatchSizer.class);
     
       /**
        * Column size information.
        */
       public static class ColumnSize {
    +    public final String prefix;
         public final MaterializedField metadata;
     
         /**
    -     * Assumed size from Drill metadata.
    +     * Assumed size from Drill metadata. Note that this information is
    +     * 100% bogus. Do not use it.
          */
     
    +    @Deprecated
         public int stdSize;
     
         /**
    -     * Actual memory consumed by all the vectors associated with this column.
    -     */
    -
    -    public int totalSize;
    -
    -    /**
          * Actual average column width as determined from actual memory use. This
          * size is larger than the actual data size since this size includes per-
          * column overhead such as any unused vector space, etc.
          */
     
    -    public int estSize;
    -    public int capacity;
    -    public int density;
    -    public int dataSize;
    -    public boolean variableWidth;
    -
    -    public ColumnSize(ValueVector vv) {
    -      metadata = vv.getField();
    -      stdSize = TypeHelper.getSize(metadata.getType());
    -
    -      // Can't get size estimates if this is an empty batch.
    -      int rowCount = vv.getAccessor().getValueCount();
    -      if (rowCount == 0) {
    -        estSize = stdSize;
    -        return;
    -      }
    +    public final int estSize;
    +    public final int valueCount;
    +    public final int entryCount;
    +    public final int dataSize;
    +    public final int estElementCount;
    +    public final boolean isVariableWidth;
     
    -      // Total size taken by all vectors (and underlying buffers)
    -      // associated with this vector.
    +    public ColumnSize(ValueVector v, String prefix, int valueCount) {
    +      this.prefix = prefix;
    +      this.valueCount = valueCount;
    +      metadata = v.getField();
    +      boolean isMap = v.getField().getType().getMinorType() == MinorType.MAP;
     
    -      totalSize = vv.getAllocatedByteCount();
    +      // The amount of memory consumed by the payload: the actual
    +      // data stored in the vectors.
     
    -      // Capacity is the number of values that the vector could
    -      // contain. This is useful only for fixed-length vectors.
    +      int mapSize = 0;
    +      if (v.getField().getDataMode() == DataMode.REPEATED) {
     
    -      capacity = vv.getValueCapacity();
    +        // Repeated vectors are special: they have an associated offset vector
    +        // that changes the value count of the contained vectors.
     
    -      // The amount of memory consumed by the payload: the actual
    -      // data stored in the vectors.
    +        @SuppressWarnings("resource")
    +        UInt4Vector offsetVector = ((RepeatedValueVector) v).getOffsetVector();
    +        entryCount = offsetVector.getAccessor().get(valueCount);
    +        estElementCount = roundUp(entryCount, valueCount);
    +        if (isMap) {
     
    -      dataSize = vv.getPayloadByteCount();
    +          // For map, the only data associated with the map vector
    +          // itself is the offset vector, if any.
     
    -      // Determine "density" the number of rows compared to potential
    -      // capacity. Low-density batches occur at block boundaries, ends
    -      // of files and so on. Low-density batches throw off our estimates
    -      // for Varchar columns because we don't know the actual number of
    -      // bytes consumed (that information is hidden behind the Varchar
    -      // implementation where we can't get at it.)
    +          mapSize = offsetVector.getPayloadByteCount(valueCount);
    +        }
    +      } else {
    +        entryCount = valueCount;
    +        estElementCount = 1;
    +      }
     
    -      density = roundUp(dataSize * 100, totalSize);
    -      estSize = roundUp(dataSize, rowCount);
    -      variableWidth = vv instanceof VariableWidthVector ;
    +      if (isMap) {
    +        dataSize = mapSize;
    +        stdSize = 0;
    +     } else {
    +        dataSize = v.getPayloadByteCount(valueCount);
    +        stdSize = TypeHelper.getSize(metadata.getType());
    +     }
    +      estSize = roundUp(dataSize, valueCount);
    +      isVariableWidth = v instanceof VariableWidthVector ;
         }
     
         @Override
         public String toString() {
           StringBuilder buf = new StringBuilder()
    +          .append(prefix)
               .append(metadata.getName())
               .append("(type: ")
    +          .append(metadata.getType().getMode().name())
    +          .append(" ")
               .append(metadata.getType().getMinorType().name())
    -          .append(", std col. size: ")
    +          .append(", count: ")
    +          .append(valueCount);
    +      if (metadata.getDataMode() == DataMode.REPEATED) {
    +        buf.append(", total entries: ")
    +           .append(entryCount)
    +           .append(", per-array: ")
    +           .append(estElementCount);
    +      }
    +      buf .append(", std size: ")
               .append(stdSize)
    -          .append(", actual col. size: ")
    +          .append(", actual size: ")
               .append(estSize)
    -          .append(", total size: ")
    -          .append(totalSize)
               .append(", data size: ")
               .append(dataSize)
    -          .append(", row capacity: ")
    -          .append(capacity)
    -          .append(", density: ")
    -          .append(density)
               .append(")");
           return buf.toString();
         }
    +
    +    public void buildAllocHelper(SmartAllocationHelper allocHelper) {
    +      int width = 0;
    +      switch(metadata.getType().getMinorType()) {
    +      case VAR16CHAR:
    +      case VARBINARY:
    +      case VARCHAR:
    +
    +        // Subtract out the offset vector width
    +        width = estSize - 4;
    +
    +        // Subtract out the bits (is-set) vector width
    +        if (metadata.getDataMode() == DataMode.OPTIONAL) {
    +          width -= 1;
    +        }
    +        break;
    +      default:
    +        break;
    +      }
    +      String name = prefix + metadata.getName();
    +      if (metadata.getDataMode() == DataMode.REPEATED) {
    +        if (width > 0) {
    +          allocHelper.variableWidthArray(name, width, estElementCount);
    --- End diff --
    
    This was an early draft; since revised.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by Ben-Zvi <gi...@git.apache.org>.

Github user Ben-Zvi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r124904394
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/managed/SortMemoryManager.java ---
    @@ -391,84 +601,93 @@ public MergeTask(MergeAction action, int count) {
     
       public MergeTask consolidateBatches(long allocMemory, int inMemCount, int spilledRunsCount) {
    --- End diff --
    
    Add JavaDoc for consolidateBatches() 



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by asfgit <gi...@git.apache.org>.

Github user asfgit closed the pull request at:

    https://github.com/apache/drill/pull/860


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by Ben-Zvi <gi...@git.apache.org>.

Github user Ben-Zvi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r124122190
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/managed/MergeSortWrapper.java ---
    @@ -147,7 +147,7 @@ private MSorter createNewMSorter(List<Ordering> orderings, MappingSet mainMappin
         cg.plainJavaCapable(true);
     
         // Uncomment out this line to debug the generated code.
    -//  cg.saveCodeForDebugging(true);
    +    cg.saveCodeForDebugging(true);
    --- End diff --
    
    Need to restore the previous comment; i.e. no change for this file ....



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r128142798
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/managed/SorterWrapper.java ---
    @@ -83,10 +83,9 @@ private SingleBatchSorter newSorter(VectorAccessible batch) {
         ClassGenerator<SingleBatchSorter> g = cg.getRoot();
         cg.plainJavaCapable(true);
         // Uncomment out this line to debug the generated code.
    -  cg.saveCodeForDebugging(true);
    +    cg.saveCodeForDebugging(true);
    --- End diff --
    
    Fixed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by Ben-Zvi <gi...@git.apache.org>.

Github user Ben-Zvi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r127360560
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/spill/SpillSet.java ---
    @@ -413,7 +419,12 @@ public SpillSet(DrillConfig config, FragmentHandle handle, PhysicalOperator popC
           fileManager = new HadoopFileManager(spillFs);
         }
     
    -    spillDirName = String.format("%s_%s_%s-%s_minor%s", QueryIdHelper.getQueryId(handle.getQueryId()),
    +    // If provided with a prefix to identify the Drillbit, prepend that to the
    +    // spill directory.
    +
    +    spillDirName = String.format("%s%s_%s_%s-%s-%s",
    +        ep == null ? "" : ep.getAddress() + "-" + ep.getUserPort() + "/",
    --- End diff --
    
    Any chance the address would be "" ?  Then the directory name would start with a "-" which could cause some trouble (probably only for access from the shell, thus OK).



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r128135228
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/managed/SortMemoryManager.java ---
    @@ -391,84 +601,93 @@ public MergeTask(MergeAction action, int count) {
     
       public MergeTask consolidateBatches(long allocMemory, int inMemCount, int spilledRunsCount) {
    --- End diff --
    
    Done.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by Ben-Zvi <gi...@git.apache.org>.

Github user Ben-Zvi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r127345763
  
    --- Diff: exec/vector/src/main/java/org/apache/drill/exec/vector/complex/RepeatedListVector.java ---
    @@ -435,20 +438,18 @@ public void copyEntry(int toIndex, ValueVector from, int fromIndex) {
         copyFromSafe(fromIndex, toIndex, (RepeatedListVector) from);
       }
     
    -  @Override
    -  public int getAllocatedByteCount() {
    -    return delegate.getAllocatedByteCount();
    +  public void getLedgers(Set<BufferLedger> ledgers) {
    --- End diff --
    
    again ....


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r128134415
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/managed/SortMemoryManager.java ---
    @@ -97,24 +215,43 @@
     
       private SortConfig config;
     
    -  private int estimatedInputSize;
    -
       private boolean potentialOverflow;
     
    -  public SortMemoryManager(SortConfig config, long memoryLimit) {
    +  private boolean isLowMemory;
    +
    +  private boolean performanceWarning;
    +
    +  public SortMemoryManager(SortConfig config, long opMemoryLimit) {
         this.config = config;
     
         // The maximum memory this operator can use as set by the
         // operator definition (propagated to the allocator.)
     
         if (config.maxMemory() > 0) {
    --- End diff --
    
    This is just a bit tricky. A config value of 0 means ignore the config value (the else condition.) If the config value is set, use it, unless it is greater than the operator-defined limit, in which case use the operator limit.
    
    Still, cleaned this up some.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r128125799
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/spill/RecordBatchSizer.java ---
    @@ -19,117 +19,162 @@
     
     import java.util.ArrayList;
     import java.util.List;
    +import java.util.Set;
     
    +import org.apache.drill.common.types.TypeProtos.DataMode;
     import org.apache.drill.common.types.TypeProtos.MinorType;
     import org.apache.drill.exec.expr.TypeHelper;
    +import org.apache.drill.exec.memory.AllocationManager.BufferLedger;
     import org.apache.drill.exec.memory.BaseAllocator;
     import org.apache.drill.exec.record.BatchSchema;
     import org.apache.drill.exec.record.MaterializedField;
     import org.apache.drill.exec.record.RecordBatch;
    +import org.apache.drill.exec.record.SmartAllocationHelper;
     import org.apache.drill.exec.record.VectorAccessible;
     import org.apache.drill.exec.record.VectorWrapper;
     import org.apache.drill.exec.record.selection.SelectionVector2;
    +import org.apache.drill.exec.vector.UInt4Vector;
     import org.apache.drill.exec.vector.ValueVector;
     import org.apache.drill.exec.vector.complex.AbstractMapVector;
    +import org.apache.drill.exec.vector.complex.RepeatedValueVector;
     import org.apache.drill.exec.vector.VariableWidthVector;
     
    +import com.google.common.collect.Sets;
    +
     /**
      * Given a record batch or vector container, determines the actual memory
      * consumed by each column, the average row, and the entire record batch.
      */
     
     public class RecordBatchSizer {
    -//  private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(RecordBatchSizer.class);
     
       /**
        * Column size information.
        */
       public static class ColumnSize {
    +    public final String prefix;
         public final MaterializedField metadata;
     
         /**
    -     * Assumed size from Drill metadata.
    +     * Assumed size from Drill metadata. Note that this information is
    +     * 100% bogus. Do not use it.
          */
     
    +    @Deprecated
         public int stdSize;
     
         /**
    -     * Actual memory consumed by all the vectors associated with this column.
    -     */
    -
    -    public int totalSize;
    -
    -    /**
          * Actual average column width as determined from actual memory use. This
          * size is larger than the actual data size since this size includes per-
          * column overhead such as any unused vector space, etc.
          */
     
    -    public int estSize;
    -    public int capacity;
    -    public int density;
    -    public int dataSize;
    -    public boolean variableWidth;
    -
    -    public ColumnSize(ValueVector vv) {
    -      metadata = vv.getField();
    -      stdSize = TypeHelper.getSize(metadata.getType());
    -
    -      // Can't get size estimates if this is an empty batch.
    -      int rowCount = vv.getAccessor().getValueCount();
    -      if (rowCount == 0) {
    -        estSize = stdSize;
    -        return;
    -      }
    +    public final int estSize;
    +    public final int valueCount;
    +    public final int entryCount;
    +    public final int dataSize;
    +    public final int estElementCount;
    +    public final boolean isVariableWidth;
     
    -      // Total size taken by all vectors (and underlying buffers)
    -      // associated with this vector.
    +    public ColumnSize(ValueVector v, String prefix, int valueCount) {
    +      this.prefix = prefix;
    +      this.valueCount = valueCount;
    +      metadata = v.getField();
    +      boolean isMap = v.getField().getType().getMinorType() == MinorType.MAP;
     
    -      totalSize = vv.getAllocatedByteCount();
    +      // The amount of memory consumed by the payload: the actual
    +      // data stored in the vectors.
     
    -      // Capacity is the number of values that the vector could
    -      // contain. This is useful only for fixed-length vectors.
    +      int mapSize = 0;
    +      if (v.getField().getDataMode() == DataMode.REPEATED) {
     
    -      capacity = vv.getValueCapacity();
    +        // Repeated vectors are special: they have an associated offset vector
    +        // that changes the value count of the contained vectors.
     
    -      // The amount of memory consumed by the payload: the actual
    -      // data stored in the vectors.
    +        @SuppressWarnings("resource")
    +        UInt4Vector offsetVector = ((RepeatedValueVector) v).getOffsetVector();
    +        entryCount = offsetVector.getAccessor().get(valueCount);
    +        estElementCount = roundUp(entryCount, valueCount);
    +        if (isMap) {
     
    -      dataSize = vv.getPayloadByteCount();
    +          // For map, the only data associated with the map vector
    +          // itself is the offset vector, if any.
     
    -      // Determine "density" the number of rows compared to potential
    -      // capacity. Low-density batches occur at block boundaries, ends
    -      // of files and so on. Low-density batches throw off our estimates
    -      // for Varchar columns because we don't know the actual number of
    -      // bytes consumed (that information is hidden behind the Varchar
    -      // implementation where we can't get at it.)
    +          mapSize = offsetVector.getPayloadByteCount(valueCount);
    +        }
    +      } else {
    +        entryCount = valueCount;
    +        estElementCount = 1;
    +      }
     
    -      density = roundUp(dataSize * 100, totalSize);
    -      estSize = roundUp(dataSize, rowCount);
    -      variableWidth = vv instanceof VariableWidthVector ;
    +      if (isMap) {
    +        dataSize = mapSize;
    +        stdSize = 0;
    +     } else {
    +        dataSize = v.getPayloadByteCount(valueCount);
    +        stdSize = TypeHelper.getSize(metadata.getType());
    +     }
    +      estSize = roundUp(dataSize, valueCount);
    +      isVariableWidth = v instanceof VariableWidthVector ;
         }
     
         @Override
         public String toString() {
           StringBuilder buf = new StringBuilder()
    +          .append(prefix)
               .append(metadata.getName())
               .append("(type: ")
    +          .append(metadata.getType().getMode().name())
    +          .append(" ")
               .append(metadata.getType().getMinorType().name())
    -          .append(", std col. size: ")
    +          .append(", count: ")
    +          .append(valueCount);
    +      if (metadata.getDataMode() == DataMode.REPEATED) {
    +        buf.append(", total entries: ")
    +           .append(entryCount)
    +           .append(", per-array: ")
    +           .append(estElementCount);
    +      }
    +      buf .append(", std size: ")
               .append(stdSize)
    -          .append(", actual col. size: ")
    +          .append(", actual size: ")
               .append(estSize)
    -          .append(", total size: ")
    -          .append(totalSize)
               .append(", data size: ")
               .append(dataSize)
    -          .append(", row capacity: ")
    -          .append(capacity)
    -          .append(", density: ")
    -          .append(density)
               .append(")");
           return buf.toString();
         }
    +
    +    public void buildAllocHelper(SmartAllocationHelper allocHelper) {
    --- End diff --
    
    Sorry. There is an existing class called "AllocationHelper" but it is dumb: it uses guesses for Varchar width, number of array elements -- and just plain ignores certain types. So, if the new "allocation helper" is "less dumb", then it is "smart..."
    
    (Note that this class also has a lame name: "RecordBatchSizer". It was supposed to be a temporary hack until we did the job right; hence the silly name...)
    
    But, since you want to review some additional code, I changed the name to "VectorInitializer." Eventually, just as with the "sizer" itself, this should be part of an expanded runtime metadata system wrapped around record batches.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r128128116
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/spill/SpillSet.java ---
    @@ -413,7 +419,12 @@ public SpillSet(DrillConfig config, FragmentHandle handle, PhysicalOperator popC
           fileManager = new HadoopFileManager(spillFs);
         }
     
    -    spillDirName = String.format("%s_%s_%s-%s_minor%s", QueryIdHelper.getQueryId(handle.getQueryId()),
    +    // If provided with a prefix to identify the Drillbit, prepend that to the
    +    // spill directory.
    +
    +    spillDirName = String.format("%s%s_%s_%s-%s-%s",
    +        ep == null ? "" : ep.getAddress() + "-" + ep.getUserPort() + "/",
    --- End diff --
    
    Will never be empty in production. Test code is required to provide a suitable value. Still, added an assert to ensure that this is true.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r128142873
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/record/SmartAllocationHelper.java ---
    @@ -0,0 +1,156 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.drill.exec.record;
    +
    +import java.util.HashMap;
    +import java.util.Map;
    +import java.util.Map.Entry;
    +
    +import org.apache.drill.exec.vector.AllocationHelper;
    +import org.apache.drill.exec.vector.ValueVector;
    +import org.apache.drill.exec.vector.complex.AbstractMapVector;
    +import org.apache.drill.exec.vector.complex.RepeatedMapVector;
    +
    +/**
    + * Prototype mechanism to allocate vectors based on expected
    + * data sizes. This version uses a name-based map of fields
    + * to sizes. Better to represent the batch structurally and
    + * simply iterate over the schema rather than doing a per-field
    + * lookup. But, the mechanisms needed to do the efficient solution
    + * don't exist yet.
    + */
    +
    +public class SmartAllocationHelper {
    +
    +  public static class AllocationHint {
    --- End diff --
    
    Fixed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r128130180
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/managed/SortMemoryManager.java ---
    @@ -19,7 +19,125 @@
     
     import com.google.common.annotations.VisibleForTesting;
     
    +/**
    + * Computes the memory needs for input batches, spill batches and merge
    + * batches. The key challenges that this code tries to overcome are:
    + * <ul>
    + * <li>Drill is not designed for the small memory allocations,
    + * but the planner may provide such allocations because the memory per
    + * query is divided among slices (minor fragments) and among buffering
    + * operators, leaving very little per operator.</li>
    + * <li>Drill does not provide the detailed memory information needed to
    + * carefully manage memory in tight constraints.</li>
    + * <li>But, Drill has a death penalty for going over the memory limit.</li>
    + * </ul>
    + * As a result, this class is a bit of a hack: it attempt to consider a
    + * number of ill-defined factors in order to divide up memory use in a
    + * way that prevents OOM errors.
    + * <p>
    + * First, it is necessary to differentiate two concepts:
    + * <ul>
    + * <li>The <i>data size</i> of a batch: the amount of memory needed to hold
    + * the data itself. The data size is constant for any given batch.</li>
    + * <li>The <i>buffer size</i> of the buffers that hold the data. The buffer
    + * size varies wildly depending on how the batch was produced.</li>
    + * </ul>
    + * The three kinds of buffer layouts seen to date include:
    + * <ul>
    + * <li>One buffer per vector component (data, offsets, null flags, etc.)
    + * &ndash; create by readers, project and other operators.</li>
    + * <li>One buffer for the entire batch, with each vector component using
    + * a slice of the overall buffer. &ndash; case for batches deserialized from
    + * exchanges.</li>
    + * <li>One buffer for each top-level vector, with component vectors
    + * using slices of the overall vector buffer &ndash; the result of reading
    + * spilled batches from disk.</li>
    + * </ul>
    + * In each case, buffer sizes are power-of-two rounded from the data size.
    + * But since the data is grouped differently in each case, the resulting buffer
    + * sizes vary considerably.
    + * <p>
    + * As a result, we can never be sure of the amount of memory needed for a
    + * batch. So, we have to estimate based on a number of factors:
    + * <ul>
    + * <li>Uses the {@link RecordBatchSizer} to estimate the data size and
    + * buffer size of each incoming batch.</li>
    + * <li>Estimates the internal fragmentation due to power-of-two rounding.</li>
    + * <li>Configured preferences for spill and output batches.</li>
    + * </ul>
    + * The code handles "normal" and "low" memory conditions.
    + * <ul>
    + * <li>In normal memory, we simply work out the number of preferred-size
    + * batches fit in memory (based on the predicted buffer size.)</li>
    + * <li>In low memory, we divide up the available memory to produce the
    + * spill and merge batch sizes. The sizes will be less than the configured
    --- End diff --
    
    Added more words; let me know if they help to clarify the situation. (The low-memory condition is not just something we dreamed up; it was required to cope with the silly-low memory conditions that QA kept creating...)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by Ben-Zvi <gi...@git.apache.org>.

Github user Ben-Zvi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r127077314
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/managed/PriorityQueueCopierWrapper.java ---
    @@ -100,10 +102,12 @@ private PriorityQueueCopier newCopier(VectorAccessible batch) {
        * @param batchGroupList
        * @param outputContainer
        * @param targetRecordCount
    +   * @param allocHelper
        * @return
        */
    -  public BatchMerger startMerge(BatchSchema schema, List<? extends BatchGroup> batchGroupList, VectorContainer outputContainer, int targetRecordCount) {
    -    return new BatchMerger(this, schema, batchGroupList, outputContainer, targetRecordCount);
    +  public BatchMerger startMerge(BatchSchema schema, List<? extends BatchGroup> batchGroupList,
    --- End diff --
    
    How about creating another **startMerge(...)** without the last **allocHelper** parameter, which would then call **BatchMerger(......., null)** ; this new overloaded one would be used in places like TestCopier.java, instead of adding a last null parameter there.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by Ben-Zvi <gi...@git.apache.org>.

Github user Ben-Zvi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r124106839
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/spill/RecordBatchSizer.java ---
    @@ -189,30 +238,29 @@ public RecordBatchSizer(VectorAccessible va) {
       public RecordBatchSizer(VectorAccessible va, SelectionVector2 sv2) {
         rowCount = va.getRecordCount();
         for (VectorWrapper<?> vw : va) {
    -      int size = measureColumn(vw.getValueVector());
    -      if ( size > maxSize ) { maxSize = size; }
    -      if ( vw.getField().isNullable() ) { numNullables++; }
    +      measureColumn(vw.getValueVector(), "", rowCount);
    +    }
    +
    +    for (BufferLedger ledger : ledgers) {
    +      accountedMemorySize += ledger.getAccountedSize();
         }
     
         if (rowCount > 0) {
    -      grossRowWidth = roundUp(totalBatchSize, rowCount);
    +      grossRowWidth = roundUp(accountedMemorySize, rowCount);
         }
     
         if (sv2 != null) {
           sv2Size = sv2.getBuffer(false).capacity();
    -      grossRowWidth += roundUp(sv2Size, rowCount);
    -      netRowWidth += 2;
    +      accountedMemorySize += sv2Size;
         }
     
    -    int totalDensity = 0;
    -    int usableCount = 0;
    -    for (ColumnSize colSize : columnSizes) {
    -      if ( colSize.density > 0 ) {
    -        usableCount++;
    -      }
    -      totalDensity += colSize.density;
    -    }
    -    avgDensity = roundUp(totalDensity, usableCount);
    +    computeEstimates();
    +  }
    +
    +  private void computeEstimates() {
    +    grossRowWidth = roundUp(accountedMemorySize, rowCount);
    +    netRowWidth = roundUp(netBatchSize, rowCount);
    +    avgDensity = roundUp(netBatchSize * 100, accountedMemorySize);
       }
     
       public void applySv2() {
    --- End diff --
    
    hasSv2 is never set, hence always 'false'. So when this method is called (from analyzeIncomingBatch() ), it increases the accountedMemorySize even though there is no SV2 !!



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by Ben-Zvi <gi...@git.apache.org>.

Github user Ben-Zvi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r124909795
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/managed/SorterWrapper.java ---
    @@ -83,10 +83,9 @@ private SingleBatchSorter newSorter(VectorAccessible batch) {
         ClassGenerator<SingleBatchSorter> g = cg.getRoot();
         cg.plainJavaCapable(true);
         // Uncomment out this line to debug the generated code.
    -  cg.saveCodeForDebugging(true);
    +    cg.saveCodeForDebugging(true);
    --- End diff --
    
    comment out  ..


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by Ben-Zvi <gi...@git.apache.org>.

Github user Ben-Zvi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r127345445
  
    --- Diff: exec/vector/src/main/java/org/apache/drill/exec/vector/complex/BaseRepeatedValueVector.java ---
    @@ -210,13 +212,15 @@ protected void replaceDataVector(ValueVector v) {
       }
     
       @Override
    -  public int getAllocatedByteCount() {
    -    return offsets.getAllocatedByteCount() + vector.getAllocatedByteCount();
    +  public void getLedgers(Set<BufferLedger> ledgers) {
    --- End diff --
    
    This method looks more like a "set" than a "get" .....



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r128144743
  
    --- Diff: exec/vector/src/main/codegen/templates/VariableLengthVectors.java ---
    @@ -247,27 +249,26 @@ public void copyEntry(int toIndex, ValueVector from, int fromIndex) {
       }
     
       @Override
    -  public int getAllocatedByteCount() {
    -    return offsetVector.getAllocatedByteCount() + super.getAllocatedByteCount();
    +  public void getLedgers(Set<BufferLedger> ledgers) {
    +    offsetVector.getLedgers(ledgers);
    +    super.getLedgers(ledgers);
       }
     
       @Override
    -  public int getPayloadByteCount() {
    -    UInt${type.width}Vector.Accessor a = offsetVector.getAccessor();
    -    int count = a.getValueCount();
    -    if (count == 0) {
    +  public int getPayloadByteCount(int valueCount) {
    +    if (valueCount == 0) {
           return 0;
    -    } else {
    -      // If 1 or more values, then the last value is set to
    -      // the offset of the next value, which is the same as
    -      // the length of existing values.
    -      // In addition to the actual data bytes, we must also
    -      // include the "overhead" bytes: the offset vector entries
    -      // that accompany each column value. Thus, total payload
    -      // size is consumed text bytes + consumed offset vector
    -      // bytes.
    -      return a.get(count-1) + offsetVector.getPayloadByteCount();
         }
    +    // If 1 or more values, then the last value is set to
    +    // the offset of the next value, which is the same as
    +    // the length of existing values.
    +    // In addition to the actual data bytes, we must also
    +    // include the "overhead" bytes: the offset vector entries
    +    // that accompany each column value. Thus, total payload
    +    // size is consumed text bytes + consumed offset vector
    +    // bytes.
    +    return offsetVector.getAccessor().get(valueCount) +
    --- End diff --
    
    Remember that offset vectors are 1 larger than the number of values. If we have three entries, "a", "bb", and "ccc", then we have four offsets: 0, 2, 4, 7. The number of bytes used by the values is the last offset, the one in position 3, or 7. Then, we need the number of bytes from the offset vector itself.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by Ben-Zvi <gi...@git.apache.org>.

Github user Ben-Zvi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r124430031
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/managed/SortMemoryManager.java ---
    @@ -19,7 +19,125 @@
     
     import com.google.common.annotations.VisibleForTesting;
     
    +/**
    + * Computes the memory needs for input batches, spill batches and merge
    + * batches. The key challenges that this code tries to overcome are:
    + * <ul>
    + * <li>Drill is not designed for the small memory allocations,
    + * but the planner may provide such allocations because the memory per
    + * query is divided among slices (minor fragments) and among buffering
    + * operators, leaving very little per operator.</li>
    + * <li>Drill does not provide the detailed memory information needed to
    + * carefully manage memory in tight constraints.</li>
    + * <li>But, Drill has a death penalty for going over the memory limit.</li>
    + * </ul>
    + * As a result, this class is a bit of a hack: it attempt to consider a
    + * number of ill-defined factors in order to divide up memory use in a
    + * way that prevents OOM errors.
    + * <p>
    + * First, it is necessary to differentiate two concepts:
    + * <ul>
    + * <li>The <i>data size</i> of a batch: the amount of memory needed to hold
    + * the data itself. The data size is constant for any given batch.</li>
    + * <li>The <i>buffer size</i> of the buffers that hold the data. The buffer
    + * size varies wildly depending on how the batch was produced.</li>
    + * </ul>
    + * The three kinds of buffer layouts seen to date include:
    + * <ul>
    + * <li>One buffer per vector component (data, offsets, null flags, etc.)
    + * &ndash; create by readers, project and other operators.</li>
    + * <li>One buffer for the entire batch, with each vector component using
    + * a slice of the overall buffer. &ndash; case for batches deserialized from
    + * exchanges.</li>
    + * <li>One buffer for each top-level vector, with component vectors
    + * using slices of the overall vector buffer &ndash; the result of reading
    + * spilled batches from disk.</li>
    + * </ul>
    + * In each case, buffer sizes are power-of-two rounded from the data size.
    + * But since the data is grouped differently in each case, the resulting buffer
    + * sizes vary considerably.
    + * <p>
    + * As a result, we can never be sure of the amount of memory needed for a
    + * batch. So, we have to estimate based on a number of factors:
    + * <ul>
    + * <li>Uses the {@link RecordBatchSizer} to estimate the data size and
    + * buffer size of each incoming batch.</li>
    + * <li>Estimates the internal fragmentation due to power-of-two rounding.</li>
    + * <li>Configured preferences for spill and output batches.</li>
    + * </ul>
    + * The code handles "normal" and "low" memory conditions.
    + * <ul>
    + * <li>In normal memory, we simply work out the number of preferred-size
    + * batches fit in memory (based on the predicted buffer size.)</li>
    + * <li>In low memory, we divide up the available memory to produce the
    + * spill and merge batch sizes. The sizes will be less than the configured
    + * preference.</li>
    + * </ul>
    + */
    +
     public class SortMemoryManager {
    +  static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(ExternalSortBatch.class);
    +
    +  /**
    +   * Estimate for typical internal fragmentation in a buffer due to power-of-two
    +   * rounding on vectors.
    +   * <p>
    +   * <p>
    +   * <pre>[____|__$__]</pre>
    +   * In the above, the brackets represent the whole vector. The
    +   * first half is always full. When the first half filled, the second
    +   * half was allocated. On average, the second half will be half full.
    +   * This means that, on average, 1/4 of the allocated space is
    +   * unused (the definition of internal fragmentation.)
    +   */
    +
    +  public static final double INTERNAL_FRAGMENTATION_ESTIMATE = 1.0/4.0;
    +
    +  /**
    +   * Given a buffer, this is the assumed amount of space
    +   * available for data. (Adding more will double the buffer
    +   * size half the time.)
    +   */
    +
    +  public static final double PAYLOAD_MULTIPLIER = 1 - INTERNAL_FRAGMENTATION_ESTIMATE;
    +
    +  /**
    +   * Given a data size, this is the multiplier to create the buffer
    +   * size estimate. (Note: since we work with aggregate batches, we
    +   * cannot simply round up to the next power of two: rounding is done
    +   * on a vector-by-vector basis. Here we need to estimate the aggregate
    +   * effect of rounding.
    +   */
    +
    +  public static final double BUFFER_MULTIPLIER = 1/PAYLOAD_MULTIPLIER;
    +  public static final double WORST_CASE_MULTIPLIER = 2.0;
    +
    +  public static final int MIN_ROWS_PER_SORT_BATCH = 100;
    --- End diff --
    
    Possibly make the min rows a configurable option, so could be tested with different values.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by Ben-Zvi <gi...@git.apache.org>.

Github user Ben-Zvi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r124121514
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/MSortTemplate.java ---
    @@ -92,8 +93,9 @@ public void setup(final FragmentContext context, final BufferAllocator allocator
        * @return
        */
       public static long memoryNeeded(final int recordCount) {
    -    // We need 4 bytes (SV4) for each record.
    -    return recordCount * 4;
    +    // We need 4 bytes (SV4) for each record, power of 2 rounded.
    --- End diff --
    
    Does the implementation of __memoryNeeded()__ in SortRecordBatchBuilder.java also need to be similarly modified ?
      And a minor comment: Would the name __memoryNeededForSV4()__ be more descriptive ?  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r128322098
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/spill/RecordBatchSizer.java ---
    @@ -19,117 +19,162 @@
     
     import java.util.ArrayList;
     import java.util.List;
    +import java.util.Set;
     
    +import org.apache.drill.common.types.TypeProtos.DataMode;
     import org.apache.drill.common.types.TypeProtos.MinorType;
     import org.apache.drill.exec.expr.TypeHelper;
    +import org.apache.drill.exec.memory.AllocationManager.BufferLedger;
     import org.apache.drill.exec.memory.BaseAllocator;
     import org.apache.drill.exec.record.BatchSchema;
     import org.apache.drill.exec.record.MaterializedField;
     import org.apache.drill.exec.record.RecordBatch;
    +import org.apache.drill.exec.record.SmartAllocationHelper;
     import org.apache.drill.exec.record.VectorAccessible;
     import org.apache.drill.exec.record.VectorWrapper;
     import org.apache.drill.exec.record.selection.SelectionVector2;
    +import org.apache.drill.exec.vector.UInt4Vector;
     import org.apache.drill.exec.vector.ValueVector;
     import org.apache.drill.exec.vector.complex.AbstractMapVector;
    +import org.apache.drill.exec.vector.complex.RepeatedValueVector;
     import org.apache.drill.exec.vector.VariableWidthVector;
     
    +import com.google.common.collect.Sets;
    +
     /**
      * Given a record batch or vector container, determines the actual memory
      * consumed by each column, the average row, and the entire record batch.
      */
     
     public class RecordBatchSizer {
    -//  private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(RecordBatchSizer.class);
     
       /**
        * Column size information.
        */
       public static class ColumnSize {
    +    public final String prefix;
         public final MaterializedField metadata;
     
         /**
    -     * Assumed size from Drill metadata.
    +     * Assumed size from Drill metadata. Note that this information is
    +     * 100% bogus. Do not use it.
          */
     
    +    @Deprecated
         public int stdSize;
     
         /**
    -     * Actual memory consumed by all the vectors associated with this column.
    -     */
    -
    -    public int totalSize;
    -
    -    /**
          * Actual average column width as determined from actual memory use. This
          * size is larger than the actual data size since this size includes per-
          * column overhead such as any unused vector space, etc.
          */
     
    -    public int estSize;
    -    public int capacity;
    -    public int density;
    -    public int dataSize;
    -    public boolean variableWidth;
    -
    -    public ColumnSize(ValueVector vv) {
    -      metadata = vv.getField();
    -      stdSize = TypeHelper.getSize(metadata.getType());
    -
    -      // Can't get size estimates if this is an empty batch.
    -      int rowCount = vv.getAccessor().getValueCount();
    -      if (rowCount == 0) {
    -        estSize = stdSize;
    -        return;
    -      }
    +    public final int estSize;
    +    public final int valueCount;
    +    public final int entryCount;
    +    public final int dataSize;
    +    public final int estElementCount;
    +    public final boolean isVariableWidth;
     
    -      // Total size taken by all vectors (and underlying buffers)
    -      // associated with this vector.
    +    public ColumnSize(ValueVector v, String prefix, int valueCount) {
    --- End diff --
    
    This appears to be a Drill standard: it appears all over the code. But, renamed the variable anyway.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r128130382
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/managed/SortMemoryManager.java ---
    @@ -19,7 +19,125 @@
     
     import com.google.common.annotations.VisibleForTesting;
     
    +/**
    + * Computes the memory needs for input batches, spill batches and merge
    + * batches. The key challenges that this code tries to overcome are:
    + * <ul>
    + * <li>Drill is not designed for the small memory allocations,
    + * but the planner may provide such allocations because the memory per
    + * query is divided among slices (minor fragments) and among buffering
    + * operators, leaving very little per operator.</li>
    + * <li>Drill does not provide the detailed memory information needed to
    + * carefully manage memory in tight constraints.</li>
    + * <li>But, Drill has a death penalty for going over the memory limit.</li>
    + * </ul>
    + * As a result, this class is a bit of a hack: it attempt to consider a
    + * number of ill-defined factors in order to divide up memory use in a
    + * way that prevents OOM errors.
    + * <p>
    + * First, it is necessary to differentiate two concepts:
    + * <ul>
    + * <li>The <i>data size</i> of a batch: the amount of memory needed to hold
    + * the data itself. The data size is constant for any given batch.</li>
    + * <li>The <i>buffer size</i> of the buffers that hold the data. The buffer
    + * size varies wildly depending on how the batch was produced.</li>
    + * </ul>
    + * The three kinds of buffer layouts seen to date include:
    + * <ul>
    + * <li>One buffer per vector component (data, offsets, null flags, etc.)
    + * &ndash; create by readers, project and other operators.</li>
    + * <li>One buffer for the entire batch, with each vector component using
    + * a slice of the overall buffer. &ndash; case for batches deserialized from
    + * exchanges.</li>
    + * <li>One buffer for each top-level vector, with component vectors
    + * using slices of the overall vector buffer &ndash; the result of reading
    + * spilled batches from disk.</li>
    + * </ul>
    + * In each case, buffer sizes are power-of-two rounded from the data size.
    + * But since the data is grouped differently in each case, the resulting buffer
    + * sizes vary considerably.
    + * <p>
    + * As a result, we can never be sure of the amount of memory needed for a
    + * batch. So, we have to estimate based on a number of factors:
    + * <ul>
    + * <li>Uses the {@link RecordBatchSizer} to estimate the data size and
    + * buffer size of each incoming batch.</li>
    + * <li>Estimates the internal fragmentation due to power-of-two rounding.</li>
    + * <li>Configured preferences for spill and output batches.</li>
    + * </ul>
    + * The code handles "normal" and "low" memory conditions.
    + * <ul>
    + * <li>In normal memory, we simply work out the number of preferred-size
    + * batches fit in memory (based on the predicted buffer size.)</li>
    + * <li>In low memory, we divide up the available memory to produce the
    + * spill and merge batch sizes. The sizes will be less than the configured
    + * preference.</li>
    + * </ul>
    + */
    +
     public class SortMemoryManager {
    +  static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(ExternalSortBatch.class);
    +
    +  /**
    +   * Estimate for typical internal fragmentation in a buffer due to power-of-two
    +   * rounding on vectors.
    +   * <p>
    +   * <p>
    +   * <pre>[____|__$__]</pre>
    +   * In the above, the brackets represent the whole vector. The
    +   * first half is always full. When the first half filled, the second
    +   * half was allocated. On average, the second half will be half full.
    +   * This means that, on average, 1/4 of the allocated space is
    +   * unused (the definition of internal fragmentation.)
    +   */
    +
    +  public static final double INTERNAL_FRAGMENTATION_ESTIMATE = 1.0/4.0;
    +
    +  /**
    +   * Given a buffer, this is the assumed amount of space
    +   * available for data. (Adding more will double the buffer
    +   * size half the time.)
    +   */
    +
    +  public static final double PAYLOAD_MULTIPLIER = 1 - INTERNAL_FRAGMENTATION_ESTIMATE;
    +
    +  /**
    +   * Given a data size, this is the multiplier to create the buffer
    +   * size estimate. (Note: since we work with aggregate batches, we
    +   * cannot simply round up to the next power of two: rounding is done
    +   * on a vector-by-vector basis. Here we need to estimate the aggregate
    +   * effect of rounding.
    +   */
    +
    +  public static final double BUFFER_MULTIPLIER = 1/PAYLOAD_MULTIPLIER;
    +  public static final double WORST_CASE_MULTIPLIER = 2.0;
    +
    +  public static final int MIN_ROWS_PER_SORT_BATCH = 100;
    --- End diff --
    
    I could. But, this is used only when memory is squeezed as a feeble attempt to avoid creating tiny spill batches. Actually, I'm thinking to remove this later since, under low memory, we really don't have much leverage in setting batch sizes.
    
    Also used in logging to log a warning when the Drill "big data" system is forced to process "too few" records per batch.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill issue #860: DRILL-5601: Rollup of external sort fixes and improvements

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on the issue:

    https://github.com/apache/drill/pull/860
  
    Rebased onto master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by Ben-Zvi <gi...@git.apache.org>.

Github user Ben-Zvi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r127357111
  
    --- Diff: exec/java-exec/src/test/java/org/apache/drill/test/ClientFixture.java ---
    @@ -231,4 +236,120 @@ public void setControls(String controls) {
       public RowSetBuilder rowSetBuilder(BatchSchema schema) {
         return new RowSetBuilder(allocator(), schema);
       }
    +
    +  /**
    +   * Very simple parser for semi-colon separated lists of SQL statements which
    +   * handles quoted semicolons. Drill can execute only one statement at a time
    +   * (without a trailing semi-colon.) This parser breaks up a statement list
    +   * into single statements. Input:<code><pre>
    +   * USE a.b;
    +   * ALTER SESSION SET `foo` = ";";
    +   * SELECT * FROM bar WHERE x = "\";";
    +   * </pre><code>Output:
    +   * <ul>
    +   * <li><tt>USE a.b</tt></li>
    +   * <li><tt>ALTER SESSION SET `foo` = ";"</tt></li>
    +   * <li><tt>SELECT * FROM bar WHERE x = "\";"</tt></li>
    +   */
    +
    +  public static class StatementParser {
    +    private final Reader in;
    +    private StringBuilder buf;
    +
    +    public StatementParser(Reader in) {
    +      this.in = in;
    +    }
    +
    +    public String parseNext() throws IOException {
    +      boolean eof = false;
    +      buf = new StringBuilder();
    +      outer:
    +      for (;;) {
    +        int c = in.read();
    +        if (c == -1) {
    +          eof = true;
    +          break;
    +        }
    +        if (c == ';') {
    +          break;
    +        }
    +        buf.append((char) c);
    +        if (c == '"' || c == '\'' || c == '`') {
    +          int quote = c;
    +          boolean escape = false;
    +          for (;;) {
    +            c = in.read();
    +            if (c == -1) {
    +              break outer;
    --- End diff --
    
    Note: This code would return a truncated SQL Statement (non matched quote), as appears in the input.
    Option: Throw some user exception here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r128367935
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/managed/SortConfig.java ---
    @@ -114,15 +127,25 @@ public SortConfig(DrillConfig config) {
         } else {
           mSortBatchSize = Character.MAX_VALUE;
         }
    +
    +    // Disable the single-batch shortcut; primarily for testing
    +
    +    if (config.hasPath(ExecConstants.EXTERNAL_SORT_OPTIMIZE_SINGLE_BATCH)) {
    +      enableSingleBatchShortcut = config.getBoolean(ExecConstants.EXTERNAL_SORT_OPTIMIZE_SINGLE_BATCH);
    +    } else {
    --- End diff --
    
    Yes. But the "final" provides useful information: the value can't change once set. Once we accept "final" then we're stuck with the awkward code to initialize the value only once.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r128125131
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/StreamingAggTemplate.java ---
    @@ -36,6 +36,9 @@
       private int previousIndex = -1;
       private int underlyingIndex = 0;
       private int currentIndex;
    +  /**
    +   * Number of records added to the current aggregation group. (?)
    --- End diff --
    
    Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r128127648
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/spill/RecordBatchSizer.java ---
    @@ -177,7 +226,7 @@ public RecordBatchSizer(RecordBatch batch) {
       /**
        *  Count the nullable columns; used for memory estimation
        */
    -  public int numNullables;
    +  public int nullableCount;
       /**
        *
        * @param va
    --- End diff --
    
    Done.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by Ben-Zvi <gi...@git.apache.org>.

Github user Ben-Zvi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r127351398
  
    --- Diff: exec/java-exec/src/main/resources/drill-module.conf ---
    @@ -239,9 +239,7 @@ drill.exec: {
         external: {
           // Drill uses the managed External Sort Batch by default.
           // Set this to true to use the legacy, unmanaged version.
    -      // Disabled in the intial commit, to be enabled after
    -      // tests are committed.
    -      disable_managed: true,
    +      disable_managed: false,
    --- End diff --
    
    Back to "true" ....



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by Ben-Zvi <gi...@git.apache.org>.

Github user Ben-Zvi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r123834281
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/spill/RecordBatchSizer.java ---
    @@ -19,117 +19,162 @@
     
     import java.util.ArrayList;
     import java.util.List;
    +import java.util.Set;
     
    +import org.apache.drill.common.types.TypeProtos.DataMode;
     import org.apache.drill.common.types.TypeProtos.MinorType;
     import org.apache.drill.exec.expr.TypeHelper;
    +import org.apache.drill.exec.memory.AllocationManager.BufferLedger;
     import org.apache.drill.exec.memory.BaseAllocator;
     import org.apache.drill.exec.record.BatchSchema;
     import org.apache.drill.exec.record.MaterializedField;
     import org.apache.drill.exec.record.RecordBatch;
    +import org.apache.drill.exec.record.SmartAllocationHelper;
     import org.apache.drill.exec.record.VectorAccessible;
     import org.apache.drill.exec.record.VectorWrapper;
     import org.apache.drill.exec.record.selection.SelectionVector2;
    +import org.apache.drill.exec.vector.UInt4Vector;
     import org.apache.drill.exec.vector.ValueVector;
     import org.apache.drill.exec.vector.complex.AbstractMapVector;
    +import org.apache.drill.exec.vector.complex.RepeatedValueVector;
     import org.apache.drill.exec.vector.VariableWidthVector;
     
    +import com.google.common.collect.Sets;
    +
     /**
      * Given a record batch or vector container, determines the actual memory
      * consumed by each column, the average row, and the entire record batch.
      */
     
     public class RecordBatchSizer {
    -//  private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(RecordBatchSizer.class);
     
       /**
        * Column size information.
        */
       public static class ColumnSize {
    +    public final String prefix;
         public final MaterializedField metadata;
     
         /**
    -     * Assumed size from Drill metadata.
    +     * Assumed size from Drill metadata. Note that this information is
    +     * 100% bogus. Do not use it.
          */
     
    +    @Deprecated
         public int stdSize;
     
         /**
    -     * Actual memory consumed by all the vectors associated with this column.
    -     */
    -
    -    public int totalSize;
    -
    -    /**
          * Actual average column width as determined from actual memory use. This
          * size is larger than the actual data size since this size includes per-
          * column overhead such as any unused vector space, etc.
          */
     
    -    public int estSize;
    -    public int capacity;
    -    public int density;
    -    public int dataSize;
    -    public boolean variableWidth;
    -
    -    public ColumnSize(ValueVector vv) {
    -      metadata = vv.getField();
    -      stdSize = TypeHelper.getSize(metadata.getType());
    -
    -      // Can't get size estimates if this is an empty batch.
    -      int rowCount = vv.getAccessor().getValueCount();
    -      if (rowCount == 0) {
    -        estSize = stdSize;
    -        return;
    -      }
    +    public final int estSize;
    +    public final int valueCount;
    +    public final int entryCount;
    +    public final int dataSize;
    +    public final int estElementCount;
    +    public final boolean isVariableWidth;
     
    -      // Total size taken by all vectors (and underlying buffers)
    -      // associated with this vector.
    +    public ColumnSize(ValueVector v, String prefix, int valueCount) {
    +      this.prefix = prefix;
    +      this.valueCount = valueCount;
    +      metadata = v.getField();
    +      boolean isMap = v.getField().getType().getMinorType() == MinorType.MAP;
     
    -      totalSize = vv.getAllocatedByteCount();
    +      // The amount of memory consumed by the payload: the actual
    +      // data stored in the vectors.
     
    -      // Capacity is the number of values that the vector could
    -      // contain. This is useful only for fixed-length vectors.
    +      int mapSize = 0;
    +      if (v.getField().getDataMode() == DataMode.REPEATED) {
     
    -      capacity = vv.getValueCapacity();
    +        // Repeated vectors are special: they have an associated offset vector
    +        // that changes the value count of the contained vectors.
     
    -      // The amount of memory consumed by the payload: the actual
    -      // data stored in the vectors.
    +        @SuppressWarnings("resource")
    +        UInt4Vector offsetVector = ((RepeatedValueVector) v).getOffsetVector();
    +        entryCount = offsetVector.getAccessor().get(valueCount);
    +        estElementCount = roundUp(entryCount, valueCount);
    +        if (isMap) {
     
    -      dataSize = vv.getPayloadByteCount();
    +          // For map, the only data associated with the map vector
    +          // itself is the offset vector, if any.
     
    -      // Determine "density" the number of rows compared to potential
    -      // capacity. Low-density batches occur at block boundaries, ends
    -      // of files and so on. Low-density batches throw off our estimates
    -      // for Varchar columns because we don't know the actual number of
    -      // bytes consumed (that information is hidden behind the Varchar
    -      // implementation where we can't get at it.)
    +          mapSize = offsetVector.getPayloadByteCount(valueCount);
    +        }
    +      } else {
    +        entryCount = valueCount;
    +        estElementCount = 1;
    +      }
     
    -      density = roundUp(dataSize * 100, totalSize);
    -      estSize = roundUp(dataSize, rowCount);
    -      variableWidth = vv instanceof VariableWidthVector ;
    +      if (isMap) {
    +        dataSize = mapSize;
    +        stdSize = 0;
    +     } else {
    +        dataSize = v.getPayloadByteCount(valueCount);
    +        stdSize = TypeHelper.getSize(metadata.getType());
    +     }
    +      estSize = roundUp(dataSize, valueCount);
    +      isVariableWidth = v instanceof VariableWidthVector ;
         }
     
         @Override
         public String toString() {
           StringBuilder buf = new StringBuilder()
    +          .append(prefix)
               .append(metadata.getName())
               .append("(type: ")
    +          .append(metadata.getType().getMode().name())
    +          .append(" ")
               .append(metadata.getType().getMinorType().name())
    -          .append(", std col. size: ")
    +          .append(", count: ")
    +          .append(valueCount);
    +      if (metadata.getDataMode() == DataMode.REPEATED) {
    +        buf.append(", total entries: ")
    +           .append(entryCount)
    +           .append(", per-array: ")
    +           .append(estElementCount);
    +      }
    +      buf .append(", std size: ")
               .append(stdSize)
    -          .append(", actual col. size: ")
    +          .append(", actual size: ")
               .append(estSize)
    -          .append(", total size: ")
    -          .append(totalSize)
               .append(", data size: ")
               .append(dataSize)
    -          .append(", row capacity: ")
    -          .append(capacity)
    -          .append(", density: ")
    -          .append(density)
               .append(")");
           return buf.toString();
         }
    +
    +    public void buildAllocHelper(SmartAllocationHelper allocHelper) {
    +      int width = 0;
    +      switch(metadata.getType().getMinorType()) {
    --- End diff --
    
    Could be nicer by adding a method to MinorType called **isVariable()** (i.e. check if any of those 3), then this switch statement can be replaced by a simple if statement.  (Maybe that proposed method is the same as checking the local variable **isVariableWidth** ???)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by Ben-Zvi <gi...@git.apache.org>.

Github user Ben-Zvi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r124437209
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/managed/SortMemoryManager.java ---
    @@ -97,24 +215,43 @@
     
       private SortConfig config;
     
    -  private int estimatedInputSize;
    -
       private boolean potentialOverflow;
     
    -  public SortMemoryManager(SortConfig config, long memoryLimit) {
    +  private boolean isLowMemory;
    +
    +  private boolean performanceWarning;
    +
    +  public SortMemoryManager(SortConfig config, long opMemoryLimit) {
         this.config = config;
     
         // The maximum memory this operator can use as set by the
         // operator definition (propagated to the allocator.)
     
         if (config.maxMemory() > 0) {
    --- End diff --
    
    Since opMemoryLimit > 0, it can be put in the if() condition instead of 0, hence no need for the Math.min() . (Also the whole statement can be written in a single code line using ?:  )



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill issue #860: DRILL-5601: Rollup of external sort fixes and improvements

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on the issue:

https://github.com/apache/drill/pull/860

# Additional Revisions

The code contains an optimization for short queries. If the query has just one batch then the sort need not copy that batch to produce the output. Instead, the sort simply passes along the input batch along with an SV2. This commit fixes some issues with transferring that single, buffered, input batch to the sort’s output container.

While reproducing problems found by QA, it turned out to be useful to be able to parse SQL statements from a file in the context of the test framework. Added test methods for this purpose.

The “AllocationHelper” class turns out to throw an exception if asked to allocate a vector with zero elements. Since a zero-size can occasionally come from the record batch sizer (via the “smart allocation helper”, special code was added to handle this case.

Backed out some debugging code that accidentally appeared in the original PR.

With this set of improvements, the revised, managed sort has become more stable and reliable than the original version. As a result, this commit enables the managed sort by default.

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by Ben-Zvi <gi...@git.apache.org>.

Github user Ben-Zvi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r127363488
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/spill/RecordBatchSizer.java ---
    @@ -70,52 +72,90 @@
          */
     
         public final int estSize;
    +
    +    /**
    +     * Number of times the value here (possibly repeated) appears in
    +     * the record batch.
    +     */
    +
         public final int valueCount;
    -    public final int entryCount;
    -    public final int dataSize;
    -    public final int estElementCount;
    +
    +    /**
    +     * Number of times the entries of this column appears. If this is a
    +     * scalar, the entry count is the same as the value count. If this
    --- End diff --
    
    If a scalar, entryCount would be 1 .


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by Ben-Zvi <gi...@git.apache.org>.

Github user Ben-Zvi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r124436676
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/managed/SortMemoryManager.java ---
    @@ -19,7 +19,125 @@
     
     import com.google.common.annotations.VisibleForTesting;
     
    +/**
    + * Computes the memory needs for input batches, spill batches and merge
    + * batches. The key challenges that this code tries to overcome are:
    + * <ul>
    + * <li>Drill is not designed for the small memory allocations,
    + * but the planner may provide such allocations because the memory per
    + * query is divided among slices (minor fragments) and among buffering
    + * operators, leaving very little per operator.</li>
    + * <li>Drill does not provide the detailed memory information needed to
    + * carefully manage memory in tight constraints.</li>
    + * <li>But, Drill has a death penalty for going over the memory limit.</li>
    + * </ul>
    + * As a result, this class is a bit of a hack: it attempt to consider a
    + * number of ill-defined factors in order to divide up memory use in a
    + * way that prevents OOM errors.
    + * <p>
    + * First, it is necessary to differentiate two concepts:
    + * <ul>
    + * <li>The <i>data size</i> of a batch: the amount of memory needed to hold
    + * the data itself. The data size is constant for any given batch.</li>
    + * <li>The <i>buffer size</i> of the buffers that hold the data. The buffer
    + * size varies wildly depending on how the batch was produced.</li>
    + * </ul>
    + * The three kinds of buffer layouts seen to date include:
    + * <ul>
    + * <li>One buffer per vector component (data, offsets, null flags, etc.)
    + * &ndash; create by readers, project and other operators.</li>
    + * <li>One buffer for the entire batch, with each vector component using
    + * a slice of the overall buffer. &ndash; case for batches deserialized from
    + * exchanges.</li>
    + * <li>One buffer for each top-level vector, with component vectors
    + * using slices of the overall vector buffer &ndash; the result of reading
    + * spilled batches from disk.</li>
    + * </ul>
    + * In each case, buffer sizes are power-of-two rounded from the data size.
    + * But since the data is grouped differently in each case, the resulting buffer
    + * sizes vary considerably.
    + * <p>
    + * As a result, we can never be sure of the amount of memory needed for a
    + * batch. So, we have to estimate based on a number of factors:
    + * <ul>
    + * <li>Uses the {@link RecordBatchSizer} to estimate the data size and
    + * buffer size of each incoming batch.</li>
    + * <li>Estimates the internal fragmentation due to power-of-two rounding.</li>
    + * <li>Configured preferences for spill and output batches.</li>
    + * </ul>
    + * The code handles "normal" and "low" memory conditions.
    + * <ul>
    + * <li>In normal memory, we simply work out the number of preferred-size
    + * batches fit in memory (based on the predicted buffer size.)</li>
    + * <li>In low memory, we divide up the available memory to produce the
    + * spill and merge batch sizes. The sizes will be less than the configured
    + * preference.</li>
    + * </ul>
    + */
    +
     public class SortMemoryManager {
    +  static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(ExternalSortBatch.class);
    +
    +  /**
    +   * Estimate for typical internal fragmentation in a buffer due to power-of-two
    +   * rounding on vectors.
    +   * <p>
    +   * <p>
    +   * <pre>[____|__$__]</pre>
    +   * In the above, the brackets represent the whole vector. The
    +   * first half is always full. When the first half filled, the second
    +   * half was allocated. On average, the second half will be half full.
    +   * This means that, on average, 1/4 of the allocated space is
    +   * unused (the definition of internal fragmentation.)
    +   */
    +
    +  public static final double INTERNAL_FRAGMENTATION_ESTIMATE = 1.0/4.0;
    +
    +  /**
    +   * Given a buffer, this is the assumed amount of space
    +   * available for data. (Adding more will double the buffer
    +   * size half the time.)
    +   */
    +
    +  public static final double PAYLOAD_MULTIPLIER = 1 - INTERNAL_FRAGMENTATION_ESTIMATE;
    +
    +  /**
    +   * Given a data size, this is the multiplier to create the buffer
    +   * size estimate. (Note: since we work with aggregate batches, we
    +   * cannot simply round up to the next power of two: rounding is done
    +   * on a vector-by-vector basis. Here we need to estimate the aggregate
    +   * effect of rounding.
    +   */
    +
    +  public static final double BUFFER_MULTIPLIER = 1/PAYLOAD_MULTIPLIER;
    +  public static final double WORST_CASE_MULTIPLIER = 2.0;
    +
    +  public static final int MIN_ROWS_PER_SORT_BATCH = 100;
    +  public static final double LOW_MEMORY_MERGE_BATCH_RATIO = 0.25;
    +
    +  public static class BatchSizeEstimate {
    +    int dataSize;
    +    int expectedBufferSize;
    +    int maxBufferSize;
    +
    +    public void setFromData(int dataSize) {
    +      this.dataSize = dataSize;
    +      expectedBufferSize = multiply(dataSize, BUFFER_MULTIPLIER);
    +      maxBufferSize = multiply(dataSize, WORST_CASE_MULTIPLIER);
    +    }
    +
    +    public void setFromBuffer(int bufferSize) {
    +      expectedBufferSize = bufferSize;
    +      dataSize = multiply(bufferSize, PAYLOAD_MULTIPLIER);
    +      maxBufferSize = multiply(dataSize, WORST_CASE_MULTIPLIER);
    +    }
    +
    +    public void setFromWorstCaseBuffer(int bufferSize) {
    +      maxBufferSize = bufferSize;
    +      dataSize = multiply(maxBufferSize, 1 / WORST_CASE_MULTIPLIER);
    +      expectedBufferSize = multiply(dataSize, BUFFER_MULTIPLIER);
    --- End diff --
    
    Here and in setFromData, the expected buffer size is  data * 4/3, while the max buffer is 2 * data , hence the expected is %66 of the max. Should it be %75 ? 



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by Ben-Zvi <gi...@git.apache.org>.

Github user Ben-Zvi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r127536602
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/StreamingAggBatch.java ---
    @@ -111,6 +120,8 @@ public void buildSchema() throws SchemaChangeException {
           case STOP:
             state = BatchState.STOP;
             return;
    +      default:
    +        break;
    --- End diff --
    
    Not getting into a "religious" debate, but "default+break" is just a useless no-op


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by Ben-Zvi <gi...@git.apache.org>.

Github user Ben-Zvi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r124122920
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/managed/PriorityQueueCopierWrapper.java ---
    @@ -82,7 +84,7 @@ private PriorityQueueCopier newCopier(VectorAccessible batch) {
         ClassGenerator<PriorityQueueCopier> g = cg.getRoot();
         cg.plainJavaCapable(true);
         // Uncomment out this line to debug the generated code.
    -//    cg.saveCodeForDebugging(true);
    +    cg.saveCodeForDebugging(true);
    --- End diff --
    
    ditto ... re-comment ...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r128133649
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/managed/SortMemoryManager.java ---
    @@ -19,7 +19,125 @@
     
     import com.google.common.annotations.VisibleForTesting;
     
    +/**
    + * Computes the memory needs for input batches, spill batches and merge
    + * batches. The key challenges that this code tries to overcome are:
    + * <ul>
    + * <li>Drill is not designed for the small memory allocations,
    + * but the planner may provide such allocations because the memory per
    + * query is divided among slices (minor fragments) and among buffering
    + * operators, leaving very little per operator.</li>
    + * <li>Drill does not provide the detailed memory information needed to
    + * carefully manage memory in tight constraints.</li>
    + * <li>But, Drill has a death penalty for going over the memory limit.</li>
    + * </ul>
    + * As a result, this class is a bit of a hack: it attempt to consider a
    + * number of ill-defined factors in order to divide up memory use in a
    + * way that prevents OOM errors.
    + * <p>
    + * First, it is necessary to differentiate two concepts:
    + * <ul>
    + * <li>The <i>data size</i> of a batch: the amount of memory needed to hold
    + * the data itself. The data size is constant for any given batch.</li>
    + * <li>The <i>buffer size</i> of the buffers that hold the data. The buffer
    + * size varies wildly depending on how the batch was produced.</li>
    + * </ul>
    + * The three kinds of buffer layouts seen to date include:
    + * <ul>
    + * <li>One buffer per vector component (data, offsets, null flags, etc.)
    + * &ndash; create by readers, project and other operators.</li>
    + * <li>One buffer for the entire batch, with each vector component using
    + * a slice of the overall buffer. &ndash; case for batches deserialized from
    + * exchanges.</li>
    + * <li>One buffer for each top-level vector, with component vectors
    + * using slices of the overall vector buffer &ndash; the result of reading
    + * spilled batches from disk.</li>
    + * </ul>
    + * In each case, buffer sizes are power-of-two rounded from the data size.
    + * But since the data is grouped differently in each case, the resulting buffer
    + * sizes vary considerably.
    + * <p>
    + * As a result, we can never be sure of the amount of memory needed for a
    + * batch. So, we have to estimate based on a number of factors:
    + * <ul>
    + * <li>Uses the {@link RecordBatchSizer} to estimate the data size and
    + * buffer size of each incoming batch.</li>
    + * <li>Estimates the internal fragmentation due to power-of-two rounding.</li>
    + * <li>Configured preferences for spill and output batches.</li>
    + * </ul>
    + * The code handles "normal" and "low" memory conditions.
    + * <ul>
    + * <li>In normal memory, we simply work out the number of preferred-size
    + * batches fit in memory (based on the predicted buffer size.)</li>
    + * <li>In low memory, we divide up the available memory to produce the
    + * spill and merge batch sizes. The sizes will be less than the configured
    + * preference.</li>
    + * </ul>
    + */
    +
     public class SortMemoryManager {
    +  static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(ExternalSortBatch.class);
    +
    +  /**
    +   * Estimate for typical internal fragmentation in a buffer due to power-of-two
    +   * rounding on vectors.
    +   * <p>
    +   * <p>
    +   * <pre>[____|__$__]</pre>
    +   * In the above, the brackets represent the whole vector. The
    +   * first half is always full. When the first half filled, the second
    +   * half was allocated. On average, the second half will be half full.
    +   * This means that, on average, 1/4 of the allocated space is
    +   * unused (the definition of internal fragmentation.)
    +   */
    +
    +  public static final double INTERNAL_FRAGMENTATION_ESTIMATE = 1.0/4.0;
    +
    +  /**
    +   * Given a buffer, this is the assumed amount of space
    +   * available for data. (Adding more will double the buffer
    +   * size half the time.)
    +   */
    +
    +  public static final double PAYLOAD_MULTIPLIER = 1 - INTERNAL_FRAGMENTATION_ESTIMATE;
    +
    +  /**
    +   * Given a data size, this is the multiplier to create the buffer
    +   * size estimate. (Note: since we work with aggregate batches, we
    +   * cannot simply round up to the next power of two: rounding is done
    +   * on a vector-by-vector basis. Here we need to estimate the aggregate
    +   * effect of rounding.
    +   */
    +
    +  public static final double BUFFER_MULTIPLIER = 1/PAYLOAD_MULTIPLIER;
    +  public static final double WORST_CASE_MULTIPLIER = 2.0;
    +
    +  public static final int MIN_ROWS_PER_SORT_BATCH = 100;
    +  public static final double LOW_MEMORY_MERGE_BATCH_RATIO = 0.25;
    +
    +  public static class BatchSizeEstimate {
    +    int dataSize;
    +    int expectedBufferSize;
    +    int maxBufferSize;
    +
    +    public void setFromData(int dataSize) {
    +      this.dataSize = dataSize;
    +      expectedBufferSize = multiply(dataSize, BUFFER_MULTIPLIER);
    +      maxBufferSize = multiply(dataSize, WORST_CASE_MULTIPLIER);
    +    }
    +
    +    public void setFromBuffer(int bufferSize) {
    +      expectedBufferSize = bufferSize;
    +      dataSize = multiply(bufferSize, PAYLOAD_MULTIPLIER);
    +      maxBufferSize = multiply(dataSize, WORST_CASE_MULTIPLIER);
    +    }
    +
    +    public void setFromWorstCaseBuffer(int bufferSize) {
    +      maxBufferSize = bufferSize;
    +      dataSize = multiply(maxBufferSize, 1 / WORST_CASE_MULTIPLIER);
    +      expectedBufferSize = multiply(dataSize, BUFFER_MULTIPLIER);
    --- End diff --
    
    Right. To make sure the numbers are right this time...
    
    * buffer size given is already the 2x the data size.
    * The data size is half that.
    * The expected size is 3/2 of the data size.
    
    For example, assume we have the following buffer, 75% filled:
    
    ```
    [xxxx] <-- actual
    [xxxx] <-- best case buffer
    [xxxx----] <-- worst case buffer
    [xxxx--] <-- average (expected) case buffer
    ```
    
    So, yes, the numbers are off. BUFFER_MULTIPLIER should be 3/2. That is, for 2x data, will, on average need 3x of memory.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r128129255
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/managed/PriorityQueueCopierWrapper.java ---
    @@ -245,29 +250,35 @@ private BatchMerger(PriorityQueueCopierWrapper holder, BatchSchema schema, List<
     
         @Override
         public boolean next() {
    -      Stopwatch w = Stopwatch.createStarted();
           long start = holder.getAllocator().getAllocatedMemory();
    +
    +      // Allocate an outgoing container the "dumb" way (based on static sizes)
    +      // for testing, or the "smart" way (based on actual observed data sizes)
    +      // for production code.
    +
    +      if (allocHelper == null) {
    +        VectorAccessibleUtilities.allocateVectors(outputContainer, targetRecordCount);
    +      } else {
    +        allocHelper.allocateBatch(outputContainer, targetRecordCount);
    +      }
    +      logger.trace("Initial output batch allocation: {} bytes",
    +                   holder.getAllocator().getAllocatedMemory() - start);
    +      Stopwatch w = Stopwatch.createStarted();
           int count = holder.copier.next(targetRecordCount);
    -      copyCount += count;
           if (count > 0) {
             long t = w.elapsed(TimeUnit.MICROSECONDS);
             batchCount++;
    -        logger.trace("Took {} us to merge {} records", t, count);
             long size = holder.getAllocator().getAllocatedMemory() - start;
    +        logger.trace("Took {} us to merge {} records, consuming {} bytes of memory",
    +                     t, count, size);
             estBatchSize = Math.max(estBatchSize, size);
           } else {
             logger.trace("copier returned 0 records");
           }
     
    -      // Identify the schema to be used in the output container. (Since
    -      // all merged batches have the same schema, the schema we identify
    -      // here should be the same as that which we already had.
    +      // Initialize output container metadata.
    --- End diff --
    
    They were actually a bit off the mark and reflected a partial understanding. The very nature of the buildSchema() is just to copy schema from vectors into the schema for the batch; it has nothing (directly) to do with the schema of incoming batches.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by Ben-Zvi <gi...@git.apache.org>.

Github user Ben-Zvi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r124428100
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/managed/SortMemoryManager.java ---
    @@ -19,7 +19,125 @@
     
     import com.google.common.annotations.VisibleForTesting;
     
    +/**
    + * Computes the memory needs for input batches, spill batches and merge
    + * batches. The key challenges that this code tries to overcome are:
    + * <ul>
    + * <li>Drill is not designed for the small memory allocations,
    + * but the planner may provide such allocations because the memory per
    + * query is divided among slices (minor fragments) and among buffering
    + * operators, leaving very little per operator.</li>
    + * <li>Drill does not provide the detailed memory information needed to
    + * carefully manage memory in tight constraints.</li>
    + * <li>But, Drill has a death penalty for going over the memory limit.</li>
    + * </ul>
    + * As a result, this class is a bit of a hack: it attempt to consider a
    + * number of ill-defined factors in order to divide up memory use in a
    + * way that prevents OOM errors.
    + * <p>
    + * First, it is necessary to differentiate two concepts:
    + * <ul>
    + * <li>The <i>data size</i> of a batch: the amount of memory needed to hold
    + * the data itself. The data size is constant for any given batch.</li>
    + * <li>The <i>buffer size</i> of the buffers that hold the data. The buffer
    + * size varies wildly depending on how the batch was produced.</li>
    + * </ul>
    + * The three kinds of buffer layouts seen to date include:
    + * <ul>
    + * <li>One buffer per vector component (data, offsets, null flags, etc.)
    + * &ndash; create by readers, project and other operators.</li>
    + * <li>One buffer for the entire batch, with each vector component using
    + * a slice of the overall buffer. &ndash; case for batches deserialized from
    + * exchanges.</li>
    + * <li>One buffer for each top-level vector, with component vectors
    + * using slices of the overall vector buffer &ndash; the result of reading
    + * spilled batches from disk.</li>
    + * </ul>
    + * In each case, buffer sizes are power-of-two rounded from the data size.
    + * But since the data is grouped differently in each case, the resulting buffer
    + * sizes vary considerably.
    + * <p>
    + * As a result, we can never be sure of the amount of memory needed for a
    + * batch. So, we have to estimate based on a number of factors:
    + * <ul>
    + * <li>Uses the {@link RecordBatchSizer} to estimate the data size and
    + * buffer size of each incoming batch.</li>
    + * <li>Estimates the internal fragmentation due to power-of-two rounding.</li>
    + * <li>Configured preferences for spill and output batches.</li>
    + * </ul>
    + * The code handles "normal" and "low" memory conditions.
    + * <ul>
    + * <li>In normal memory, we simply work out the number of preferred-size
    + * batches fit in memory (based on the predicted buffer size.)</li>
    + * <li>In low memory, we divide up the available memory to produce the
    + * spill and merge batch sizes. The sizes will be less than the configured
    --- End diff --
    
    Nice explanation up to here; but need more reasoning/explanation about the "low memory" - what's different here ? What's the purpose of the configured limits ? Why sizes are smaller (won't that yield a higher number of batches, hence may OOM ?)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by Ben-Zvi <gi...@git.apache.org>.

Github user Ben-Zvi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r124132387
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/managed/PriorityQueueCopierWrapper.java ---
    @@ -245,29 +250,35 @@ private BatchMerger(PriorityQueueCopierWrapper holder, BatchSchema schema, List<
     
         @Override
         public boolean next() {
    -      Stopwatch w = Stopwatch.createStarted();
           long start = holder.getAllocator().getAllocatedMemory();
    +
    +      // Allocate an outgoing container the "dumb" way (based on static sizes)
    +      // for testing, or the "smart" way (based on actual observed data sizes)
    +      // for production code.
    +
    +      if (allocHelper == null) {
    +        VectorAccessibleUtilities.allocateVectors(outputContainer, targetRecordCount);
    +      } else {
    +        allocHelper.allocateBatch(outputContainer, targetRecordCount);
    +      }
    +      logger.trace("Initial output batch allocation: {} bytes",
    +                   holder.getAllocator().getAllocatedMemory() - start);
    +      Stopwatch w = Stopwatch.createStarted();
           int count = holder.copier.next(targetRecordCount);
    -      copyCount += count;
           if (count > 0) {
             long t = w.elapsed(TimeUnit.MICROSECONDS);
             batchCount++;
    -        logger.trace("Took {} us to merge {} records", t, count);
             long size = holder.getAllocator().getAllocatedMemory() - start;
    +        logger.trace("Took {} us to merge {} records, consuming {} bytes of memory",
    +                     t, count, size);
             estBatchSize = Math.max(estBatchSize, size);
           } else {
             logger.trace("copier returned 0 records");
           }
     
    -      // Identify the schema to be used in the output container. (Since
    -      // all merged batches have the same schema, the schema we identify
    -      // here should be the same as that which we already had.
    +      // Initialize output container metadata.
    --- End diff --
    
    Why remove the original comments ?  They still look valid.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r128143565
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/record/SmartAllocationHelper.java ---
    @@ -0,0 +1,156 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.drill.exec.record;
    +
    +import java.util.HashMap;
    +import java.util.Map;
    +import java.util.Map.Entry;
    +
    +import org.apache.drill.exec.vector.AllocationHelper;
    +import org.apache.drill.exec.vector.ValueVector;
    +import org.apache.drill.exec.vector.complex.AbstractMapVector;
    +import org.apache.drill.exec.vector.complex.RepeatedMapVector;
    +
    +/**
    + * Prototype mechanism to allocate vectors based on expected
    + * data sizes. This version uses a name-based map of fields
    + * to sizes. Better to represent the batch structurally and
    + * simply iterate over the schema rather than doing a per-field
    + * lookup. But, the mechanisms needed to do the efficient solution
    + * don't exist yet.
    + */
    +
    +public class SmartAllocationHelper {
    +
    +  public static class AllocationHint {
    +    public final int entryWidth;
    +    public final int elementCount;
    +
    +    private AllocationHint(int width, int elements) {
    +      entryWidth = width;
    +      elementCount = elements;
    +    }
    +
    +    @Override
    +    public String toString() {
    +      StringBuilder buf = new StringBuilder()
    +          .append("{");
    +      boolean comma = false;
    +      if (entryWidth > 0) {
    +        buf.append("width=")
    +           .append(entryWidth);
    +        comma = true;
    +      }
    +      if (elementCount > 0) {
    +        if (comma) {
    +          buf.append(", ");
    +        }
    +        buf.append("elements=")
    +        .append(elementCount);
    +      }
    +      buf.append("}");
    +      return buf.toString();
    +    }
    +  }
    +
    +  private Map<String, AllocationHint> hints = new HashMap<>();
    +
    +  public void variableWidth(String name, int width) {
    +    hints.put(name, new AllocationHint(width, 1));
    +  }
    +
    +  public void fixedWidthArray(String name, int elements) {
    +    hints.put(name, new AllocationHint(0, elements));
    +  }
    +
    +  public void variableWidthArray(String name, int width, int elements) {
    +    hints.put(name, new AllocationHint(width, elements));
    +  }
    +
    +  public void allocateBatch(VectorAccessible va, int recordCount) {
    +    for (VectorWrapper<?> w: va) {
    +      allocateVector(w.getValueVector(), "", recordCount);
    +    }
    +  }
    +
    +  private void allocateVector(ValueVector vector, String prefix, int recordCount) {
    --- End diff --
    
    It certainly could be. But, as it turns out, maps are (strangely) vectors: the MapVector.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r128127127
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/spill/RecordBatchSizer.java ---
    @@ -139,14 +184,16 @@ public String toString() {
        */
       private int rowCount;
       /**
    -   * Standard row width using Drill meta-data.
    +   * Standard row width using Drill meta-data. Note: this information is
    +   * 100% bogus. Do not use it.
        */
    +  @Deprecated
       private int stdRowWidth;
    --- End diff --
    
    Ah, good. My snarky comment had its affect... Do you want to go ahead and remove the references in HashAgg, then remove the field here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r128367397
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/spill/RecordBatchSizer.java ---
    @@ -70,52 +72,90 @@
          */
     
         public final int estSize;
    +
    +    /**
    +     * Number of times the value here (possibly repeated) appears in
    +     * the record batch.
    +     */
    +
         public final int valueCount;
    -    public final int entryCount;
    -    public final int dataSize;
    -    public final int estElementCount;
    +
    +    /**
    +     * Number of times the entries of this column appears. If this is a
    +     * scalar, the entry count is the same as the value count. If this
    --- End diff --
    
    The comment is poorly worded. Meant "array of scalar" vs. "Array of variable-width". There is also "array of array of scalar | variable width"... Please check if the revised comment makes this any clearer. (It is hard to explain without first explaining the whole column data model...)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill issue #860: DRILL-5601: Rollup of external sort fixes and improvements

Posted by jinfengni <gi...@git.apache.org>.

Github user jinfengni commented on the issue:

    https://github.com/apache/drill/pull/860
  
    +1
    
    LGTM.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by Ben-Zvi <gi...@git.apache.org>.

Github user Ben-Zvi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r127358527
  
    --- Diff: exec/vector/src/main/java/org/apache/drill/exec/vector/AllocationHelper.java ---
    @@ -26,14 +26,16 @@ public static void allocate(ValueVector v, int valueCount, int bytesPerValue) {
         allocate(v, valueCount, bytesPerValue, 5);
       }
     
    -  public static void allocatePrecomputedChildCount(ValueVector v, int valueCount, int bytesPerValue, int childValCount){
    -    if(v instanceof FixedWidthVector) {
    +  public static void allocatePrecomputedChildCount(ValueVector v, int valueCount, int bytesPerValue, int childValCount) {
    --- End diff --
    
    Discouraged: single letter parameter/field names. Even **"vv"** would be easier to search, if needed.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by Ben-Zvi <gi...@git.apache.org>.

Github user Ben-Zvi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r127344297
  
    --- Diff: exec/vector/src/main/codegen/templates/VariableLengthVectors.java ---
    @@ -247,27 +249,26 @@ public void copyEntry(int toIndex, ValueVector from, int fromIndex) {
       }
     
       @Override
    -  public int getAllocatedByteCount() {
    -    return offsetVector.getAllocatedByteCount() + super.getAllocatedByteCount();
    +  public void getLedgers(Set<BufferLedger> ledgers) {
    +    offsetVector.getLedgers(ledgers);
    +    super.getLedgers(ledgers);
       }
     
       @Override
    -  public int getPayloadByteCount() {
    -    UInt${type.width}Vector.Accessor a = offsetVector.getAccessor();
    -    int count = a.getValueCount();
    -    if (count == 0) {
    +  public int getPayloadByteCount(int valueCount) {
    +    if (valueCount == 0) {
           return 0;
    -    } else {
    -      // If 1 or more values, then the last value is set to
    -      // the offset of the next value, which is the same as
    -      // the length of existing values.
    -      // In addition to the actual data bytes, we must also
    -      // include the "overhead" bytes: the offset vector entries
    -      // that accompany each column value. Thus, total payload
    -      // size is consumed text bytes + consumed offset vector
    -      // bytes.
    -      return a.get(count-1) + offsetVector.getPayloadByteCount();
         }
    +    // If 1 or more values, then the last value is set to
    +    // the offset of the next value, which is the same as
    +    // the length of existing values.
    +    // In addition to the actual data bytes, we must also
    +    // include the "overhead" bytes: the offset vector entries
    +    // that accompany each column value. Thus, total payload
    +    // size is consumed text bytes + consumed offset vector
    +    // bytes.
    +    return offsetVector.getAccessor().get(valueCount) +
    --- End diff --
    
    Should this be **(valueCount - 1)** ??



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by Ben-Zvi <gi...@git.apache.org>.

Github user Ben-Zvi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r123814368
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/spill/RecordBatchSizer.java ---
    @@ -19,117 +19,162 @@
     
     import java.util.ArrayList;
     import java.util.List;
    +import java.util.Set;
     
    +import org.apache.drill.common.types.TypeProtos.DataMode;
     import org.apache.drill.common.types.TypeProtos.MinorType;
     import org.apache.drill.exec.expr.TypeHelper;
    +import org.apache.drill.exec.memory.AllocationManager.BufferLedger;
     import org.apache.drill.exec.memory.BaseAllocator;
     import org.apache.drill.exec.record.BatchSchema;
     import org.apache.drill.exec.record.MaterializedField;
     import org.apache.drill.exec.record.RecordBatch;
    +import org.apache.drill.exec.record.SmartAllocationHelper;
     import org.apache.drill.exec.record.VectorAccessible;
     import org.apache.drill.exec.record.VectorWrapper;
     import org.apache.drill.exec.record.selection.SelectionVector2;
    +import org.apache.drill.exec.vector.UInt4Vector;
     import org.apache.drill.exec.vector.ValueVector;
     import org.apache.drill.exec.vector.complex.AbstractMapVector;
    +import org.apache.drill.exec.vector.complex.RepeatedValueVector;
     import org.apache.drill.exec.vector.VariableWidthVector;
     
    +import com.google.common.collect.Sets;
    +
     /**
      * Given a record batch or vector container, determines the actual memory
      * consumed by each column, the average row, and the entire record batch.
      */
     
     public class RecordBatchSizer {
    -//  private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(RecordBatchSizer.class);
     
       /**
        * Column size information.
        */
       public static class ColumnSize {
    +    public final String prefix;
         public final MaterializedField metadata;
     
         /**
    -     * Assumed size from Drill metadata.
    +     * Assumed size from Drill metadata. Note that this information is
    +     * 100% bogus. Do not use it.
          */
     
    +    @Deprecated
         public int stdSize;
     
         /**
    -     * Actual memory consumed by all the vectors associated with this column.
    -     */
    -
    -    public int totalSize;
    -
    -    /**
          * Actual average column width as determined from actual memory use. This
          * size is larger than the actual data size since this size includes per-
          * column overhead such as any unused vector space, etc.
          */
     
    -    public int estSize;
    -    public int capacity;
    -    public int density;
    -    public int dataSize;
    -    public boolean variableWidth;
    -
    -    public ColumnSize(ValueVector vv) {
    -      metadata = vv.getField();
    -      stdSize = TypeHelper.getSize(metadata.getType());
    -
    -      // Can't get size estimates if this is an empty batch.
    -      int rowCount = vv.getAccessor().getValueCount();
    -      if (rowCount == 0) {
    -        estSize = stdSize;
    -        return;
    -      }
    +    public final int estSize;
    +    public final int valueCount;
    +    public final int entryCount;
    +    public final int dataSize;
    +    public final int estElementCount;
    +    public final boolean isVariableWidth;
     
    -      // Total size taken by all vectors (and underlying buffers)
    -      // associated with this vector.
    +    public ColumnSize(ValueVector v, String prefix, int valueCount) {
    --- End diff --
    
    Programming 101: "v" - The use of a single letter variable name should be discouraged ...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill issue #860: DRILL-5601: Rollup of external sort fixes and improvements

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on the issue:

    https://github.com/apache/drill/pull/860
  
    Squashed commits.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r128365742
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/managed/PriorityQueueCopierWrapper.java ---
    @@ -82,7 +84,7 @@ private PriorityQueueCopier newCopier(VectorAccessible batch) {
         ClassGenerator<PriorityQueueCopier> g = cg.getRoot();
         cg.plainJavaCapable(true);
         // Uncomment out this line to debug the generated code.
    -//    cg.saveCodeForDebugging(true);
    +    cg.saveCodeForDebugging(true);
    --- End diff --
    
    I guess the fiddle-factor is getting to high; time to implement your suggestion of having options to enable this stuff. I'll try to add that in some future PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by Ben-Zvi <gi...@git.apache.org>.

Github user Ben-Zvi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r124435170
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/managed/SortMemoryManager.java ---
    @@ -312,52 +488,66 @@ private void adjustForLowMemory() {
        * one spill batch to make progress.
        */
     
    -  private void lowMemorySpillBatchSize() {
    +  private void lowMemoryInternalBatchSizes() {
     
         // The "expected" size is with power-of-two rounding in some vectors.
         // We later work backwards to the row count assuming average internal
         // fragmentation.
     
    -    // Must hold two input batches. Use (most of) the rest for the spill batch.
    +    // Must hold two input batches. Use half of the rest for the spill batch.
    +    // In a really bad case, the number here may be negative. We'll fix
    +    // it below.
     
    -    expectedSpillBatchSize = (int) (memoryLimit - 2 * estimatedInputSize);
    +    int spillBufferSize = (int) (memoryLimit - 2 * inputBatchSize.maxBufferSize) / 2;
    --- End diff --
    
    (1) spilled "buffer" or spilled "batch" size ?
    (2) Where is the possible negative number fixed ? Looks like setFromWorstCase() would set all three fields to negative numbers. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by Ben-Zvi <gi...@git.apache.org>.

Github user Ben-Zvi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r123839714
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/spill/RecordBatchSizer.java ---
    @@ -139,14 +184,16 @@ public String toString() {
        */
       private int rowCount;
       /**
    -   * Standard row width using Drill meta-data.
    +   * Standard row width using Drill meta-data. Note: this information is
    +   * 100% bogus. Do not use it.
        */
    +  @Deprecated
       private int stdRowWidth;
    --- End diff --
    
    Can just eliminate this field (and the method stdRowWidth() ). It is only being used by the HashAgg, but that code there is dead anyway, as it only applies for rowCount() == 0 , which is false as the method updateEstMaxBatchSize() is only called for a non-empty batch.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by Ben-Zvi <gi...@git.apache.org>.

Github user Ben-Zvi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r123844716
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/spill/RecordBatchSizer.java ---
    @@ -19,117 +19,162 @@
     
     import java.util.ArrayList;
     import java.util.List;
    +import java.util.Set;
     
    +import org.apache.drill.common.types.TypeProtos.DataMode;
     import org.apache.drill.common.types.TypeProtos.MinorType;
     import org.apache.drill.exec.expr.TypeHelper;
    +import org.apache.drill.exec.memory.AllocationManager.BufferLedger;
     import org.apache.drill.exec.memory.BaseAllocator;
     import org.apache.drill.exec.record.BatchSchema;
     import org.apache.drill.exec.record.MaterializedField;
     import org.apache.drill.exec.record.RecordBatch;
    +import org.apache.drill.exec.record.SmartAllocationHelper;
     import org.apache.drill.exec.record.VectorAccessible;
     import org.apache.drill.exec.record.VectorWrapper;
     import org.apache.drill.exec.record.selection.SelectionVector2;
    +import org.apache.drill.exec.vector.UInt4Vector;
     import org.apache.drill.exec.vector.ValueVector;
     import org.apache.drill.exec.vector.complex.AbstractMapVector;
    +import org.apache.drill.exec.vector.complex.RepeatedValueVector;
     import org.apache.drill.exec.vector.VariableWidthVector;
     
    +import com.google.common.collect.Sets;
    +
     /**
      * Given a record batch or vector container, determines the actual memory
      * consumed by each column, the average row, and the entire record batch.
      */
     
     public class RecordBatchSizer {
    -//  private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(RecordBatchSizer.class);
     
       /**
        * Column size information.
        */
       public static class ColumnSize {
    +    public final String prefix;
         public final MaterializedField metadata;
     
         /**
    -     * Assumed size from Drill metadata.
    +     * Assumed size from Drill metadata. Note that this information is
    +     * 100% bogus. Do not use it.
          */
     
    +    @Deprecated
         public int stdSize;
     
         /**
    -     * Actual memory consumed by all the vectors associated with this column.
    -     */
    -
    -    public int totalSize;
    -
    -    /**
          * Actual average column width as determined from actual memory use. This
          * size is larger than the actual data size since this size includes per-
          * column overhead such as any unused vector space, etc.
          */
     
    -    public int estSize;
    -    public int capacity;
    -    public int density;
    -    public int dataSize;
    -    public boolean variableWidth;
    -
    -    public ColumnSize(ValueVector vv) {
    -      metadata = vv.getField();
    -      stdSize = TypeHelper.getSize(metadata.getType());
    -
    -      // Can't get size estimates if this is an empty batch.
    -      int rowCount = vv.getAccessor().getValueCount();
    -      if (rowCount == 0) {
    -        estSize = stdSize;
    -        return;
    -      }
    +    public final int estSize;
    +    public final int valueCount;
    +    public final int entryCount;
    +    public final int dataSize;
    +    public final int estElementCount;
    +    public final boolean isVariableWidth;
     
    -      // Total size taken by all vectors (and underlying buffers)
    -      // associated with this vector.
    +    public ColumnSize(ValueVector v, String prefix, int valueCount) {
    +      this.prefix = prefix;
    +      this.valueCount = valueCount;
    +      metadata = v.getField();
    +      boolean isMap = v.getField().getType().getMinorType() == MinorType.MAP;
     
    -      totalSize = vv.getAllocatedByteCount();
    +      // The amount of memory consumed by the payload: the actual
    +      // data stored in the vectors.
     
    -      // Capacity is the number of values that the vector could
    -      // contain. This is useful only for fixed-length vectors.
    +      int mapSize = 0;
    +      if (v.getField().getDataMode() == DataMode.REPEATED) {
     
    -      capacity = vv.getValueCapacity();
    +        // Repeated vectors are special: they have an associated offset vector
    +        // that changes the value count of the contained vectors.
     
    -      // The amount of memory consumed by the payload: the actual
    -      // data stored in the vectors.
    +        @SuppressWarnings("resource")
    +        UInt4Vector offsetVector = ((RepeatedValueVector) v).getOffsetVector();
    +        entryCount = offsetVector.getAccessor().get(valueCount);
    +        estElementCount = roundUp(entryCount, valueCount);
    +        if (isMap) {
     
    -      dataSize = vv.getPayloadByteCount();
    +          // For map, the only data associated with the map vector
    +          // itself is the offset vector, if any.
     
    -      // Determine "density" the number of rows compared to potential
    -      // capacity. Low-density batches occur at block boundaries, ends
    -      // of files and so on. Low-density batches throw off our estimates
    -      // for Varchar columns because we don't know the actual number of
    -      // bytes consumed (that information is hidden behind the Varchar
    -      // implementation where we can't get at it.)
    +          mapSize = offsetVector.getPayloadByteCount(valueCount);
    +        }
    +      } else {
    +        entryCount = valueCount;
    +        estElementCount = 1;
    +      }
     
    -      density = roundUp(dataSize * 100, totalSize);
    -      estSize = roundUp(dataSize, rowCount);
    -      variableWidth = vv instanceof VariableWidthVector ;
    +      if (isMap) {
    +        dataSize = mapSize;
    +        stdSize = 0;
    +     } else {
    +        dataSize = v.getPayloadByteCount(valueCount);
    +        stdSize = TypeHelper.getSize(metadata.getType());
    +     }
    +      estSize = roundUp(dataSize, valueCount);
    +      isVariableWidth = v instanceof VariableWidthVector ;
         }
     
         @Override
         public String toString() {
           StringBuilder buf = new StringBuilder()
    +          .append(prefix)
               .append(metadata.getName())
               .append("(type: ")
    +          .append(metadata.getType().getMode().name())
    +          .append(" ")
               .append(metadata.getType().getMinorType().name())
    -          .append(", std col. size: ")
    +          .append(", count: ")
    +          .append(valueCount);
    +      if (metadata.getDataMode() == DataMode.REPEATED) {
    +        buf.append(", total entries: ")
    +           .append(entryCount)
    +           .append(", per-array: ")
    +           .append(estElementCount);
    +      }
    +      buf .append(", std size: ")
               .append(stdSize)
    -          .append(", actual col. size: ")
    +          .append(", actual size: ")
               .append(estSize)
    -          .append(", total size: ")
    -          .append(totalSize)
               .append(", data size: ")
               .append(dataSize)
    -          .append(", row capacity: ")
    -          .append(capacity)
    -          .append(", density: ")
    -          .append(density)
               .append(")");
           return buf.toString();
         }
    +
    +    public void buildAllocHelper(SmartAllocationHelper allocHelper) {
    --- End diff --
    
    "Smart" and "Helper" are vague terms useful for marketing; we better use a more descriptive terminology to explain what exactly it does. E.g., "batch-allocator" instead of "alloc-helper"; for "smart" -- maybe "deep-check" , or "hint-based" ?? ....
    Also add a (javadoc) comment here. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r128134981
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/managed/SortMemoryManager.java ---
    @@ -312,52 +488,66 @@ private void adjustForLowMemory() {
        * one spill batch to make progress.
        */
     
    -  private void lowMemorySpillBatchSize() {
    +  private void lowMemoryInternalBatchSizes() {
     
         // The "expected" size is with power-of-two rounding in some vectors.
         // We later work backwards to the row count assuming average internal
         // fragmentation.
     
    -    // Must hold two input batches. Use (most of) the rest for the spill batch.
    +    // Must hold two input batches. Use half of the rest for the spill batch.
    +    // In a really bad case, the number here may be negative. We'll fix
    +    // it below.
     
    -    expectedSpillBatchSize = (int) (memoryLimit - 2 * estimatedInputSize);
    +    int spillBufferSize = (int) (memoryLimit - 2 * inputBatchSize.maxBufferSize) / 2;
    --- End diff --
    
    Buffer, in the sense of the amount of memory set aside for the spill batch. We work backwards to get the spill batch size.
    
    Yes, in the worst case, the estimated spill batch size will be negative, meaning we don't even have room to hold two input batches, let alone any spill batches.
    
    The negative number is not fixed. Instead, the resulting spill batch row count is clamped at a minimum of 1 in `rowsPerBatch()`. Also, we whine to the log file that we've got too little memory and that Bad Things are likely to happen.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r128128588
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/MSortTemplate.java ---
    @@ -92,8 +93,9 @@ public void setup(final FragmentContext context, final BufferAllocator allocator
        * @return
        */
       public static long memoryNeeded(final int recordCount) {
    -    // We need 4 bytes (SV4) for each record.
    -    return recordCount * 4;
    +    // We need 4 bytes (SV4) for each record, power of 2 rounded.
    --- End diff --
    
    Actually, this is one of the (few) original method names.
    
    Good catch on the other method, fixed that one also.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by Ben-Zvi <gi...@git.apache.org>.

Github user Ben-Zvi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r127852641
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/managed/SortConfig.java ---
    @@ -114,15 +127,25 @@ public SortConfig(DrillConfig config) {
         } else {
           mSortBatchSize = Character.MAX_VALUE;
         }
    +
    +    // Disable the single-batch shortcut; primarily for testing
    +
    +    if (config.hasPath(ExecConstants.EXTERNAL_SORT_OPTIMIZE_SINGLE_BATCH)) {
    +      enableSingleBatchShortcut = config.getBoolean(ExecConstants.EXTERNAL_SORT_OPTIMIZE_SINGLE_BATCH);
    +    } else {
    --- End diff --
    
    Minor comment - can save 3 LOC by initializing to "true" (and dropping the "final") , then no need for the else { }
     


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #860: DRILL-5601: Rollup of external sort fixes and impro...

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r128124528
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/StreamingAggBatch.java ---
    @@ -111,6 +120,8 @@ public void buildSchema() throws SchemaChangeException {
           case STOP:
             state = BatchState.STOP;
             return;
    +      default:
    +        break;
    --- End diff --
    
    Agreed. But it is a useless no-op that removes a compiler warning.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---