You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by davies <gi...@git.apache.org> on 2015/10/16 02:57:27 UTC

[GitHub] spark pull request: [WIP] Improve cache performance for primitive ...

GitHub user davies opened a pull request:

    https://github.com/apache/spark/pull/9145

    [WIP] Improve cache performance for primitive types

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/davies/spark byte_buffer

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/9145.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #9145
    
----
commit cea0e330dc7cc32a98946ea0c9915f35c1125b61
Author: Davies Liu <da...@databricks.com>
Date:   2015-10-16T00:01:12Z

    speedup reading from ByteBuffer

commit 7ee54a9aaa457f76e4d6e7585e257909a2b5d6f2
Author: Davies Liu <da...@databricks.com>
Date:   2015-10-16T00:47:01Z

    codegen

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11149] [SQL] Improve cache performance ...

Posted by davies <gi...@git.apache.org>.

Github user davies commented on the pull request:

    https://github.com/apache/spark/pull/9145#issuecomment-149701868
  
    I'm going to merge this first, unblock me to work on output UnsafeRow for columnar cache, because having Unsafe format inside a MutableRow could result unexpected behavior, any new comments will be address in follow up PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [WIP] Improve cache performance for primitive ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9145#issuecomment-148668823
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43836/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [WIP] Improve cache performance for primitive ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9145#issuecomment-148571238
  
      [Test build #43824 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43824/console) for   PR 9145 at commit [`7ee54a9`](https://github.com/apache/spark/commit/7ee54a9aaa457f76e4d6e7585e257909a2b5d6f2).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `abstract class ColumnarIterator extends Iterator[InternalRow] `
      * `      class SpecificColumnarIterator extends $`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11149] [SQL] Improve cache performance ...

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9145#discussion_r42278053
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/columnar/GenerateColumnAccessor.scala ---
    @@ -0,0 +1,149 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.columnar
    +
    +import org.apache.spark.Logging
    +import org.apache.spark.sql.catalyst.InternalRow
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.expressions.codegen.{CodeFormatter, CodeGenerator}
    +import org.apache.spark.sql.types._
    +
    +/**
    + * An Iterator to walk throught the InternalRows from a CachedBatch
    + */
    +abstract class ColumnarIterator extends Iterator[InternalRow] {
    +  def initialize(input: Iterator[CachedBatch], mutableRow: MutableRow, columnTypes: Array[DataType],
    +    columnIndexes: Array[Int]): Unit
    +}
    +
    +/**
    + * Generates bytecode for an [[ColumnarIterator]] for columnar cache.
    + */
    +object GenerateColumnAccessor extends CodeGenerator[Seq[DataType], ColumnarIterator] with Logging {
    +
    +  protected def canonicalize(in: Seq[DataType]): Seq[DataType] = in
    +  protected def bind(in: Seq[DataType], inputSchema: Seq[Attribute]): Seq[DataType] = in
    +
    +  protected def create(columnTypes: Seq[DataType]): ColumnarIterator = {
    +    val ctx = newCodeGenContext()
    +    val (creaters, accesses) = columnTypes.zipWithIndex.map { case (dt, index) =>
    +      val accessorName = ctx.freshName("accessor")
    +      val accessorCls = dt match {
    +        case NullType => classOf[NullColumnAccessor].getName
    +        case BooleanType => classOf[BooleanColumnAccessor].getName
    +        case ByteType => classOf[ByteColumnAccessor].getName
    +        case ShortType => classOf[ShortColumnAccessor].getName
    +        case IntegerType | DateType => classOf[IntColumnAccessor].getName
    +        case LongType | TimestampType => classOf[LongColumnAccessor].getName
    +        case FloatType => classOf[FloatColumnAccessor].getName
    +        case DoubleType => classOf[DoubleColumnAccessor].getName
    +        case StringType => classOf[StringColumnAccessor].getName
    +        case BinaryType => classOf[BinaryColumnAccessor].getName
    +        case dt: DecimalType if dt.precision <= Decimal.MAX_LONG_DIGITS =>
    +          classOf[CompactDecimalColumnAccessor].getName
    +        case dt: DecimalType => classOf[DecimalColumnAccessor].getName
    +        case struct: StructType => classOf[StructColumnAccessor].getName
    +        case array: ArrayType => classOf[ArrayColumnAccessor].getName
    +        case t: MapType => classOf[MapColumnAccessor].getName
    +      }
    +      ctx.addMutableState(accessorCls, accessorName, s"$accessorName = null;")
    +
    +      val createCode = dt match {
    +        case t if ctx.isPrimitiveType(dt) =>
    +          s"$accessorName = new $accessorCls(ByteBuffer.wrap(buffers[$index]).order(nativeOrder));"
    +        case NullType | StringType | BinaryType =>
    +          s"$accessorName = new $accessorCls(ByteBuffer.wrap(buffers[$index]).order(nativeOrder));"
    +        case other =>
    +          s"""$accessorName = new $accessorCls(ByteBuffer.wrap(buffers[$index]).order(nativeOrder),
    +             (${dt.getClass.getName}) columnTypes[$index]);"""
    +      }
    +
    +      val extract = s"$accessorName.extractTo(mutableRow, $index);"
    +
    +      (createCode, extract)
    +    }.unzip
    +
    +    val code = s"""
    +      import java.nio.ByteBuffer;
    +      import java.nio.ByteOrder;
    +
    +      public SpecificColumnarIterator generate($exprType[] expr) {
    +        return new SpecificColumnarIterator();
    +      }
    +
    +      class SpecificColumnarIterator extends ${classOf[ColumnarIterator].getName} {
    +
    +        private ByteOrder nativeOrder = null;
    +        private byte[][] buffers = null;
    +
    +        private int currentRow = 0;
    +        private int totalRows = 0;
    +
    +        private scala.collection.Iterator input = null;
    +        private MutableRow mutableRow = null;
    +        private ${classOf[DataType].getName}[] columnTypes = null;
    +        private int[] columnIndexes = null;
    +
    +        ${declareMutableStates(ctx)}
    +
    +        public SpecificColumnarIterator() {
    +          this.nativeOrder = ByteOrder.nativeOrder();
    +          this.buffers = new byte[${columnTypes.length}][];
    +
    +          ${initMutableStates(ctx)}
    +        }
    +
    +        public void initialize(scala.collection.Iterator input, MutableRow mutableRow,
    +                               ${classOf[DataType].getName}[] columnTypes, int[] columnIndexes) {
    +          this.input = input;
    +          this.mutableRow = mutableRow;
    +          this.columnTypes = columnTypes;
    +          this.columnIndexes = columnIndexes;
    +        }
    +
    +        public boolean hasNext() {
    +          if (currentRow < totalRows) {
    +            return true;
    +          }
    +          if (!input.hasNext()) {
    +            return false;
    +          }
    +
    +          ${classOf[CachedBatch].getName} batch = (${classOf[CachedBatch].getName}) input.next();
    +          currentRow = 0;
    +          totalRows = batch.count();
    +          for (int i=0; i<columnIndexes.length; i++) {
    --- End diff --
    
    might as well fit our own style here: add space around = and <


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [WIP] Improve cache performance for primitive ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9145#issuecomment-148639107
  
      [Test build #43836 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43836/consoleFull) for   PR 9145 at commit [`1ef3e18`](https://github.com/apache/spark/commit/1ef3e183e522497bcad082b158e014e841efafcf).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11149] [SQL] Improve cache performance ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9145#issuecomment-149337859
  
    **[Test build #43936 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43936/consoleFull)** for PR 9145 at commit [`f9151cc`](https://github.com/apache/spark/commit/f9151cc553ef45eb41e848ddf1c5cc0f82598062).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11149] [SQL] Improve cache performance ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9145#issuecomment-149364620
  
    **[Test build #43936 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43936/consoleFull)** for PR 9145 at commit [`f9151cc`](https://github.com/apache/spark/commit/f9151cc553ef45eb41e848ddf1c5cc0f82598062).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `abstract class ColumnarIterator extends Iterator[InternalRow] `\n  * `      class SpecificColumnarIterator extends $`\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11149] [SQL] Improve cache performance ...

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9145#discussion_r42277711
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/columnar/GenerateColumnAccessor.scala ---
    @@ -0,0 +1,149 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.columnar
    +
    +import org.apache.spark.Logging
    +import org.apache.spark.sql.catalyst.InternalRow
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.expressions.codegen.{CodeFormatter, CodeGenerator}
    +import org.apache.spark.sql.types._
    +
    +/**
    + * An Iterator to walk throught the InternalRows from a CachedBatch
    + */
    +abstract class ColumnarIterator extends Iterator[InternalRow] {
    +  def initialize(input: Iterator[CachedBatch], mutableRow: MutableRow, columnTypes: Array[DataType],
    +    columnIndexes: Array[Int]): Unit
    +}
    +
    +/**
    + * Generates bytecode for an [[ColumnarIterator]] for columnar cache.
    + */
    +object GenerateColumnAccessor extends CodeGenerator[Seq[DataType], ColumnarIterator] with Logging {
    +
    +  protected def canonicalize(in: Seq[DataType]): Seq[DataType] = in
    +  protected def bind(in: Seq[DataType], inputSchema: Seq[Attribute]): Seq[DataType] = in
    +
    +  protected def create(columnTypes: Seq[DataType]): ColumnarIterator = {
    +    val ctx = newCodeGenContext()
    +    val (creaters, accesses) = columnTypes.zipWithIndex.map { case (dt, index) =>
    --- End diff --
    
    actually probably more clear to say
    
    initializeAccessors and extractors


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [WIP] Improve cache performance for primitive ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9145#issuecomment-148668765
  
      [Test build #43836 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43836/console) for   PR 9145 at commit [`1ef3e18`](https://github.com/apache/spark/commit/1ef3e183e522497bcad082b158e014e841efafcf).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `abstract class ColumnarIterator extends Iterator[InternalRow] `
      * `      class SpecificColumnarIterator extends $`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11149] [SQL] Improve cache performance ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9145#issuecomment-149335069
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [WIP] Improve cache performance for primitive ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9145#issuecomment-148634491
  
      [Test build #43834 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43834/console) for   PR 9145 at commit [`8a49887`](https://github.com/apache/spark/commit/8a498871c769fbe940575ed00e670aa180d63ec8).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `abstract class ColumnarIterator extends Iterator[InternalRow] `
      * `      class SpecificColumnarIterator extends $`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11149] [SQL] Improve cache performance ...

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9145#discussion_r42277995
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/columnar/GenerateColumnAccessor.scala ---
    @@ -0,0 +1,149 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.columnar
    +
    +import org.apache.spark.Logging
    +import org.apache.spark.sql.catalyst.InternalRow
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.expressions.codegen.{CodeFormatter, CodeGenerator}
    +import org.apache.spark.sql.types._
    +
    +/**
    + * An Iterator to walk throught the InternalRows from a CachedBatch
    + */
    +abstract class ColumnarIterator extends Iterator[InternalRow] {
    +  def initialize(input: Iterator[CachedBatch], mutableRow: MutableRow, columnTypes: Array[DataType],
    +    columnIndexes: Array[Int]): Unit
    +}
    +
    +/**
    + * Generates bytecode for an [[ColumnarIterator]] for columnar cache.
    + */
    +object GenerateColumnAccessor extends CodeGenerator[Seq[DataType], ColumnarIterator] with Logging {
    +
    +  protected def canonicalize(in: Seq[DataType]): Seq[DataType] = in
    +  protected def bind(in: Seq[DataType], inputSchema: Seq[Attribute]): Seq[DataType] = in
    +
    +  protected def create(columnTypes: Seq[DataType]): ColumnarIterator = {
    +    val ctx = newCodeGenContext()
    +    val (creaters, accesses) = columnTypes.zipWithIndex.map { case (dt, index) =>
    +      val accessorName = ctx.freshName("accessor")
    +      val accessorCls = dt match {
    +        case NullType => classOf[NullColumnAccessor].getName
    +        case BooleanType => classOf[BooleanColumnAccessor].getName
    +        case ByteType => classOf[ByteColumnAccessor].getName
    +        case ShortType => classOf[ShortColumnAccessor].getName
    +        case IntegerType | DateType => classOf[IntColumnAccessor].getName
    +        case LongType | TimestampType => classOf[LongColumnAccessor].getName
    +        case FloatType => classOf[FloatColumnAccessor].getName
    +        case DoubleType => classOf[DoubleColumnAccessor].getName
    +        case StringType => classOf[StringColumnAccessor].getName
    +        case BinaryType => classOf[BinaryColumnAccessor].getName
    +        case dt: DecimalType if dt.precision <= Decimal.MAX_LONG_DIGITS =>
    +          classOf[CompactDecimalColumnAccessor].getName
    +        case dt: DecimalType => classOf[DecimalColumnAccessor].getName
    +        case struct: StructType => classOf[StructColumnAccessor].getName
    +        case array: ArrayType => classOf[ArrayColumnAccessor].getName
    +        case t: MapType => classOf[MapColumnAccessor].getName
    +      }
    +      ctx.addMutableState(accessorCls, accessorName, s"$accessorName = null;")
    +
    +      val createCode = dt match {
    +        case t if ctx.isPrimitiveType(dt) =>
    +          s"$accessorName = new $accessorCls(ByteBuffer.wrap(buffers[$index]).order(nativeOrder));"
    +        case NullType | StringType | BinaryType =>
    +          s"$accessorName = new $accessorCls(ByteBuffer.wrap(buffers[$index]).order(nativeOrder));"
    +        case other =>
    +          s"""$accessorName = new $accessorCls(ByteBuffer.wrap(buffers[$index]).order(nativeOrder),
    +             (${dt.getClass.getName}) columnTypes[$index]);"""
    +      }
    +
    +      val extract = s"$accessorName.extractTo(mutableRow, $index);"
    +
    +      (createCode, extract)
    +    }.unzip
    +
    +    val code = s"""
    +      import java.nio.ByteBuffer;
    +      import java.nio.ByteOrder;
    +
    +      public SpecificColumnarIterator generate($exprType[] expr) {
    +        return new SpecificColumnarIterator();
    +      }
    +
    +      class SpecificColumnarIterator extends ${classOf[ColumnarIterator].getName} {
    +
    +        private ByteOrder nativeOrder = null;
    +        private byte[][] buffers = null;
    +
    +        private int currentRow = 0;
    +        private int totalRows = 0;
    +
    +        private scala.collection.Iterator input = null;
    +        private MutableRow mutableRow = null;
    +        private ${classOf[DataType].getName}[] columnTypes = null;
    +        private int[] columnIndexes = null;
    +
    +        ${declareMutableStates(ctx)}
    +
    +        public SpecificColumnarIterator() {
    +          this.nativeOrder = ByteOrder.nativeOrder();
    +          this.buffers = new byte[${columnTypes.length}][];
    +
    +          ${initMutableStates(ctx)}
    +        }
    +
    +        public void initialize(scala.collection.Iterator input, MutableRow mutableRow,
    +                               ${classOf[DataType].getName}[] columnTypes, int[] columnIndexes) {
    +          this.input = input;
    +          this.mutableRow = mutableRow;
    +          this.columnTypes = columnTypes;
    +          this.columnIndexes = columnIndexes;
    +        }
    +
    +        public boolean hasNext() {
    +          if (currentRow < totalRows) {
    +            return true;
    +          }
    +          if (!input.hasNext()) {
    +            return false;
    --- End diff --
    
    when can this happen?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11149] [SQL] Improve cache performance ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9145#issuecomment-149364736
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11149] [SQL] Improve cache performance ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9145#issuecomment-148804541
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11149] [SQL] Improve cache performance ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9145#issuecomment-148806821
  
      [Test build #43844 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43844/consoleFull) for   PR 9145 at commit [`4511781`](https://github.com/apache/spark/commit/4511781885fef87a19c5139235fd6b5e7cfa1825).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11149] [SQL] Improve cache performance ...

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9145#discussion_r42277662
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/columnar/GenerateColumnAccessor.scala ---
    @@ -0,0 +1,149 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.columnar
    +
    +import org.apache.spark.Logging
    +import org.apache.spark.sql.catalyst.InternalRow
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.expressions.codegen.{CodeFormatter, CodeGenerator}
    +import org.apache.spark.sql.types._
    +
    +/**
    + * An Iterator to walk throught the InternalRows from a CachedBatch
    + */
    +abstract class ColumnarIterator extends Iterator[InternalRow] {
    +  def initialize(input: Iterator[CachedBatch], mutableRow: MutableRow, columnTypes: Array[DataType],
    +    columnIndexes: Array[Int]): Unit
    +}
    +
    +/**
    + * Generates bytecode for an [[ColumnarIterator]] for columnar cache.
    + */
    +object GenerateColumnAccessor extends CodeGenerator[Seq[DataType], ColumnarIterator] with Logging {
    +
    +  protected def canonicalize(in: Seq[DataType]): Seq[DataType] = in
    +  protected def bind(in: Seq[DataType], inputSchema: Seq[Attribute]): Seq[DataType] = in
    +
    +  protected def create(columnTypes: Seq[DataType]): ColumnarIterator = {
    +    val ctx = newCodeGenContext()
    +    val (creaters, accesses) = columnTypes.zipWithIndex.map { case (dt, index) =>
    --- End diff --
    
    creators


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11149] [SQL] Improve cache performance ...

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/9145#issuecomment-149338203
  
    LGTM - although I didn't look super closely so might be good for an extra pair of eyes too.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11149] [SQL] Improve cache performance ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9145#issuecomment-149364738
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43936/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [WIP] Improve cache performance for primitive ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9145#issuecomment-148628177
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [WIP] Improve cache performance for primitive ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9145#issuecomment-148637437
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11149] [SQL] Improve cache performance ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9145#issuecomment-148839870
  
      [Test build #43844 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43844/console) for   PR 9145 at commit [`4511781`](https://github.com/apache/spark/commit/4511781885fef87a19c5139235fd6b5e7cfa1825).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `abstract class ColumnarIterator extends Iterator[InternalRow] `
      * `      class SpecificColumnarIterator extends $`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11149] [SQL] Improve cache performance ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9145#issuecomment-149335106
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11149] [SQL] Improve cache performance ...

Posted by davies <gi...@git.apache.org>.

Github user davies commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9145#discussion_r42566202
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/columnar/ColumnType.scala ---
    @@ -219,11 +251,11 @@ private[sql] object DOUBLE extends NativeColumnType(DoubleType, 8) {
       }
     
       override def extract(buffer: ByteBuffer): Double = {
    -    buffer.getDouble()
    +    ByteBufferHelper.getDouble(buffer)
       }
     
       override def extract(buffer: ByteBuffer, row: MutableRow, ordinal: Int): Unit = {
    -    row.setDouble(ordinal, buffer.getDouble())
    +    row.setDouble(ordinal, ByteBufferHelper.getDouble(buffer))
       }
     
       override def setField(row: MutableRow, ordinal: Int, value: Double): Unit = {
    --- End diff --
    
    I think it does not worth it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [WIP] Improve cache performance for primitive ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9145#issuecomment-148634518
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [WIP] Improve cache performance for primitive ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9145#issuecomment-148628168
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11149] [SQL] Improve cache performance ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9145#issuecomment-148840462
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43844/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11149] [SQL] Improve cache performance ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9145#issuecomment-148840460
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [WIP] Improve cache performance for primitive ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9145#issuecomment-148634519
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43834/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [WIP] Improve cache performance for primitive ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9145#issuecomment-148629024
  
      [Test build #43834 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43834/consoleFull) for   PR 9145 at commit [`8a49887`](https://github.com/apache/spark/commit/8a498871c769fbe940575ed00e670aa180d63ec8).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11149] [SQL] Improve cache performance ...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on the pull request:

    https://github.com/apache/spark/pull/9145#issuecomment-149788964
  
    LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11149] [SQL] Improve cache performance ...

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9145#discussion_r42278194
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/columnar/InMemoryColumnarTableScan.scala ---
    @@ -43,7 +42,7 @@ private[sql] object InMemoryRelation {
           tableName)()
     }
     
    -private[sql] case class CachedBatch(buffers: Array[Array[Byte]], stats: InternalRow)
    +private[sql] case class CachedBatch(count: Int, buffers: Array[Array[Byte]], stats: InternalRow)
    --- End diff --
    
    let's add classdoc for count, buffers, and stats.
    
    Without looking at the code it is less clear what count is here. (Is it the total number of rows?)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [WIP] Improve cache performance for primitive ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9145#issuecomment-148564299
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [WIP] Improve cache performance for primitive ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9145#issuecomment-148668822
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11149] [SQL] Improve cache performance ...

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9145#discussion_r42275706
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/columnar/ColumnType.scala ---
    @@ -28,6 +28,36 @@ import org.apache.spark.sql.types._
     import org.apache.spark.unsafe.Platform
     import org.apache.spark.unsafe.types.UTF8String
     
    +
    +/**
    + * A help class for fast reading Int/Long/Float/Double from ByteBuffer in native order.
    + */
    --- End diff --
    
    put a big warning here that this only works with HeapByteBuffer.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11149] [SQL] Improve cache performance ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9145#issuecomment-148804511
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11149] [SQL] Improve cache performance ...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9145#discussion_r42584730
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/columnar/GenerateColumnAccessor.scala ---
    @@ -0,0 +1,149 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.columnar
    +
    +import org.apache.spark.Logging
    +import org.apache.spark.sql.catalyst.InternalRow
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.expressions.codegen.{CodeFormatter, CodeGenerator}
    +import org.apache.spark.sql.types._
    +
    +/**
    + * An Iterator to walk throught the InternalRows from a CachedBatch
    + */
    +abstract class ColumnarIterator extends Iterator[InternalRow] {
    +  def initialize(input: Iterator[CachedBatch], mutableRow: MutableRow, columnTypes: Array[DataType],
    +    columnIndexes: Array[Int]): Unit
    +}
    +
    +/**
    + * Generates bytecode for an [[ColumnarIterator]] for columnar cache.
    + */
    +object GenerateColumnAccessor extends CodeGenerator[Seq[DataType], ColumnarIterator] with Logging {
    +
    +  protected def canonicalize(in: Seq[DataType]): Seq[DataType] = in
    +  protected def bind(in: Seq[DataType], inputSchema: Seq[Attribute]): Seq[DataType] = in
    +
    +  protected def create(columnTypes: Seq[DataType]): ColumnarIterator = {
    +    val ctx = newCodeGenContext()
    +    val (initializeAccessors, extractors) = columnTypes.zipWithIndex.map { case (dt, index) =>
    +      val accessorName = ctx.freshName("accessor")
    +      val accessorCls = dt match {
    +        case NullType => classOf[NullColumnAccessor].getName
    +        case BooleanType => classOf[BooleanColumnAccessor].getName
    +        case ByteType => classOf[ByteColumnAccessor].getName
    +        case ShortType => classOf[ShortColumnAccessor].getName
    +        case IntegerType | DateType => classOf[IntColumnAccessor].getName
    +        case LongType | TimestampType => classOf[LongColumnAccessor].getName
    +        case FloatType => classOf[FloatColumnAccessor].getName
    +        case DoubleType => classOf[DoubleColumnAccessor].getName
    +        case StringType => classOf[StringColumnAccessor].getName
    +        case BinaryType => classOf[BinaryColumnAccessor].getName
    +        case dt: DecimalType if dt.precision <= Decimal.MAX_LONG_DIGITS =>
    +          classOf[CompactDecimalColumnAccessor].getName
    +        case dt: DecimalType => classOf[DecimalColumnAccessor].getName
    +        case struct: StructType => classOf[StructColumnAccessor].getName
    +        case array: ArrayType => classOf[ArrayColumnAccessor].getName
    +        case t: MapType => classOf[MapColumnAccessor].getName
    +      }
    +      ctx.addMutableState(accessorCls, accessorName, s"$accessorName = null;")
    +
    +      val createCode = dt match {
    +        case t if ctx.isPrimitiveType(dt) =>
    +          s"$accessorName = new $accessorCls(ByteBuffer.wrap(buffers[$index]).order(nativeOrder));"
    +        case NullType | StringType | BinaryType =>
    +          s"$accessorName = new $accessorCls(ByteBuffer.wrap(buffers[$index]).order(nativeOrder));"
    +        case other =>
    +          s"""$accessorName = new $accessorCls(ByteBuffer.wrap(buffers[$index]).order(nativeOrder),
    +             (${dt.getClass.getName}) columnTypes[$index]);"""
    +      }
    +
    +      val extract = s"$accessorName.extractTo(mutableRow, $index);"
    +
    +      (createCode, extract)
    +    }.unzip
    +
    +    val code = s"""
    +      import java.nio.ByteBuffer;
    +      import java.nio.ByteOrder;
    +
    +      public SpecificColumnarIterator generate($exprType[] expr) {
    +        return new SpecificColumnarIterator();
    +      }
    +
    +      class SpecificColumnarIterator extends ${classOf[ColumnarIterator].getName} {
    +
    +        private ByteOrder nativeOrder = null;
    +        private byte[][] buffers = null;
    +
    +        private int currentRow = 0;
    +        private int numRowsInBatch = 0;
    +
    +        private scala.collection.Iterator input = null;
    +        private MutableRow mutableRow = null;
    +        private ${classOf[DataType].getName}[] columnTypes = null;
    +        private int[] columnIndexes = null;
    +
    +        ${declareMutableStates(ctx)}
    +
    +        public SpecificColumnarIterator() {
    +          this.nativeOrder = ByteOrder.nativeOrder();
    +          this.buffers = new byte[${columnTypes.length}][];
    +
    +          ${initMutableStates(ctx)}
    +        }
    +
    +        public void initialize(scala.collection.Iterator input, MutableRow mutableRow,
    +                               ${classOf[DataType].getName}[] columnTypes, int[] columnIndexes) {
    --- End diff --
    
    wrong indent here


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11149] [SQL] Improve cache performance ...

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9145#discussion_r42275556
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeFormatter.scala ---
    @@ -44,11 +45,13 @@ private class CodeFormatter {
         } else {
           indentString
         }
    +    code.append(f"${currentLine}%03d ")
    --- End diff --
    
    as discussed offline, we can add `/* ... */` to still enable pasting this into an IDE.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11149] [SQL] Improve cache performance ...

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9145#discussion_r42412746
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/columnar/GenerateColumnAccessor.scala ---
    @@ -0,0 +1,149 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.columnar
    +
    +import org.apache.spark.Logging
    +import org.apache.spark.sql.catalyst.InternalRow
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.expressions.codegen.{CodeFormatter, CodeGenerator}
    +import org.apache.spark.sql.types._
    +
    +/**
    + * An Iterator to walk throught the InternalRows from a CachedBatch
    + */
    +abstract class ColumnarIterator extends Iterator[InternalRow] {
    +  def initialize(input: Iterator[CachedBatch], mutableRow: MutableRow, columnTypes: Array[DataType],
    +    columnIndexes: Array[Int]): Unit
    +}
    +
    +/**
    + * Generates bytecode for an [[ColumnarIterator]] for columnar cache.
    + */
    +object GenerateColumnAccessor extends CodeGenerator[Seq[DataType], ColumnarIterator] with Logging {
    +
    +  protected def canonicalize(in: Seq[DataType]): Seq[DataType] = in
    +  protected def bind(in: Seq[DataType], inputSchema: Seq[Attribute]): Seq[DataType] = in
    +
    +  protected def create(columnTypes: Seq[DataType]): ColumnarIterator = {
    +    val ctx = newCodeGenContext()
    +    val (creaters, accesses) = columnTypes.zipWithIndex.map { case (dt, index) =>
    +      val accessorName = ctx.freshName("accessor")
    +      val accessorCls = dt match {
    +        case NullType => classOf[NullColumnAccessor].getName
    +        case BooleanType => classOf[BooleanColumnAccessor].getName
    +        case ByteType => classOf[ByteColumnAccessor].getName
    +        case ShortType => classOf[ShortColumnAccessor].getName
    +        case IntegerType | DateType => classOf[IntColumnAccessor].getName
    +        case LongType | TimestampType => classOf[LongColumnAccessor].getName
    +        case FloatType => classOf[FloatColumnAccessor].getName
    +        case DoubleType => classOf[DoubleColumnAccessor].getName
    +        case StringType => classOf[StringColumnAccessor].getName
    +        case BinaryType => classOf[BinaryColumnAccessor].getName
    +        case dt: DecimalType if dt.precision <= Decimal.MAX_LONG_DIGITS =>
    +          classOf[CompactDecimalColumnAccessor].getName
    +        case dt: DecimalType => classOf[DecimalColumnAccessor].getName
    +        case struct: StructType => classOf[StructColumnAccessor].getName
    +        case array: ArrayType => classOf[ArrayColumnAccessor].getName
    +        case t: MapType => classOf[MapColumnAccessor].getName
    +      }
    +      ctx.addMutableState(accessorCls, accessorName, s"$accessorName = null;")
    +
    +      val createCode = dt match {
    +        case t if ctx.isPrimitiveType(dt) =>
    +          s"$accessorName = new $accessorCls(ByteBuffer.wrap(buffers[$index]).order(nativeOrder));"
    +        case NullType | StringType | BinaryType =>
    +          s"$accessorName = new $accessorCls(ByteBuffer.wrap(buffers[$index]).order(nativeOrder));"
    +        case other =>
    +          s"""$accessorName = new $accessorCls(ByteBuffer.wrap(buffers[$index]).order(nativeOrder),
    +             (${dt.getClass.getName}) columnTypes[$index]);"""
    +      }
    +
    +      val extract = s"$accessorName.extractTo(mutableRow, $index);"
    +
    +      (createCode, extract)
    +    }.unzip
    +
    +    val code = s"""
    +      import java.nio.ByteBuffer;
    +      import java.nio.ByteOrder;
    +
    +      public SpecificColumnarIterator generate($exprType[] expr) {
    +        return new SpecificColumnarIterator();
    +      }
    +
    +      class SpecificColumnarIterator extends ${classOf[ColumnarIterator].getName} {
    +
    +        private ByteOrder nativeOrder = null;
    +        private byte[][] buffers = null;
    +
    +        private int currentRow = 0;
    +        private int totalRows = 0;
    +
    +        private scala.collection.Iterator input = null;
    +        private MutableRow mutableRow = null;
    +        private ${classOf[DataType].getName}[] columnTypes = null;
    +        private int[] columnIndexes = null;
    +
    +        ${declareMutableStates(ctx)}
    +
    +        public SpecificColumnarIterator() {
    +          this.nativeOrder = ByteOrder.nativeOrder();
    +          this.buffers = new byte[${columnTypes.length}][];
    +
    +          ${initMutableStates(ctx)}
    +        }
    +
    +        public void initialize(scala.collection.Iterator input, MutableRow mutableRow,
    +                               ${classOf[DataType].getName}[] columnTypes, int[] columnIndexes) {
    +          this.input = input;
    +          this.mutableRow = mutableRow;
    +          this.columnTypes = columnTypes;
    +          this.columnIndexes = columnIndexes;
    +        }
    +
    +        public boolean hasNext() {
    +          if (currentRow < totalRows) {
    +            return true;
    +          }
    +          if (!input.hasNext()) {
    +            return false;
    --- End diff --
    
    As discussed offline, the confusion is coming from "totalRows". Maybe name that "numRowsInBatch"?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11149] [SQL] Improve cache performance ...

Posted by davies <gi...@git.apache.org>.

Github user davies commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9145#discussion_r42411214
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/columnar/GenerateColumnAccessor.scala ---
    @@ -0,0 +1,149 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.columnar
    +
    +import org.apache.spark.Logging
    +import org.apache.spark.sql.catalyst.InternalRow
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.expressions.codegen.{CodeFormatter, CodeGenerator}
    +import org.apache.spark.sql.types._
    +
    +/**
    + * An Iterator to walk throught the InternalRows from a CachedBatch
    + */
    +abstract class ColumnarIterator extends Iterator[InternalRow] {
    +  def initialize(input: Iterator[CachedBatch], mutableRow: MutableRow, columnTypes: Array[DataType],
    +    columnIndexes: Array[Int]): Unit
    +}
    +
    +/**
    + * Generates bytecode for an [[ColumnarIterator]] for columnar cache.
    + */
    +object GenerateColumnAccessor extends CodeGenerator[Seq[DataType], ColumnarIterator] with Logging {
    +
    +  protected def canonicalize(in: Seq[DataType]): Seq[DataType] = in
    +  protected def bind(in: Seq[DataType], inputSchema: Seq[Attribute]): Seq[DataType] = in
    +
    +  protected def create(columnTypes: Seq[DataType]): ColumnarIterator = {
    +    val ctx = newCodeGenContext()
    +    val (creaters, accesses) = columnTypes.zipWithIndex.map { case (dt, index) =>
    +      val accessorName = ctx.freshName("accessor")
    +      val accessorCls = dt match {
    +        case NullType => classOf[NullColumnAccessor].getName
    +        case BooleanType => classOf[BooleanColumnAccessor].getName
    +        case ByteType => classOf[ByteColumnAccessor].getName
    +        case ShortType => classOf[ShortColumnAccessor].getName
    +        case IntegerType | DateType => classOf[IntColumnAccessor].getName
    +        case LongType | TimestampType => classOf[LongColumnAccessor].getName
    +        case FloatType => classOf[FloatColumnAccessor].getName
    +        case DoubleType => classOf[DoubleColumnAccessor].getName
    +        case StringType => classOf[StringColumnAccessor].getName
    +        case BinaryType => classOf[BinaryColumnAccessor].getName
    +        case dt: DecimalType if dt.precision <= Decimal.MAX_LONG_DIGITS =>
    +          classOf[CompactDecimalColumnAccessor].getName
    +        case dt: DecimalType => classOf[DecimalColumnAccessor].getName
    +        case struct: StructType => classOf[StructColumnAccessor].getName
    +        case array: ArrayType => classOf[ArrayColumnAccessor].getName
    +        case t: MapType => classOf[MapColumnAccessor].getName
    +      }
    +      ctx.addMutableState(accessorCls, accessorName, s"$accessorName = null;")
    +
    +      val createCode = dt match {
    +        case t if ctx.isPrimitiveType(dt) =>
    +          s"$accessorName = new $accessorCls(ByteBuffer.wrap(buffers[$index]).order(nativeOrder));"
    +        case NullType | StringType | BinaryType =>
    +          s"$accessorName = new $accessorCls(ByteBuffer.wrap(buffers[$index]).order(nativeOrder));"
    +        case other =>
    +          s"""$accessorName = new $accessorCls(ByteBuffer.wrap(buffers[$index]).order(nativeOrder),
    +             (${dt.getClass.getName}) columnTypes[$index]);"""
    +      }
    +
    +      val extract = s"$accessorName.extractTo(mutableRow, $index);"
    +
    +      (createCode, extract)
    +    }.unzip
    +
    +    val code = s"""
    +      import java.nio.ByteBuffer;
    +      import java.nio.ByteOrder;
    +
    +      public SpecificColumnarIterator generate($exprType[] expr) {
    +        return new SpecificColumnarIterator();
    +      }
    +
    +      class SpecificColumnarIterator extends ${classOf[ColumnarIterator].getName} {
    +
    +        private ByteOrder nativeOrder = null;
    +        private byte[][] buffers = null;
    +
    +        private int currentRow = 0;
    +        private int totalRows = 0;
    +
    +        private scala.collection.Iterator input = null;
    +        private MutableRow mutableRow = null;
    +        private ${classOf[DataType].getName}[] columnTypes = null;
    +        private int[] columnIndexes = null;
    +
    +        ${declareMutableStates(ctx)}
    +
    +        public SpecificColumnarIterator() {
    +          this.nativeOrder = ByteOrder.nativeOrder();
    +          this.buffers = new byte[${columnTypes.length}][];
    +
    +          ${initMutableStates(ctx)}
    +        }
    +
    +        public void initialize(scala.collection.Iterator input, MutableRow mutableRow,
    +                               ${classOf[DataType].getName}[] columnTypes, int[] columnIndexes) {
    +          this.input = input;
    +          this.mutableRow = mutableRow;
    +          this.columnTypes = columnTypes;
    +          this.columnIndexes = columnIndexes;
    +        }
    +
    +        public boolean hasNext() {
    +          if (currentRow < totalRows) {
    +            return true;
    +          }
    +          if (!input.hasNext()) {
    +            return false;
    --- End diff --
    
    This will happen at the end of Iterator[CachedBatch]


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11149] [SQL] Improve cache performance ...

Posted by asfgit <gi...@git.apache.org>.

Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/9145


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [WIP] Improve cache performance for primitive ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9145#issuecomment-148637449
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [WIP] Improve cache performance for primitive ...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9145#issuecomment-148565257
  
      [Test build #43824 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43824/consoleFull) for   PR 9145 at commit [`7ee54a9`](https://github.com/apache/spark/commit/7ee54a9aaa457f76e4d6e7585e257909a2b5d6f2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-11149] [SQL] Improve cache performance ...

Posted by tedyu <gi...@git.apache.org>.

Github user tedyu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9145#discussion_r42564979
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/columnar/ColumnType.scala ---
    @@ -219,11 +251,11 @@ private[sql] object DOUBLE extends NativeColumnType(DoubleType, 8) {
       }
     
       override def extract(buffer: ByteBuffer): Double = {
    -    buffer.getDouble()
    +    ByteBufferHelper.getDouble(buffer)
       }
     
       override def extract(buffer: ByteBuffer, row: MutableRow, ordinal: Int): Unit = {
    -    row.setDouble(ordinal, buffer.getDouble())
    +    row.setDouble(ordinal, ByteBufferHelper.getDouble(buffer))
       }
     
       override def setField(row: MutableRow, ordinal: Int, value: Double): Unit = {
    --- End diff --
    
    Around line 332, there is call to buffer.getShort()
    
    Is it worth adding corresponding method to ByteBufferHelper ?
    
    If so, I can send a PR.
    
    Thanks


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [WIP] Improve cache performance for primitive ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9145#issuecomment-148571268
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43824/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [WIP] Improve cache performance for primitive ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9145#issuecomment-148571266
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [WIP] Improve cache performance for primitive ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9145#issuecomment-148564318
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org