You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by Whoosh <gi...@git.apache.org> on 2017/10/22 22:05:20 UTC

[GitHub] spark pull request #19553: [SPARK-22330][CORE] Linear containsKey operation ...

GitHub user Whoosh opened a pull request:

    https://github.com/apache/spark/pull/19553

    [SPARK-22330][CORE] Linear containsKey operation for serialized maps

    …alization.
    
    ## What changes were proposed in this pull request?
    
    Use non-linear containsKey operation for serialized maps, lookup into underlying map.
    
    ## How was this patch tested?
    
    unit tests

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/Whoosh/spark SPARK-22330

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/19553.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #19553
    
----
commit 518b301d32a77a44812235d42f07302edcc4fda2
Author: Alexander Istomin <is...@rutarget.ru>
Date:   2017-10-22T21:46:50Z

    SPARK-22330 use underlying contains instead of default AbstractMap realization.

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19553: [SPARK-22330][CORE] Linear containsKey operation ...

Posted by Whoosh <gi...@git.apache.org>.
Github user Whoosh commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19553#discussion_r148453286
  
    --- Diff: core/src/main/scala/org/apache/spark/api/java/JavaUtils.scala ---
    @@ -43,10 +43,17 @@ private[spark] object JavaUtils {
     
         override def size: Int = underlying.size
     
    -    override def get(key: AnyRef): B = try {
    -      underlying.getOrElse(key.asInstanceOf[A], null.asInstanceOf[B])
    -    } catch {
    -      case ex: ClassCastException => null.asInstanceOf[B]
    +    // Delegate to implementation because AbstractMap implementation iterates over whole key set
    +    override def containsKey(key: AnyRef): Boolean = {
    +      underlying.contains(key.asInstanceOf[A])
    --- End diff --
    
    Neveminde, maybe it's my local issues, i will commit suggested chagnes, lets look at jenkins afterall.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19553: [SPARK-22330][CORE] Linear containsKey operation ...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19553#discussion_r148463125
  
    --- Diff: core/src/main/scala/org/apache/spark/api/java/JavaUtils.scala ---
    @@ -43,10 +43,16 @@ private[spark] object JavaUtils {
     
         override def size: Int = underlying.size
     
    -    override def get(key: AnyRef): B = try {
    -      underlying.getOrElse(key.asInstanceOf[A], null.asInstanceOf[B])
    -    } catch {
    -      case ex: ClassCastException => null.asInstanceOf[B]
    +    // Delegate to implementation because AbstractMap implementation iterates over whole key set
    +    override def containsKey(key: AnyRef): Boolean = {
    +      key.isInstanceOf[A] && underlying.contains(key.asInstanceOf[A])
    +    }
    +
    +    override def get(key: AnyRef): B = {
    +      if (key.isInstanceOf[A]) {
    +        return underlying.getOrElse(key.asInstanceOf[A], null.asInstanceOf[B])
    --- End diff --
    
    It's more conventional to avoid `return`, and just write if () { ... foo } else { ... bar }. No big deal.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19553: [SPARK-22330][CORE] Linear containsKey operation for ser...

Posted by Whoosh <gi...@git.apache.org>.
Github user Whoosh commented on the issue:

    https://github.com/apache/spark/pull/19553
  
    @cloud-fan I've checked all core tests, it was fine, should I do smt in addition to?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19553: [SPARK-22330][CORE] Linear containsKey operation for ser...

Posted by Tagar <gi...@git.apache.org>.
Github user Tagar commented on the issue:

    https://github.com/apache/spark/pull/19553
  
    @Whoosh can you please check if there is a similar problem that affects [SPARK-21657](https://issues.apache.org/jira/browse/SPARK-21657?focusedCommentId=16220469&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16220469) ? thanks.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19553: [SPARK-22330][CORE] Linear containsKey operation for ser...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/19553
  
    @cloud-fan yeah I'm OK coming full circle on this one. I had hoped to avoid it with a simpler mechanism but didn't see that it wouldn't actually work. I apologize to @Whoosh for the run-around.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19553: [SPARK-22330][CORE] Linear containsKey operation ...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/19553


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19553: [SPARK-22330][CORE] Linear containsKey operation for ser...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19553
  
    **[Test build #83321 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83321/testReport)** for PR 19553 at commit [`c4bd531`](https://github.com/apache/spark/commit/c4bd5311ce0fc733659afc02aff6c1c7e02e13ab).
     * This patch **fails to build**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19553: [SPARK-22330][CORE] Linear containsKey operation for ser...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19553
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83198/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19553: [SPARK-22330][CORE] Linear containsKey operation for ser...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/19553
  
    how about we just try catch the ClassCastException? introducing class tag has serialization overhead.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19553: [SPARK-22330][CORE] Linear containsKey operation ...

Posted by Whoosh <gi...@git.apache.org>.
Github user Whoosh commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19553#discussion_r147007902
  
    --- Diff: core/src/main/scala/org/apache/spark/api/java/JavaUtils.scala ---
    @@ -43,6 +43,13 @@ private[spark] object JavaUtils {
     
         override def size: Int = underlying.size
     
    +    // Delegate to implementation because AbstractMap implementation iterates over whole key set
    +    override def containsKey(key: AnyRef): Boolean = try {
    --- End diff --
    
    @srowen 
    Yeah, you're right -) I have been focused on code around.  Scala isn't my language haven't known that construction, thx. 
    
    > What about other methods -- anything worth plumbing directly to the underlying implementation?
    
    Yep, checked, remove(key) still plumbing, actually, I don't know how to simply fix it, cuz there are no methods to remove it without recreating a new Map. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19553: [SPARK-22330][CORE] Linear containsKey operation for ser...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19553
  
    **[Test build #83198 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83198/testReport)** for PR 19553 at commit [`235f6d6`](https://github.com/apache/spark/commit/235f6d67cf25f4016c8e8ffb77103770e855ec62).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19553: [SPARK-22330][CORE] Linear containsKey operation for ser...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/19553
  
    thanks, merging to master!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19553: [SPARK-22330][CORE] Linear containsKey operation for ser...

Posted by Whoosh <gi...@git.apache.org>.
Github user Whoosh commented on the issue:

    https://github.com/apache/spark/pull/19553
  
    Ye, why not, let's continue this circle ;)


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19553: [SPARK-22330][CORE] Linear containsKey operation for ser...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19553
  
    **[Test build #83506 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83506/testReport)** for PR 19553 at commit [`91cdfef`](https://github.com/apache/spark/commit/91cdfef4bcd743635b68ace1dafac309af76289c).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19553: [SPARK-22330][CORE] Linear containsKey operation for ser...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19553
  
    **[Test build #3962 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3962/testReport)** for PR 19553 at commit [`f2cbb0c`](https://github.com/apache/spark/commit/f2cbb0c9278ad7230d54bf00ccce1ff1bcdf75f7).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19553: [SPARK-22330][CORE] Linear containsKey operation for ser...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/19553
  
    LGTM


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19553: [SPARK-22330][CORE] Linear containsKey operation for ser...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/19553
  
    LGTM


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19553: [SPARK-22330][CORE] Linear containsKey operation for ser...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/19553
  
    @Whoosh do you have time for one more change to bring this full circle, back to catching exceptions? looks like there was a reason that was necessary.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19553: [SPARK-22330][CORE] Linear containsKey operation for ser...

Posted by Whoosh <gi...@git.apache.org>.
Github user Whoosh commented on the issue:

    https://github.com/apache/spark/pull/19553
  
    retest this, please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19553: [SPARK-22330][CORE] Linear containsKey operation ...

Posted by Whoosh <gi...@git.apache.org>.
Github user Whoosh commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19553#discussion_r147006646
  
    --- Diff: core/src/test/scala/org/apache/spark/util/JavaUtils.scala ---
    @@ -0,0 +1,37 @@
    +/*
    --- End diff --
    
    Yep, done. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19553: [SPARK-22330][CORE] Linear containsKey operation ...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19553#discussion_r146147952
  
    --- Diff: core/src/test/scala/org/apache/spark/util/JavaUtils.scala ---
    @@ -0,0 +1,37 @@
    +/*
    --- End diff --
    
    This class needs to be in the same package as what it tests, and be called `JavaUtilsSuite`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19553: [SPARK-22330][CORE] Linear containsKey operation for ser...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19553
  
    **[Test build #3962 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3962/testReport)** for PR 19553 at commit [`f2cbb0c`](https://github.com/apache/spark/commit/f2cbb0c9278ad7230d54bf00ccce1ff1bcdf75f7).
     * This patch **fails to build**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19553: [SPARK-22330][CORE] Linear containsKey operation for ser...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/19553
  
    LGTM


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19553: [SPARK-22330][CORE] Linear containsKey operation ...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19553#discussion_r147576141
  
    --- Diff: core/src/main/scala/org/apache/spark/api/java/JavaUtils.scala ---
    @@ -43,10 +43,15 @@ private[spark] object JavaUtils {
     
         override def size: Int = underlying.size
     
    -    override def get(key: AnyRef): B = try {
    -      underlying.getOrElse(key.asInstanceOf[A], null.asInstanceOf[B])
    -    } catch {
    -      case ex: ClassCastException => null.asInstanceOf[B]
    +    // Delegate to implementation because AbstractMap implementation iterates over whole key set
    +    override def containsKey(key: AnyRef): Boolean = key match {
    +      case key: A => underlying.contains(key)
    --- End diff --
    
    +1


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19553: [SPARK-22330][CORE] Linear containsKey operation for ser...

Posted by Whoosh <gi...@git.apache.org>.
Github user Whoosh commented on the issue:

    https://github.com/apache/spark/pull/19553
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19553: [SPARK-22330][CORE] Linear containsKey operation for ser...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19553
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83506/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19553: [SPARK-22330][CORE] Linear containsKey operation for ser...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19553
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83321/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19553: [SPARK-22330][CORE] Linear containsKey operation ...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19553#discussion_r147596592
  
    --- Diff: core/src/main/scala/org/apache/spark/api/java/JavaUtils.scala ---
    @@ -43,10 +43,17 @@ private[spark] object JavaUtils {
     
         override def size: Int = underlying.size
     
    -    override def get(key: AnyRef): B = try {
    -      underlying.getOrElse(key.asInstanceOf[A], null.asInstanceOf[B])
    -    } catch {
    -      case ex: ClassCastException => null.asInstanceOf[B]
    +    // Delegate to implementation because AbstractMap implementation iterates over whole key set
    +    override def containsKey(key: AnyRef): Boolean = {
    +      underlying.contains(key.asInstanceOf[A])
    --- End diff --
    
    I thought it should throw exception, however there is a test showing that it's fine...


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19553: [SPARK-22330][CORE] Linear containsKey operation for ser...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19553
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19553: [SPARK-22330][CORE] Linear containsKey operation ...

Posted by Whoosh <gi...@git.apache.org>.
Github user Whoosh commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19553#discussion_r147008530
  
    --- Diff: core/src/test/scala/org/apache/spark/util/JavaUtils.scala ---
    @@ -0,0 +1,37 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.util
    +
    +import org.mockito.Mockito._
    +
    +import org.apache.spark.SparkFunSuite
    +import org.apache.spark.api.java.JavaUtils
    +
    +class JavaUtils extends SparkFunSuite {
    +
    +  test("containsKey implementation without iteratively entrySet call") {
    --- End diff --
    
    @srowen Added addition assertion, I guess, it's enough.
    Btw, sorry for my slow answers(that you're have pinged me), haven't time all three days =(


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19553: [SPARK-22330][CORE] Linear containsKey operation ...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19553#discussion_r147161700
  
    --- Diff: core/src/main/scala/org/apache/spark/api/java/JavaUtils.scala ---
    @@ -43,10 +43,15 @@ private[spark] object JavaUtils {
     
         override def size: Int = underlying.size
     
    -    override def get(key: AnyRef): B = try {
    -      underlying.getOrElse(key.asInstanceOf[A], null.asInstanceOf[B])
    -    } catch {
    -      case ex: ClassCastException => null.asInstanceOf[B]
    +    // Delegate to implementation because AbstractMap implementation iterates over whole key set
    +    override def containsKey(key: AnyRef): Boolean = key match {
    +      case key: A => underlying.contains(key)
    --- End diff --
    
    Ah right, we're not going to be able to pattern match the type here. I'd go with something like `key.instanceOf[A] && underlying.contains(key.asInstanceOf[A])`, and similarly for `get`. Still better than causing the cast exception.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19553: [SPARK-22330][CORE] Linear containsKey operation ...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19553#discussion_r146147889
  
    --- Diff: core/src/main/scala/org/apache/spark/api/java/JavaUtils.scala ---
    @@ -43,6 +43,13 @@ private[spark] object JavaUtils {
     
         override def size: Int = underlying.size
     
    +    // Delegate to implementation because AbstractMap implementation iterates over whole key set
    +    override def containsKey(key: AnyRef): Boolean = try {
    --- End diff --
    
    This reminds me that this whole class could go away once only Scala 2.12 is supported, but that's a while away.
    
    Simpler as this?
    
    ```
    override def containsKey(key: AnyRef): Boolean = key match {
       case a: A => underlying.contains(a)
       case _ => false
    }
    ```
    
    if so then get() could be:
    
    ```
    override def get(key: AnyRef): B = key match {
      case a: A => underlying.getOrElse(a, null.asInstanceOf[B])
      case _ => null.asInstanceOf[B]
    }
    ```
    
    What about other methods -- anything worth plumbing directly to the underlying implementation?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19553: [SPARK-22330][CORE] Linear containsKey operation for ser...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/19553
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19553: [SPARK-22330][CORE] Linear containsKey operation for ser...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/19553
  
    Ping @Whoosh


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19553: [SPARK-22330][CORE] Linear containsKey operation for ser...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19553
  
    **[Test build #83321 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83321/testReport)** for PR 19553 at commit [`c4bd531`](https://github.com/apache/spark/commit/c4bd5311ce0fc733659afc02aff6c1c7e02e13ab).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19553: [SPARK-22330][CORE] Linear containsKey operation ...

Posted by Whoosh <gi...@git.apache.org>.
Github user Whoosh commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19553#discussion_r147591080
  
    --- Diff: core/src/main/scala/org/apache/spark/api/java/JavaUtils.scala ---
    @@ -43,10 +43,15 @@ private[spark] object JavaUtils {
     
         override def size: Int = underlying.size
     
    -    override def get(key: AnyRef): B = try {
    -      underlying.getOrElse(key.asInstanceOf[A], null.asInstanceOf[B])
    -    } catch {
    -      case ex: ClassCastException => null.asInstanceOf[B]
    +    // Delegate to implementation because AbstractMap implementation iterates over whole key set
    +    override def containsKey(key: AnyRef): Boolean = key match {
    +      case key: A => underlying.contains(key)
    --- End diff --
    
    @srowen 
    It can't be so. Will cause "abstract type A is unchecked since it is eliminated by erasure" compile-time error.
    As I guess, there is no need any type checking before a get because it'll have cast to Object anyway and get(key) it's only compiling issues, please correct me if I'm wrong,  I've added a simple test for this.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19553: [SPARK-22330][CORE] Linear containsKey operation for ser...

Posted by Whoosh <gi...@git.apache.org>.
Github user Whoosh commented on the issue:

    https://github.com/apache/spark/pull/19553
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19553: [SPARK-22330][CORE] Linear containsKey operation for ser...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19553
  
    **[Test build #83198 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83198/testReport)** for PR 19553 at commit [`235f6d6`](https://github.com/apache/spark/commit/235f6d67cf25f4016c8e8ffb77103770e855ec62).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19553: [SPARK-22330][CORE] Linear containsKey operation for ser...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/19553
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19553: [SPARK-22330][CORE] Linear containsKey operation ...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19553#discussion_r147632343
  
    --- Diff: core/src/main/scala/org/apache/spark/api/java/JavaUtils.scala ---
    @@ -43,10 +43,17 @@ private[spark] object JavaUtils {
     
         override def size: Int = underlying.size
     
    -    override def get(key: AnyRef): B = try {
    -      underlying.getOrElse(key.asInstanceOf[A], null.asInstanceOf[B])
    -    } catch {
    -      case ex: ClassCastException => null.asInstanceOf[B]
    +    // Delegate to implementation because AbstractMap implementation iterates over whole key set
    +    override def containsKey(key: AnyRef): Boolean = {
    +      underlying.contains(key.asInstanceOf[A])
    --- End diff --
    
    Yeah for historical reasons the signature has a key of type AnyRef instead of A (pre-generics compatibility) and so the behavior for a key of the wrong class can go either way: https://docs.oracle.com/javase/8/docs/api/java/util/Collection.html#optional-restrictions We should probably just keep the behavior.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19553: [SPARK-22330][CORE] Linear containsKey operation ...

Posted by Whoosh <gi...@git.apache.org>.
Github user Whoosh commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19553#discussion_r148451557
  
    --- Diff: core/src/main/scala/org/apache/spark/api/java/JavaUtils.scala ---
    @@ -43,10 +43,17 @@ private[spark] object JavaUtils {
     
         override def size: Int = underlying.size
     
    -    override def get(key: AnyRef): B = try {
    -      underlying.getOrElse(key.asInstanceOf[A], null.asInstanceOf[B])
    -    } catch {
    -      case ex: ClassCastException => null.asInstanceOf[B]
    +    // Delegate to implementation because AbstractMap implementation iterates over whole key set
    +    override def containsKey(key: AnyRef): Boolean = {
    +      underlying.contains(key.asInstanceOf[A])
    --- End diff --
    
    @cloud-fan @srowen 
    So, let's have decided what we want for containsKey() here first, before i will do next commit -)
    Because suggested (key.isInstanceOf[A] && underlying.contains(key.asInstanceOf[A]))
    will not work, it cause compile-time error with "abstract type A is unchecked since it is eliminated by erasure" message. 
    



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19553: [SPARK-22330][CORE] Linear containsKey operation for ser...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19553
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19553: [SPARK-22330][CORE] Linear containsKey operation for ser...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19553
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19553: [SPARK-22330][CORE] Linear containsKey operation ...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19553#discussion_r147657754
  
    --- Diff: core/src/main/scala/org/apache/spark/api/java/JavaUtils.scala ---
    @@ -43,10 +43,17 @@ private[spark] object JavaUtils {
     
         override def size: Int = underlying.size
     
    -    override def get(key: AnyRef): B = try {
    -      underlying.getOrElse(key.asInstanceOf[A], null.asInstanceOf[B])
    -    } catch {
    -      case ex: ClassCastException => null.asInstanceOf[B]
    +    // Delegate to implementation because AbstractMap implementation iterates over whole key set
    +    override def containsKey(key: AnyRef): Boolean = {
    +      underlying.contains(key.asInstanceOf[A])
    --- End diff --
    
    do we mean we can just write `underlying.contains(key)`?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19553: [SPARK-22330][CORE] Linear containsKey operation for ser...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/19553
  
    If for some reason this last approach can't be made to work, we could just fall back to catching the exception. Because it shouldn't happen in any normal use, it doesn't matter much.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19553: [SPARK-22330][CORE] Linear containsKey operation for ser...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/19553
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19553: [SPARK-22330][CORE] Linear containsKey operation for ser...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/19553
  
    @Tagar it's the same general type of issue, but not directly related nor exactly the same cause. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19553: [SPARK-22330][CORE] Linear containsKey operation ...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19553#discussion_r147593724
  
    --- Diff: core/src/main/scala/org/apache/spark/api/java/JavaUtils.scala ---
    @@ -43,10 +43,17 @@ private[spark] object JavaUtils {
     
         override def size: Int = underlying.size
     
    -    override def get(key: AnyRef): B = try {
    -      underlying.getOrElse(key.asInstanceOf[A], null.asInstanceOf[B])
    -    } catch {
    -      case ex: ClassCastException => null.asInstanceOf[B]
    +    // Delegate to implementation because AbstractMap implementation iterates over whole key set
    +    override def containsKey(key: AnyRef): Boolean = {
    +      underlying.contains(key.asInstanceOf[A])
    +    }
    +
    +    override def get(key: AnyRef): B = {
    +      val value = underlying.get(key.asInstanceOf[A])
    +      if (value.isDefined && value.get.isInstanceOf[B]) {
    --- End diff --
    
    `underlying` values are already known to be `B`, so this isn't necessary. But a condition of the key is.
    
    ```
    if (key.instanceOf[A]) {
      underlying.getOrElse(key.asInstanceOf[A], null)
    } else {
      null
    }
    ```
    
    Might need an extra cast in there.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19553: [SPARK-22330][CORE] Linear containsKey operation ...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19553#discussion_r146147996
  
    --- Diff: core/src/test/scala/org/apache/spark/util/JavaUtils.scala ---
    @@ -0,0 +1,37 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.util
    +
    +import org.mockito.Mockito._
    +
    +import org.apache.spark.SparkFunSuite
    +import org.apache.spark.api.java.JavaUtils
    +
    +class JavaUtils extends SparkFunSuite {
    +
    +  test("containsKey implementation without iteratively entrySet call") {
    --- End diff --
    
    You'd be welcome to add a test that the resulting map is serializable as that's the primary job of this method, but that's of course not directly related. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19553: [SPARK-22330][CORE] Linear containsKey operation for ser...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19553
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19553: [SPARK-22330][CORE] Linear containsKey operation ...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19553#discussion_r147593655
  
    --- Diff: core/src/main/scala/org/apache/spark/api/java/JavaUtils.scala ---
    @@ -43,10 +43,17 @@ private[spark] object JavaUtils {
     
         override def size: Int = underlying.size
     
    -    override def get(key: AnyRef): B = try {
    -      underlying.getOrElse(key.asInstanceOf[A], null.asInstanceOf[B])
    -    } catch {
    -      case ex: ClassCastException => null.asInstanceOf[B]
    +    // Delegate to implementation because AbstractMap implementation iterates over whole key set
    +    override def containsKey(key: AnyRef): Boolean = {
    +      underlying.contains(key.asInstanceOf[A])
    --- End diff --
    
    Really, this should return `false` if the key isn't an `A`. This will throw an exception now. It should be prefixed with `key.isInstanceOf[A] && ...`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19553: [SPARK-22330][CORE] Linear containsKey operation for ser...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/19553
  
    Ping @Whoosh


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19553: [SPARK-22330][CORE] Linear containsKey operation ...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19553#discussion_r148464415
  
    --- Diff: core/src/main/scala/org/apache/spark/api/java/JavaUtils.scala ---
    @@ -43,10 +43,16 @@ private[spark] object JavaUtils {
     
         override def size: Int = underlying.size
     
    -    override def get(key: AnyRef): B = try {
    -      underlying.getOrElse(key.asInstanceOf[A], null.asInstanceOf[B])
    -    } catch {
    -      case ex: ClassCastException => null.asInstanceOf[B]
    +    // Delegate to implementation because AbstractMap implementation iterates over whole key set
    +    override def containsKey(key: AnyRef): Boolean = {
    +      key.isInstanceOf[A] && underlying.contains(key.asInstanceOf[A])
    --- End diff --
    
    Ah, yeah right, that won't work without a ClassTag.
    
    ```
    [error] [warn] /home/jenkins/workspace/SparkPullRequestBuilder/core/src/main/scala/org/apache/spark/api/java/JavaUtils.scala:48: abstract type A is unchecked since it is eliminated by erasure
    [error] [warn]       key.isInstanceOf[A] && underlying.contains(key.asInstanceOf[A])
    [error] [warn] 
    [error] [warn] /home/jenkins/workspace/SparkPullRequestBuilder/core/src/main/scala/org/apache/spark/api/java/JavaUtils.scala:52: abstract type A is unchecked since it is eliminated by erasure
    [error] [warn]       if (key.isInstanceOf[A]) {
    [error] [warn] 
    ```
    
    I think that's why the implementation originally just caught a ClassCastException, but that's ugly.
    
    So the type `A` in the method declaration becomes `A : ClassTag` and the type check becomes `classTag[A].runtimeClass.isAssignableFrom(key.getClass)`
    
    @Whoosh would you mind giving that a try? I think that nails it then.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19553: [SPARK-22330][CORE] Linear containsKey operation ...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19553#discussion_r147918030
  
    --- Diff: core/src/main/scala/org/apache/spark/api/java/JavaUtils.scala ---
    @@ -43,10 +43,17 @@ private[spark] object JavaUtils {
     
         override def size: Int = underlying.size
     
    -    override def get(key: AnyRef): B = try {
    -      underlying.getOrElse(key.asInstanceOf[A], null.asInstanceOf[B])
    -    } catch {
    -      case ex: ClassCastException => null.asInstanceOf[B]
    +    // Delegate to implementation because AbstractMap implementation iterates over whole key set
    +    override def containsKey(key: AnyRef): Boolean = {
    +      underlying.contains(key.asInstanceOf[A])
    --- End diff --
    
    No, because Scala Maps do require their key type in contains() and get(). https://www.scala-lang.org/api/current/scala/collection/Map.html . This is indeed the difference between the APIs that we need to bridge.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19553: [SPARK-22330][CORE] Linear containsKey operation for ser...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19553
  
    **[Test build #83506 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83506/testReport)** for PR 19553 at commit [`91cdfef`](https://github.com/apache/spark/commit/91cdfef4bcd743635b68ace1dafac309af76289c).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19553: [SPARK-22330][CORE] Linear containsKey operation ...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19553#discussion_r148456265
  
    --- Diff: core/src/main/scala/org/apache/spark/api/java/JavaUtils.scala ---
    @@ -43,10 +43,16 @@ private[spark] object JavaUtils {
     
         override def size: Int = underlying.size
     
    -    override def get(key: AnyRef): B = try {
    -      underlying.getOrElse(key.asInstanceOf[A], null.asInstanceOf[B])
    -    } catch {
    -      case ex: ClassCastException => null.asInstanceOf[B]
    +    // Delegate to implementation because AbstractMap implementation iterates over whole key set
    +    override def containsKey(key: AnyRef): Boolean = {
    +      key.isInstanceOf[A] && underlying.contains(key.asInstanceOf[A])
    --- End diff --
    
    Seems this is a real issue, shall we include a class tag and do `clsTag.getClass.isAssignableFrom(key.getClass)`?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org