You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/10/31 09:11:46 UTC

[GitHub] [spark] eejbyfeldt commented on pull request #38427: [SPARK-40950][CORE] Fix isRemoteAddressMaxedOut performance overhead on scala 2.13

eejbyfeldt commented on PR #38427:
URL: https://github.com/apache/spark/pull/38427#issuecomment-1296802435

   > > 
   > 
   > A little personal thought：the limitation of `immutable.Seq` is caused by Scala version migration, this comes from Scala 2.13, which changes the predefined of `Seq` from `scala.collection.Seq(2.12)` to `scala.collection.immutable.Seq`, we only promise that it is a `scala.collection.Seq` when using Scala 2.12 now and 2.13 is beta version, so if we agree to explicitly define `Seq` as `scala.collection.Seq` in the API, the things may become simple(can avoid calling `toSeq` or `toIndexedSeq`).
   
   The reason I changed it to `IndexedSeq` is that `mutable.ArrayBuffer` is an `IndexedSeq` on 2.12 so you can do this without copying:
   ```
   scala> val a: IndexedSeq[Int] =  scala.collection.mutable.ArrayBuffer[Int]()
   a: IndexedSeq[Int] = ArrayBuffer()
   ```
   so I assumed that `toIndexedSeq` would be a no-op like `toSeq` in 2.12. But this seems to be wrong.
   
   But I agree that if we change `Seq` to `scala.collection.Seq` or `IndexedSeq` to `scala.collection.IndexedSeq` it would be possible to just remove all the `toSeq` or `toIndexedSeq` (except one place for toIndexSeq where we currently use a ListBuffer but that could be changed to a ArrayBuffer) and get the performance on 2.13 without getting worse on 2.12.
   
   
   > Give more information for this way: convert ArrayBuffer to immutable.Seq or immutable.IndexedSeq, we should use immutable.ArraySeq.from(buffer) or immutable.ArraySeq.unsafeWrapArray(buffer.toArray) instead of buffer.toSeq( or buffer.toIndexedSeq), from and unsafeWrapArray will be more efficient.
   
   >  At the same time, there are many other similar cases. Should we fixed them all in one
   
   So the idea here would be to define some helper that can be used to convert our mutable collection to a `Seq` and  have a no-op operation on scala 2.12  and do it using an ArraySeq on 2.13. And therefore accepting an extra copy on 2.13?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org