You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (Jira)" <ji...@apache.org> on 2021/04/28 20:20:00 UTC
[jira] [Commented] (SPARK-35263) Refactor
ShuffleBlockFetcherIteratorSuite to reduce duplicated code
[ https://issues.apache.org/jira/browse/SPARK-35263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17334986#comment-17334986 ]
Apache Spark commented on SPARK-35263:
--------------------------------------
User 'xkrogen' has created a pull request for this issue:
https://github.com/apache/spark/pull/32389
> Refactor ShuffleBlockFetcherIteratorSuite to reduce duplicated code
> -------------------------------------------------------------------
>
> Key: SPARK-35263
> URL: https://issues.apache.org/jira/browse/SPARK-35263
> Project: Spark
> Issue Type: Improvement
> Components: Shuffle, Tests
> Affects Versions: 3.1.1
> Reporter: Erik Krogen
> Priority: Major
>
> {{ShuffleFetcherBlockIteratorSuite}} has tons of duplicate code, like:
> {code}
> val iterator = new ShuffleBlockFetcherIterator(
> taskContext,
> transfer,
> blockManager,
> blocksByAddress,
> (_, in) => in,
> 48 * 1024 * 1024,
> Int.MaxValue,
> Int.MaxValue,
> Int.MaxValue,
> true,
> false,
> metrics,
> false)
> {code}
> It's challenging to tell what the interesting parts are vs. what is just being set to some default/unused value.
> Similarly but not as bad, there are 10 calls like:
> {code}
> verify(transfer, times(1)).fetchBlocks(any(), any(), any(), any(), any(), any())
> {code}
> and 7 like
> {code}
> when(transfer.fetchBlocks(any(), any(), any(), any(), any(), any())).thenAnswer ...
> {code}
> This can result in about 10% reduction in both lines and characters in the file:
> {code}
> # Before
> > wc core/src/test/scala/org/apache/spark/storage/ShuffleBlockFetcherIteratorSuite.scala
> 1063 3950 43201 core/src/test/scala/org/apache/spark/storage/ShuffleBlockFetcherIteratorSuite.scala
> # After
> > wc core/src/test/scala/org/apache/spark/storage/ShuffleBlockFetcherIteratorSuite.scala
> 928 3609 39053 core/src/test/scala/org/apache/spark/storage/ShuffleBlockFetcherIteratorSuite.scala
> {code}
> It also helps readability:
> {code}
> val iterator = createShuffleBlockIteratorWithDefaults(
> transfer,
> blocksByAddress,
> maxBytesInFlight = 1000L
> )
> {code}
> Now I can clearly tell that {{maxBytesInFlight}} is the main parameter we're interested in here.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org