You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/10/28 01:56:24 UTC

[GitHub] [spark] dengziming commented on a change in pull request #26266: [SPARK-29611][WEBUI] Sort Kafka metadata by the number of messages

dengziming commented on a change in pull request #26266: [SPARK-29611][WEBUI] Sort Kafka metadata by the number of messages
URL: https://github.com/apache/spark/pull/26266#discussion_r339381599
 
 

 ##########
 File path: external/kafka-0-10/src/main/scala/org/apache/spark/streaming/kafka010/DirectKafkaInputDStream.scala
 ##########
 @@ -222,9 +222,10 @@ private[spark] class DirectKafkaInputDStream[K, V](
     val description = offsetRanges.filter { offsetRange =>
       // Don't display empty ranges.
       offsetRange.fromOffset != offsetRange.untilOffset
-    }.map { offsetRange =>
+    }.toSeq.sortBy(-_.count()).map { offsetRange =>
 
 Review comment:
   thank you for reviewing, it seems to be reasonable to sort by topic(or partition) name, but it's useful to sort by count when inspecting your data skewness problem, and it will help you to have an intuitive acknowledge of some data-intensive topics.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org