You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Apache Spark (JIRA)" <ji...@apache.org> on 2018/01/30 11:35:00 UTC

[jira] [Assigned] (SPARK-23270) FileInputDStream Streaming UI 's records should not be set to the default value of 0, it should be the total number of rows of new files.

     [ https://issues.apache.org/jira/browse/SPARK-23270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Apache Spark reassigned SPARK-23270:
------------------------------------

    Assignee: Apache Spark

> FileInputDStream Streaming UI 's records should not be set to the default value of 0, it should be the total number of rows of new files.
> -----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-23270
>                 URL: https://issues.apache.org/jira/browse/SPARK-23270
>             Project: Spark
>          Issue Type: Bug
>          Components: DStreams
>    Affects Versions: 2.4.0
>            Reporter: guoxiaolongzte
>            Assignee: Apache Spark
>            Priority: Major
>         Attachments: 1.png
>
>
> FileInputDStream Streaming UI 's records should not be set to the default value of 0, it should be the total number of rows of new files.
> ^-------------------------------------------in FileInputDStream.scala start-------------------------------------^
> val inputInfo = StreamInputInfo(id, {color:#FF0000}0{color}, metadata) {color:#FF0000}// set to the default value of 0{color}
>  ssc.scheduler.inputInfoTracker.reportInfo(validTime, inputInfo)
> case class StreamInputInfo(
>  inputStreamId: Int, numRecords: Long, metadata: Map[String, Any] = Map.empty)
> -------------------------------------in FileInputDStream.scala end---------------------------
>  
> ^-------------------------------------------in DirectKafkaInputDStream.scala start-------------------------------------^
> val inputInfo = StreamInputInfo(id, {color:#FF0000}rdd.count{color}, metadata) {color:#FF0000}//set to rdd count as numRecords{color}
>  ssc.scheduler.inputInfoTracker.reportInfo(validTime, inputInfo)
> case class StreamInputInfo(
> inputStreamId: Int, numRecords: Long, metadata: Map[String, Any] = Map.empty)
> -------------------------------------in DirectKafkaInputDStream.scala end-----------------------
>  
> test method：
> ./bin/spark-submit --class org.apache.spark.examples.streaming.HdfsWordCount examples/jars/spark-examples_2.11-2.4.0-SNAPSHOT.jar /spark/tmp/
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org