You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by danilopds <da...@gmail.com> on 2014/09/08 22:54:18 UTC

Records - Input Byte

Hi,

I was reading the paper of Spark Streaming:
"Discretized Streams: Fault-Tolerant Streaming Computation at Scale"

So,
I read that performance evaluation used 100-byte input records in test Grep
and WordCount.

I don't have much experience and I'd like to know how can I control this
value in my records (like words in an input file)?
Can anyone suggest me something to start?

Thanks!



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Records-Input-Byte-tp13733.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Re: Records - Input Byte

Posted by Mayur Rustagi <ma...@gmail.com>.
What do you mean by "control your input”, are you trying to pace your spark streaming by number of words. If so that is not supported as of now, you can only control time & consume all files within that time period. 
-- 
Regards,
Mayur Rustagi
Ph: +1 (760) 203 3257
http://www.sigmoidanalytics.com
@mayur_rustagi

On Tue, Sep 9, 2014 at 2:24 AM, danilopds <da...@gmail.com> wrote:

> Hi,
> I was reading the paper of Spark Streaming:
> "Discretized Streams: Fault-Tolerant Streaming Computation at Scale"
> So,
> I read that performance evaluation used 100-byte input records in test Grep
> and WordCount.
> I don't have much experience and I'd like to know how can I control this
> value in my records (like words in an input file)?
> Can anyone suggest me something to start?
> Thanks!
> --
> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Records-Input-Byte-tp13733.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org