You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Eduardo Costa Alfaia <e....@unibs.it> on 2014/11/03 15:57:05 UTC

Spark Kafka Performance

Hi Guys,
Anyone could explain me how to work Kafka with Spark, I am using the JavaKafkaWordCount.java like a test and the line command is:

./run-example org.apache.spark.streaming.examples.JavaKafkaWordCount spark://192.168.0.13:7077 computer49:2181 test-consumer-group unibs.it 3

and like a producer I am using this command:

rdkafka_cachesender -t unibs.nec -p 1 -b 192.168.0.46:9092 -f output.txt -l 100 -n 10


rdkafka_cachesender is a program that was developed by me which send to kafka the output.txt’s content where -l is the length of each send(upper bound) and -n is the lines to send in a row. Bellow is the throughput calculated by the program:

File is 2235755 bytes
throughput (b/s) = 699751388
throughput (b/s) = 723542382
throughput (b/s) = 662989745
throughput (b/s) = 505028200
throughput (b/s) = 471263416
throughput (b/s) = 446837266
throughput (b/s) = 409856716
throughput (b/s) = 373994467
throughput (b/s) = 366343097
throughput (b/s) = 373240017
throughput (b/s) = 386139016
throughput (b/s) = 373802209
throughput (b/s) = 369308515
throughput (b/s) = 366935820
throughput (b/s) = 365175388
throughput (b/s) = 362175419
throughput (b/s) = 358356633
throughput (b/s) = 357219124
throughput (b/s) = 352174125
throughput (b/s) = 348313093
throughput (b/s) = 355099099
throughput (b/s) = 348069777
throughput (b/s) = 348478302
throughput (b/s) = 340404276
throughput (b/s) = 339876031
throughput (b/s) = 339175102
throughput (b/s) = 327555252
throughput (b/s) = 324272374
throughput (b/s) = 322479222
throughput (b/s) = 319544906
throughput (b/s) = 317201853
throughput (b/s) = 317351399
throughput (b/s) = 315027978
throughput (b/s) = 313831014
throughput (b/s) = 310050384
throughput (b/s) = 307654601
throughput (b/s) = 305707061
throughput (b/s) = 307961102
throughput (b/s) = 296898200
throughput (b/s) = 296409904
throughput (b/s) = 294609332
throughput (b/s) = 293397843
throughput (b/s) = 293194876
throughput (b/s) = 291724886
throughput (b/s) = 290031314
throughput (b/s) = 289747022
throughput (b/s) = 289299632

The throughput goes down after some seconds and it does not maintain the performance like the initial values:

throughput (b/s) = 699751388
throughput (b/s) = 723542382
throughput (b/s) = 662989745

Another question is about spark, after I have started the spark line command after 15 sec spark continue to repeat the words counted, but my program continue to send words to kafka, so I mean that the words counted in spark should grow up. I have attached the log from spark.
  
My Case is:

ComputerA(Kafka_cachsesender) -> ComputerB(Kakfa-Brokers-Zookeeper) -> ComputerC (Spark)
 
If I don’t explain very well send a reply to me.

Thanks Guys
-- 
Informativa sulla Privacy: http://www.unibs.it/node/8155

Re: Spark Kafka Performance

Posted by Eduardo Costa Alfaia <e....@unibs.it>.
Hi Bhavesh

I will collect the  dump and I will send for you.

I am using a program that I have caught here  https://github.com/edenhill/librdkafka/tree/master/examples <https://github.com/edenhill/librdkafka/tree/master/examples> and I have changed to meet my tests. I have attached the files.




	
	
> On Nov 5, 2014, at 04:45, Bhavesh Mistry <mi...@gmail.com> wrote:
> 
> Hi Eduardo,
> 
> Can you please take thread dump and see if there are blocking issues on
> producer side ?  Do you have single instance of Producers and Multiple
> treads ?
> 
> Are you using Scala Producer or New Java Producer ?  Also, what is your
> producer property ?
> 
> 
> Thanks,
> 
> Bhavesh
> 
> On Tue, Nov 4, 2014 at 12:40 AM, Eduardo Alfaia <e....@unibs.it>
> wrote:
> 
>> Hi Gwen,
>> I have changed the java code kafkawordcount to use reducebykeyandwindow in
>> spark.
>> 
>> ----- Messaggio originale -----
>> Da: "Gwen Shapira" <gs...@cloudera.com>
>> Inviato: ‎03/‎11/‎2014 21:08
>> A: "users@kafka.apache.org" <us...@kafka.apache.org>
>> Cc: "user@spark.incubator.apache.org" <us...@spark.incubator.apache.org>
>> Oggetto: Re: Spark Kafka Performance
>> 
>> Not sure about the throughput, but:
>> 
>> "I mean that the words counted in spark should grow up" - The spark
>> word-count example doesn't accumulate.
>> It gets an RDD every n seconds and counts the words in that RDD. So we
>> don't expect the count to go up.
>> 
>> 
>> 
>> On Mon, Nov 3, 2014 at 6:57 AM, Eduardo Costa Alfaia <
>> e.costaalfaia@unibs.it
>>> wrote:
>> 
>>> Hi Guys,
>>> Anyone could explain me how to work Kafka with Spark, I am using the
>>> JavaKafkaWordCount.java like a test and the line command is:
>>> 
>>> ./run-example org.apache.spark.streaming.examples.JavaKafkaWordCount
>>> spark://192.168.0.13:7077 computer49:2181 test-consumer-group unibs.it 3
>>> 
>>> and like a producer I am using this command:
>>> 
>>> rdkafka_cachesender -t unibs.nec -p 1 -b 192.168.0.46:9092 -f output.txt
>>> -l 100 -n 10
>>> 
>>> 
>>> rdkafka_cachesender is a program that was developed by me which send to
>>> kafka the output.txt’s content where -l is the length of each send(upper
>>> bound) and -n is the lines to send in a row. Bellow is the throughput
>>> calculated by the program:
>>> 
>>> File is 2235755 bytes
>>> throughput (b/s) = 699751388
>>> throughput (b/s) = 723542382
>>> throughput (b/s) = 662989745
>>> throughput (b/s) = 505028200
>>> throughput (b/s) = 471263416
>>> throughput (b/s) = 446837266
>>> throughput (b/s) = 409856716
>>> throughput (b/s) = 373994467
>>> throughput (b/s) = 366343097
>>> throughput (b/s) = 373240017
>>> throughput (b/s) = 386139016
>>> throughput (b/s) = 373802209
>>> throughput (b/s) = 369308515
>>> throughput (b/s) = 366935820
>>> throughput (b/s) = 365175388
>>> throughput (b/s) = 362175419
>>> throughput (b/s) = 358356633
>>> throughput (b/s) = 357219124
>>> throughput (b/s) = 352174125
>>> throughput (b/s) = 348313093
>>> throughput (b/s) = 355099099
>>> throughput (b/s) = 348069777
>>> throughput (b/s) = 348478302
>>> throughput (b/s) = 340404276
>>> throughput (b/s) = 339876031
>>> throughput (b/s) = 339175102
>>> throughput (b/s) = 327555252
>>> throughput (b/s) = 324272374
>>> throughput (b/s) = 322479222
>>> throughput (b/s) = 319544906
>>> throughput (b/s) = 317201853
>>> throughput (b/s) = 317351399
>>> throughput (b/s) = 315027978
>>> throughput (b/s) = 313831014
>>> throughput (b/s) = 310050384
>>> throughput (b/s) = 307654601
>>> throughput (b/s) = 305707061
>>> throughput (b/s) = 307961102
>>> throughput (b/s) = 296898200
>>> throughput (b/s) = 296409904
>>> throughput (b/s) = 294609332
>>> throughput (b/s) = 293397843
>>> throughput (b/s) = 293194876
>>> throughput (b/s) = 291724886
>>> throughput (b/s) = 290031314
>>> throughput (b/s) = 289747022
>>> throughput (b/s) = 289299632
>>> 
>>> The throughput goes down after some seconds and it does not maintain the
>>> performance like the initial values:
>>> 
>>> throughput (b/s) = 699751388
>>> throughput (b/s) = 723542382
>>> throughput (b/s) = 662989745
>>> 
>>> Another question is about spark, after I have started the spark line
>>> command after 15 sec spark continue to repeat the words counted, but my
>>> program continue to send words to kafka, so I mean that the words counted
>>> in spark should grow up. I have attached the log from spark.
>>> 
>>> My Case is:
>>> 
>>> ComputerA(Kafka_cachsesender) -> ComputerB(Kakfa-Brokers-Zookeeper) ->
>>> ComputerC (Spark)
>>> 
>>> If I don’t explain very well send a reply to me.
>>> 
>>> Thanks Guys
>>> --
>>> Informativa sulla Privacy: http://www.unibs.it/node/8155
>>> 
>> 
>> --
>> Informativa sulla Privacy: http://www.unibs.it/node/8155
>> 


-- 
Informativa sulla Privacy: http://www.unibs.it/node/8155

Re: Spark Kafka Performance

Posted by Bhavesh Mistry <mi...@gmail.com>.
Hi Eduardo,

Can you please take thread dump and see if there are blocking issues on
producer side ?  Do you have single instance of Producers and Multiple
treads ?

Are you using Scala Producer or New Java Producer ?  Also, what is your
producer property ?


Thanks,

Bhavesh

On Tue, Nov 4, 2014 at 12:40 AM, Eduardo Alfaia <e....@unibs.it>
wrote:

> Hi Gwen,
> I have changed the java code kafkawordcount to use reducebykeyandwindow in
> spark.
>
> ----- Messaggio originale -----
> Da: "Gwen Shapira" <gs...@cloudera.com>
> Inviato: ‎03/‎11/‎2014 21:08
> A: "users@kafka.apache.org" <us...@kafka.apache.org>
> Cc: "user@spark.incubator.apache.org" <us...@spark.incubator.apache.org>
> Oggetto: Re: Spark Kafka Performance
>
> Not sure about the throughput, but:
>
> "I mean that the words counted in spark should grow up" - The spark
> word-count example doesn't accumulate.
> It gets an RDD every n seconds and counts the words in that RDD. So we
> don't expect the count to go up.
>
>
>
> On Mon, Nov 3, 2014 at 6:57 AM, Eduardo Costa Alfaia <
> e.costaalfaia@unibs.it
> > wrote:
>
> > Hi Guys,
> > Anyone could explain me how to work Kafka with Spark, I am using the
> > JavaKafkaWordCount.java like a test and the line command is:
> >
> > ./run-example org.apache.spark.streaming.examples.JavaKafkaWordCount
> > spark://192.168.0.13:7077 computer49:2181 test-consumer-group unibs.it 3
> >
> > and like a producer I am using this command:
> >
> > rdkafka_cachesender -t unibs.nec -p 1 -b 192.168.0.46:9092 -f output.txt
> > -l 100 -n 10
> >
> >
> > rdkafka_cachesender is a program that was developed by me which send to
> > kafka the output.txt’s content where -l is the length of each send(upper
> > bound) and -n is the lines to send in a row. Bellow is the throughput
> > calculated by the program:
> >
> > File is 2235755 bytes
> > throughput (b/s) = 699751388
> > throughput (b/s) = 723542382
> > throughput (b/s) = 662989745
> > throughput (b/s) = 505028200
> > throughput (b/s) = 471263416
> > throughput (b/s) = 446837266
> > throughput (b/s) = 409856716
> > throughput (b/s) = 373994467
> > throughput (b/s) = 366343097
> > throughput (b/s) = 373240017
> > throughput (b/s) = 386139016
> > throughput (b/s) = 373802209
> > throughput (b/s) = 369308515
> > throughput (b/s) = 366935820
> > throughput (b/s) = 365175388
> > throughput (b/s) = 362175419
> > throughput (b/s) = 358356633
> > throughput (b/s) = 357219124
> > throughput (b/s) = 352174125
> > throughput (b/s) = 348313093
> > throughput (b/s) = 355099099
> > throughput (b/s) = 348069777
> > throughput (b/s) = 348478302
> > throughput (b/s) = 340404276
> > throughput (b/s) = 339876031
> > throughput (b/s) = 339175102
> > throughput (b/s) = 327555252
> > throughput (b/s) = 324272374
> > throughput (b/s) = 322479222
> > throughput (b/s) = 319544906
> > throughput (b/s) = 317201853
> > throughput (b/s) = 317351399
> > throughput (b/s) = 315027978
> > throughput (b/s) = 313831014
> > throughput (b/s) = 310050384
> > throughput (b/s) = 307654601
> > throughput (b/s) = 305707061
> > throughput (b/s) = 307961102
> > throughput (b/s) = 296898200
> > throughput (b/s) = 296409904
> > throughput (b/s) = 294609332
> > throughput (b/s) = 293397843
> > throughput (b/s) = 293194876
> > throughput (b/s) = 291724886
> > throughput (b/s) = 290031314
> > throughput (b/s) = 289747022
> > throughput (b/s) = 289299632
> >
> > The throughput goes down after some seconds and it does not maintain the
> > performance like the initial values:
> >
> > throughput (b/s) = 699751388
> > throughput (b/s) = 723542382
> > throughput (b/s) = 662989745
> >
> > Another question is about spark, after I have started the spark line
> > command after 15 sec spark continue to repeat the words counted, but my
> > program continue to send words to kafka, so I mean that the words counted
> > in spark should grow up. I have attached the log from spark.
> >
> > My Case is:
> >
> > ComputerA(Kafka_cachsesender) -> ComputerB(Kakfa-Brokers-Zookeeper) ->
> > ComputerC (Spark)
> >
> > If I don’t explain very well send a reply to me.
> >
> > Thanks Guys
> > --
> > Informativa sulla Privacy: http://www.unibs.it/node/8155
> >
>
> --
> Informativa sulla Privacy: http://www.unibs.it/node/8155
>

R: Spark Kafka Performance

Posted by Eduardo Alfaia <e....@unibs.it>.
Hi Gwen,
I have changed the java code kafkawordcount to use reducebykeyandwindow in spark.

----- Messaggio originale -----
Da: "Gwen Shapira" <gs...@cloudera.com>
Inviato: ‎03/‎11/‎2014 21:08
A: "users@kafka.apache.org" <us...@kafka.apache.org>
Cc: "user@spark.incubator.apache.org" <us...@spark.incubator.apache.org>
Oggetto: Re: Spark Kafka Performance

Not sure about the throughput, but:

"I mean that the words counted in spark should grow up" - The spark
word-count example doesn't accumulate.
It gets an RDD every n seconds and counts the words in that RDD. So we
don't expect the count to go up.



On Mon, Nov 3, 2014 at 6:57 AM, Eduardo Costa Alfaia <e.costaalfaia@unibs.it
> wrote:

> Hi Guys,
> Anyone could explain me how to work Kafka with Spark, I am using the
> JavaKafkaWordCount.java like a test and the line command is:
>
> ./run-example org.apache.spark.streaming.examples.JavaKafkaWordCount
> spark://192.168.0.13:7077 computer49:2181 test-consumer-group unibs.it 3
>
> and like a producer I am using this command:
>
> rdkafka_cachesender -t unibs.nec -p 1 -b 192.168.0.46:9092 -f output.txt
> -l 100 -n 10
>
>
> rdkafka_cachesender is a program that was developed by me which send to
> kafka the output.txt’s content where -l is the length of each send(upper
> bound) and -n is the lines to send in a row. Bellow is the throughput
> calculated by the program:
>
> File is 2235755 bytes
> throughput (b/s) = 699751388
> throughput (b/s) = 723542382
> throughput (b/s) = 662989745
> throughput (b/s) = 505028200
> throughput (b/s) = 471263416
> throughput (b/s) = 446837266
> throughput (b/s) = 409856716
> throughput (b/s) = 373994467
> throughput (b/s) = 366343097
> throughput (b/s) = 373240017
> throughput (b/s) = 386139016
> throughput (b/s) = 373802209
> throughput (b/s) = 369308515
> throughput (b/s) = 366935820
> throughput (b/s) = 365175388
> throughput (b/s) = 362175419
> throughput (b/s) = 358356633
> throughput (b/s) = 357219124
> throughput (b/s) = 352174125
> throughput (b/s) = 348313093
> throughput (b/s) = 355099099
> throughput (b/s) = 348069777
> throughput (b/s) = 348478302
> throughput (b/s) = 340404276
> throughput (b/s) = 339876031
> throughput (b/s) = 339175102
> throughput (b/s) = 327555252
> throughput (b/s) = 324272374
> throughput (b/s) = 322479222
> throughput (b/s) = 319544906
> throughput (b/s) = 317201853
> throughput (b/s) = 317351399
> throughput (b/s) = 315027978
> throughput (b/s) = 313831014
> throughput (b/s) = 310050384
> throughput (b/s) = 307654601
> throughput (b/s) = 305707061
> throughput (b/s) = 307961102
> throughput (b/s) = 296898200
> throughput (b/s) = 296409904
> throughput (b/s) = 294609332
> throughput (b/s) = 293397843
> throughput (b/s) = 293194876
> throughput (b/s) = 291724886
> throughput (b/s) = 290031314
> throughput (b/s) = 289747022
> throughput (b/s) = 289299632
>
> The throughput goes down after some seconds and it does not maintain the
> performance like the initial values:
>
> throughput (b/s) = 699751388
> throughput (b/s) = 723542382
> throughput (b/s) = 662989745
>
> Another question is about spark, after I have started the spark line
> command after 15 sec spark continue to repeat the words counted, but my
> program continue to send words to kafka, so I mean that the words counted
> in spark should grow up. I have attached the log from spark.
>
> My Case is:
>
> ComputerA(Kafka_cachsesender) -> ComputerB(Kakfa-Brokers-Zookeeper) ->
> ComputerC (Spark)
>
> If I don’t explain very well send a reply to me.
>
> Thanks Guys
> --
> Informativa sulla Privacy: http://www.unibs.it/node/8155
>

-- 
Informativa sulla Privacy: http://www.unibs.it/node/8155

R: Spark Kafka Performance

Posted by Eduardo Alfaia <e....@unibs.it>.
Hi Gwen,
I have changed the java code kafkawordcount to use reducebykeyandwindow in spark.

----- Messaggio originale -----
Da: "Gwen Shapira" <gs...@cloudera.com>
Inviato: ‎03/‎11/‎2014 21:08
A: "users@kafka.apache.org" <us...@kafka.apache.org>
Cc: "user@spark.incubator.apache.org" <us...@spark.incubator.apache.org>
Oggetto: Re: Spark Kafka Performance

Not sure about the throughput, but:

"I mean that the words counted in spark should grow up" - The spark
word-count example doesn't accumulate.
It gets an RDD every n seconds and counts the words in that RDD. So we
don't expect the count to go up.



On Mon, Nov 3, 2014 at 6:57 AM, Eduardo Costa Alfaia <e.costaalfaia@unibs.it
> wrote:

> Hi Guys,
> Anyone could explain me how to work Kafka with Spark, I am using the
> JavaKafkaWordCount.java like a test and the line command is:
>
> ./run-example org.apache.spark.streaming.examples.JavaKafkaWordCount
> spark://192.168.0.13:7077 computer49:2181 test-consumer-group unibs.it 3
>
> and like a producer I am using this command:
>
> rdkafka_cachesender -t unibs.nec -p 1 -b 192.168.0.46:9092 -f output.txt
> -l 100 -n 10
>
>
> rdkafka_cachesender is a program that was developed by me which send to
> kafka the output.txt’s content where -l is the length of each send(upper
> bound) and -n is the lines to send in a row. Bellow is the throughput
> calculated by the program:
>
> File is 2235755 bytes
> throughput (b/s) = 699751388
> throughput (b/s) = 723542382
> throughput (b/s) = 662989745
> throughput (b/s) = 505028200
> throughput (b/s) = 471263416
> throughput (b/s) = 446837266
> throughput (b/s) = 409856716
> throughput (b/s) = 373994467
> throughput (b/s) = 366343097
> throughput (b/s) = 373240017
> throughput (b/s) = 386139016
> throughput (b/s) = 373802209
> throughput (b/s) = 369308515
> throughput (b/s) = 366935820
> throughput (b/s) = 365175388
> throughput (b/s) = 362175419
> throughput (b/s) = 358356633
> throughput (b/s) = 357219124
> throughput (b/s) = 352174125
> throughput (b/s) = 348313093
> throughput (b/s) = 355099099
> throughput (b/s) = 348069777
> throughput (b/s) = 348478302
> throughput (b/s) = 340404276
> throughput (b/s) = 339876031
> throughput (b/s) = 339175102
> throughput (b/s) = 327555252
> throughput (b/s) = 324272374
> throughput (b/s) = 322479222
> throughput (b/s) = 319544906
> throughput (b/s) = 317201853
> throughput (b/s) = 317351399
> throughput (b/s) = 315027978
> throughput (b/s) = 313831014
> throughput (b/s) = 310050384
> throughput (b/s) = 307654601
> throughput (b/s) = 305707061
> throughput (b/s) = 307961102
> throughput (b/s) = 296898200
> throughput (b/s) = 296409904
> throughput (b/s) = 294609332
> throughput (b/s) = 293397843
> throughput (b/s) = 293194876
> throughput (b/s) = 291724886
> throughput (b/s) = 290031314
> throughput (b/s) = 289747022
> throughput (b/s) = 289299632
>
> The throughput goes down after some seconds and it does not maintain the
> performance like the initial values:
>
> throughput (b/s) = 699751388
> throughput (b/s) = 723542382
> throughput (b/s) = 662989745
>
> Another question is about spark, after I have started the spark line
> command after 15 sec spark continue to repeat the words counted, but my
> program continue to send words to kafka, so I mean that the words counted
> in spark should grow up. I have attached the log from spark.
>
> My Case is:
>
> ComputerA(Kafka_cachsesender) -> ComputerB(Kakfa-Brokers-Zookeeper) ->
> ComputerC (Spark)
>
> If I don’t explain very well send a reply to me.
>
> Thanks Guys
> --
> Informativa sulla Privacy: http://www.unibs.it/node/8155
>

-- 
Informativa sulla Privacy: http://www.unibs.it/node/8155

Re: Spark Kafka Performance

Posted by Gwen Shapira <gs...@cloudera.com>.
Not sure about the throughput, but:

"I mean that the words counted in spark should grow up" - The spark
word-count example doesn't accumulate.
It gets an RDD every n seconds and counts the words in that RDD. So we
don't expect the count to go up.



On Mon, Nov 3, 2014 at 6:57 AM, Eduardo Costa Alfaia <e.costaalfaia@unibs.it
> wrote:

> Hi Guys,
> Anyone could explain me how to work Kafka with Spark, I am using the
> JavaKafkaWordCount.java like a test and the line command is:
>
> ./run-example org.apache.spark.streaming.examples.JavaKafkaWordCount
> spark://192.168.0.13:7077 computer49:2181 test-consumer-group unibs.it 3
>
> and like a producer I am using this command:
>
> rdkafka_cachesender -t unibs.nec -p 1 -b 192.168.0.46:9092 -f output.txt
> -l 100 -n 10
>
>
> rdkafka_cachesender is a program that was developed by me which send to
> kafka the output.txt’s content where -l is the length of each send(upper
> bound) and -n is the lines to send in a row. Bellow is the throughput
> calculated by the program:
>
> File is 2235755 bytes
> throughput (b/s) = 699751388
> throughput (b/s) = 723542382
> throughput (b/s) = 662989745
> throughput (b/s) = 505028200
> throughput (b/s) = 471263416
> throughput (b/s) = 446837266
> throughput (b/s) = 409856716
> throughput (b/s) = 373994467
> throughput (b/s) = 366343097
> throughput (b/s) = 373240017
> throughput (b/s) = 386139016
> throughput (b/s) = 373802209
> throughput (b/s) = 369308515
> throughput (b/s) = 366935820
> throughput (b/s) = 365175388
> throughput (b/s) = 362175419
> throughput (b/s) = 358356633
> throughput (b/s) = 357219124
> throughput (b/s) = 352174125
> throughput (b/s) = 348313093
> throughput (b/s) = 355099099
> throughput (b/s) = 348069777
> throughput (b/s) = 348478302
> throughput (b/s) = 340404276
> throughput (b/s) = 339876031
> throughput (b/s) = 339175102
> throughput (b/s) = 327555252
> throughput (b/s) = 324272374
> throughput (b/s) = 322479222
> throughput (b/s) = 319544906
> throughput (b/s) = 317201853
> throughput (b/s) = 317351399
> throughput (b/s) = 315027978
> throughput (b/s) = 313831014
> throughput (b/s) = 310050384
> throughput (b/s) = 307654601
> throughput (b/s) = 305707061
> throughput (b/s) = 307961102
> throughput (b/s) = 296898200
> throughput (b/s) = 296409904
> throughput (b/s) = 294609332
> throughput (b/s) = 293397843
> throughput (b/s) = 293194876
> throughput (b/s) = 291724886
> throughput (b/s) = 290031314
> throughput (b/s) = 289747022
> throughput (b/s) = 289299632
>
> The throughput goes down after some seconds and it does not maintain the
> performance like the initial values:
>
> throughput (b/s) = 699751388
> throughput (b/s) = 723542382
> throughput (b/s) = 662989745
>
> Another question is about spark, after I have started the spark line
> command after 15 sec spark continue to repeat the words counted, but my
> program continue to send words to kafka, so I mean that the words counted
> in spark should grow up. I have attached the log from spark.
>
> My Case is:
>
> ComputerA(Kafka_cachsesender) -> ComputerB(Kakfa-Brokers-Zookeeper) ->
> ComputerC (Spark)
>
> If I don’t explain very well send a reply to me.
>
> Thanks Guys
> --
> Informativa sulla Privacy: http://www.unibs.it/node/8155
>