You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by yishchuk <is...@yahoo.com> on 2017/03/30 11:21:20 UTC

Huge delays while returning data back from remote callable's

<making sure topic is sent to mailing list>



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Huge-delays-while-returning-data-back-from-remote-callable-s-tp11561p11562.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Huge delays while returning data back from remote callable's

Posted by yishchuk <is...@yahoo.com>.
Nikolai,

I've created sample project which reproduces the behavior:  perf-test.zip
<http://apache-ignite-users.70518.x6.nabble.com/file/n12306/perf-test.zip> 
.

It contains 4 modules: 
 - common: for common classes
 - daily: where main processing occurs
 - linker: where results of dailies are combined
 - service: web endpoint

The project can be built with command 'gradle assemble', which produces 3
JARs - daily-1.0.0.jar, linker-1.0.0.jar, and service-1.0.0.jar. 
I have the following host configuration (all CentOS 7 VMs with 10Gbps
network, 8 cores and 32Gb RAM): 
 - 10.80.220.4: daily and linker
 - 10.80.220.5: daily and linker
 - 10.80.220.6: daily and linker
 - 10.80.220.40: daily and linker
 - 10.80.220.41: daily and linker
 - 10.80.220.7: service

I ran services with command like this (with appropriate IPs jar names):  
  *SELF_HOST=10.80.220.4 java -Xmx2g -XX:+PrintGC -server -jar
daily-1.0.0.jar*


Service listens to port 7000. When I issue the following command: 
  *curl 'http://10.80.220.7:7000/test/huge' -H 'Content-Type:
application/json' --data-binary '{"size":1000000}'*

many times sequentially, I get results like the following:

/
2017-04-28 10:16:34.959  INFO 30661 --- [erformanceGrid%]
perf.controller.TestController           : Computed [test-huge(35000539)] 
in 2.186388594s
2017-04-28 10:16:37.370  INFO 30661 --- [erformanceGrid%]
perf.controller.TestController           : Computed [test-huge(35000539)] 
in 2.137624933s
2017-04-28 10:16:42.345  INFO 30661 --- [erformanceGrid%]
perf.controller.TestController           : Computed [test-huge(35000539)] 
in *4.706772665s*
2017-04-28 10:16:44.935  INFO 30661 --- [erformanceGrid%]
perf.controller.TestController           : Computed [test-huge(35000539)] 
in 2.305917127s
2017-04-28 10:16:47.399  INFO 30661 --- [erformanceGrid%]
perf.controller.TestController           : Computed [test-huge(35000539)] 
in 2.194614218s
[GC (Allocation Failure)  930736K->441316K(1507328K), 0.0353486 secs]
2017-04-28 10:16:50.025  INFO 30661 --- [erformanceGrid%]
perf.controller.TestController           : Computed [test-huge(35000539)] 
in 2.336885696s
2017-04-28 10:16:52.444  INFO 30661 --- [erformanceGrid%]
perf.controller.TestController           : Computed [test-huge(35000539)] 
in 2.136979967s
2017-04-28 10:16:58.563  INFO 30661 --- [erformanceGrid%]
perf.controller.TestController           : Computed [test-huge(35000539)] 
in *5.834498859s*
2017-04-28 10:17:05.652  INFO 30661 --- [erformanceGrid%]
perf.controller.TestController           : Computed [test-huge(35000539)] 
in *6.821000148s*
2017-04-28 10:17:08.541  INFO 30661 --- [erformanceGrid%]
perf.controller.TestController           : Computed [test-huge(35000539)] 
in 2.614002647s
[GC (Allocation Failure)  1002384K->475560K(1519104K), 0.0349871 secs]
2017-04-28 10:17:10.801  INFO 30661 --- [erformanceGrid%]
perf.controller.TestController           : Computed [test-huge(35000539)] 
in 1.990004232s
2017-04-28 10:17:13.188  INFO 30661 --- [erformanceGrid%]
perf.controller.TestController           : Computed [test-huge(35000539)] 
in 2.115146997s
2017-04-28 10:17:19.866  INFO 30661 --- [erformanceGrid%]
perf.controller.TestController           : Computed [test-huge(35000539)] 
in *6.398794509s*
2017-04-28 10:17:26.970  INFO 30661 --- [erformanceGrid%]
perf.controller.TestController           : Computed [test-huge(35000539)] 
in *6.829077624s*
2017-04-28 10:17:29.430  INFO 30661 --- [erformanceGrid%]
perf.controller.TestController           : Computed [test-huge(35000539)] 
in 2.188620807s
2017-04-28 10:17:31.857  INFO 30661 --- [erformanceGrid%]
perf.controller.TestController           : Computed [test-huge(35000539)] 
in 2.158131195s
[GC (Allocation Failure)  1068287K->475632K(1571840K), 0.0347788 secs]
2017-04-28 10:17:34.440  INFO 30661 --- [erformanceGrid%]
perf.controller.TestController           : Computed [test-huge(35000539)] 
in 2.307639326s
2017-04-28 10:17:40.887  INFO 30661 --- [erformanceGrid%]
perf.controller.TestController           : Computed [test-huge(35000539)] 
in *6.179043614s*
2017-04-28 10:17:48.460  INFO 30661 --- [erformanceGrid%]
perf.controller.TestController           : Computed [test-huge(35000539)] 
in *7.189130919s*
2017-04-28 10:17:50.960  INFO 30661 --- [erformanceGrid%]
perf.controller.TestController           : Computed [test-huge(35000539)] 
in 2.205854651s
2017-04-28 10:17:53.283  INFO 30661 --- [erformanceGrid%]
perf.controller.TestController           : Computed [test-huge(35000539)] 
in 2.029877931s
2017-04-28 10:17:55.736  INFO 30661 --- [erformanceGrid%]
perf.controller.TestController           : Computed [test-huge(35000539)] 
in 2.18218379s
[GC (Allocation Failure)  1104799K->475608K(1550848K), 0.0361552 secs]
2017-04-28 10:18:01.918  INFO 30661 --- [erformanceGrid%]
perf.controller.TestController           : Computed [test-huge(35000539)] 
in *5.91376447s*
2017-04-28 10:18:08.664  INFO 30661 --- [erformanceGrid%]
perf.controller.TestController           : Computed [test-huge(35000539)] 
in *6.47573291s*
2017-04-28 10:18:11.077  INFO 30661 --- [erformanceGrid%]
perf.controller.TestController           : Computed [test-huge(35000539)] 
in 2.132781788s
2017-04-28 10:43:22.765  INFO 30661 --- [erformanceGrid%]
perf.controller.TestController           : Computed [test-huge(35000539)] 
in 1.98846579s
/

So spikes are still there.





--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Huge-delays-while-returning-data-back-from-remote-callable-s-tp11561p12306.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Huge delays while returning data back from remote callable's

Posted by Nikolai Tikhonov <nt...@apache.org>.
Yuriy,

I don't have any ideas. Could you create simple maven project which
reproduce the behaviour and share it?

On Tue, Apr 4, 2017 at 4:23 PM, Nikolai Tikhonov <nt...@apache.org>
wrote:

> I've tried reproduce the behaviour on my side (used your code snippet
> above) but without success. Could you profile your application by fly
> recorder? You can use the following settings https://apacheignite.
> readme.io/docs/jvm-and-system-tuning#section-flightrecorder-settings.
>
> On Fri, Mar 31, 2017 at 4:21 PM, Nikolai Tikhonov <nt...@apache.org>
> wrote:
>
>> Let's try to increase  the following parameter
>> TcpCommunicationSpi#setUnacknowledgedMessagesBufferSize to 1_000_000.
>>
>> On Fri, Mar 31, 2017 at 2:03 PM, yishchuk <is...@yahoo.com> wrote:
>>
>>> I've enabled GC logging as you suggested, and attached the results.
>>>
>>> gc-L01.log
>>> <http://apache-ignite-users.70518.x6.nabble.com/file/n11606/gc-L01.log>
>>> gc-L02.log
>>> <http://apache-ignite-users.70518.x6.nabble.com/file/n11606/gc-L02.log>
>>> gc-L03.log
>>> <http://apache-ignite-users.70518.x6.nabble.com/file/n11606/gc-L03.log>
>>> gc-L05.log
>>> <http://apache-ignite-users.70518.x6.nabble.com/file/n11606/gc-L05.log>
>>> gc-L06.log
>>> <http://apache-ignite-users.70518.x6.nabble.com/file/n11606/gc-L06.log>
>>>
>>> gc-CLIENT.log
>>> <http://apache-ignite-users.70518.x6.nabble.com/file/n11606/
>>> gc-CLIENT.log>
>>>
>>> last call took on server L06 3.21271845s, and on the client -
>>> 30.815496906s
>>>
>>>
>>>
>>>
>>> --
>>> View this message in context: http://apache-ignite-users.705
>>> 18.x6.nabble.com/Huge-delays-while-returning-data-back-from-
>>> remote-callable-s-tp11561p11606.html
>>> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>>>
>>
>>
>

Re: Huge delays while returning data back from remote callable's

Posted by yishchuk <is...@yahoo.com>.
After I've applied recommended JVM settings from the link you provided, I
couldn't reproduce these long timeouts on this sample code  any more - but
they are still reproducible on original code. I've profiled the application
with flight recorder and attached files for intermediate caller and
calculator ('rec-linker' and 'rec-daily' respectively)

rec-linker.7z
<http://apache-ignite-users.70518.x6.nabble.com/file/n11891/rec-linker.7z>  
rec-daily.7z <http://dropmefiles.com/d21RD>  

Actually, I can't myself see what can be wrong there - could you please help
and point me where the problem can be?

Thanks,

Yuriy



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Huge-delays-while-returning-data-back-from-remote-callable-s-tp11561p11891.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Huge delays while returning data back from remote callable's

Posted by Nikolai Tikhonov <nt...@apache.org>.
I've tried reproduce the behaviour on my side (used your code snippet
above) but without success. Could you profile your application by fly
recorder? You can use the following settings
https://apacheignite.readme.io/docs/jvm-and-system-tuning#section-flightrecorder-settings
.

On Fri, Mar 31, 2017 at 4:21 PM, Nikolai Tikhonov <nt...@apache.org>
wrote:

> Let's try to increase  the following parameter TcpCommunicationSpi#
> setUnacknowledgedMessagesBufferSize to 1_000_000.
>
> On Fri, Mar 31, 2017 at 2:03 PM, yishchuk <is...@yahoo.com> wrote:
>
>> I've enabled GC logging as you suggested, and attached the results.
>>
>> gc-L01.log
>> <http://apache-ignite-users.70518.x6.nabble.com/file/n11606/gc-L01.log>
>> gc-L02.log
>> <http://apache-ignite-users.70518.x6.nabble.com/file/n11606/gc-L02.log>
>> gc-L03.log
>> <http://apache-ignite-users.70518.x6.nabble.com/file/n11606/gc-L03.log>
>> gc-L05.log
>> <http://apache-ignite-users.70518.x6.nabble.com/file/n11606/gc-L05.log>
>> gc-L06.log
>> <http://apache-ignite-users.70518.x6.nabble.com/file/n11606/gc-L06.log>
>>
>> gc-CLIENT.log
>> <http://apache-ignite-users.70518.x6.nabble.com/file/n11606/
>> gc-CLIENT.log>
>>
>> last call took on server L06 3.21271845s, and on the client -
>> 30.815496906s
>>
>>
>>
>>
>> --
>> View this message in context: http://apache-ignite-users.705
>> 18.x6.nabble.com/Huge-delays-while-returning-data-back-
>> from-remote-callable-s-tp11561p11606.html
>> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>>
>
>

Re: Huge delays while returning data back from remote callable's

Posted by yishchuk <is...@yahoo.com>.
Is there anything else I can try to solve the problem?



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Huge-delays-while-returning-data-back-from-remote-callable-s-tp11561p11710.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Huge delays while returning data back from remote callable's

Posted by yishchuk <is...@yahoo.com>.
  Sorry, but this didn't help. 

Also, I tried setting 'unacknowledgedMessagesBufferSize' to 500_000,
5_000_000, 15_000_000, 50_000_000 - still the same long responses time to
time.



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Huge-delays-while-returning-data-back-from-remote-callable-s-tp11561p11630.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Huge delays while returning data back from remote callable's

Posted by Nikolai Tikhonov <nt...@apache.org>.
Let's try to increase  the following parameter
TcpCommunicationSpi#setUnacknowledgedMessagesBufferSize to 1_000_000.

On Fri, Mar 31, 2017 at 2:03 PM, yishchuk <is...@yahoo.com> wrote:

> I've enabled GC logging as you suggested, and attached the results.
>
> gc-L01.log
> <http://apache-ignite-users.70518.x6.nabble.com/file/n11606/gc-L01.log>
> gc-L02.log
> <http://apache-ignite-users.70518.x6.nabble.com/file/n11606/gc-L02.log>
> gc-L03.log
> <http://apache-ignite-users.70518.x6.nabble.com/file/n11606/gc-L03.log>
> gc-L05.log
> <http://apache-ignite-users.70518.x6.nabble.com/file/n11606/gc-L05.log>
> gc-L06.log
> <http://apache-ignite-users.70518.x6.nabble.com/file/n11606/gc-L06.log>
>
> gc-CLIENT.log
> <http://apache-ignite-users.70518.x6.nabble.com/file/n11606/gc-CLIENT.log>
>
> last call took on server L06 3.21271845s, and on the client - 30.815496906s
>
>
>
>
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/Huge-delays-while-returning-data-
> back-from-remote-callable-s-tp11561p11606.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>

Re: Huge delays while returning data back from remote callable's

Posted by yishchuk <is...@yahoo.com>.
I've enabled GC logging as you suggested, and attached the results.

gc-L01.log
<http://apache-ignite-users.70518.x6.nabble.com/file/n11606/gc-L01.log>  
gc-L02.log
<http://apache-ignite-users.70518.x6.nabble.com/file/n11606/gc-L02.log>  
gc-L03.log
<http://apache-ignite-users.70518.x6.nabble.com/file/n11606/gc-L03.log>  
gc-L05.log
<http://apache-ignite-users.70518.x6.nabble.com/file/n11606/gc-L05.log>  
gc-L06.log
<http://apache-ignite-users.70518.x6.nabble.com/file/n11606/gc-L06.log>  

gc-CLIENT.log
<http://apache-ignite-users.70518.x6.nabble.com/file/n11606/gc-CLIENT.log>  

last call took on server L06 3.21271845s, and on the client - 30.815496906s




--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Huge-delays-while-returning-data-back-from-remote-callable-s-tp11561p11606.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Huge delays while returning data back from remote callable's

Posted by Nikolai Tikhonov <nt...@apache.org>.
Hi,

Could you enable gc logs and share with us? For example: -Xloggc:./gc.log
-XX:+PrintGCDetails -verbose:gc

On Thu, Mar 30, 2017 at 2:21 PM, yishchuk <is...@yahoo.com> wrote:

> <making sure topic is sent to mailing list>
>
>
>
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/Huge-delays-while-returning-data-
> back-from-remote-callable-s-tp11561p11562.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>