You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@activemq.apache.org by nigro_franz <ni...@gmail.com> on 2017/04/01 07:14:14 UTC

Re: Paging

BIG Update:

I've just completed the performance work to make the MAPPED Journal suitable
to be used as a full citizen Journal type for any kind of loads (concurrent
and durable ones in particular).

In order to make it suitable to be used with high loads of concurrent
requests of persistent messages, I've performed these optimisations:
1) smart batching on concurrent/high-rate sync requests (different from NIO
and ASYNCIO; it is high throughput *and* low latency, falling back to honor
a configurable SLA on latency only when unable to achieve both)  
2) adapting (on the load) page prefetching/zeroing to make the write
requests more sympathetic with the Unix/Windows paging caching policies and
the hard disk sector sizes

The change is huge and I'm thinking (need feedbacks!!) to make a different
(the old one would be good) implementation of it for the PAGING: they
satisfies different purposes and requires different optimisations.

I prefer to not publish benchmark results and I've not updated the Artemis
journal benchmark tool on purpose: it can't show the improvements easily due
to the kind of optimisations done to make it faster on the common execution
paths.
On the other hand, I've reused the tuning performed by the tool on Artemis
startup to configure the latency SLA of the journal and the write buffer
size: hence a user will need only to set the Journal type on broker.xml and
everything will work as the the ASYNCIO case (using its same properties,
maxAIO excluded).

I'll suggest to try it on real loads and disabling the disk HW write cache
(eg hdparm -W 0 /dev/sda on Linux) to have consistent results: it is more
sensible than the ASYNCIO journal to these hardware features, in particular
due to the adapting prefetching feature I've put it in.

The branch with its latest version is:  here
<https://github.com/franz1981/activemq-artemis/tree/batch_buffer.>  . 





--
View this message in context: http://activemq.2283324.n4.nabble.com/Paging-tp4724085p4724470.html
Sent from the ActiveMQ - Dev mailing list archive at Nabble.com.

Re: Paging

Posted by nigro_franz <ni...@gmail.com>.

I've continued to work to reduce the GC pressure on the hot paths while
journalling and/or paging and this is the latest result:

https://github.com/franz1981/activemq-artemis/tree/buffer_pooling

This is the list of improvements:
- NIO/ASYNCIO new TimedBuffer with adapting batch window heuristic (+
throughput and - latency)
- NIO/ASYNCIO journal/paging operations benefit from less buffer copies and
always performed with batch copy under the hood
- NIO journal operation perform TLABs allocation pooling (off heap) 
- NIO improved file copy operations using zero-copy FileChannel::transfertTo
- NIO improved zeroing using a single OS page buffer to clean the file 
- NIO deterministic release of unpooled buffers to avoid OOM errors due to
slow GC

Currently It needs some check against the test suite, but the first results
are promising :)




--
View this message in context: http://activemq.2283324.n4.nabble.com/Paging-tp4724085p4725510.html
Sent from the ActiveMQ - Dev mailing list archive at Nabble.com.

Re: Paging

Posted by nigro_franz <ni...@gmail.com>.

I've pooled the ByteBuffer (ByteBuf and ActiveMQBuffer) used to encode each
PagedMessage:

https://github.com/franz1981/activemq-artemis/commit/e1a698d7b97b6a8bfda508c945f45bbf788320a6#diff-f1eb88bba4fa77f5c5ef180231de9965R218

I'm waiting to do the PR because I need to do other stress tests on it..just
to be sure it doesn't break anything...



--
View this message in context: http://activemq.2283324.n4.nabble.com/Paging-tp4724085p4724767.html
Sent from the ActiveMQ - Dev mailing list archive at Nabble.com.

Re: Paging

Posted by Clebert Suconic <cl...@gmail.com>.

What was your change ?

Pull request ? :)
On Mon, Apr 10, 2017 at 5:35 AM nigro_franz <ni...@gmail.com> wrote:

> The GC free paging (on the write path at least) seems to perform pretty
> good:
>
> This is without using it:
>
> Producer Throughput: 42153 ops/sec
> Consumer Throughput: 42453 ops/sec
> EndToEnd Throughput: 40818 ops/sec
> EndToEnd SERVICE-TIME Latencies distribution in MICROSECONDS
> mean             852567.48
> min              517996.54
> 50.00%           843055.10
> 90.00%          1098907.65
> 99.00%          1384120.32
> 99.90%          1451229.18
> 99.99%          1493172.22
> max             1493172.22
> count              1000000
>
> This is using it:
>
> Producer Throughput: 49744 ops/sec
> Consumer Throughput: 49739 ops/sec
> EndToEnd Throughput: 49738 ops/sec
> EndToEnd SERVICE-TIME Latencies distribution in MICROSECONDS
> mean               4948.23
> min                  92.16
> 50.00%             2162.69
> 90.00%            11337.73
> 99.00%            42991.62
> 99.90%           115867.65
> 99.99%           121110.53
> max              122159.10
> count              1000000
>
> The throughput is increased by 25% and the latencies are far better (less
> GC
> and shorter ones).
>
>
>
>
>
> --
> View this message in context:
> http://activemq.2283324.n4.nabble.com/Paging-tp4724085p4724761.html
> Sent from the ActiveMQ - Dev mailing list archive at Nabble.com.
>
-- 
Clebert Suconic

Re: Paging

Posted by nigro_franz <ni...@gmail.com>.

The GC free paging (on the write path at least) seems to perform pretty good:

This is without using it:

Producer Throughput: 42153 ops/sec
Consumer Throughput: 42453 ops/sec
EndToEnd Throughput: 40818 ops/sec
EndToEnd SERVICE-TIME Latencies distribution in MICROSECONDS
mean             852567.48
min              517996.54
50.00%           843055.10
90.00%          1098907.65
99.00%          1384120.32
99.90%          1451229.18
99.99%          1493172.22
max             1493172.22
count              1000000

This is using it:

Producer Throughput: 49744 ops/sec
Consumer Throughput: 49739 ops/sec
EndToEnd Throughput: 49738 ops/sec
EndToEnd SERVICE-TIME Latencies distribution in MICROSECONDS
mean               4948.23
min                  92.16
50.00%             2162.69
90.00%            11337.73
99.00%            42991.62
99.90%           115867.65
99.99%           121110.53
max              122159.10
count              1000000

The throughput is increased by 25% and the latencies are far better (less GC
and shorter ones).





--
View this message in context: http://activemq.2283324.n4.nabble.com/Paging-tp4724085p4724761.html
Sent from the ActiveMQ - Dev mailing list archive at Nabble.com.

Re: Paging

Posted by nigro_franz <ni...@gmail.com>.

@michael
Before putting the mmap in action on paging I've started to make some
improvements around the current one:

https://issues.apache.org/jira/browse/ARTEMIS-1104
(https://github.com/franz1981/activemq-artemis/tree/paging_improved)

With this change, on steady state:
- the writes while paging won't produce any garbage 
- the memory footprint will be constant
- there will not be created any (hidden) additional copy of the data before
writing into the file (on NIO), improving the write performance

As soon as I'll have some number with the effects on the overall broker,
I'll publish them, but the first tests on it already shows a massive
reduction in the minor GC numbers:

<http://activemq.2283324.n4.nabble.com/file/n4724758/original_paging.png> 
<http://activemq.2283324.n4.nabble.com/file/n4724758/improved_paging.png> 



--
View this message in context: http://activemq.2283324.n4.nabble.com/Paging-tp4724085p4724758.html
Sent from the ActiveMQ - Dev mailing list archive at Nabble.com.

Re: Paging

Posted by nigro_franz <ni...@gmail.com>.

@clebert
Yes, the change introduces a lot of new stuff and it needs to play better
with the rest of the system from a user perspective too (definition of
properties/configurations...I'm not used to them at all...you've seen my
last PR on NettyConnection and you know what I mean: a configuration
properties HELL).

@michael
As Clebert said we need more integration, unit and performance tests to
validate it.

For example, the "smart batching" policy (enabled by default now) used to
coalesce the fsyncs/msyncs operations seems to be pretty effective from a
latency point of view in the most of the cases (100->4k bytes sized messages
with 1->64 clients), but requires additional analysis with different load
distributions and proper HW to be sure on its effectiveness.

>I'm unclear why you’d run or you’re recommending with disabling the disk
cache's? 

You're right, I've missed some context here :P .
I've not given that advice from a durability point of view, but I've noticed
that the background zeroing of the journal files while under sustained load,
could "steal" (vendor and journal size dependent) the write cache buffer to
the main writer making it less stable from a latency perspective, especially
with long running all out throughput tests.

Different vendors implement that cache in very different ways and right now
I've not implemented the zeroing so differently from the other 2 journals:
memory mapped files are by default lazy zeroed (on *Nix) hence I could avoid
any zeroing at all, exploiting additional temporal locality, but it is
something that need more tests in order to be configurable and effective
from the most of the systems.

Anyway I'll be very happy to receive feed-back about its performance with
proper hardware, hence thanks :)






--
View this message in context: http://activemq.2283324.n4.nabble.com/Paging-tp4724085p4724479.html
Sent from the ActiveMQ - Dev mailing list archive at Nabble.com.

Re: Paging

Posted by Clebert Suconic <cl...@gmail.com>.

> Im unclear why you’d run or you’re recommending with disabling the disk cache's?
>


I don't think we need that any longer. We always call sync now on libaio.

We used to require users to disable a level of caching that would keep
the data at the Kernel buffer, and not on the actual disk. it wouldn't
survive cache if you didn't disable that.


I don't think Francesco's work would make it into 2.0.1... it would
make it 2.1, and I think we would need more time to validate it.

Re: Paging

Posted by Michael André Pearce <mi...@me.com>.

Hi Franz,

Sorry if its obvious, will this be in the 2.0.1 release that Clebert just announced to look to be tagged soon?

If so, with work time constraints, as we are focused on rolling out new brokers, ill try to run this on the setup we have currently on our physical server lab with ssd's, and try get some feedback.

One note i have, if the disks you’re using are SSD’s of data centre quality (e.g. Intel s3610, s3710), i would expect them to have write cache’s with capacitor backed to guarantee the write even on lights out, as such especially as SSD in the data centre becomes more prevalent. Im unclear why you’d run or you’re recommending with disabling the disk cache's?

Cheers
Mike


> On 1 Apr 2017, at 08:14, nigro_franz <ni...@gmail.com> wrote:
> 
> BIG Update:
> 
> I've just completed the performance work to make the MAPPED Journal suitable
> to be used as a full citizen Journal type for any kind of loads (concurrent
> and durable ones in particular).
> 
> In order to make it suitable to be used with high loads of concurrent
> requests of persistent messages, I've performed these optimisations:
> 1) smart batching on concurrent/high-rate sync requests (different from NIO
> and ASYNCIO; it is high throughput *and* low latency, falling back to honor
> a configurable SLA on latency only when unable to achieve both)  
> 2) adapting (on the load) page prefetching/zeroing to make the write
> requests more sympathetic with the Unix/Windows paging caching policies and
> the hard disk sector sizes
> 
> The change is huge and I'm thinking (need feedbacks!!) to make a different
> (the old one would be good) implementation of it for the PAGING: they
> satisfies different purposes and requires different optimisations.
> 
> I prefer to not publish benchmark results and I've not updated the Artemis
> journal benchmark tool on purpose: it can't show the improvements easily due
> to the kind of optimisations done to make it faster on the common execution
> paths.
> On the other hand, I've reused the tuning performed by the tool on Artemis
> startup to configure the latency SLA of the journal and the write buffer
> size: hence a user will need only to set the Journal type on broker.xml and
> everything will work as the the ASYNCIO case (using its same properties,
> maxAIO excluded).
> 
> I'll suggest to try it on real loads and disabling the disk HW write cache
> (eg hdparm -W 0 /dev/sda on Linux) to have consistent results: it is more
> sensible than the ASYNCIO journal to these hardware features, in particular
> due to the adapting prefetching feature I've put it in.
> 
> The branch with its latest version is:  here
> <https://github.com/franz1981/activemq-artemis/tree/batch_buffer.>  . 
> 
> 
> 
> 
> 
> --
> View this message in context: http://activemq.2283324.n4.nabble.com/Paging-tp4724085p4724470.html
> Sent from the ActiveMQ - Dev mailing list archive at Nabble.com.