You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@activemq.apache.org by Michael André Pearce <mi...@me.com> on 2017/03/24 01:04:02 UTC

Paging

Hi All,

Before I go any further and raise a jira and look to implement.

Is there any reason paging only uses NIO and unlike the journal doesn't support toggling to use AIO or the new Mapped variant?

Cheers
Mike

Sent from my iPhone

Re: Paging

Posted by nigro_franz <ni...@gmail.com>.

I've continued to work to reduce the GC pressure on the hot paths while
journalling and/or paging and this is the latest result:

https://github.com/franz1981/activemq-artemis/tree/buffer_pooling

This is the list of improvements:
- NIO/ASYNCIO new TimedBuffer with adapting batch window heuristic (+
throughput and - latency)
- NIO/ASYNCIO journal/paging operations benefit from less buffer copies and
always performed with batch copy under the hood
- NIO journal operation perform TLABs allocation pooling (off heap) 
- NIO improved file copy operations using zero-copy FileChannel::transfertTo
- NIO improved zeroing using a single OS page buffer to clean the file 
- NIO deterministic release of unpooled buffers to avoid OOM errors due to
slow GC

Currently It needs some check against the test suite, but the first results
are promising :)




--
View this message in context: http://activemq.2283324.n4.nabble.com/Paging-tp4724085p4725510.html
Sent from the ActiveMQ - Dev mailing list archive at Nabble.com.

Re: Paging

Posted by nigro_franz <ni...@gmail.com>.

I've pooled the ByteBuffer (ByteBuf and ActiveMQBuffer) used to encode each
PagedMessage:

https://github.com/franz1981/activemq-artemis/commit/e1a698d7b97b6a8bfda508c945f45bbf788320a6#diff-f1eb88bba4fa77f5c5ef180231de9965R218

I'm waiting to do the PR because I need to do other stress tests on it..just
to be sure it doesn't break anything...



--
View this message in context: http://activemq.2283324.n4.nabble.com/Paging-tp4724085p4724767.html
Sent from the ActiveMQ - Dev mailing list archive at Nabble.com.

Re: Paging

Posted by Clebert Suconic <cl...@gmail.com>.

What was your change ?

Pull request ? :)
On Mon, Apr 10, 2017 at 5:35 AM nigro_franz <ni...@gmail.com> wrote:

> The GC free paging (on the write path at least) seems to perform pretty
> good:
>
> This is without using it:
>
> Producer Throughput: 42153 ops/sec
> Consumer Throughput: 42453 ops/sec
> EndToEnd Throughput: 40818 ops/sec
> EndToEnd SERVICE-TIME Latencies distribution in MICROSECONDS
> mean             852567.48
> min              517996.54
> 50.00%           843055.10
> 90.00%          1098907.65
> 99.00%          1384120.32
> 99.90%          1451229.18
> 99.99%          1493172.22
> max             1493172.22
> count              1000000
>
> This is using it:
>
> Producer Throughput: 49744 ops/sec
> Consumer Throughput: 49739 ops/sec
> EndToEnd Throughput: 49738 ops/sec
> EndToEnd SERVICE-TIME Latencies distribution in MICROSECONDS
> mean               4948.23
> min                  92.16
> 50.00%             2162.69
> 90.00%            11337.73
> 99.00%            42991.62
> 99.90%           115867.65
> 99.99%           121110.53
> max              122159.10
> count              1000000
>
> The throughput is increased by 25% and the latencies are far better (less
> GC
> and shorter ones).
>
>
>
>
>
> --
> View this message in context:
> http://activemq.2283324.n4.nabble.com/Paging-tp4724085p4724761.html
> Sent from the ActiveMQ - Dev mailing list archive at Nabble.com.
>
-- 
Clebert Suconic

Re: Paging

Posted by nigro_franz <ni...@gmail.com>.

The GC free paging (on the write path at least) seems to perform pretty good:

This is without using it:

Producer Throughput: 42153 ops/sec
Consumer Throughput: 42453 ops/sec
EndToEnd Throughput: 40818 ops/sec
EndToEnd SERVICE-TIME Latencies distribution in MICROSECONDS
mean             852567.48
min              517996.54
50.00%           843055.10
90.00%          1098907.65
99.00%          1384120.32
99.90%          1451229.18
99.99%          1493172.22
max             1493172.22
count              1000000

This is using it:

Producer Throughput: 49744 ops/sec
Consumer Throughput: 49739 ops/sec
EndToEnd Throughput: 49738 ops/sec
EndToEnd SERVICE-TIME Latencies distribution in MICROSECONDS
mean               4948.23
min                  92.16
50.00%             2162.69
90.00%            11337.73
99.00%            42991.62
99.90%           115867.65
99.99%           121110.53
max              122159.10
count              1000000

The throughput is increased by 25% and the latencies are far better (less GC
and shorter ones).





--
View this message in context: http://activemq.2283324.n4.nabble.com/Paging-tp4724085p4724761.html
Sent from the ActiveMQ - Dev mailing list archive at Nabble.com.

Re: Paging

Posted by nigro_franz <ni...@gmail.com>.

@michael
Before putting the mmap in action on paging I've started to make some
improvements around the current one:

https://issues.apache.org/jira/browse/ARTEMIS-1104
(https://github.com/franz1981/activemq-artemis/tree/paging_improved)

With this change, on steady state:
- the writes while paging won't produce any garbage 
- the memory footprint will be constant
- there will not be created any (hidden) additional copy of the data before
writing into the file (on NIO), improving the write performance

As soon as I'll have some number with the effects on the overall broker,
I'll publish them, but the first tests on it already shows a massive
reduction in the minor GC numbers:

<http://activemq.2283324.n4.nabble.com/file/n4724758/original_paging.png> 
<http://activemq.2283324.n4.nabble.com/file/n4724758/improved_paging.png> 



--
View this message in context: http://activemq.2283324.n4.nabble.com/Paging-tp4724085p4724758.html
Sent from the ActiveMQ - Dev mailing list archive at Nabble.com.

Re: Paging

Posted by nigro_franz <ni...@gmail.com>.

@clebert
Yes, the change introduces a lot of new stuff and it needs to play better
with the rest of the system from a user perspective too (definition of
properties/configurations...I'm not used to them at all...you've seen my
last PR on NettyConnection and you know what I mean: a configuration
properties HELL).

@michael
As Clebert said we need more integration, unit and performance tests to
validate it.

For example, the "smart batching" policy (enabled by default now) used to
coalesce the fsyncs/msyncs operations seems to be pretty effective from a
latency point of view in the most of the cases (100->4k bytes sized messages
with 1->64 clients), but requires additional analysis with different load
distributions and proper HW to be sure on its effectiveness.

>I'm unclear why you’d run or you’re recommending with disabling the disk
cache's? 

You're right, I've missed some context here :P .
I've not given that advice from a durability point of view, but I've noticed
that the background zeroing of the journal files while under sustained load,
could "steal" (vendor and journal size dependent) the write cache buffer to
the main writer making it less stable from a latency perspective, especially
with long running all out throughput tests.

Different vendors implement that cache in very different ways and right now
I've not implemented the zeroing so differently from the other 2 journals:
memory mapped files are by default lazy zeroed (on *Nix) hence I could avoid
any zeroing at all, exploiting additional temporal locality, but it is
something that need more tests in order to be configurable and effective
from the most of the systems.

Anyway I'll be very happy to receive feed-back about its performance with
proper hardware, hence thanks :)






--
View this message in context: http://activemq.2283324.n4.nabble.com/Paging-tp4724085p4724479.html
Sent from the ActiveMQ - Dev mailing list archive at Nabble.com.

Re: Paging

Posted by Clebert Suconic <cl...@gmail.com>.

> Im unclear why you’d run or you’re recommending with disabling the disk cache's?
>


I don't think we need that any longer. We always call sync now on libaio.

We used to require users to disable a level of caching that would keep
the data at the Kernel buffer, and not on the actual disk. it wouldn't
survive cache if you didn't disable that.


I don't think Francesco's work would make it into 2.0.1... it would
make it 2.1, and I think we would need more time to validate it.

Re: Paging

Posted by Michael André Pearce <mi...@me.com>.

Hi Franz,

Sorry if its obvious, will this be in the 2.0.1 release that Clebert just announced to look to be tagged soon?

If so, with work time constraints, as we are focused on rolling out new brokers, ill try to run this on the setup we have currently on our physical server lab with ssd's, and try get some feedback.

One note i have, if the disks you’re using are SSD’s of data centre quality (e.g. Intel s3610, s3710), i would expect them to have write cache’s with capacitor backed to guarantee the write even on lights out, as such especially as SSD in the data centre becomes more prevalent. Im unclear why you’d run or you’re recommending with disabling the disk cache's?

Cheers
Mike


> On 1 Apr 2017, at 08:14, nigro_franz <ni...@gmail.com> wrote:
> 
> BIG Update:
> 
> I've just completed the performance work to make the MAPPED Journal suitable
> to be used as a full citizen Journal type for any kind of loads (concurrent
> and durable ones in particular).
> 
> In order to make it suitable to be used with high loads of concurrent
> requests of persistent messages, I've performed these optimisations:
> 1) smart batching on concurrent/high-rate sync requests (different from NIO
> and ASYNCIO; it is high throughput *and* low latency, falling back to honor
> a configurable SLA on latency only when unable to achieve both)  
> 2) adapting (on the load) page prefetching/zeroing to make the write
> requests more sympathetic with the Unix/Windows paging caching policies and
> the hard disk sector sizes
> 
> The change is huge and I'm thinking (need feedbacks!!) to make a different
> (the old one would be good) implementation of it for the PAGING: they
> satisfies different purposes and requires different optimisations.
> 
> I prefer to not publish benchmark results and I've not updated the Artemis
> journal benchmark tool on purpose: it can't show the improvements easily due
> to the kind of optimisations done to make it faster on the common execution
> paths.
> On the other hand, I've reused the tuning performed by the tool on Artemis
> startup to configure the latency SLA of the journal and the write buffer
> size: hence a user will need only to set the Journal type on broker.xml and
> everything will work as the the ASYNCIO case (using its same properties,
> maxAIO excluded).
> 
> I'll suggest to try it on real loads and disabling the disk HW write cache
> (eg hdparm -W 0 /dev/sda on Linux) to have consistent results: it is more
> sensible than the ASYNCIO journal to these hardware features, in particular
> due to the adapting prefetching feature I've put it in.
> 
> The branch with its latest version is:  here
> <https://github.com/franz1981/activemq-artemis/tree/batch_buffer.>  . 
> 
> 
> 
> 
> 
> --
> View this message in context: http://activemq.2283324.n4.nabble.com/Paging-tp4724085p4724470.html
> Sent from the ActiveMQ - Dev mailing list archive at Nabble.com.

Re: Paging

Posted by nigro_franz <ni...@gmail.com>.

BIG Update:

I've just completed the performance work to make the MAPPED Journal suitable
to be used as a full citizen Journal type for any kind of loads (concurrent
and durable ones in particular).

In order to make it suitable to be used with high loads of concurrent
requests of persistent messages, I've performed these optimisations:
1) smart batching on concurrent/high-rate sync requests (different from NIO
and ASYNCIO; it is high throughput *and* low latency, falling back to honor
a configurable SLA on latency only when unable to achieve both)  
2) adapting (on the load) page prefetching/zeroing to make the write
requests more sympathetic with the Unix/Windows paging caching policies and
the hard disk sector sizes

The change is huge and I'm thinking (need feedbacks!!) to make a different
(the old one would be good) implementation of it for the PAGING: they
satisfies different purposes and requires different optimisations.

I prefer to not publish benchmark results and I've not updated the Artemis
journal benchmark tool on purpose: it can't show the improvements easily due
to the kind of optimisations done to make it faster on the common execution
paths.
On the other hand, I've reused the tuning performed by the tool on Artemis
startup to configure the latency SLA of the journal and the write buffer
size: hence a user will need only to set the Journal type on broker.xml and
everything will work as the the ASYNCIO case (using its same properties,
maxAIO excluded).

I'll suggest to try it on real loads and disabling the disk HW write cache
(eg hdparm -W 0 /dev/sda on Linux) to have consistent results: it is more
sensible than the ASYNCIO journal to these hardware features, in particular
due to the adapting prefetching feature I've put it in.

The branch with its latest version is:  here
<https://github.com/franz1981/activemq-artemis/tree/batch_buffer.>  . 





--
View this message in context: http://activemq.2283324.n4.nabble.com/Paging-tp4724085p4724470.html
Sent from the ActiveMQ - Dev mailing list archive at Nabble.com.

Re: Paging

Posted by nigro_franz <ni...@gmail.com>.

Sorry for the late response guys, I was killed by the spring allergies in the
last days; my eyes was burning :(


Clebert Suconic wrote
> I am treating Mappedfile as experimental still.. 

Agree, it is a real work in progress :) 
To use the mapped journal safely there is at least 1 big blind spot that
need to be addressed (with datasync on): major page faults.
Right now I've done some benchmarks around multi-threading append and the
biggest bottleneck seems to be way by which we cache the journal files: I'm
guessing will be necessary to leverage the OS page cache strategies and 
JournalFilesRepository
<https://github.com/apache/activemq-artemis/blob/master/artemis-journal/src/main/java/org/apache/activemq/artemis/core/journal/impl/JournalFilesRepository.java>  
could manage the file caching with a LRU policy, but I need some help to
implement it..


Michael André Pearce wrote
> MemoryMapped for paging was actually the one I was more thinking to be
> beneficial. 

Yes, for Paging (or the normal Journal with datasync off), I'm sure that it
will be pretty fast as it is as long that there will be enough memory in the
HW.
If any of you is curious to try the latest version of it, here is my latest
branch on it:
https://github.com/franz1981/activemq-artemis/tree/batch_buffer.
On it you could find a  journal benchmark
<https://github.com/franz1981/activemq-artemis/blob/batch_buffer/artemis-journal/src/test/java/org/apache/activemq/artemis/core/io/JournalTptBenchmark.java>  
that could show the page faults issue (on Nio and Mapped) and a general idea
of how it perform compared to the others (in multi thread or mixed producers
scenario).


Clebert Suconic wrote
> It will be awesome when people start usin new devices that are optimized 
> around mmap. 

Yes, it will be cool! As soon as possible I'll recontact the guys that
developed PMEM to try to make a working implementation ( it will be a
straightforward porting of the current MAPPED journal very likely), but i
need the hardware to try it for real :D







--
View this message in context: http://activemq.2283324.n4.nabble.com/Paging-tp4724085p4724130.html
Sent from the ActiveMQ - Dev mailing list archive at Nabble.com.

Re: Paging

Posted by Michael André Pearce <mi...@me.com>.

Going off Francesca's results when implementing memory mapped for journal :
https://issues.apache.org/jira/browse/ARTEMIS-906

Running mapped with no data sync will be of orders of magnitude on the write path. 



Sent from my iPhone

> On 24 Mar 2017, at 01:53, Clebert Suconic <cl...@gmail.com> wrote:
> 
> On Thu, Mar 23, 2017 at 9:38 PM Michael André Pearce <
> michael.andre.pearce@me.com> wrote:
> 
>> MemoryMapped for paging was actually the one I was more thinking to be
>> beneficial.
>> 
>> Afaik paging is rebuildable of server crash as essentially just memory
>> spill from the heap?
>> 
>> What I was thinking with memory mapped could give improved io at the
>> expense of guaranteed disk persistence in server crash. But for paging I
>> believe (please correct this assumption) this would be ok?
> 
> 
> 
> It's the same as NIO actually in terms of guarantees.
> 
> But it should save CPU on kernel calls.
> 
> 
> 
> It will be awesome when people start usin new devices that are optimized
> around mmap.
> 
>> 
>> 
>> 
>> 
>> Sent from my iPhone
>> 
>> On 24 Mar 2017, at 01:27, Clebert Suconic <cl...@gmail.com>
>> wrote:
>> 
>>>> It would make sense to implement Journal using Mappedfile....
>>> 
>>> 
>>> I meant Paging of course.
>> 
> -- 
> Clebert Suconic

Re: Paging

Posted by Clebert Suconic <cl...@gmail.com>.

On Thu, Mar 23, 2017 at 9:38 PM Michael André Pearce <
michael.andre.pearce@me.com> wrote:

> MemoryMapped for paging was actually the one I was more thinking to be
> beneficial.
>
> Afaik paging is rebuildable of server crash as essentially just memory
> spill from the heap?
>
> What I was thinking with memory mapped could give improved io at the
> expense of guaranteed disk persistence in server crash. But for paging I
> believe (please correct this assumption) this would be ok?



It's the same as NIO actually in terms of guarantees.

But it should save CPU on kernel calls.



It will be awesome when people start usin new devices that are optimized
around mmap.

>
>
>
>
> Sent from my iPhone
>
> On 24 Mar 2017, at 01:27, Clebert Suconic <cl...@gmail.com>
> wrote:
>
> >> It would make sense to implement Journal using Mappedfile....
> >
> >
> > I meant Paging of course.
>
-- 
Clebert Suconic

Re: Paging

Posted by Michael André Pearce <mi...@me.com>.

MemoryMapped for paging was actually the one I was more thinking to be beneficial.

Afaik paging is rebuildable of server crash as essentially just memory spill from the heap?

What I was thinking with memory mapped could give improved io at the expense of guaranteed disk persistence in server crash. But for paging I believe (please correct this assumption) this would be ok?

Sent from my iPhone

On 24 Mar 2017, at 01:27, Clebert Suconic <cl...@gmail.com> wrote:

>> It would make sense to implement Journal using Mappedfile....
> 
> 
> I meant Paging of course.

Re: Paging

Posted by Clebert Suconic <cl...@gmail.com>.

> It would make sense to implement Journal using Mappedfile....


I meant Paging of course.

Re: Paging

Posted by Clebert Suconic <cl...@gmail.com>.

AIO requires pre allocating the files.. which is not really a good
option on paging, as the files will grow from zero and not be reused.
(which is different on journal as we reuse the files). (At least I
didn't find a good way to use libaio on paging).

MappedFile is a new thing and we still didn't have time to get that
far. I am treating Mappedfile as experimental still.. we have the good
testsuite, but I didn't run many performance tests on it yet... I'm
not sure if Francesco Nigro or anyone else have done it so.

It would make sense to implement Journal using Mappedfile....

Maybe we could find a better algorithm to use libaio on paging

On Thu, Mar 23, 2017 at 9:04 PM, Michael André Pearce
<mi...@me.com> wrote:
> Hi All,
>
> Before I go any further and raise a jira and look to implement.
>
> Is there any reason paging only uses NIO and unlike the journal doesn't support toggling to use AIO or the new Mapped variant?
>
> Cheers
> Mike
>
> Sent from my iPhone

-- 
Clebert Suconic