You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@activemq.apache.org by "Francesco Nigro (JIRA)" <ji...@apache.org> on 2016/06/10 17:55:21 UTC

[jira] [Comment Edited] (ARTEMIS-508) Sequential File Improvement + Performance Tests

    [ https://issues.apache.org/jira/browse/ARTEMIS-508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15324956#comment-15324956 ] 

Francesco Nigro edited comment on ARTEMIS-508 at 6/10/16 5:55 PM:
------------------------------------------------------------------

Checked! 

I'm not sure i prefer the vanilla's single Thread Executor as first-class choice to reduce the contention/need of synchronization of the journal; for few reasons:
1) it use internally a LinkedBlockingQueue: 
-it does not support in the right way batch offering due to the lock on the submit of new elements on it, implying a sort "OS queue effect" that lead to impredicible latencies, thus it will make even less predictable what is trying to cure
-it creates new nodes without recycling in any way,this could lead to a lot a bad effects on latencies and on the overall application behaviour due to the pressure on the garbage collector (eg Full GC pauses + http://psy-lob-saw.blogspot.it/2016/03/gc-nepotism-and-linked-queues.html) 
- the on-heap Java objects it uses are sensible to Card Marking issues and pointer chasing issues (less cache friendly!)
2) creates a lot of garbage submitting new Runnable for each write request (+ new Future<T>,FutureTask etcetera)
3) it doesn't order properly the write requests (cannot be made fair on the requests scheduling without pain)

Thus i see only two opportunities of improvements:
1) single threaded (like my current implementation): reduce features but make it fast as hell (single writer principle is ever a good idea IMHO and less branches let it more predictable and faster)
2) multi-threaded: using an approach similar to this   https://groups.google.com/forum/#!topic/mechanical-sympathy/5jMIVNf7zXA or with the single threaded version plus a lock-free ringBuffer that feed an event loop (that could use the single-threaded version of the SequenceFile)

What do you think about it?Do you see other possibilities?


was (Author: nigro.fra@gmail.com):
Checked! 

I'm not sure i prefer the vanilla's single Thread Executor as first-class choice to reduce the contention/need of synchronization of the journal; for few reasons:
1) it use internally a LinkedBlockingQueue: 
-it does not support in the right way batch offering due to the lock on the submit of new elements on it, implying a sort "OS queue effect" that lead to impredicible latencies, thus it will make even less predictable what is trying to cure
-it creates new nodes without recycling in any way,this could lead to a lot a bad effects on latencies and on the overall application behaviour due to the pressure on the garbage collector (eg Full GC pauses + http://psy-lob-saw.blogspot.it/2016/03/gc-nepotism-and-linked-queues.html) 
-using normal Java objects it is sensible to Card Marking issues and pointer chasing issues (less cache friendly!)
2) creates a lot of garbage submitting new Runnable for each write request (+ new Future<T>,FutureTask etcetera)
3) it doesn't order properly the write requests (cannot be made fair on the requests scheduling without pain)

Thus i see only two opportunities of improvements:
1) single threaded (like my current implementation): reduce features but make it fast as hell (single writer principle is ever a good idea IMHO and less branches let it more predictable and faster)
2) multi-threaded: using an approach similar to this   https://groups.google.com/forum/#!topic/mechanical-sympathy/5jMIVNf7zXA or with the single threaded version plus a lock-free ringBuffer that feed an event loop (that could use the single-threaded version of the SequenceFile)

What do you think about it?Do you see other possibilities?

> Sequential File Improvement + Performance Tests
> -----------------------------------------------
>
>                 Key: ARTEMIS-508
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-508
>             Project: ActiveMQ Artemis
>          Issue Type: Improvement
>            Reporter: Francesco Nigro
>              Labels: feature, performance, test
>
> https://github.com/franz1981/activemq-artemis/tree/artemis-journal-performance/artemis-journal/src/main/java/org/apache/activemq/artemis/core/io/mapped
> I've implemented in the package org.apache.activemq.artemis.core.io.mapped a new memory mapped SequentialFile implementation to support fast write/read operations for OSs that cannot use libaio or need RAM access performances on standard Files.
> The implementation is not thread-safe (it is needed?) and needs more buffer checks (or a complex implementation).  I'm using the Netty's PlatformDependent class to perform bulk copy without safepoint's poll issues.
> The implementation is simple but it's good as a proof of concept to compare it against the others: i' ve added a coordinated-omission performance test to measure the latency of a directWrite + OS jitter.
> https://github.com/franz1981/activemq-artemis/blob/artemis-journal-performance/artemis-journal/src/test/java/org/apache/activemq/artemis/core/io/aio/SequentialFileBench.java
> The write tests show performances typical of memory mapped file: quoting from Peter Lawrey "for burst of up to 10% of the main memory, it can sustain rates of 1 - 3 GB/second written. e.g. A laptop with 8 GB of memory might handle bursts of 800 MB at a rate of 1 GB per second. A server with 64 GB of memory might handle a burst of 6.5 GB at a rate of 3 GB per second".
> I want to improve the quality of the implementation by:
> 1- enforcing the original SequenceFile contract 
> 2- replacing synchronized reads/writes with lock-free versions
> 3- replacing the EventExecutor with a Lock-Free (even wait-free) Array Queue + EventLoop's poller
> 4- addressing all the False-Sharing issues around all the AtomicLong instances used
> 5- reducing the garbage produced for the fast paths to 0
> [6- expose try methods to allow direct flow control]
> [7- replacing the Semaphore based rate limiter with a lock free one]
> That's are all proposals, what do you think about it? 
> Regards



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)