You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Thomas Mueller (JIRA)" <ji...@apache.org> on 2016/02/01 07:51:39 UTC

[jira] [Commented] (OAK-2761) Persistent cache: add data in a different thread

    [ https://issues.apache.org/jira/browse/OAK-2761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15125848#comment-15125848 ] 

Thomas Mueller commented on OAK-2761:
-------------------------------------

The "different thread" logic is implemented in the (related) TCPBroadcaster class, using:

{noformat}
ArrayBlockingQueue<ByteBuffer> sendBuffer
...
// entries are added to this buffer in the main thread (the thread that must not be blocked):
    @Override
    public void send(ByteBuffer buff) {
        ByteBuffer b = ByteBuffer.allocate(buff.remaining());
        b.put(buff);
        b.flip();
        while (sendBuffer.size() > MAX_BUFFER_SIZE) {
            sendBuffer.poll();
        }
        try {
            sendBuffer.add(b);
        } catch (IllegalStateException e) {
            // ignore - might happen once in a while,
            // if the buffer was not yet full just before, but now
            // many threads concurrently tried to add
        }
    }

// the thread that sends (writes):
    void send() {
        while (isRunning()) {
            try {
                ByteBuffer buff = sendBuffer.poll(10, TimeUnit.MILLISECONDS);
                if (buff != null && isRunning()) {
                    sendBuffer(buff);
                }
            } catch (InterruptedException e) {
                // ignore
            }
        }
    }
...
{noformat}

As for threading, I have used an explicit new thread. I think that's much better than using a thread pool or similar, because we have full control over how to start and stop the thread. As we have seen for the AsyncIndexUpdate thread, relying on an external thread pool is dangerous, as stopping the thread pool can call Thread.interrupt (which results in all kinds of problems), or the thread pool is shut down too late, or shutting down the thread pool does not wait for all running threads to stop (no Thread.join). Also, you can give the thread a nice, human readable name, which is not easy with a thread pool. So I have used:

{noformat}
        sendThread = new Thread(new Runnable() {
            @Override
            public void run() {
                send();
            }
        }, "Oak TCPBroadcaster: send #" + id);
        sendThread.setDaemon(true);
        sendThread.start();
...
    @Override
    public void close() {
        if (isRunning()) {
            LOG.debug("Stopping");
            synchronized (stop) {
                stop.set(true);
                stop.notifyAll();
            }
         ...
            try {
                sendThread.join();
            } catch (InterruptedException e) {
                // ignore
            }
{noformat}

> Persistent cache: add data in a different thread
> ------------------------------------------------
>
>                 Key: OAK-2761
>                 URL: https://issues.apache.org/jira/browse/OAK-2761
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: cache, core, mongomk
>            Reporter: Thomas Mueller
>            Assignee: Thomas Mueller
>              Labels: resilience
>             Fix For: 1.4, 1.3.15
>
>
> The persistent cache usually stores data in a background thread, but sometimes (if a lot of data is added quickly) the foreground thread is blocked.
> Even worse, switching the cache file can happen in a foreground thread, with the following stack trace.
> {noformat}
> "127.0.0.1 [1428931262206] POST /bin/replicate.json HTTP/1.1" prio=5 tid=0x00007fe5df819800 nid=0x9907 runnable [0x0000000113fc4000]
>    java.lang.Thread.State: RUNNABLE
>         ...
> 	at org.h2.mvstore.MVStoreTool.compact(MVStoreTool.java:404)
> 	at org.apache.jackrabbit.oak.plugins.document.persistentCache.PersistentCache$1.closeStore(PersistentCache.java:213)
> 	- locked <0x0000000782483050> (a org.apache.jackrabbit.oak.plugins.document.persistentCache.PersistentCache$1)
> 	at org.apache.jackrabbit.oak.plugins.document.persistentCache.PersistentCache.switchGenerationIfNeeded(PersistentCache.java:350)
> 	- locked <0x0000000782455710> (a org.apache.jackrabbit.oak.plugins.document.persistentCache.PersistentCache)
> 	at org.apache.jackrabbit.oak.plugins.document.persistentCache.NodeCache.write(NodeCache.java:85)
> 	at org.apache.jackrabbit.oak.plugins.document.persistentCache.NodeCache.put(NodeCache.java:130)
> 	at org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore.applyChanges(DocumentNodeStore.java:1060)
> 	at org.apache.jackrabbit.oak.plugins.document.Commit.applyToCache(Commit.java:599)
> 	at org.apache.jackrabbit.oak.plugins.document.CommitQueue.afterTrunkCommit(CommitQueue.java:127)
> 	- locked <0x0000000781890788> (a org.apache.jackrabbit.oak.plugins.document.CommitQueue)
> 	at org.apache.jackrabbit.oak.plugins.document.CommitQueue.done(CommitQueue.java:83)
> 	at org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore.done(DocumentNodeStore.java:637)
> {noformat}
> To avoid blocking the foreground thread, one solution is to store all data in a separate thread. If there is too much data added, then some of the data is not stored. If possible, the data that was not referenced a lot, and / or old revisions of documents (if new revisions are available).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)