You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@nifi.apache.org by "Mark Payne (JIRA)" <ji...@apache.org> on 2015/07/01 15:08:07 UTC

[jira] [Commented] (NIFI-731) If content repo is unable to destroy content as fast as it is generated, nifi performance becomes very sporadic

    [ https://issues.apache.org/jira/browse/NIFI-731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14610193#comment-14610193 ] 

Mark Payne commented on NIFI-731:
---------------------------------

The patch supplied here provides a few improvements. It allows the user to synchronize individual partitions of the flowfile repo on regular intervals, which will allow some content claims to start being archived/desctroyed immediately. Currently, we wait until the repo is checkpointed and start destroying all content claims, so this will provide a smoother performance. Additionally, it allows the user to change the number of partitions used by the FlowFile Repo. This is done because experimentation shows that 16 partitions is generally enough and results in much better performance than 256 - so the default was also changed from 256 to 16.

A better but much more involved solution is to allow the Content Repository to append to an existing Content Claim, as described in NIFI-744. This will result in far fewer files to be deleted, and this will very much alleviate this problem.

> If content repo is unable to destroy content as fast as it is generated, nifi performance becomes very sporadic
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: NIFI-731
>                 URL: https://issues.apache.org/jira/browse/NIFI-731
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Core Framework
>            Reporter: Mark Payne
>            Assignee: Mark Payne
>             Fix For: 0.2.0
>
>         Attachments: 0001-NIFI-731-Updated-admin-guide-to-explain-the-flowfile.patch
>
>
> When the FlowFile Repository marks claims as destructable, it puts the notification on a queue that the content repo pulls from. If the content repo cannot keep up, the queue will fill, resulting in backpressure, that prevents the FlowFile repository from being updated. This, in turn, causes Processors to block, waiting on space to become available. This is by design.
> However, the capacity of this queue is quite large, and the content repo drains the entire queue, then destroys all content claims that are on it. As a result, this act of destroying claims can be quite long, and Processors can block for quite a period of time, leading to very sporadic performance.
> Instead, the content repo should pull from the queue and destroy the claims one at a time or in small batches, instead of draining the entire queue each time. This should result in much less sporadic behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)