You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "Mark Payne (JIRA)" <ji...@apache.org> on 2016/10/20 16:51:59 UTC

[jira] [Created] (NIFI-2925) FlowFiles that are swapped out are never released from Content Repository

Mark Payne created NIFI-2925:
--------------------------------

             Summary: FlowFiles that are swapped out are never released from Content Repository
                 Key: NIFI-2925
                 URL: https://issues.apache.org/jira/browse/NIFI-2925
             Project: Apache NiFi
          Issue Type: Bug
          Components: Core Framework
    Affects Versions: 1.0.0
            Reporter: Mark Payne
            Assignee: Mark Payne
            Priority: Blocker
             Fix For: 1.1.0


To reproduce this, I created a simple Flow: GenerateFlowFile (1 KB file size) with success going to 2 different UpdateAttribute Processors (so that the same Content Claim is held by 2 different FlowFiles). I let about 150,000 FlowFiles queue up (with backpressure turned off). I then start one of the UpdateAttribute processors. This drained its queue. I could then look at my content repo for any files not archived:

{code}
content_repository $ find . -type f | grep -v archive | wc -l
     192
{code}

After a few minutes, the FlowFile repo is checkpointed, which will result in things getting cleaned up if they can. The above command shows the same result (expected, since the FlowFiles are still held. I then empty the queue. After the FlowFile checkpoints again, I should see nothing in the content repo outside of archive, but I see:

{code}
content_repository $ find . -type f | grep -v archive | wc -l
     167
{code}

I see the same thing happening if I turn on expiration to remove the FlowFiles instead of clicking Empty Queue, or if a processor runs and completes the processing of the data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)