You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@nifi.apache.org by "Mark Payne (JIRA)" <ji...@apache.org> on 2015/07/11 03:44:04 UTC

[jira] [Created] (NIFI-756) Persistent Provenance Repository can avoid deleting events from lucene

Mark Payne created NIFI-756:
-------------------------------

             Summary: Persistent Provenance Repository can avoid deleting events from lucene
                 Key: NIFI-756
                 URL: https://issues.apache.org/jira/browse/NIFI-756
             Project: Apache NiFi
          Issue Type: Improvement
          Components: Core Framework
            Reporter: Mark Payne


Currently, when events expire in the repository, they are deleted from the indices. This is very expensive. Since the index is sharded (by default at 500 MB), we can instead just ensure that searches always have  a start date no earlier than the first provenance event. This way, we won't retrieve any expired records, but they can remain in the index. When all events in the index have expired (we know, based on the earliest event of the next index), we can simply close all readers/writers for the expired index and delete the entire index. This is far cheaper than continually updating the Lucene indices and would make a huge difference in performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)