You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@nifi.apache.org by "Mark Payne (JIRA)" <ji...@apache.org> on 2015/07/29 03:36:04 UTC

[jira] [Updated] (NIFI-756) Persistent Provenance Repository can avoid deleting events from lucene

     [ https://issues.apache.org/jira/browse/NIFI-756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mark Payne updated NIFI-756:
----------------------------
    Fix Version/s: 0.4.0

> Persistent Provenance Repository can avoid deleting events from lucene
> ----------------------------------------------------------------------
>
>                 Key: NIFI-756
>                 URL: https://issues.apache.org/jira/browse/NIFI-756
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Core Framework
>            Reporter: Mark Payne
>             Fix For: 0.4.0
>
>
> Currently, when events expire in the repository, they are deleted from the indices. This is very expensive. Since the index is sharded (by default at 500 MB), we can instead just ensure that searches always have  a start date no earlier than the first provenance event. This way, we won't retrieve any expired records, but they can remain in the index. When all events in the index have expired (we know, based on the earliest event of the next index), we can simply close all readers/writers for the expired index and delete the entire index. This is far cheaper than continually updating the Lucene indices and would make a huge difference in performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)