You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nifi.apache.org by Brandon DeVries <br...@jhu.edu> on 2016/10/11 20:03:12 UTC

Provenance repo corruption

Devs,

I just opened a ticket to address an issue we've encountered with
Provenance repo corruption[1].  This would address (as is currently
partially being done) how to recover from a corrupt provenance repo.
However, the question is whether we can avoid this sort of corruption in
the first place.  The immediate thought that jumped to mind was wrapping
the writes to lucene with a write ahead log.  Obviously, this would
increase the overhead on something that is already fairly expensive.
However, in cases where provenance is *really* important, it might be worth
considering.  This could potentially be another flavor of
ProvenanceEventRepository, e.g.  WriteAheadPersistentProvenanceRepository
or FaultTolerantProvenanceRepository.  Does anyone have any thoughts /
opinions?

Brandon

[1] https://issues.apache.org/jira/browse/NIFI-2890