You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "Koji Kawamura (JIRA)" <ji...@apache.org> on 2017/01/20 05:37:27 UTC

[jira] [Updated] (NIFI-3373) Add nifi.flow.configuration.archive.max.count property

     [ https://issues.apache.org/jira/browse/NIFI-3373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Koji Kawamura updated NIFI-3373:
--------------------------------
    Description: 
Currently we can limit the number of flow.xml.gz archive files by:

* total archive size (nifi.flow.configuration.archive.max.storage)
* archive file age (nifi.flow.configuration.archive.max.time)

In addition to these conditions to manage old archives, there's a demand that simply limiting number of archive files regardless time or size constraint.
https://lists.apache.org/thread.html/4d2d9cec46ee896318a5492bf020f60c28396e2850c077dad40d45d2@%3Cusers.nifi.apache.org%3E

We can provide that by adding new property 'nifi.flow.configuration.archive.max.count', so that If specified, only N latest config files can be archived.

Make those properties optional, and process in following order:

- If max.count is specified, any archive other than the latest (N-1) is removed
- If max.time is specified, any archive that is older than max.time is removed
- If max.storage is specified, old archives are deleted while total size is greater than the configuration
- Create new archive, keep the latest archive regardless of above limitations

To illustrate how flow.xml archiving works, here are simulations with the updated logic, where the size of flow.xml keeps increasing:

h3. CASE-1

archive.max.storage=10MB
archive.max.count = 5

Time | flow.xml | archives | archive total |
t1 | f1 5MB  | f1 | 5MB
t2 | f2 5MB  | f1, f2 | 10MB
t3 | f3 5MB  | f1, f2, f3 | 15MB
t4 | f4 10MB | f2, f3, f4 | 20MB
t5 | f5 15MB | f4, f5 | 25MB
t6 | f6 20MB | f6 | 20MB
t7 | f7 25MB | t7 | 25MB

* t3: f3 can is archived even total exceeds 10MB. Because f1 + f2 <=
10MB. WAR message starts to be logged from this point, because total
archive size > 10MB.
* t4: The oldest f1 is removed, because f1 + f2 + f3 > 10MB.
* t5: Even if flow.xml size exceeds max.storage, the latest archive is
created. f4 are kept because f4 <= 10MB.
* t6: f4 and f5 are removed because f4 + f5 > 10MB, and also f5 > 10MB.

In this case, NiFi will keep logging WAR (or should be ERR??) message
indicating archive storage size is exceeding limit, from t3.
After t6, even if archive.max.count = 5, NiFi will only keep the
latest flow.xml.

h3. CASE-2

If at least 5 archives need to be kept no matter what, then set
blank max.storage and max.time.

archive.max.storage=
archive.max.time=
archive.max.count = 5 // Only limit archives by count

Time | flow.xml | archives | archive total |
t1 | f1 5MB  | f1 | 5MB
t2 | f2 5MB  | f1, f2 | 10MB
t3 | f3 5MB  | f1, f2, f3 | 15MB
t4 | f4 10MB | f1, f2, f3, f4 | 25MB
t5 | f5 15MB | f1, f2, f3, f4, f5 | 40MB
t6 | f6 20MB | f2, f3, f4, f5, f6 | 55MB
t7 | f7 25MB | f3, f4, f5, f6, (f7) | 50MB, (75MB)
t8 | f8 30MB | f3, f4, f5, f6 | 50MB

* From t6, oldest archive is removed to keep number of archives <= 5
* At t7, if the disk has only 60MB space, f7 won't be archived. And
after this point, archive mechanism stop working (Trying to create new
archive, but keep getting exception: no space left on device).

In either case above, once flow.xml has grown to that size, some human
intervention would be needed

  was:
Currently we can limit the number of flow.xml.gz archive files by:

* total archive size (nifi.flow.configuration.archive.max.storage)
* archive file age (nifi.flow.configuration.archive.max.time)

In addition to these conditions to manage old archives, there's a demand that simply limiting number of archive files regardless time or size constraint.
https://lists.apache.org/thread.html/4d2d9cec46ee896318a5492bf020f60c28396e2850c077dad40d45d2@%3Cusers.nifi.apache.org%3E

We can provide that by adding new property 'nifi.flow.configuration.archive.max.count', so that If specified, only N latest config files can be archived.

Make those properties optional, and process in following order:

- If max.count is specified, any archive other than the latest (N-1) is removed
- If max.time is specified, any archive that is older than max.time is removed
- If max.storage is specified, old archives are deleted while total size is greater than the configuration
- Create new archive, keep the latest archive regardless of above limitations


> Add nifi.flow.configuration.archive.max.count property
> ------------------------------------------------------
>
>                 Key: NIFI-3373
>                 URL: https://issues.apache.org/jira/browse/NIFI-3373
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Core Framework
>            Reporter: Koji Kawamura
>
> Currently we can limit the number of flow.xml.gz archive files by:
> * total archive size (nifi.flow.configuration.archive.max.storage)
> * archive file age (nifi.flow.configuration.archive.max.time)
> In addition to these conditions to manage old archives, there's a demand that simply limiting number of archive files regardless time or size constraint.
> https://lists.apache.org/thread.html/4d2d9cec46ee896318a5492bf020f60c28396e2850c077dad40d45d2@%3Cusers.nifi.apache.org%3E
> We can provide that by adding new property 'nifi.flow.configuration.archive.max.count', so that If specified, only N latest config files can be archived.
> Make those properties optional, and process in following order:
> - If max.count is specified, any archive other than the latest (N-1) is removed
> - If max.time is specified, any archive that is older than max.time is removed
> - If max.storage is specified, old archives are deleted while total size is greater than the configuration
> - Create new archive, keep the latest archive regardless of above limitations
> To illustrate how flow.xml archiving works, here are simulations with the updated logic, where the size of flow.xml keeps increasing:
> h3. CASE-1
> archive.max.storage=10MB
> archive.max.count = 5
> Time | flow.xml | archives | archive total |
> t1 | f1 5MB  | f1 | 5MB
> t2 | f2 5MB  | f1, f2 | 10MB
> t3 | f3 5MB  | f1, f2, f3 | 15MB
> t4 | f4 10MB | f2, f3, f4 | 20MB
> t5 | f5 15MB | f4, f5 | 25MB
> t6 | f6 20MB | f6 | 20MB
> t7 | f7 25MB | t7 | 25MB
> * t3: f3 can is archived even total exceeds 10MB. Because f1 + f2 <=
> 10MB. WAR message starts to be logged from this point, because total
> archive size > 10MB.
> * t4: The oldest f1 is removed, because f1 + f2 + f3 > 10MB.
> * t5: Even if flow.xml size exceeds max.storage, the latest archive is
> created. f4 are kept because f4 <= 10MB.
> * t6: f4 and f5 are removed because f4 + f5 > 10MB, and also f5 > 10MB.
> In this case, NiFi will keep logging WAR (or should be ERR??) message
> indicating archive storage size is exceeding limit, from t3.
> After t6, even if archive.max.count = 5, NiFi will only keep the
> latest flow.xml.
> h3. CASE-2
> If at least 5 archives need to be kept no matter what, then set
> blank max.storage and max.time.
> archive.max.storage=
> archive.max.time=
> archive.max.count = 5 // Only limit archives by count
> Time | flow.xml | archives | archive total |
> t1 | f1 5MB  | f1 | 5MB
> t2 | f2 5MB  | f1, f2 | 10MB
> t3 | f3 5MB  | f1, f2, f3 | 15MB
> t4 | f4 10MB | f1, f2, f3, f4 | 25MB
> t5 | f5 15MB | f1, f2, f3, f4, f5 | 40MB
> t6 | f6 20MB | f2, f3, f4, f5, f6 | 55MB
> t7 | f7 25MB | f3, f4, f5, f6, (f7) | 50MB, (75MB)
> t8 | f8 30MB | f3, f4, f5, f6 | 50MB
> * From t6, oldest archive is removed to keep number of archives <= 5
> * At t7, if the disk has only 60MB space, f7 won't be archived. And
> after this point, archive mechanism stop working (Trying to create new
> archive, but keep getting exception: no space left on device).
> In either case above, once flow.xml has grown to that size, some human
> intervention would be needed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)