You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@camel.apache.org by Sunil O <su...@gmail.com> on 2018/09/27 12:16:09 UTC

Usage of readLockMinAge

We have a implementation scenario with File consumer polling large number
of folders (3000+) and 200K files per day with 1 mts SLA per file
processing.

We also need to use readLock to avoid picking files which are being
written.

In this scenario - we went for readLock=changed option. However this option
results in thread sleeping for minium time as specified by
readLockCheckInterval period option. While looking for workarounds - we
found the readLockMinAge option - which allows to pick up files which are
old enough without getting into the sleep mode. This has reduced the time
for picking files and thereby reducing the overall processing time.

However whenever a file is encountered with age below minage - the sleep
occurs as per the logic in FileChangedExclusiveReadLockStrategy.  If this
'sleep' step can be avoided when readLockMin age is specified - then
insteading of sleeping - the consumer can go on to pick other files. This
modified behavior would be useful in scenario where overall throughput and
processing performance is important than sequential processing etc.


While browsing similar issues - found JIRA issue 9324 which also discusses
the issue regarding the Sleep step -

So it would be good if one of the below is available

a)  there is a separate ExclusiveReadLockStrategy similar to
FileChangedExclusiveReadLockStrategy which deals only with readLockMin age
and skips file if age is not met instead of sleeping.

Or

b) an option skip/sleep should be added for
FileChangedExclusiveReadLockStrategy when readLockMinAge is used.


Please give your suggestions.

Re: Usage of readLockMinAge

Posted by Sunil lam <su...@gmail.com>.
Understood.. Thanks..  that looks a better approach - it will save time/cpu
cycles as it happens much earlier before fileprocessstrategy.

On Fri, Sep 28, 2018 at 11:25 AM Claus Ibsen <cl...@gmail.com> wrote:

> Hi
>
> What I mean is that the filter can skip the file for now when its too
> young, and then later when its older, then its included, and then the
> read lock can take over afterwards.
> On Thu, Sep 27, 2018 at 7:18 PM Sunil lam <su...@gmail.com>
> wrote:
> >
> > Here we do not want to skip files based on size - but based on whether
> file
> > has attained the minAge (as per lastmodifieddate).
> > With respect to readLockMinAge - we feel it is a useful check to find out
> > if the incoming file is fully written and if used in conjunction with
> > readLock=changed option it helps to pick completed files. In each polling
> > iteration - we want to skip files that have not met the minAge criteria.
> > The skipped files will be reevaluated in the next polling. There should
> not
> > be any 'sleep' happening which will decrease overall throughput. Hence
> the
> > suggestion for the either of the 2 changes (mentioned originally)  with
> > respect to FileChangedExclusiveReadLockStrategy.
> >
> > As an additional note but not related to the main issue  - We are also
> > using a custom filter though for a different purpose - since file polling
> > consumer is single threaded - we had multiple pollers configured for the
> > same root directory with a custom filter that filters out specific
> > subdirectories for each poller. In this way the 3000 subdirectories are
> > divided logically among the pollers.
> >
> >
> >
> > On Thu, Sep 27, 2018 at 6:48 PM Claus Ibsen <cl...@gmail.com>
> wrote:
> >
> > > Hi
> > >
> > > Have you tried with a custom filter where you can check file size and
> > > skip small files.
> > > On Thu, Sep 27, 2018 at 2:16 PM Sunil O <su...@gmail.com>
> wrote:
> > > >
> > > > We have a implementation scenario with File consumer polling large
> number
> > > > of folders (3000+) and 200K files per day with 1 mts SLA per file
> > > > processing.
> > > >
> > > > We also need to use readLock to avoid picking files which are being
> > > > written.
> > > >
> > > > In this scenario - we went for readLock=changed option. However this
> > > option
> > > > results in thread sleeping for minium time as specified by
> > > > readLockCheckInterval period option. While looking for workarounds -
> we
> > > > found the readLockMinAge option - which allows to pick up files
> which are
> > > > old enough without getting into the sleep mode. This has reduced the
> time
> > > > for picking files and thereby reducing the overall processing time.
> > > >
> > > > However whenever a file is encountered with age below minage - the
> sleep
> > > > occurs as per the logic in FileChangedExclusiveReadLockStrategy.  If
> this
> > > > 'sleep' step can be avoided when readLockMin age is specified - then
> > > > insteading of sleeping - the consumer can go on to pick other files.
> This
> > > > modified behavior would be useful in scenario where overall
> throughput
> > > and
> > > > processing performance is important than sequential processing etc.
> > > >
> > > >
> > > > While browsing similar issues - found JIRA issue 9324 which also
> > > discusses
> > > > the issue regarding the Sleep step -
> > > >
> > > > So it would be good if one of the below is available
> > > >
> > > > a)  there is a separate ExclusiveReadLockStrategy similar to
> > > > FileChangedExclusiveReadLockStrategy which deals only with
> readLockMin
> > > age
> > > > and skips file if age is not met instead of sleeping.
> > > >
> > > > Or
> > > >
> > > > b) an option skip/sleep should be added for
> > > > FileChangedExclusiveReadLockStrategy when readLockMinAge is used.
> > > >
> > > >
> > > > Please give your suggestions.
> > >
> > >
> > >
> > > --
> > > Claus Ibsen
> > > -----------------
> > > http://davsclaus.com @davsclaus
> > > Camel in Action 2: https://www.manning.com/ibsen2
> > >
>
>
>
> --
> Claus Ibsen
> -----------------
> http://davsclaus.com @davsclaus
> Camel in Action 2: https://www.manning.com/ibsen2
>

Re: Usage of readLockMinAge

Posted by Claus Ibsen <cl...@gmail.com>.
Hi

What I mean is that the filter can skip the file for now when its too
young, and then later when its older, then its included, and then the
read lock can take over afterwards.
On Thu, Sep 27, 2018 at 7:18 PM Sunil lam <su...@gmail.com> wrote:
>
> Here we do not want to skip files based on size - but based on whether file
> has attained the minAge (as per lastmodifieddate).
> With respect to readLockMinAge - we feel it is a useful check to find out
> if the incoming file is fully written and if used in conjunction with
> readLock=changed option it helps to pick completed files. In each polling
> iteration - we want to skip files that have not met the minAge criteria.
> The skipped files will be reevaluated in the next polling. There should not
> be any 'sleep' happening which will decrease overall throughput. Hence the
> suggestion for the either of the 2 changes (mentioned originally)  with
> respect to FileChangedExclusiveReadLockStrategy.
>
> As an additional note but not related to the main issue  - We are also
> using a custom filter though for a different purpose - since file polling
> consumer is single threaded - we had multiple pollers configured for the
> same root directory with a custom filter that filters out specific
> subdirectories for each poller. In this way the 3000 subdirectories are
> divided logically among the pollers.
>
>
>
> On Thu, Sep 27, 2018 at 6:48 PM Claus Ibsen <cl...@gmail.com> wrote:
>
> > Hi
> >
> > Have you tried with a custom filter where you can check file size and
> > skip small files.
> > On Thu, Sep 27, 2018 at 2:16 PM Sunil O <su...@gmail.com> wrote:
> > >
> > > We have a implementation scenario with File consumer polling large number
> > > of folders (3000+) and 200K files per day with 1 mts SLA per file
> > > processing.
> > >
> > > We also need to use readLock to avoid picking files which are being
> > > written.
> > >
> > > In this scenario - we went for readLock=changed option. However this
> > option
> > > results in thread sleeping for minium time as specified by
> > > readLockCheckInterval period option. While looking for workarounds - we
> > > found the readLockMinAge option - which allows to pick up files which are
> > > old enough without getting into the sleep mode. This has reduced the time
> > > for picking files and thereby reducing the overall processing time.
> > >
> > > However whenever a file is encountered with age below minage - the sleep
> > > occurs as per the logic in FileChangedExclusiveReadLockStrategy.  If this
> > > 'sleep' step can be avoided when readLockMin age is specified - then
> > > insteading of sleeping - the consumer can go on to pick other files. This
> > > modified behavior would be useful in scenario where overall throughput
> > and
> > > processing performance is important than sequential processing etc.
> > >
> > >
> > > While browsing similar issues - found JIRA issue 9324 which also
> > discusses
> > > the issue regarding the Sleep step -
> > >
> > > So it would be good if one of the below is available
> > >
> > > a)  there is a separate ExclusiveReadLockStrategy similar to
> > > FileChangedExclusiveReadLockStrategy which deals only with readLockMin
> > age
> > > and skips file if age is not met instead of sleeping.
> > >
> > > Or
> > >
> > > b) an option skip/sleep should be added for
> > > FileChangedExclusiveReadLockStrategy when readLockMinAge is used.
> > >
> > >
> > > Please give your suggestions.
> >
> >
> >
> > --
> > Claus Ibsen
> > -----------------
> > http://davsclaus.com @davsclaus
> > Camel in Action 2: https://www.manning.com/ibsen2
> >



-- 
Claus Ibsen
-----------------
http://davsclaus.com @davsclaus
Camel in Action 2: https://www.manning.com/ibsen2

Re: Usage of readLockMinAge

Posted by Sunil lam <su...@gmail.com>.
Here we do not want to skip files based on size - but based on whether file
has attained the minAge (as per lastmodifieddate).
With respect to readLockMinAge - we feel it is a useful check to find out
if the incoming file is fully written and if used in conjunction with
readLock=changed option it helps to pick completed files. In each polling
iteration - we want to skip files that have not met the minAge criteria.
The skipped files will be reevaluated in the next polling. There should not
be any 'sleep' happening which will decrease overall throughput. Hence the
suggestion for the either of the 2 changes (mentioned originally)  with
respect to FileChangedExclusiveReadLockStrategy.

As an additional note but not related to the main issue  - We are also
using a custom filter though for a different purpose - since file polling
consumer is single threaded - we had multiple pollers configured for the
same root directory with a custom filter that filters out specific
subdirectories for each poller. In this way the 3000 subdirectories are
divided logically among the pollers.



On Thu, Sep 27, 2018 at 6:48 PM Claus Ibsen <cl...@gmail.com> wrote:

> Hi
>
> Have you tried with a custom filter where you can check file size and
> skip small files.
> On Thu, Sep 27, 2018 at 2:16 PM Sunil O <su...@gmail.com> wrote:
> >
> > We have a implementation scenario with File consumer polling large number
> > of folders (3000+) and 200K files per day with 1 mts SLA per file
> > processing.
> >
> > We also need to use readLock to avoid picking files which are being
> > written.
> >
> > In this scenario - we went for readLock=changed option. However this
> option
> > results in thread sleeping for minium time as specified by
> > readLockCheckInterval period option. While looking for workarounds - we
> > found the readLockMinAge option - which allows to pick up files which are
> > old enough without getting into the sleep mode. This has reduced the time
> > for picking files and thereby reducing the overall processing time.
> >
> > However whenever a file is encountered with age below minage - the sleep
> > occurs as per the logic in FileChangedExclusiveReadLockStrategy.  If this
> > 'sleep' step can be avoided when readLockMin age is specified - then
> > insteading of sleeping - the consumer can go on to pick other files. This
> > modified behavior would be useful in scenario where overall throughput
> and
> > processing performance is important than sequential processing etc.
> >
> >
> > While browsing similar issues - found JIRA issue 9324 which also
> discusses
> > the issue regarding the Sleep step -
> >
> > So it would be good if one of the below is available
> >
> > a)  there is a separate ExclusiveReadLockStrategy similar to
> > FileChangedExclusiveReadLockStrategy which deals only with readLockMin
> age
> > and skips file if age is not met instead of sleeping.
> >
> > Or
> >
> > b) an option skip/sleep should be added for
> > FileChangedExclusiveReadLockStrategy when readLockMinAge is used.
> >
> >
> > Please give your suggestions.
>
>
>
> --
> Claus Ibsen
> -----------------
> http://davsclaus.com @davsclaus
> Camel in Action 2: https://www.manning.com/ibsen2
>

Re: Usage of readLockMinAge

Posted by Claus Ibsen <cl...@gmail.com>.
Hi

Have you tried with a custom filter where you can check file size and
skip small files.
On Thu, Sep 27, 2018 at 2:16 PM Sunil O <su...@gmail.com> wrote:
>
> We have a implementation scenario with File consumer polling large number
> of folders (3000+) and 200K files per day with 1 mts SLA per file
> processing.
>
> We also need to use readLock to avoid picking files which are being
> written.
>
> In this scenario - we went for readLock=changed option. However this option
> results in thread sleeping for minium time as specified by
> readLockCheckInterval period option. While looking for workarounds - we
> found the readLockMinAge option - which allows to pick up files which are
> old enough without getting into the sleep mode. This has reduced the time
> for picking files and thereby reducing the overall processing time.
>
> However whenever a file is encountered with age below minage - the sleep
> occurs as per the logic in FileChangedExclusiveReadLockStrategy.  If this
> 'sleep' step can be avoided when readLockMin age is specified - then
> insteading of sleeping - the consumer can go on to pick other files. This
> modified behavior would be useful in scenario where overall throughput and
> processing performance is important than sequential processing etc.
>
>
> While browsing similar issues - found JIRA issue 9324 which also discusses
> the issue regarding the Sleep step -
>
> So it would be good if one of the below is available
>
> a)  there is a separate ExclusiveReadLockStrategy similar to
> FileChangedExclusiveReadLockStrategy which deals only with readLockMin age
> and skips file if age is not met instead of sleeping.
>
> Or
>
> b) an option skip/sleep should be added for
> FileChangedExclusiveReadLockStrategy when readLockMinAge is used.
>
>
> Please give your suggestions.



-- 
Claus Ibsen
-----------------
http://davsclaus.com @davsclaus
Camel in Action 2: https://www.manning.com/ibsen2