You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@sling.apache.org by Roy Teeuwen <ro...@teeuwen.be> on 2017/09/25 16:36:37 UTC

File system resource provider - Performance

Hey all,

We are using the file system resource provider to monitor a directory to be attached as resources. Currently in our directory we have around 5 levels deep with in total 200.000 files. The problem we are facing is that when our application starts up, it takes an unusual amount of time to get online. It appears that the file system resource provider is building up the monitor cache and this is the cause of the performance decay. We currently have one file system resource provider config for the entire directory, would it be better to have multiple on lower levels, and if so, how many files maximum per resource provider config should we take?

Or maybe the file system resource provider is not made to work on this large amount of files? For our use-case it looks perfect because we also use the events being triggered and because we like to use the files as resources in our application without having to import them as actual jcr nodes.

Greets,
Roy

RE: File system resource provider - Performance

Posted by Stefan Seifert <ss...@pro-vision.de>.
we can definitly think about making that startup async, perhaps configurable (with default=async to speed thinks up) - those monitoring is mainly to support the resource events. and for a lot of use cases it may not be importing if file changes during the startup phase are not detected.

stefan

>-----Original Message-----
>From: Roy Teeuwen [mailto:roy@teeuwen.be]
>Sent: Friday, September 29, 2017 4:01 PM
>To: users@sling.apache.org
>Cc: Stefan Seifert
>Subject: Re: File system resource provider - Performance
>
>Hey Carsten,
>
>What about doing the initial scanning also async? As in that you do the
>initial scanning in the first scheduled run, without throwing change events
>for that first run
>
>Greets,
>Roy
>
>> On 29 Sep 2017, at 15:57, Carsten Ziegeler <cz...@apache.org> wrote:
>>
>>
>>
>>
>> Stefan Seifert wrote
>>>
>>>> I think we could try using newer file features from Java 7 which might
>>>> make the scanning obsolete. But I've never looked into it.
>>>
>>> you mean with e.g. this?
>>>
>https://docs.oracle.com/javase/7/docs/api/java/nio/file/WatchService.html
>>>
>>> i had this on my todo list some time ago for fsresource - but after some
>first experiments this seems to behave different depending on the operation
>system (windows vs. linux), so i dropped it that time. if someone comes up
>with a concept that works reliable on all operation systems i would be glad
>to help integrate it in fsresource.
>>>
>>
>> Yes, I agree - this doesn't seem to help (I now remember that I looked
>> into it a long time ago).
>>
>> I'm not sure if this is a good idea, but we could have a switch that
>> skips the initial scan and also skips registering the periodic scanners.
>> Once a file is accessed we lazily add a scanner for that directory to
>> the list. So once a resource/file is used you get change events. If a
>> directory is never touched, we don't do anything.
>>
>> Regards
>> Carsten
>> --
>> Carsten Ziegeler
>> Adobe Research Switzerland
>> cziegeler@apache.org



Re: File system resource provider - Performance

Posted by Roy Teeuwen <ro...@teeuwen.be>.
Hey Carsten,

What about doing the initial scanning also async? As in that you do the initial scanning in the first scheduled run, without throwing change events for that first run

Greets,
Roy

> On 29 Sep 2017, at 15:57, Carsten Ziegeler <cz...@apache.org> wrote:
> 
> 
> 
> 
> Stefan Seifert wrote
>> 
>>> I think we could try using newer file features from Java 7 which might
>>> make the scanning obsolete. But I've never looked into it.
>> 
>> you mean with e.g. this?
>> https://docs.oracle.com/javase/7/docs/api/java/nio/file/WatchService.html
>> 
>> i had this on my todo list some time ago for fsresource - but after some first experiments this seems to behave different depending on the operation system (windows vs. linux), so i dropped it that time. if someone comes up with a concept that works reliable on all operation systems i would be glad to help integrate it in fsresource.
>> 
> 
> Yes, I agree - this doesn't seem to help (I now remember that I looked
> into it a long time ago).
> 
> I'm not sure if this is a good idea, but we could have a switch that
> skips the initial scan and also skips registering the periodic scanners.
> Once a file is accessed we lazily add a scanner for that directory to
> the list. So once a resource/file is used you get change events. If a
> directory is never touched, we don't do anything.
> 
> Regards
> Carsten
> --
> Carsten Ziegeler
> Adobe Research Switzerland
> cziegeler@apache.org


Re: File system resource provider - Performance

Posted by Carsten Ziegeler <cz...@apache.org>.
 


Stefan Seifert wrote
> 
>> I think we could try using newer file features from Java 7 which might
>> make the scanning obsolete. But I've never looked into it.
> 
> you mean with e.g. this?
> https://docs.oracle.com/javase/7/docs/api/java/nio/file/WatchService.html
> 
> i had this on my todo list some time ago for fsresource - but after some first experiments this seems to behave different depending on the operation system (windows vs. linux), so i dropped it that time. if someone comes up with a concept that works reliable on all operation systems i would be glad to help integrate it in fsresource.
> 

Yes, I agree - this doesn't seem to help (I now remember that I looked
into it a long time ago).

I'm not sure if this is a good idea, but we could have a switch that
skips the initial scan and also skips registering the periodic scanners.
Once a file is accessed we lazily add a scanner for that directory to
the list. So once a resource/file is used you get change events. If a
directory is never touched, we don't do anything.

Regards
Carsten
-- 
Carsten Ziegeler
Adobe Research Switzerland
cziegeler@apache.org

RE: File system resource provider - Performance

Posted by Stefan Seifert <ss...@pro-vision.de>.
unfortunately i do not remember the details right now, but i assume it was something like "does not seem to work at all on windows" or so. i did not invest much time.
but it's probably worth having a second look.

stefan


>-----Original Message-----
>From: Roy Teeuwen [mailto:roy@teeuwen.be]
>Sent: Friday, September 29, 2017 3:58 PM
>To: users@sling.apache.org
>Subject: Re: File system resource provider - Performance
>
>Hey Stefan,
>
>I was planning to give that a try too, could you maybe elaborate on what
>hick-ups you noticed on the different OSes?
>
>Greets
>Roy
>
>> On 29 Sep 2017, at 15:52, Stefan Seifert <ss...@pro-vision.de> wrote:
>>
>>
>>> I think we could try using newer file features from Java 7 which might
>>> make the scanning obsolete. But I've never looked into it.
>>
>> you mean with e.g. this?
>> https://docs.oracle.com/javase/7/docs/api/java/nio/file/WatchService.html
>>
>> i had this on my todo list some time ago for fsresource - but after some
>first experiments this seems to behave different depending on the operation
>system (windows vs. linux), so i dropped it that time. if someone comes up
>with a concept that works reliable on all operation systems i would be glad
>to help integrate it in fsresource.
>>
>> stefan
>>
>>



Re: File system resource provider - Performance

Posted by Roy Teeuwen <ro...@teeuwen.be>.
Hey Stefan,

I was planning to give that a try too, could you maybe elaborate on what hick-ups you noticed on the different OSes?

Greets
Roy

> On 29 Sep 2017, at 15:52, Stefan Seifert <ss...@pro-vision.de> wrote:
> 
> 
>> I think we could try using newer file features from Java 7 which might
>> make the scanning obsolete. But I've never looked into it.
> 
> you mean with e.g. this?
> https://docs.oracle.com/javase/7/docs/api/java/nio/file/WatchService.html
> 
> i had this on my todo list some time ago for fsresource - but after some first experiments this seems to behave different depending on the operation system (windows vs. linux), so i dropped it that time. if someone comes up with a concept that works reliable on all operation systems i would be glad to help integrate it in fsresource.
> 
> stefan
> 
> 


RE: File system resource provider - Performance

Posted by Stefan Seifert <ss...@pro-vision.de>.
>I think we could try using newer file features from Java 7 which might
>make the scanning obsolete. But I've never looked into it.

you mean with e.g. this?
https://docs.oracle.com/javase/7/docs/api/java/nio/file/WatchService.html

i had this on my todo list some time ago for fsresource - but after some first experiments this seems to behave different depending on the operation system (windows vs. linux), so i dropped it that time. if someone comes up with a concept that works reliable on all operation systems i would be glad to help integrate it in fsresource.

stefan 



Re: File system resource provider - Performance

Posted by Carsten Ziegeler <cz...@apache.org>.
Hi,

yes, this is an odd behaviour. The whole tree is scanned on startup and
this is blocking. In addition, if you have a large tree the periodic
scanning for changes (this is polling) is probably not optimal either.

I think we could try using newer file features from Java 7 which might
make the scanning obsolete. But I've never looked into it.

Regards

Carsten


Roy Teeuwen wrote
> Hey all,
> 
> We are using the file system resource provider to monitor a directory to be attached as resources. Currently in our directory we have around 5 levels deep with in total 200.000 files. The problem we are facing is that when our application starts up, it takes an unusual amount of time to get online. It appears that the file system resource provider is building up the monitor cache and this is the cause of the performance decay. We currently have one file system resource provider config for the entire directory, would it be better to have multiple on lower levels, and if so, how many files maximum per resource provider config should we take?
> 
> Or maybe the file system resource provider is not made to work on this large amount of files? For our use-case it looks perfect because we also use the events being triggered and because we like to use the files as resources in our application without having to import them as actual jcr nodes.
> 
> Greets,
> Roy
> 
-- 
Carsten Ziegeler
Adobe Research Switzerland
cziegeler@apache.org