You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@sling.apache.org by Dascalita Dragos <dd...@gmail.com> on 2012/06/03 13:35:11 UTC

DataStore GC for Sling

Hi,
Are there any implementations for Sling to clean the JCR DataStore ? I've
seen an impl for CRX c.f [1] ( Running GC section ), but couldn't find one
for Apache Sling.
I found only a JCR Doc [2] mentioning how to do it, but I'm getting errors
that I can't connect to repository, b/c the repository is locked.

I'm wondering if this is a supported functionality in Sling, if not, are
there any plans to include it and any temporary workaround ?

Thanks,
Dragos


1-
http://dev.day.com/docs/en/cq/current/core/administering/persistence_managers.html
2- http://wiki.apache.org/jackrabbit/DataStore

Re: DataStore GC for Sling

Posted by Felix Meschberger <fm...@adobe.com>.
Hi

I think what Dascalita is after is the data store GC which is living besides the persistence manager and is used to store large (> 32KB depending on config IIRC) binary objects.

Regards
Felix

Am 03.06.2012 um 20:31 schrieb Günther Schmidt:

> Hi Dascalita,
> 
> I'm not sure that Sling has the same need for a JCR GC as CRX does. Both 
> use Jackrabbit as the JCR backend but CRX plugs a different Persistence 
> Manager into Jackrabbit, TarPM, which is database less and append only, 
> based on the tar file format, and, I believe proprietary. An append-only 
> datastore would definitely need GC. I presume the out of the box 
> Jackrabbit implementation does it immediately, but I'm just guessing here.
> 
> Günther
> 
> Am 03.06.12 13:35, schrieb Dascalita Dragos:
>> Hi,
>> Are there any implementations for Sling to clean the JCR DataStore ? I've
>> seen an impl for CRX c.f [1] ( Running GC section ), but couldn't find one
>> for Apache Sling.
>> I found only a JCR Doc [2] mentioning how to do it, but I'm getting errors
>> that I can't connect to repository, b/c the repository is locked.
>> 
>> I'm wondering if this is a supported functionality in Sling, if not, are
>> there any plans to include it and any temporary workaround ?
>> 
>> Thanks,
>> Dragos
>> 
>> 
>> 1-
>> http://dev.day.com/docs/en/cq/current/core/administering/persistence_managers.html
>> 2- http://wiki.apache.org/jackrabbit/DataStore
>> 


Re: DataStore GC for Sling

Posted by Günther Schmidt <gu...@kmmd.de>.
Hi Dascalita,

I'm not sure that Sling has the same need for a JCR GC as CRX does. Both 
use Jackrabbit as the JCR backend but CRX plugs a different Persistence 
Manager into Jackrabbit, TarPM, which is database less and append only, 
based on the tar file format, and, I believe proprietary. An append-only 
datastore would definitely need GC. I presume the out of the box 
Jackrabbit implementation does it immediately, but I'm just guessing here.

Günther

Am 03.06.12 13:35, schrieb Dascalita Dragos:
> Hi,
> Are there any implementations for Sling to clean the JCR DataStore ? I've
> seen an impl for CRX c.f [1] ( Running GC section ), but couldn't find one
> for Apache Sling.
> I found only a JCR Doc [2] mentioning how to do it, but I'm getting errors
> that I can't connect to repository, b/c the repository is locked.
>
> I'm wondering if this is a supported functionality in Sling, if not, are
> there any plans to include it and any temporary workaround ?
>
> Thanks,
> Dragos
>
>
> 1-
> http://dev.day.com/docs/en/cq/current/core/administering/persistence_managers.html
> 2- http://wiki.apache.org/jackrabbit/DataStore
>

Re: DataStore GC for Sling

Posted by Felix Meschberger <fm...@adobe.com>.
Thanks. I have applied it.

Regards
Felix

Am 04.06.2012 um 16:24 schrieb Dascalita Dragos:

> Hi Felix,
> Following your suggestions I've submitted a patch, creating a JIRA issue,
> c.f [1]
> It works very nice, and the hook is easy to use afterwards.
> 
> Thanks for your help,
> Dragos
> [1] - https://issues.apache.org/jira/browse/SLING-2501
> 
> On Mon, Jun 4, 2012 at 1:06 PM, Felix Meschberger <fm...@adobe.com>wrote:
> 
>> Hi,
>> 
>> Am 04.06.2012 um 11:52 schrieb Dascalita Dragos:
>> 
>>> Hi Felix,
>>> Thanks for your answer. I've been trying to avoid adding code in
>>> jackrabbit-server as much as possible, but I found myself ending up
>> exactly
>>> in this place, just as you were saying.
>>> 
>>> I'm going to impl this logic into jackrabbit-server, and then I'm
>> thinking
>>> to create a patch and send it as an improvement; in case it will make its
>>> way into the Sling codebase at some point it would be great.
>> 
>> Sounds good.
>> 
>>> 
>>> I'm thinking to create a class that implements
>>> org.apache.jackrabbit.api.management.RepositoryManager, expose it from
>>> SlingServerRepository. For the stop() method I'm going to throw an
>>> exception as you said, and for the other method
>>> createDataStoreGarbageCollector delegate the call to RepositoryImpl, as
>> you
>>> advised.
>> 
>> I would make it even simpler: Have the SlingServerRepository class
>> implement RepositoryManager and register it as such. This would require to
>> overwrite the AbsractSlingRepository.registerService method and adding the
>> RepositoryManager interface to the list of registered services.
>> 
>>> 
>>> IMO I think that this behavior is very important when you work with the
>>> DataStore, as at the moment there is no way to remove content from it.
>>> People working with documents, large images, or videos - which is my
>> case,
>>> need this a lot.
>> 
>> I agree.
>> 
>> Regards
>> Felix
>> 
>>> 
>>> Thanks for you help,
>>> Dragos
>>> 
>>> 
>>> 
>>> On Sun, Jun 3, 2012 at 7:30 PM, Felix Meschberger <fmeschbe@adobe.com
>>> wrote:
>>> 
>>>> Hi,
>>>> 
>>>> Am 03.06.2012 um 13:35 schrieb Dascalita Dragos:
>>>> 
>>>>> Hi,
>>>>> Are there any implementations for Sling to clean the JCR DataStore ?
>> I've
>>>>> seen an impl for CRX c.f [1] ( Running GC section ), but couldn't find
>>>> one
>>>>> for Apache Sling.
>>>>> I found only a JCR Doc [2] mentioning how to do it, but I'm getting
>>>> errors
>>>>> that I can't connect to repository, b/c the repository is locked.
>>>>> 
>>>>> I'm wondering if this is a supported functionality in Sling, if not,
>> are
>>>>> there any plans to include it and any temporary workaround ?
>>>> 
>>>> No, the jackrabbit server bundle does not currently expose this
>>>> functionality.
>>>> 
>>>> We could of course have the SlingRepository expose the Jackrabbit
>>>> RepositoryManager interface delegating the
>>>> createDataStoreGarbageCollector() method to the RepositoryImpl
>>>> implementation (and throw some exception on the stop method).
>>>> 
>>>> WDYT ?
>>>> 
>>>> Regards
>>>> Felix
>>>> 
>>>> 
>>>>> 
>>>>> Thanks,
>>>>> Dragos
>>>>> 
>>>>> 
>>>>> 1-
>>>>> 
>>>> 
>> http://dev.day.com/docs/en/cq/current/core/administering/persistence_managers.html
>>>>> 2- http://wiki.apache.org/jackrabbit/DataStore
>>>> 
>>>> 
>> 
>> 


Re: DataStore GC for Sling

Posted by Dascalita Dragos <dd...@gmail.com>.
Hi Felix,
Following your suggestions I've submitted a patch, creating a JIRA issue,
c.f [1]
It works very nice, and the hook is easy to use afterwards.

Thanks for your help,
Dragos
[1] - https://issues.apache.org/jira/browse/SLING-2501

On Mon, Jun 4, 2012 at 1:06 PM, Felix Meschberger <fm...@adobe.com>wrote:

> Hi,
>
> Am 04.06.2012 um 11:52 schrieb Dascalita Dragos:
>
> > Hi Felix,
> > Thanks for your answer. I've been trying to avoid adding code in
> > jackrabbit-server as much as possible, but I found myself ending up
> exactly
> > in this place, just as you were saying.
> >
> > I'm going to impl this logic into jackrabbit-server, and then I'm
> thinking
> > to create a patch and send it as an improvement; in case it will make its
> > way into the Sling codebase at some point it would be great.
>
> Sounds good.
>
> >
> > I'm thinking to create a class that implements
> > org.apache.jackrabbit.api.management.RepositoryManager, expose it from
> > SlingServerRepository. For the stop() method I'm going to throw an
> > exception as you said, and for the other method
> > createDataStoreGarbageCollector delegate the call to RepositoryImpl, as
> you
> > advised.
>
> I would make it even simpler: Have the SlingServerRepository class
> implement RepositoryManager and register it as such. This would require to
> overwrite the AbsractSlingRepository.registerService method and adding the
> RepositoryManager interface to the list of registered services.
>
> >
> > IMO I think that this behavior is very important when you work with the
> > DataStore, as at the moment there is no way to remove content from it.
> > People working with documents, large images, or videos - which is my
> case,
> > need this a lot.
>
> I agree.
>
> Regards
> Felix
>
> >
> > Thanks for you help,
> > Dragos
> >
> >
> >
> > On Sun, Jun 3, 2012 at 7:30 PM, Felix Meschberger <fmeschbe@adobe.com
> >wrote:
> >
> >> Hi,
> >>
> >> Am 03.06.2012 um 13:35 schrieb Dascalita Dragos:
> >>
> >>> Hi,
> >>> Are there any implementations for Sling to clean the JCR DataStore ?
> I've
> >>> seen an impl for CRX c.f [1] ( Running GC section ), but couldn't find
> >> one
> >>> for Apache Sling.
> >>> I found only a JCR Doc [2] mentioning how to do it, but I'm getting
> >> errors
> >>> that I can't connect to repository, b/c the repository is locked.
> >>>
> >>> I'm wondering if this is a supported functionality in Sling, if not,
> are
> >>> there any plans to include it and any temporary workaround ?
> >>
> >> No, the jackrabbit server bundle does not currently expose this
> >> functionality.
> >>
> >> We could of course have the SlingRepository expose the Jackrabbit
> >> RepositoryManager interface delegating the
> >> createDataStoreGarbageCollector() method to the RepositoryImpl
> >> implementation (and throw some exception on the stop method).
> >>
> >> WDYT ?
> >>
> >> Regards
> >> Felix
> >>
> >>
> >>>
> >>> Thanks,
> >>> Dragos
> >>>
> >>>
> >>> 1-
> >>>
> >>
> http://dev.day.com/docs/en/cq/current/core/administering/persistence_managers.html
> >>> 2- http://wiki.apache.org/jackrabbit/DataStore
> >>
> >>
>
>

Re: DataStore GC for Sling

Posted by Felix Meschberger <fm...@adobe.com>.
Hi,

Am 04.06.2012 um 11:52 schrieb Dascalita Dragos:

> Hi Felix,
> Thanks for your answer. I've been trying to avoid adding code in
> jackrabbit-server as much as possible, but I found myself ending up exactly
> in this place, just as you were saying.
> 
> I'm going to impl this logic into jackrabbit-server, and then I'm thinking
> to create a patch and send it as an improvement; in case it will make its
> way into the Sling codebase at some point it would be great.

Sounds good.

> 
> I'm thinking to create a class that implements
> org.apache.jackrabbit.api.management.RepositoryManager, expose it from
> SlingServerRepository. For the stop() method I'm going to throw an
> exception as you said, and for the other method
> createDataStoreGarbageCollector delegate the call to RepositoryImpl, as you
> advised.

I would make it even simpler: Have the SlingServerRepository class implement RepositoryManager and register it as such. This would require to overwrite the AbsractSlingRepository.registerService method and adding the RepositoryManager interface to the list of registered services.

> 
> IMO I think that this behavior is very important when you work with the
> DataStore, as at the moment there is no way to remove content from it.
> People working with documents, large images, or videos - which is my case,
> need this a lot.

I agree.

Regards
Felix

> 
> Thanks for you help,
> Dragos
> 
> 
> 
> On Sun, Jun 3, 2012 at 7:30 PM, Felix Meschberger <fm...@adobe.com>wrote:
> 
>> Hi,
>> 
>> Am 03.06.2012 um 13:35 schrieb Dascalita Dragos:
>> 
>>> Hi,
>>> Are there any implementations for Sling to clean the JCR DataStore ? I've
>>> seen an impl for CRX c.f [1] ( Running GC section ), but couldn't find
>> one
>>> for Apache Sling.
>>> I found only a JCR Doc [2] mentioning how to do it, but I'm getting
>> errors
>>> that I can't connect to repository, b/c the repository is locked.
>>> 
>>> I'm wondering if this is a supported functionality in Sling, if not, are
>>> there any plans to include it and any temporary workaround ?
>> 
>> No, the jackrabbit server bundle does not currently expose this
>> functionality.
>> 
>> We could of course have the SlingRepository expose the Jackrabbit
>> RepositoryManager interface delegating the
>> createDataStoreGarbageCollector() method to the RepositoryImpl
>> implementation (and throw some exception on the stop method).
>> 
>> WDYT ?
>> 
>> Regards
>> Felix
>> 
>> 
>>> 
>>> Thanks,
>>> Dragos
>>> 
>>> 
>>> 1-
>>> 
>> http://dev.day.com/docs/en/cq/current/core/administering/persistence_managers.html
>>> 2- http://wiki.apache.org/jackrabbit/DataStore
>> 
>> 


Re: DataStore GC for Sling

Posted by Dascalita Dragos <dd...@gmail.com>.
Hi Felix,
Thanks for your answer. I've been trying to avoid adding code in
jackrabbit-server as much as possible, but I found myself ending up exactly
in this place, just as you were saying.

I'm going to impl this logic into jackrabbit-server, and then I'm thinking
to create a patch and send it as an improvement; in case it will make its
way into the Sling codebase at some point it would be great.

I'm thinking to create a class that implements
org.apache.jackrabbit.api.management.RepositoryManager, expose it from
SlingServerRepository. For the stop() method I'm going to throw an
exception as you said, and for the other method
createDataStoreGarbageCollector delegate the call to RepositoryImpl, as you
advised.

IMO I think that this behavior is very important when you work with the
DataStore, as at the moment there is no way to remove content from it.
People working with documents, large images, or videos - which is my case,
need this a lot.

Thanks for you help,
Dragos



On Sun, Jun 3, 2012 at 7:30 PM, Felix Meschberger <fm...@adobe.com>wrote:

> Hi,
>
> Am 03.06.2012 um 13:35 schrieb Dascalita Dragos:
>
> > Hi,
> > Are there any implementations for Sling to clean the JCR DataStore ? I've
> > seen an impl for CRX c.f [1] ( Running GC section ), but couldn't find
> one
> > for Apache Sling.
> > I found only a JCR Doc [2] mentioning how to do it, but I'm getting
> errors
> > that I can't connect to repository, b/c the repository is locked.
> >
> > I'm wondering if this is a supported functionality in Sling, if not, are
> > there any plans to include it and any temporary workaround ?
>
> No, the jackrabbit server bundle does not currently expose this
> functionality.
>
> We could of course have the SlingRepository expose the Jackrabbit
> RepositoryManager interface delegating the
> createDataStoreGarbageCollector() method to the RepositoryImpl
> implementation (and throw some exception on the stop method).
>
> WDYT ?
>
> Regards
> Felix
>
>
> >
> > Thanks,
> > Dragos
> >
> >
> > 1-
> >
> http://dev.day.com/docs/en/cq/current/core/administering/persistence_managers.html
> > 2- http://wiki.apache.org/jackrabbit/DataStore
>
>

Re: DataStore GC for Sling

Posted by Felix Meschberger <fm...@adobe.com>.
Hi,

Am 03.06.2012 um 13:35 schrieb Dascalita Dragos:

> Hi,
> Are there any implementations for Sling to clean the JCR DataStore ? I've
> seen an impl for CRX c.f [1] ( Running GC section ), but couldn't find one
> for Apache Sling.
> I found only a JCR Doc [2] mentioning how to do it, but I'm getting errors
> that I can't connect to repository, b/c the repository is locked.
> 
> I'm wondering if this is a supported functionality in Sling, if not, are
> there any plans to include it and any temporary workaround ?

No, the jackrabbit server bundle does not currently expose this functionality.

We could of course have the SlingRepository expose the Jackrabbit RepositoryManager interface delegating the createDataStoreGarbageCollector() method to the RepositoryImpl implementation (and throw some exception on the stop method).

WDYT ?

Regards
Felix


> 
> Thanks,
> Dragos
> 
> 
> 1-
> http://dev.day.com/docs/en/cq/current/core/administering/persistence_managers.html
> 2- http://wiki.apache.org/jackrabbit/DataStore