You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-dev@jackrabbit.apache.org by Davide Giannella <da...@apache.org> on 2014/08/27 15:02:14 UTC

OrderedIndex does not comply with JCR's compareTo semantics

Hello team,

as already tracked in https://issues.apache.org/jira/browse/OAK-1763 the
ordered index does not comply to JCR's semantics.

We start seeing issues in (pre)production around it and therefore I
think we need a fix.

Unfortunately with the changes of the encoding and algorithm any
existing ordered index will have to be "translated" into the new encoding.

I thought therefore that in case of big indexes/repos it would be faster
if instead of issuing a reindex having a groovy script (oak-console)
that applies the "translation" to the already indexed structure. To be
run with a repository shutdown.

The steps of the script would be something on the line of: walk the
list, take the current key and properties (node) "tranlate" into new
encoding, save, move to the next node.

Thoughts? Otherwise I'll add a note in the ticket itself for not loosing
track.

Cheers
Davide



Re: OrderedIndex does not comply with JCR's compareTo semantics

Posted by Thomas Mueller <mu...@adobe.com>.
Hi,

>>Could we use a new index type for the new format? For example "ordered2"
>> instead of "ordered". I know not very creative naming :-)
>This is a fix to the current index implementation. I think we should
>stay with the current index.

Well, we need some way to know whether the data was migrated or not. It
could be a separate property, for example "formatVersion". It doesn't need
to be a new index type.

>> I would prefer if the translation is done automatically on restart or
>> re-index. I woudn't want to install Groovy and / or have to do manual
>> steps if possible.
>>
>A normal reindex would suffice for the vast majority of cases. We have
>some edge ones however that takes days to complete a reindex.

If it takes days to complete a re-index, then it sounds like something is
broken, and that needs to be investigated. Re-indexing at most scans the
whole repository, so it should not take days, except maybe for a
multi-billion-nodes repository. I guess that's not the case here.

>The groovy script is just a way to speed it up in these edge cases. You
>don't need to install anything as the oak-console can already execute
>groovy scripts.

OK, but no manual steps should not be needed for a regular software
upgrade.

Regards,
Thomas


Re: OrderedIndex does not comply with JCR's compareTo semantics

Posted by Davide Giannella <da...@apache.org>.
On 27/08/2014 14:48, Thomas Mueller wrote:
> Hi,
>
> Could we use a new index type for the new format? For example "ordered2"
> instead of "ordered". I know not very creative naming :-)
This is a fix to the current index implementation. I think we should
stay with the current index.

> I would prefer if the translation is done automatically on restart or
> re-index. I woudn't want to install Groovy and / or have to do manual
> steps if possible.
>
A normal reindex would suffice for the vast majority of cases. We have
some edge ones however that takes days to complete a reindex.

The groovy script is just a way to speed it up in these edge cases. You
don't need to install anything as the oak-console can already execute
groovy scripts.

Cheers
Davide



Re: OrderedIndex does not comply with JCR's compareTo semantics

Posted by Thomas Mueller <mu...@adobe.com>.
Hi,

Could we use a new index type for the new format? For example "ordered2"
instead of "ordered". I know not very creative naming :-)


I would prefer if the translation is done automatically on restart or
re-index. I woudn't want to install Groovy and / or have to do manual
steps if possible.

Regards,
Thomas


On 27/08/14 15:02, "Davide Giannella" <da...@apache.org> wrote:

>Hello team,
>
>as already tracked in https://issues.apache.org/jira/browse/OAK-1763 the
>ordered index does not comply to JCR's semantics.
>
>We start seeing issues in (pre)production around it and therefore I
>think we need a fix.
>
>Unfortunately with the changes of the encoding and algorithm any
>existing ordered index will have to be "translated" into the new encoding.
>
>I thought therefore that in case of big indexes/repos it would be faster
>if instead of issuing a reindex having a groovy script (oak-console)
>that applies the "translation" to the already indexed structure. To be
>run with a repository shutdown.
>
>The steps of the script would be something on the line of: walk the
>list, take the current key and properties (node) "tranlate" into new
>encoding, save, move to the next node.
>
>Thoughts? Otherwise I'll add a note in the ticket itself for not loosing
>track.
>
>Cheers
>Davide
>
>