You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-dev@jackrabbit.apache.org by Sham Hassan <sc...@gmail.com> on 2016/02/04 04:25:00 UTC

How to calculate the reindex time?

Hi Team,


Any appropriate formula we can arrive to calculate how much time it would
take to complete the reindex based on index definition stat from jmx?   In
earlier aem5x we use to identify number of nodes+ properties using index
tar file.  Then on system performance how much time it takes to complete
1000 nodes we arrive at rough estimation.   In call with customer
perspective  we have to give business appropriate downtime & any formula
you could help will be great help for.  I an not looking for accurate but
just a rough estimation.


Thanks,
Sham

Re: How to calculate the reindex time?

Posted by Sham Hassan <sc...@gmail.com>.
thanks all for feedback & takeing it up.

On Thu, Feb 4, 2016 at 10:50 AM, Alex Parvulescu <al...@gmail.com>
wrote:

> Thanks for the feedback Davide, I've created OAK-3987 and assigned to you,
> as requested. [0]
>
> best,
> alex
>
> [0] https://issues.apache.org/jira/browse/OAK-3987
>
>
> On Thu, Feb 4, 2016 at 4:29 PM, Davide Giannella <da...@apache.org>
> wrote:
>
> > On 04/02/2016 09:31, Alex Parvulescu wrote:
> > > ...
> > > One random thought: would it be useful to add a sort of 'dry run' call
> to
> > > the indexer, you'd just give it a path (root by default) and a max
> number
> > > of nodes (let's say 100k) and it would output the usual stats (indexed
> x
> > > nodes in .... ms) and throw away the result, this would allow for a
> sort
> > of
> > > guesstimation by type of index, on specific setups.
> > >
> > I think this is an awesome feature we should provide from Oak. I guess
> > by using some JMX input, receiving all the required inputs:
> >
> > - path to index definition (mandatory)
> > - path to the content (mandatory)
> > - max number of nodes (optional)
> >
> > Would you like to file a ticket for it, for oak 1.6 and assign it to me?
> > I think that rather than having a guesstimate it could actually be quite
> > accurate number if targeted properly.
> >
> > Cheeers
> > Davide
> >
> >
> >
>

Re: How to calculate the reindex time?

Posted by Alex Parvulescu <al...@gmail.com>.
Thanks for the feedback Davide, I've created OAK-3987 and assigned to you,
as requested. [0]

best,
alex

[0] https://issues.apache.org/jira/browse/OAK-3987


On Thu, Feb 4, 2016 at 4:29 PM, Davide Giannella <da...@apache.org> wrote:

> On 04/02/2016 09:31, Alex Parvulescu wrote:
> > ...
> > One random thought: would it be useful to add a sort of 'dry run' call to
> > the indexer, you'd just give it a path (root by default) and a max number
> > of nodes (let's say 100k) and it would output the usual stats (indexed x
> > nodes in .... ms) and throw away the result, this would allow for a sort
> of
> > guesstimation by type of index, on specific setups.
> >
> I think this is an awesome feature we should provide from Oak. I guess
> by using some JMX input, receiving all the required inputs:
>
> - path to index definition (mandatory)
> - path to the content (mandatory)
> - max number of nodes (optional)
>
> Would you like to file a ticket for it, for oak 1.6 and assign it to me?
> I think that rather than having a guesstimate it could actually be quite
> accurate number if targeted properly.
>
> Cheeers
> Davide
>
>
>

Re: How to calculate the reindex time?

Posted by Davide Giannella <da...@apache.org>.
On 04/02/2016 09:31, Alex Parvulescu wrote:
> ...
> One random thought: would it be useful to add a sort of 'dry run' call to
> the indexer, you'd just give it a path (root by default) and a max number
> of nodes (let's say 100k) and it would output the usual stats (indexed x
> nodes in .... ms) and throw away the result, this would allow for a sort of
> guesstimation by type of index, on specific setups.
>
I think this is an awesome feature we should provide from Oak. I guess
by using some JMX input, receiving all the required inputs:

- path to index definition (mandatory)
- path to the content (mandatory)
- max number of nodes (optional)

Would you like to file a ticket for it, for oak 1.6 and assign it to me?
I think that rather than having a guesstimate it could actually be quite
accurate number if targeted properly.

Cheeers
Davide



Re: How to calculate the reindex time?

Posted by Alex Parvulescu <al...@gmail.com>.
Hi,

I don't think we have a formula now, although it is an interesting exercise
(it would probably vary widely based on index type, but maybe you have a
specific index in mind, like reindexing a lucene property index).

What I wanted to say is that all of aem6x customers go though the same
thing: upgrading from 5x to 6x. this implies a full reindex, so I would
assume you already have a reference to how long a full reindex might take,
even though it's probably a rough estimate (and you'd also have to account
for a tiny delta of added content since the upgrade).

One random thought: would it be useful to add a sort of 'dry run' call to
the indexer, you'd just give it a path (root by default) and a max number
of nodes (let's say 100k) and it would output the usual stats (indexed x
nodes in .... ms) and throw away the result, this would allow for a sort of
guesstimation by type of index, on specific setups.

hope this helps,
alex



On Thu, Feb 4, 2016 at 4:25 AM, Sham Hassan <sc...@gmail.com> wrote:

> Hi Team,
>
>
> Any appropriate formula we can arrive to calculate how much time it would
> take to complete the reindex based on index definition stat from jmx?   In
> earlier aem5x we use to identify number of nodes+ properties using index
> tar file.  Then on system performance how much time it takes to complete
> 1000 nodes we arrive at rough estimation.   In call with customer
> perspective  we have to give business appropriate downtime & any formula
> you could help will be great help for.  I an not looking for accurate but
> just a rough estimation.
>
>
> Thanks,
> Sham
>