You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lenya.apache.org by Andreas Hartmann <an...@apache.org> on 2007/05/24 17:31:10 UTC
[1.4] "Indexer is busy" problem
Hi Lenya devs,
what should we do about this issue?
http://issues.apache.org/bugzilla/show_bug.cgi?id=42510
I wouldn't like to implement a queue for incremental indexing
events before 1.4 is out, because I think it's quite a lot of
work (especially in the testing department), and I can't predict
the consequences without giving it some thought. Maybe someone
has a solution at the ready (crossing my fingers ...).
Should we silently ignore the error and just don't trigger
the indexing, or should we continue to throw the exception?
I hope someone comes up with a better idea :)
TIA!
-- Andreas
--
Andreas Hartmann, CTO
BeCompany GmbH
http://www.becompany.ch
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org
Re: [1.4] "Indexer is busy" problem
Posted by Andreas Hartmann <an...@apache.org>.
Joern Nettingsmeier schrieb:
> Andreas Hartmann wrote:
>> Bob Harner schrieb:
>>> On 5/24/07, Andreas Hartmann <an...@apache.org> wrote:
>>>> Hi Lenya devs,
>>>>
>>>> what should we do about this issue?
>>>>
>>>> http://issues.apache.org/bugzilla/show_bug.cgi?id=42510
>>>>
>>>> I wouldn't like to implement a queue for incremental indexing
>>>> events before 1.4 is out, because I think it's quite a lot of
>>>> work (especially in the testing department), and I can't predict
>>>> the consequences without giving it some thought. Maybe someone
>>>> has a solution at the ready (crossing my fingers ...).
>>>>
>>>> Should we silently ignore the error and just don't trigger
>>>> the indexing, or should we continue to throw the exception?
>>>>
>>>> I hope someone comes up with a better idea :)
>>
>> [...]
>>
>>> A better interrim solution might be to display a helpful warning
>>> message rather than an exception:
>>>
>>> "Warning: this document can't be added to the search index yet
>>> because the indexer is currently busy with another document. Please
>>> re-publish this document in a moment to ensure that it is indexed."
>
> +1
>
>> The problem with this approach is that we can't determine if the
>> indexer will be busy before we apply the change to the document.
>> Since the indexer is a shared resource, we'd have to lock it to
>> prevent concurrent tasks from starting an indexing process while
>> the publishing (or any other action which changes the document
>> content) is in progress.
>
> why? bob's suggestion means it can fail, but the user will be given a
> workaround. sounds ok as an interim solution.
The advantage is that the user knows that un-indexed documents
exist (or are published), but she still has to deactivate and re-publish
them. Anyway, I don't think we will be able to achieve anything better
for the moment.
BTW, I don't see how to implement this since repo observation (which is
used for indexing) is asynchronous ...
>> I'd be interested how other systems handle this. Maybe the indexing
>> has to be part of the transaction, so the transaction can be rolled
>> back if the indexing fails. But maybe we shouldn't invest too much
>> research in this issue but rather choose a powerful back-end which
>> supports indexing for the next major version.
>
> i'd say let's ignore concurrency issues for 1.4 and document the
> shortcomings. we need to get this one out. without being negative, i
> think that most users that are eagerly waiting for a release have small
> to medium-size deployments and will only rarely encounter concurrency
> issues - we just don't have the track record atm to be considered for
> very large scale projects. let's not starve our core users too much by
> delaying 1.4 any further.
> instead, we should put up a roadmap where concurrency is an important
> topic for 1.5. incremental improvements - otherwise we'll die of
> second-system syndrome.
I agree. If anyone has an idea how to implement Bob's suggestion,
feel free to go ahead or make a proposal. I'll try to think about
it too when I find the time.
-- Andreas
--
Andreas Hartmann, CTO
BeCompany GmbH
http://www.becompany.ch
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org
Re: [1.4] "Indexer is busy" problem
Posted by Joern Nettingsmeier <ne...@folkwang-hochschule.de>.
Andreas Hartmann wrote:
> Bob Harner schrieb:
>> On 5/24/07, Andreas Hartmann <an...@apache.org> wrote:
>>> Hi Lenya devs,
>>>
>>> what should we do about this issue?
>>>
>>> http://issues.apache.org/bugzilla/show_bug.cgi?id=42510
>>>
>>> I wouldn't like to implement a queue for incremental indexing
>>> events before 1.4 is out, because I think it's quite a lot of
>>> work (especially in the testing department), and I can't predict
>>> the consequences without giving it some thought. Maybe someone
>>> has a solution at the ready (crossing my fingers ...).
>>>
>>> Should we silently ignore the error and just don't trigger
>>> the indexing, or should we continue to throw the exception?
>>>
>>> I hope someone comes up with a better idea :)
>
> [...]
>
>> A better interrim solution might be to display a helpful warning
>> message rather than an exception:
>>
>> "Warning: this document can't be added to the search index yet
>> because the indexer is currently busy with another document. Please
>> re-publish this document in a moment to ensure that it is indexed."
+1
> The problem with this approach is that we can't determine if the
> indexer will be busy before we apply the change to the document.
> Since the indexer is a shared resource, we'd have to lock it to
> prevent concurrent tasks from starting an indexing process while
> the publishing (or any other action which changes the document
> content) is in progress.
why? bob's suggestion means it can fail, but the user will be given a
workaround. sounds ok as an interim solution.
> I'd be interested how other systems handle this. Maybe the indexing
> has to be part of the transaction, so the transaction can be rolled
> back if the indexing fails. But maybe we shouldn't invest too much
> research in this issue but rather choose a powerful back-end which
> supports indexing for the next major version.
i'd say let's ignore concurrency issues for 1.4 and document the
shortcomings. we need to get this one out. without being negative, i
think that most users that are eagerly waiting for a release have small
to medium-size deployments and will only rarely encounter concurrency
issues - we just don't have the track record atm to be considered for
very large scale projects. let's not starve our core users too much by
delaying 1.4 any further.
instead, we should put up a roadmap where concurrency is an important
topic for 1.5. incremental improvements - otherwise we'll die of
second-system syndrome.
just my thoughts,
jörn
--
jörn nettingsmeier
home://germany/45128 essen/lortzingstr. 11/
http://spunk.dnsalias.org
phone://+49/201/491621
Kurt is up in Heaven now.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org
Re: [1.4] "Indexer is busy" problem
Posted by Andreas Hartmann <an...@apache.org>.
Bob Harner schrieb:
> On 5/24/07, Andreas Hartmann <an...@apache.org> wrote:
>> Hi Lenya devs,
>>
>> what should we do about this issue?
>>
>> http://issues.apache.org/bugzilla/show_bug.cgi?id=42510
>>
>> I wouldn't like to implement a queue for incremental indexing
>> events before 1.4 is out, because I think it's quite a lot of
>> work (especially in the testing department), and I can't predict
>> the consequences without giving it some thought. Maybe someone
>> has a solution at the ready (crossing my fingers ...).
>>
>> Should we silently ignore the error and just don't trigger
>> the indexing, or should we continue to throw the exception?
>>
>> I hope someone comes up with a better idea :)
[...]
> A better interrim solution might be to display a helpful warning
> message rather than an exception:
>
> "Warning: this document can't be added to the search index yet
> because the indexer is currently busy with another document. Please
> re-publish this document in a moment to ensure that it is indexed."
The problem with this approach is that we can't determine if the
indexer will be busy before we apply the change to the document.
Since the indexer is a shared resource, we'd have to lock it to
prevent concurrent tasks from starting an indexing process while
the publishing (or any other action which changes the document
content) is in progress.
I'd be interested how other systems handle this. Maybe the indexing
has to be part of the transaction, so the transaction can be rolled
back if the indexing fails. But maybe we shouldn't invest too much
research in this issue but rather choose a powerful back-end which
supports indexing for the next major version.
-- Andreas
--
Andreas Hartmann, CTO
BeCompany GmbH
http://www.becompany.ch
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org
Re: [1.4] "Indexer is busy" problem
Posted by Bob Harner <bo...@gmail.com>.
On 5/24/07, Andreas Hartmann <an...@apache.org> wrote:
> Hi Lenya devs,
>
> what should we do about this issue?
>
> http://issues.apache.org/bugzilla/show_bug.cgi?id=42510
>
> I wouldn't like to implement a queue for incremental indexing
> events before 1.4 is out, because I think it's quite a lot of
> work (especially in the testing department), and I can't predict
> the consequences without giving it some thought. Maybe someone
> has a solution at the ready (crossing my fingers ...).
>
> Should we silently ignore the error and just don't trigger
> the indexing, or should we continue to throw the exception?
>
> I hope someone comes up with a better idea :)
>
> TIA!
>
> -- Andreas
>
>
> --
> Andreas Hartmann, CTO
> BeCompany GmbH
> http://www.becompany.ch
A better interrim solution might be to display a helpful warning
message rather than an exception:
"Warning: this document can't be added to the search index yet
because the indexer is currently busy with another document. Please
re-publish this document in a moment to ensure that it is indexed."
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org