You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Grant Ingersoll <gs...@apache.org> on 2008/08/04 22:12:44 UTC
CheckIndex tool
Hey Mike,
I'm thinking about https://issues.apache.org/jira/browse/SOLR-566 and
was thinking about adding some more programmatic access to the
CheckIndex tool and wanted to see if you had any thoughts. Basically,
I am going to to capture info into a simple data structure that can
then be introspected and serialized into a RequestHandler, but also
something that might be more generally useful in certain cases where
things go bad. I was debating keeping the inline out.printlns, but
not sure if they shouldn't just be moved to the main such that the cmd
line stuff still works as is, but it doesn't clog the logs for those
that want programmatic access.
I'll post a patch soon, but wanted to see if you had any preliminary
insight.
-Grant
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
Re: CheckIndex tool
Posted by Michael McCandless <lu...@mikemccandless.com>.
Actually, those exceptions are thrown by the code detecting the
mismatch, and then caught by CheckIndex and handled as meaning that
segment is corrupt. This is consistent eg with how Lucene throws
CorruptIndexException deep down if it hits an inconsistency.
I think it's fine if you want to not use exceptions for the "local"
mismatches, and instead record the error in a data structure and then
stop processing that one segment. But for the "deep down" exceptions
you still have to keep the catch in CheckIndex to record those.
Mike
On Aug 5, 2008, at 9:30 AM, Grant Ingersoll wrote:
> I'll look into these. The other parts I am not sure on is the
> throwing of exceptions for mismatches. I know they mean CheckIndex
> can't go forward, but they aren't really errors in CheckIndex, so
> much as errors in the index, which CheckIndex is just reporting.
> So, I'm inclined to capture that and present it (and return
> immediately) instead of throw an exception. Is that reasonable?
>
> -Grant
>
>
> On Aug 4, 2008, at 5:01 PM, Michael McCandless wrote:
>
>>
>> This sounds good! I like the idea of checking the index when Solr
>> has to force release the write.lock.
>>
>> The one caveat is, when checking a large index (which can take
>> quite some time), it'd be nice to have the equivalent of the
>> inline'd out.print/ln calls happen in realtime so that you can see
>> (on the command line output) that progress is being made, which
>> segment is being checked, etc.?
>>
>> Maybe change it to an optional "infoStream" (like IndexWriter), and
>> then the current inlined prints become calls to message() which
>> checks if infoStream is non-null?
>>
>> Mike
>>
>> Grant Ingersoll wrote:
>>
>>> Hey Mike,
>>>
>>> I'm thinking about https://issues.apache.org/jira/browse/SOLR-566
>>> and was thinking about adding some more programmatic access to the
>>> CheckIndex tool and wanted to see if you had any thoughts.
>>> Basically, I am going to to capture info into a simple data
>>> structure that can then be introspected and serialized into a
>>> RequestHandler, but also something that might be more generally
>>> useful in certain cases where things go bad. I was debating
>>> keeping the inline out.printlns, but not sure if they shouldn't
>>> just be moved to the main such that the cmd line stuff still works
>>> as is, but it doesn't clog the logs for those that want
>>> programmatic access.
>>>
>>> I'll post a patch soon, but wanted to see if you had any
>>> preliminary insight.
>>>
>>> -Grant
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
Re: CheckIndex tool
Posted by Grant Ingersoll <gs...@apache.org>.
I'll look into these. The other parts I am not sure on is the
throwing of exceptions for mismatches. I know they mean CheckIndex
can't go forward, but they aren't really errors in CheckIndex, so much
as errors in the index, which CheckIndex is just reporting. So, I'm
inclined to capture that and present it (and return immediately)
instead of throw an exception. Is that reasonable?
-Grant
On Aug 4, 2008, at 5:01 PM, Michael McCandless wrote:
>
> This sounds good! I like the idea of checking the index when Solr
> has to force release the write.lock.
>
> The one caveat is, when checking a large index (which can take quite
> some time), it'd be nice to have the equivalent of the inline'd
> out.print/ln calls happen in realtime so that you can see (on the
> command line output) that progress is being made, which segment is
> being checked, etc.?
>
> Maybe change it to an optional "infoStream" (like IndexWriter), and
> then the current inlined prints become calls to message() which
> checks if infoStream is non-null?
>
> Mike
>
> Grant Ingersoll wrote:
>
>> Hey Mike,
>>
>> I'm thinking about https://issues.apache.org/jira/browse/SOLR-566
>> and was thinking about adding some more programmatic access to the
>> CheckIndex tool and wanted to see if you had any thoughts.
>> Basically, I am going to to capture info into a simple data
>> structure that can then be introspected and serialized into a
>> RequestHandler, but also something that might be more generally
>> useful in certain cases where things go bad. I was debating
>> keeping the inline out.printlns, but not sure if they shouldn't
>> just be moved to the main such that the cmd line stuff still works
>> as is, but it doesn't clog the logs for those that want
>> programmatic access.
>>
>> I'll post a patch soon, but wanted to see if you had any
>> preliminary insight.
>>
>> -Grant
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
Re: CheckIndex tool
Posted by Michael McCandless <lu...@mikemccandless.com>.
This sounds good! I like the idea of checking the index when Solr has
to force release the write.lock.
The one caveat is, when checking a large index (which can take quite
some time), it'd be nice to have the equivalent of the inline'd
out.print/ln calls happen in realtime so that you can see (on the
command line output) that progress is being made, which segment is
being checked, etc.?
Maybe change it to an optional "infoStream" (like IndexWriter), and
then the current inlined prints become calls to message() which checks
if infoStream is non-null?
Mike
Grant Ingersoll wrote:
> Hey Mike,
>
> I'm thinking about https://issues.apache.org/jira/browse/SOLR-566
> and was thinking about adding some more programmatic access to the
> CheckIndex tool and wanted to see if you had any thoughts.
> Basically, I am going to to capture info into a simple data
> structure that can then be introspected and serialized into a
> RequestHandler, but also something that might be more generally
> useful in certain cases where things go bad. I was debating keeping
> the inline out.printlns, but not sure if they shouldn't just be
> moved to the main such that the cmd line stuff still works as is,
> but it doesn't clog the logs for those that want programmatic access.
>
> I'll post a patch soon, but wanted to see if you had any preliminary
> insight.
>
> -Grant
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org