You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Ganesh <em...@yahoo.co.in> on 2008/11/28 13:09:07 UTC
Maintain last indexed information in a file or DB
I am using Lucene v2.4. I am indexing files from various folder and i have
to maintain a bookmark of what i have last indexed in each folder.
Initially i thought to save the state in each respective folder. Index
Wrtier always has documents in memory and it commits in a intervals. In an
unexpected application crash, sometimes the last saved bookmark and the last
indexed document in the database is not matching.
One another option is to keep the information in a same or different
database.
I think many might have faced this situitation.
Regards
Ganesh
Send instant messages to your online friends http://in.messenger.yahoo.com
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Maintain last indexed information in a file or DB
Posted by Michael McCandless <lu...@mikemccandless.com>.
Not sure if it's a fit here, but the 2.9 (not yet released) release of
Lucene allows you to specify metadata when you call commit, ie
commit(String userData).
This way each commit point can record "something" application specific
to describe it.
Mike
Ganesh wrote:
> My application is similar to google or msn desktop but the data
> would be voluminous. Some set of files are there in each folder and
> new files could be added to this folder. I have to pick the new one
> and index it. I could very well add some fields like folder name,
> filename, modified datetime etc. in the same DB or separate db. I
> have to search the DB and get the list of files for a folder and
> compare the same with the actual folder. This is one approach.
> Another approach is to persist the state in a file for every folder.
>
>> Oh, and I'd also try and stop the application from crashing!
> I will also try to avoid the crash, but my product will be installed
> in different customer places and there may be a possibility of
> forceful shutdown or killing java application etc. I need to be
> prepare for this situitation also.
>
> Regards
> Ganesh
>
> ----- Original Message ----- From: "Ian Lea" <ia...@gmail.com>
> To: <ja...@lucene.apache.org>
> Sent: Friday, November 28, 2008 5:51 PM
> Subject: Re: Maintain last indexed information in a file or DB
>
>
>> I'm a bit confused about what exactly is stored in folder and index
>> and database, but how about you store the bookmark information in the
>> same lucene index that you are using for the file data. One lucene
>> document per folder, with fields something like
>>
>> folder: /some/dir/somewhere
>> bookmark: some_bookmark_value
>>
>> That way the bookmark info should always be in line with the
>> indexed data.
>>
>>
>> Oh, and I'd also try and stop the application from crashing!
>>
>>
>> --
>> Ian.
>>
>>
>>
>> On Fri, Nov 28, 2008 at 12:09 PM, Ganesh <em...@yahoo.co.in>
>> wrote:
>>> I am using Lucene v2.4. I am indexing files from various folder
>>> and i have
>>> to maintain a bookmark of what i have last indexed in each folder.
>>>
>>> Initially i thought to save the state in each respective folder.
>>> Index
>>> Wrtier always has documents in memory and it commits in a
>>> intervals. In an
>>> unexpected application crash, sometimes the last saved bookmark
>>> and the last
>>> indexed document in the database is not matching.
>>>
>>> One another option is to keep the information in a same or different
>>> database.
>>>
>>> I think many might have faced this situitation.
>>>
>>> Regards
>>> Ganesh
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>
> Send instant messages to your online friends http://in.messenger.yahoo.com
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Maintain last indexed information in a file or DB
Posted by Ganesh <em...@yahoo.co.in>.
My application is similar to google or msn desktop but the data would be
voluminous. Some set of files are there in each folder and new files could
be added to this folder. I have to pick the new one and index it. I could
very well add some fields like folder name, filename, modified datetime etc.
in the same DB or separate db. I have to search the DB and get the list of
files for a folder and compare the same with the actual folder. This is one
approach. Another approach is to persist the state in a file for every
folder.
> Oh, and I'd also try and stop the application from crashing!
I will also try to avoid the crash, but my product will be installed in
different customer places and there may be a possibility of forceful
shutdown or killing java application etc. I need to be prepare for this
situitation also.
Regards
Ganesh
----- Original Message -----
From: "Ian Lea" <ia...@gmail.com>
To: <ja...@lucene.apache.org>
Sent: Friday, November 28, 2008 5:51 PM
Subject: Re: Maintain last indexed information in a file or DB
> I'm a bit confused about what exactly is stored in folder and index
> and database, but how about you store the bookmark information in the
> same lucene index that you are using for the file data. One lucene
> document per folder, with fields something like
>
> folder: /some/dir/somewhere
> bookmark: some_bookmark_value
>
> That way the bookmark info should always be in line with the indexed data.
>
>
> Oh, and I'd also try and stop the application from crashing!
>
>
> --
> Ian.
>
>
>
> On Fri, Nov 28, 2008 at 12:09 PM, Ganesh <em...@yahoo.co.in> wrote:
>> I am using Lucene v2.4. I am indexing files from various folder and i
>> have
>> to maintain a bookmark of what i have last indexed in each folder.
>>
>> Initially i thought to save the state in each respective folder. Index
>> Wrtier always has documents in memory and it commits in a intervals. In
>> an
>> unexpected application crash, sometimes the last saved bookmark and the
>> last
>> indexed document in the database is not matching.
>>
>> One another option is to keep the information in a same or different
>> database.
>>
>> I think many might have faced this situitation.
>>
>> Regards
>> Ganesh
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
Send instant messages to your online friends http://in.messenger.yahoo.com
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Maintain last indexed information in a file or DB
Posted by Ian Lea <ia...@gmail.com>.
I'm a bit confused about what exactly is stored in folder and index
and database, but how about you store the bookmark information in the
same lucene index that you are using for the file data. One lucene
document per folder, with fields something like
folder: /some/dir/somewhere
bookmark: some_bookmark_value
That way the bookmark info should always be in line with the indexed data.
Oh, and I'd also try and stop the application from crashing!
--
Ian.
On Fri, Nov 28, 2008 at 12:09 PM, Ganesh <em...@yahoo.co.in> wrote:
> I am using Lucene v2.4. I am indexing files from various folder and i have
> to maintain a bookmark of what i have last indexed in each folder.
>
> Initially i thought to save the state in each respective folder. Index
> Wrtier always has documents in memory and it commits in a intervals. In an
> unexpected application crash, sometimes the last saved bookmark and the last
> indexed document in the database is not matching.
>
> One another option is to keep the information in a same or different
> database.
>
> I think many might have faced this situitation.
>
> Regards
> Ganesh
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org