You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Ganesh <em...@yahoo.co.in> on 2008/11/28 13:09:07 UTC

Maintain last indexed information in a file or DB

I am using Lucene v2.4. I am indexing files from various folder and i have 
to maintain a bookmark of what i have last indexed in each folder.

Initially i thought to save the state in each  respective folder. Index 
Wrtier always has documents in memory and it commits in a intervals. In an 
unexpected application crash, sometimes the last saved bookmark and the last 
indexed document in the database is not matching.

One another option is to keep the information in a same or different 
database.

I think many might have faced this situitation.

Regards
Ganesh 

Send instant messages to your online friends http://in.messenger.yahoo.com 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Maintain last indexed information in a file or DB

Posted by Michael McCandless <lu...@mikemccandless.com>.
Not sure if it's a fit here, but the 2.9 (not yet released) release of  
Lucene allows you to specify metadata when you call commit, ie  
commit(String userData).

This way each commit point can record "something" application specific  
to describe it.

Mike

Ganesh wrote:

> My application is similar to google or msn desktop but the data  
> would be voluminous. Some set of files are there in each folder and  
> new files could be added to this folder. I have to pick the new one  
> and index it. I could very well add some fields like folder name,  
> filename, modified datetime etc. in the same DB or separate db. I  
> have to search the DB and get the list of files for a folder and  
> compare the same with the actual folder. This is one approach.  
> Another approach is to persist the state in a file for every folder.
>
>> Oh, and I'd also try and stop the application from crashing!
> I will also try to avoid the crash, but my product will be installed  
> in different customer places and there may be a possibility of  
> forceful shutdown or killing java application etc. I need to be  
> prepare for this situitation also.
>
> Regards
> Ganesh
>
> ----- Original Message ----- From: "Ian Lea" <ia...@gmail.com>
> To: <ja...@lucene.apache.org>
> Sent: Friday, November 28, 2008 5:51 PM
> Subject: Re: Maintain last indexed information in a file or DB
>
>
>> I'm a bit confused about what exactly is stored in folder and index
>> and database, but how about you store the bookmark information in the
>> same lucene index that you are using for the file data.  One lucene
>> document per folder, with fields something like
>>
>> folder: /some/dir/somewhere
>> bookmark: some_bookmark_value
>>
>> That way the bookmark info should always be in line with the  
>> indexed data.
>>
>>
>> Oh, and I'd also try and stop the application from crashing!
>>
>>
>> --
>> Ian.
>>
>>
>>
>> On Fri, Nov 28, 2008 at 12:09 PM, Ganesh <em...@yahoo.co.in>  
>> wrote:
>>> I am using Lucene v2.4. I am indexing files from various folder  
>>> and i have
>>> to maintain a bookmark of what i have last indexed in each folder.
>>>
>>> Initially i thought to save the state in each  respective folder.  
>>> Index
>>> Wrtier always has documents in memory and it commits in a  
>>> intervals. In an
>>> unexpected application crash, sometimes the last saved bookmark  
>>> and the last
>>> indexed document in the database is not matching.
>>>
>>> One another option is to keep the information in a same or different
>>> database.
>>>
>>> I think many might have faced this situitation.
>>>
>>> Regards
>>> Ganesh
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>
> Send instant messages to your online friends http://in.messenger.yahoo.com
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Maintain last indexed information in a file or DB

Posted by Ganesh <em...@yahoo.co.in>.
My application is similar to google or msn desktop but the data would be 
voluminous. Some set of files are there in each folder and new files could 
be added to this folder. I have to pick the new one and index it. I could 
very well add some fields like folder name, filename, modified datetime etc. 
in the same DB or separate db. I have to search the DB and get the list of 
files for a folder and compare the same with the actual folder. This is one 
approach. Another approach is to persist the state in a file for every 
folder.

> Oh, and I'd also try and stop the application from crashing!
I will also try to avoid the crash, but my product will be installed in 
different customer places and there may be a possibility of forceful 
shutdown or killing java application etc. I need to be prepare for this 
situitation also.

Regards
Ganesh

----- Original Message ----- 
From: "Ian Lea" <ia...@gmail.com>
To: <ja...@lucene.apache.org>
Sent: Friday, November 28, 2008 5:51 PM
Subject: Re: Maintain last indexed information in a file or DB


> I'm a bit confused about what exactly is stored in folder and index
> and database, but how about you store the bookmark information in the
> same lucene index that you are using for the file data.  One lucene
> document per folder, with fields something like
>
> folder: /some/dir/somewhere
> bookmark: some_bookmark_value
>
> That way the bookmark info should always be in line with the indexed data.
>
>
> Oh, and I'd also try and stop the application from crashing!
>
>
> --
> Ian.
>
>
>
> On Fri, Nov 28, 2008 at 12:09 PM, Ganesh <em...@yahoo.co.in> wrote:
>> I am using Lucene v2.4. I am indexing files from various folder and i 
>> have
>> to maintain a bookmark of what i have last indexed in each folder.
>>
>> Initially i thought to save the state in each  respective folder. Index
>> Wrtier always has documents in memory and it commits in a intervals. In 
>> an
>> unexpected application crash, sometimes the last saved bookmark and the 
>> last
>> indexed document in the database is not matching.
>>
>> One another option is to keep the information in a same or different
>> database.
>>
>> I think many might have faced this situitation.
>>
>> Regards
>> Ganesh
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 

Send instant messages to your online friends http://in.messenger.yahoo.com 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Maintain last indexed information in a file or DB

Posted by Ian Lea <ia...@gmail.com>.
I'm a bit confused about what exactly is stored in folder and index
and database, but how about you store the bookmark information in the
same lucene index that you are using for the file data.  One lucene
document per folder, with fields something like

folder: /some/dir/somewhere
bookmark: some_bookmark_value

That way the bookmark info should always be in line with the indexed data.


Oh, and I'd also try and stop the application from crashing!


--
Ian.



On Fri, Nov 28, 2008 at 12:09 PM, Ganesh <em...@yahoo.co.in> wrote:
> I am using Lucene v2.4. I am indexing files from various folder and i have
> to maintain a bookmark of what i have last indexed in each folder.
>
> Initially i thought to save the state in each  respective folder. Index
> Wrtier always has documents in memory and it commits in a intervals. In an
> unexpected application crash, sometimes the last saved bookmark and the last
> indexed document in the database is not matching.
>
> One another option is to keep the information in a same or different
> database.
>
> I think many might have faced this situitation.
>
> Regards
> Ganesh

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org