You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Lars Hammer <ha...@dezide.com> on 2003/08/20 09:22:54 UTC

updating a document

Hello

I'm trying to update a document in my index. As far as i can tell from the FAQ and other places of documentation, the only way to do this is by deleting the document and adding it again.

Now, I want to be able to add the document a new but keep from having to re-parse the original file again. That is i want to extract a document from the index (and keep a copy in memory), delete the document from the index, update a field in the doc in memory and add the doc to index once again.

I imagine that it has to be done something like this :

1. extract the desired document from index with a function returning the document (not complete code):

Document doc;
String fileNameToGetFromIdx
String tempName

   for ( int i = 0; i < numDocs; i++ )
   {
    if ( !indexreader.isDeleted( i ) )
    {
     doc = indexreader.document( i );

     if ( doc != null )
     {
      tmpName = ( doc.get( "pathToFileOnDisk" ) ); 

      if ( tmpName.equals( fileNameToGetFromIdx ) )
      {
       ir.delete( i );
       return doc;
      }
     }
    }
   }

This would leave me with the document in memory and the document deleted from the index -right?


2. update a field in the document by adding the field again.

 doc.add( Field.Text( "someField", "value" ) );

The API for Document says that if multiple fields exists with the same name, the value of the last value added is returned when getting the value.


3. add the document to the index again

indexwriter.addDocument( doc );


Is this a correct way of doing an update because i can't seem to get i to work properly.
The reason for trying it this way is to not having to reindex the original file again. I have many large PDF documents which takes some time to index :-(

Bottom line -when i do a search and a list of search results are displayed to the user, the user clicks the title of the document and the document is shown to the user. Before the document is shown i execute an update function to increase the number of times the documents has been visited -hence i need to update the visited field of that particular document in the index.

Uhm -hope you get the idea :-)

Any suggestions and comments are very welcome

thanks in advance

BTW : does anyone know if an update function is planned to be added to lucene? Would it be hard to write it yourself?


/Lars Hammer

www.dezide.com