You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Amol Bhutada <am...@synechron.com> on 2006/01/04 16:20:54 UTC

indexreader refresh

If I have a reader and searcher on a indexdata folder and another 
indexwriter writing documents to the same indexdata folder, do I need to 
close existing reader and searcher and create new so that newly indexed 
data comes into search effect?

I have checked through google, got some pointers but some important 
links are not opening now, so If you can give me a pointer or clear 
picture about this it will be great.

I am looking at implementing lucene searching for a site having millions 
of user records so even looking for best way to keep my indexes uptodate 
while searching is going on.

thanks
Amol


--------------------------------------------------------------------
Mail Disclaimer: This e-mail and any files transmitted with it are confidential and the views expressed in the same are not necessarily the views of Synechron, and its Directors, Management or Employees. This communication represents the originator's personal views and opinions. If you are not the intended recipient or the person responsible for delivering the e-mail to the intended recipient, be advised that you have received this e-mail by error, and that any use, dissemination, forwarding, printing, or copying of this e-mail is strictly prohibited. You shall be under obligation to keep the contents of this e-mail, strictly confidential and shall not disclose, disseminate or divulge the same to any Person, Company, Firm or Entity. Even though Synechron uses up-to-date virus checking software to scan it's emails please ensure you have adequate virus protection before you open or detach any documents from this transmission. Synechron does not accept any liability for viruses 
 or vulnerabilities. The rights to monitor all e-mail communication through our network are reserved with us.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: indexreader refresh

Posted by Doug Cutting <cu...@apache.org>.
Yes, that's a good start.  Your patch does not handle deletions 
correctly.  If a segment has had deletions since it was opened then its 
deletions file needs to be re-read.  I also think returning a new 
IndexReader is preferable to modifying one, since an IndexReader is 
often used as a cache key, and caches should be invalidated when an 
IndexReader is re-opened.

Robert Engels wrote:
> I proposed and posted a patch for this long ago. Only thing missing would be
> some sort of reference courting for segments (rather than the 'stayopen'
> flag).
> 
>   /**
>    * reopens the IndexReader, possibly reusing the segments for greater
> efficiency. The original IndexReader instance
>    * is closed, and the reference is no longer valid
>    *
>    * @return the new IndexReader
>    */
>   public IndexReader reopen() throws IOException {
>       if(!(this instanceof MultiReader))
>           return IndexReader.open(directory);
> 
>       MultiReader mr = (MultiReader) this;
> 
>       final IndexReader[] oldreaders = mr.getReaders();
>       final boolean[] stayopen = new boolean[oldreaders.length];
> 
>       synchronized (directory) {			  // in- & inter-process sync
>           return (IndexReader)new Lock.With(
>               directory.makeLock(IndexWriter.COMMIT_LOCK_NAME),
>               IndexWriter.COMMIT_LOCK_TIMEOUT) {
>               public Object doBody() throws IOException {
>                 SegmentInfos infos = new SegmentInfos();
>                 infos.read(directory);
>                 if (infos.size() == 1) {		  // index is optimized
>                   return new SegmentReader(infos, infos.info(0),
> closeDirectory);
>                 } else {
>                   IndexReader[] readers = new IndexReader[infos.size()];
>                   for (int i = 0; i < infos.size(); i++) {
>                       for(int j=0;j<oldreaders.length;j++) {
>                           SegmentReader sr = (SegmentReader) oldreaders[j];
>                           if(sr.si.name.equals(infos.info(i).name)) {
>                               readers[i]=sr;
>                               stayopen[j]=true;
>                           }
>                       }
>                       if(readers[i]==null)
>                           readers[i] = new SegmentReader(infos.info(i));
>                   }
> 
>                   for(int i=0;i<stayopen.length;i++)
>                       if(!stayopen[i])
>                           oldreaders[i].close();
> 
>                   return new MultiReader(directory, infos, closeDirectory,
> readers);
>                 }
>               }
>             }.run();
>         }
>   }
> 
> -----Original Message-----
> From: Doug Cutting [mailto:cutting@apache.org]
> Sent: Wednesday, January 04, 2006 12:30 PM
> To: java-dev@lucene.apache.org
> Subject: Re: indexreader refresh
> 
> 
> Amol Bhutada wrote:
> 
>>If I have a reader and searcher on a indexdata folder and another
>>indexwriter writing documents to the same indexdata folder, do I need to
>>close existing reader and searcher and create new so that newly indexed
>>data comes into search effect?
> 
> 
> [ moved from user to dev list]
> 
> This is a frequent request.  While opening an all-new IndexReader is
> effective, it is not always efficient.  It might be nice to support a
> more efficient means of re-opening an index.
> 
> Perhaps we should add a few new IndexReader methods, as follows:
> 
> /** If <code>reader</code>'s index has not been changed, return
>    * <code>reader</code>, otherwise return a new {@link IndexReader}
>    * reading the new latest of the index
>    */
> public static IndexReader open(IndexReader reader) {
>    if (isCurrent()) {
>      // unchanged: return existing
>      return reader;
>    }
> 
>    // try to incrementally create new reader
>    IndexReader result = reader.reopen(reader);
>    if (result != null) {
>      return result;
>    }
> 
>    // punt, opening an entirely new reader
>    return IndexReader.open(reader.directory());
> }
> 
> /** Return a new IndexReader reading the current state
>    * of the index, re-using reader's resources, or null if this
>    * is not possible.
>    */
> protected IndexReader reopen(IndexReader reader) {
>    return null;
> }
> 
> Then we can add implementations of reopen to SegmentReader and
> MultiReader that attempt to re-use the existing, already opened
> segments.  This should mostly be simple, but there are a few tricky
> issues, like detecting whether an already-open segment has had
> deletions, and deciding when to close obsolete segments.
> 
> Does this sound like it would make a good addition?  Does someone want
> to volunteer to implement it?
> 
> Doug
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


RE: indexreader refresh

Posted by Robert Engels <re...@ix.netcom.com>.
I proposed and posted a patch for this long ago. Only thing missing would be
some sort of reference courting for segments (rather than the 'stayopen'
flag).

  /**
   * reopens the IndexReader, possibly reusing the segments for greater
efficiency. The original IndexReader instance
   * is closed, and the reference is no longer valid
   *
   * @return the new IndexReader
   */
  public IndexReader reopen() throws IOException {
      if(!(this instanceof MultiReader))
          return IndexReader.open(directory);

      MultiReader mr = (MultiReader) this;

      final IndexReader[] oldreaders = mr.getReaders();
      final boolean[] stayopen = new boolean[oldreaders.length];

      synchronized (directory) {			  // in- & inter-process sync
          return (IndexReader)new Lock.With(
              directory.makeLock(IndexWriter.COMMIT_LOCK_NAME),
              IndexWriter.COMMIT_LOCK_TIMEOUT) {
              public Object doBody() throws IOException {
                SegmentInfos infos = new SegmentInfos();
                infos.read(directory);
                if (infos.size() == 1) {		  // index is optimized
                  return new SegmentReader(infos, infos.info(0),
closeDirectory);
                } else {
                  IndexReader[] readers = new IndexReader[infos.size()];
                  for (int i = 0; i < infos.size(); i++) {
                      for(int j=0;j<oldreaders.length;j++) {
                          SegmentReader sr = (SegmentReader) oldreaders[j];
                          if(sr.si.name.equals(infos.info(i).name)) {
                              readers[i]=sr;
                              stayopen[j]=true;
                          }
                      }
                      if(readers[i]==null)
                          readers[i] = new SegmentReader(infos.info(i));
                  }

                  for(int i=0;i<stayopen.length;i++)
                      if(!stayopen[i])
                          oldreaders[i].close();

                  return new MultiReader(directory, infos, closeDirectory,
readers);
                }
              }
            }.run();
        }
  }

-----Original Message-----
From: Doug Cutting [mailto:cutting@apache.org]
Sent: Wednesday, January 04, 2006 12:30 PM
To: java-dev@lucene.apache.org
Subject: Re: indexreader refresh


Amol Bhutada wrote:
> If I have a reader and searcher on a indexdata folder and another
> indexwriter writing documents to the same indexdata folder, do I need to
> close existing reader and searcher and create new so that newly indexed
> data comes into search effect?

[ moved from user to dev list]

This is a frequent request.  While opening an all-new IndexReader is
effective, it is not always efficient.  It might be nice to support a
more efficient means of re-opening an index.

Perhaps we should add a few new IndexReader methods, as follows:

/** If <code>reader</code>'s index has not been changed, return
   * <code>reader</code>, otherwise return a new {@link IndexReader}
   * reading the new latest of the index
   */
public static IndexReader open(IndexReader reader) {
   if (isCurrent()) {
     // unchanged: return existing
     return reader;
   }

   // try to incrementally create new reader
   IndexReader result = reader.reopen(reader);
   if (result != null) {
     return result;
   }

   // punt, opening an entirely new reader
   return IndexReader.open(reader.directory());
}

/** Return a new IndexReader reading the current state
   * of the index, re-using reader's resources, or null if this
   * is not possible.
   */
protected IndexReader reopen(IndexReader reader) {
   return null;
}

Then we can add implementations of reopen to SegmentReader and
MultiReader that attempt to re-use the existing, already opened
segments.  This should mostly be simple, but there are a few tricky
issues, like detecting whether an already-open segment has had
deletions, and deciding when to close obsolete segments.

Does this sound like it would make a good addition?  Does someone want
to volunteer to implement it?

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: indexreader refresh

Posted by Doug Cutting <cu...@apache.org>.
Amol Bhutada wrote:
> If I have a reader and searcher on a indexdata folder and another 
> indexwriter writing documents to the same indexdata folder, do I need to 
> close existing reader and searcher and create new so that newly indexed 
> data comes into search effect?

[ moved from user to dev list]

This is a frequent request.  While opening an all-new IndexReader is 
effective, it is not always efficient.  It might be nice to support a 
more efficient means of re-opening an index.

Perhaps we should add a few new IndexReader methods, as follows:

/** If <code>reader</code>'s index has not been changed, return
   * <code>reader</code>, otherwise return a new {@link IndexReader}
   * reading the new latest of the index
   */
public static IndexReader open(IndexReader reader) {
   if (isCurrent()) {
     // unchanged: return existing
     return reader;
   }

   // try to incrementally create new reader
   IndexReader result = reader.reopen(reader);
   if (result != null) {
     return result;
   }

   // punt, opening an entirely new reader
   return IndexReader.open(reader.directory());
}

/** Return a new IndexReader reading the current state
   * of the index, re-using reader's resources, or null if this
   * is not possible.
   */
protected IndexReader reopen(IndexReader reader) {
   return null;
}

Then we can add implementations of reopen to SegmentReader and 
MultiReader that attempt to re-use the existing, already opened 
segments.  This should mostly be simple, but there are a few tricky 
issues, like detecting whether an already-open segment has had 
deletions, and deciding when to close obsolete segments.

Does this sound like it would make a good addition?  Does someone want 
to volunteer to implement it?

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


RE: indexreader refresh

Posted by Ramana Jelda <ra...@ciao-group.com>.
Hi Amol,
Yeah you should close reader to get updated index into effect.


Regards,
Jelda 

-----Original Message-----
From: Amol Bhutada [mailto:amolb@synechron.com] 
Sent: Wednesday, January 04, 2006 4:21 PM
To: java-user@lucene.apache.org
Subject: indexreader refresh

If I have a reader and searcher on a indexdata folder and another
indexwriter writing documents to the same indexdata folder, do I need to
close existing reader and searcher and create new so that newly indexed data
comes into search effect?

I have checked through google, got some pointers but some important links
are not opening now, so If you can give me a pointer or clear picture about
this it will be great.

I am looking at implementing lucene searching for a site having millions of
user records so even looking for best way to keep my indexes uptodate while
searching is going on.

thanks
Amol


--------------------------------------------------------------------
Mail Disclaimer: This e-mail and any files transmitted with it are
confidential and the views expressed in the same are not necessarily the
views of Synechron, and its Directors, Management or Employees. This
communication represents the originator's personal views and opinions. If
you are not the intended recipient or the person responsible for delivering
the e-mail to the intended recipient, be advised that you have received this
e-mail by error, and that any use, dissemination, forwarding, printing, or
copying of this e-mail is strictly prohibited. You shall be under obligation
to keep the contents of this e-mail, strictly confidential and shall not
disclose, disseminate or divulge the same to any Person, Company, Firm or
Entity. Even though Synechron uses up-to-date virus checking software to
scan it's emails please ensure you have adequate virus protection before you
open or detach any documents from this transmission. Synechron does not
accept any liability for viruses  or vulnerabilities. The rights to monitor
all e-mail communication through our network are reserved with us.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org