You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Konstantyn Smirnov <in...@yahoo.com> on 2012/07/11 12:18:23 UTC

RAMDirectory and expungeDeletes()/optimize()

Hi all

in my app (Lucene 3.5.0 powered) I index the documents (not too many, say up
to 100k) using the RAMDirectory.
Then I need to send the segment over the network to be merged with the
existing index other there.

The segment need to be as "slim" as possible, e.g. without any pending
deleted documents. 

My code looks like


Is this a legitimate way to do that? 
Doesn't it conflict with the JavaDoc saying:



TIA

--
View this message in context: http://lucene.472066.n3.nabble.com/RAMDirectory-and-expungeDeletes-optimize-tp3994350.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: RAMDirectory and expungeDeletes()/optimize()

Posted by Thomas Matthijs <li...@selckin.be>.
On Tue, May 21, 2013 at 3:12 PM, Konstantyn Smirnov <in...@yahoo.com>wrote:

> I want to refresh the topic a bit.
>
> Using the Lucene 4.3.0, I could'n find a method like expungeDeletes() in
> the
> IW anymore.



http://lucene.apache.org/core/4_3_0/core/org/apache/lucene/index/IndexWriter.html#forceMergeDeletes()

Re: RAMDirectory and expungeDeletes()/optimize()

Posted by Konstantyn Smirnov <in...@yahoo.com>.
I want to refresh the topic a bit.

Using the Lucene 4.3.0, I could'n find a method like expungeDeletes() in the
IW anymore. I rely on lucence's MergePolicies to do the optimization, but I
need to keep the metadata up-to-date, docFreqs and termFreqs to name a few.

The only way to accomplish that was writer.forceMerge( 1 ), but it includes
the optimization.

Are there any other "cheaper" ways to do that?

TIA



--
View this message in context: http://lucene.472066.n3.nabble.com/RAMDirectory-and-expungeDeletes-optimize-tp3994350p4064890.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: RAMDirectory and expungeDeletes()/optimize()

Posted by Steven A Rowe <sa...@syr.edu>.
Nabble silently drops content from email sent through their interface on a regular basis.  I've told them about it multiple times.  My suggestion: find another way to post to this mailing list.

-----Original Message-----
From: Michael McCandless [mailto:lucene@mikemccandless.com] 
Sent: Wednesday, July 11, 2012 10:07 AM
To: java-user@lucene.apache.org
Subject: Re: RAMDirectory and expungeDeletes()/optimize()

What I meant was your original email says "My code looks like",
followed by blank lines, and then "Doesn't it conflict with the
JavaDoc saying:", followed by blank lines. Ie we can't see your code.

However, when I look at your email here at
http://lucene.472066.n3.nabble.com/RAMDirectory-and-expungeDeletes-optimize-td3994350.html#a3994387
I do see the code and javadocs.

But when I look at http://lucene.markmail.org/thread/z5gcms6lp4bo5hfs
and http://mail-archives.apache.org/mod_mbox/lucene-java-user/201207.mbox/%3c1342001903207-3994350.post@n3.nabble.com%3e
they are missing.

Not sure what's going on.  Maybe your email was originally HTML but
got converted to plain text somewhere along the way, losing those
important parts?

Anyway, to try to answer your question: you should be able to simply
call optimize (forceMerge(1)): it does what expungeDeletes does, and
more (merges down to 1 segment).  Yes, it's horribly costly, and so
you should do it rarely, but it sounds like it may be OK in this case
(one time thing before you send a segment off to the main index).
Still, you should test whether it actually helps in the end, because
likely the main index will have to merge these segments anyway (if
enough are added) which'd mean the merging you did on adding them was
redundant (unless bandwidth is very costly...).

Mike McCandless

http://blog.mikemccandless.com

On Wed, Jul 11, 2012 at 9:55 AM, Konstantyn Smirnov <in...@yahoo.com> wrote:
> JavaDoc comes from here
> http://lucene.apache.org/core/3_6_0/api/all/org/apache/lucene/index/IndexWriter.html#expungeDeletes()
>
> other blanks are here because it's groovy :) Or what did you mean exactly?
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/RAMDirectory-and-expungeDeletes-optimize-tp3994350p3994387.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: RAMDirectory and expungeDeletes()/optimize()

Posted by Michael McCandless <lu...@mikemccandless.com>.
What I meant was your original email says "My code looks like",
followed by blank lines, and then "Doesn't it conflict with the
JavaDoc saying:", followed by blank lines. Ie we can't see your code.

However, when I look at your email here at
http://lucene.472066.n3.nabble.com/RAMDirectory-and-expungeDeletes-optimize-td3994350.html#a3994387
I do see the code and javadocs.

But when I look at http://lucene.markmail.org/thread/z5gcms6lp4bo5hfs
and http://mail-archives.apache.org/mod_mbox/lucene-java-user/201207.mbox/%3c1342001903207-3994350.post@n3.nabble.com%3e
they are missing.

Not sure what's going on.  Maybe your email was originally HTML but
got converted to plain text somewhere along the way, losing those
important parts?

Anyway, to try to answer your question: you should be able to simply
call optimize (forceMerge(1)): it does what expungeDeletes does, and
more (merges down to 1 segment).  Yes, it's horribly costly, and so
you should do it rarely, but it sounds like it may be OK in this case
(one time thing before you send a segment off to the main index).
Still, you should test whether it actually helps in the end, because
likely the main index will have to merge these segments anyway (if
enough are added) which'd mean the merging you did on adding them was
redundant (unless bandwidth is very costly...).

Mike McCandless

http://blog.mikemccandless.com

On Wed, Jul 11, 2012 at 9:55 AM, Konstantyn Smirnov <in...@yahoo.com> wrote:
> JavaDoc comes from here
> http://lucene.apache.org/core/3_6_0/api/all/org/apache/lucene/index/IndexWriter.html#expungeDeletes()
>
> other blanks are here because it's groovy :) Or what did you mean exactly?
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/RAMDirectory-and-expungeDeletes-optimize-tp3994350p3994387.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: RAMDirectory and expungeDeletes()/optimize()

Posted by Konstantyn Smirnov <in...@yahoo.com>.
JavaDoc comes from here
http://lucene.apache.org/core/3_6_0/api/all/org/apache/lucene/index/IndexWriter.html#expungeDeletes()

other blanks are here because it's groovy :) Or what did you mean exactly?

--
View this message in context: http://lucene.472066.n3.nabble.com/RAMDirectory-and-expungeDeletes-optimize-tp3994350p3994387.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: RAMDirectory and expungeDeletes()/optimize()

Posted by Michael McCandless <lu...@mikemccandless.com>.
There are blanks at the important places (your code, and which
JavaDoc) in your email!

Mike McCandless

http://blog.mikemccandless.com

On Wed, Jul 11, 2012 at 6:18 AM, Konstantyn Smirnov <in...@yahoo.com> wrote:
> Hi all
>
> in my app (Lucene 3.5.0 powered) I index the documents (not too many, say up
> to 100k) using the RAMDirectory.
> Then I need to send the segment over the network to be merged with the
> existing index other there.
>
> The segment need to be as "slim" as possible, e.g. without any pending
> deleted documents.
>
> My code looks like
>
>
> Is this a legitimate way to do that?
> Doesn't it conflict with the JavaDoc saying:
>
>
>
> TIA
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/RAMDirectory-and-expungeDeletes-optimize-tp3994350.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org