You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Greg Pendlebury <gr...@gmail.com> on 2011/04/28 01:23:40 UTC

Embedded Solr Optimize under Windows

Hi All,

Just quick query of no particular importance to me, but we did observe this
problem:

http://code.google.com/p/solr-geonames/wiki/DeveloperInstall
"It's worth noting that the build has also been run on Mac and Solaris now,
and the Solr index is about half the size. We suspect the optimize() call in
Embedded Solr is not working correctly under Windows."

We've observed that Windows leaves lots of segments on disk and takes up
twice the volume as the other OSs. Perhaps file locking or something
prevents the optimize() call from functioning. This wasn't particularly
important to us since we don't run Windows for any prod systems. For that
reason we haven't looked too closely, but thought it might be of interest to
others... if we are even right of course :)

Ta,
Greg

Re: Embedded Solr Optimize under Windows

Posted by Greg Pendlebury <gr...@gmail.com>.
Ahh, thanks. I might try a basic commit() then and see, although it's not a
huge deal for me. It occurred to me that two optimize() calls would probably
leave exactly the same problem behind.

On 20 May 2011 09:52, Chris Hostetter <ho...@fucit.org> wrote:

>
> : Thanks for the reply. I'm at home right now, or I'd try this myself, but
> is
> : the suggestion that two optimize() calls in a row would resolve the
> issue?
>
> it might ... I think the situations in which it happens have evolved a bit
> over the years as IndexWRiter has gotten smarter about knowing when it
> really needs to touch the disk to reduce IO.
>
> there's a relatively new explicit method (IndexWriter.deleteUnusedFiles)
> that can force this...
>
> https://issues.apache.org/jira/browse/LUCENE-2259
>
> ...but it's only on trunk, and there isn't any user level hook for it in
> Solr yet (i opened SOLR-2532 to consider adding it)
>
>
> -Hoss
>

Re: Embedded Solr Optimize under Windows

Posted by Chris Hostetter <ho...@fucit.org>.
: Thanks for the reply. I'm at home right now, or I'd try this myself, but is
: the suggestion that two optimize() calls in a row would resolve the issue?

it might ... I think the situations in which it happens have evolved a bit 
over the years as IndexWRiter has gotten smarter about knowing when it 
really needs to touch the disk to reduce IO.

there's a relatively new explicit method (IndexWriter.deleteUnusedFiles) 
that can force this...

https://issues.apache.org/jira/browse/LUCENE-2259

...but it's only on trunk, and there isn't any user level hook for it in 
Solr yet (i opened SOLR-2532 to consider adding it)


-Hoss

Re: Embedded Solr Optimize under Windows

Posted by Greg Pendlebury <gr...@gmail.com>.
Thanks for the reply. I'm at home right now, or I'd try this myself, but is
the suggestion that two optimize() calls in a row would resolve the issue?
The process in question is a JVM devoted entirely to harvesting, calls
optimize() then shuts down.

The least processor intensive way of triggering this behaviour is
desirable... perhaps a commit()? But I wouldn't have expected that to
trigger a write.

On 17 May 2011 10:20, Chris Hostetter <ho...@fucit.org> wrote:

>
> : http://code.google.com/p/solr-geonames/wiki/DeveloperInstall
> : "It's worth noting that the build has also been run on Mac and Solaris
> now,
> : and the Solr index is about half the size. We suspect the optimize() call
> in
> : Embedded Solr is not working correctly under Windows."
> :
> : We've observed that Windows leaves lots of segments on disk and takes up
> : twice the volume as the other OSs. Perhaps file locking or something
>
> The problem isn't that "optimize" doesn't work on windows, the problem is
> that windows file semantics won't let files be deleted while there are
> open file handles -- so Lucene's Directory behavior is to leave the files
> on disk, and try to clean them up later.  (on the next write, or next
> optimize call)
>
>
> -Hoss
>

Re: Embedded Solr Optimize under Windows

Posted by Chris Hostetter <ho...@fucit.org>.
: http://code.google.com/p/solr-geonames/wiki/DeveloperInstall
: "It's worth noting that the build has also been run on Mac and Solaris now,
: and the Solr index is about half the size. We suspect the optimize() call in
: Embedded Solr is not working correctly under Windows."
: 
: We've observed that Windows leaves lots of segments on disk and takes up
: twice the volume as the other OSs. Perhaps file locking or something

The problem isn't that "optimize" doesn't work on windows, the problem is 
that windows file semantics won't let files be deleted while there are 
open file handles -- so Lucene's Directory behavior is to leave the files 
on disk, and try to clean them up later.  (on the next write, or next 
optimize call)


-Hoss