You are viewing a plain text version of this content. The canonical link for it is here.

Posted to java-user@lucene.apache.org by petite_abeille <pe...@mac.com> on 2002/04/26 09:52:35 UTC

FileNotFoundException: Too many open files

Hello,

I'm running into this exception quiet often while using Lucene (the 
situation is so bad with the latest rc, that I had to revert to the last 
com.lucene package). I'm sure I have my fair share of bugs in my app, 
but nonetheless, how can I "control" Lucene usage of RandomAccessFile? 
The indexes are optimized and I try to keep a close look at how many 
IndexWriter/Reader exists at any point in time... Nevertheless, I run 
into that exception much too often :-( Any help appreciated!

"04/26 00:07:11 (Warning) Finder.findObjectsWithSpecificationInStore: 
java.io.FileNotFoundException:  _la.f9 (Too many open files)"

Also, on a somewhat related note, how do I "shut down" Lucene properly. 
Eg, do I need to do anything with the IndexWriter and so on?

Last, but not least, is there a way to turn of the file locking in the 
latest rc as it's really getting in the way :-(

Finally, I just wanted to make sure: Lucene is fully multi-threaded 
right? I can do search *and* write concurrently in different threads at 
the same time on the same index?

Any insight much appreciated!

Thanks.

PA.

BTW, should I post this kind of question to user or dev?


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>

Re: Homogeneous vs Heterogeneous indexes (was: FileNotFoundException)

Posted by petite_abeille <pe...@mac.com>.

On Tuesday, April 30, 2002, at 01:57 AM, Steven J. Owens wrote:

> Just be glad you aren't doing this on Solaris with JDK 1.1.6

I know... In fact I'm looking forward to port my stuff to 1.4... As my 
app is very much IO bond I'm really excited by this nio madness... :-)

> Yes and no.  Setting ulimit to a reasonable number of open files is not 
> only not a patch, it's the "right" way to do it.

Of course... Nothing is really black or white... What I wanted to say is 
that -as a first strike- *I* prefer not to mess around with system 
parameters.

>  I understand where you're coming from, really, and in a certain way, 
> it makes sense

Thanks. I already feel less alone... ;-)

> BUT... sometimes the impulse for clean, good design takes you too far 
> down a blind alley.

Sure. At the end of the day, everything is a tradeoff...

>  Sometimes there is no elegant solution. Sometimes there is no "best" 
> way, only one of a limited set of options with different tradeoffs.

Absolutely.

> Most serious applications have to have some sort of OS variable 
> tweaking, you're just used to having it done invisibly and painlessly.

Agree. In fact that's my first desktop application for nearly a decade. 
I usually work on large scale system. And let me tell you, it's a very 
different pair of sleeves... ;-)

>  You could figure out the "right" way to set the system configuration 
> on install or launch.

One of my design "goal" is to try to avoid these sort of tweakings as 
much as I can.

>  You could look at the alternative techniques for indexing in Lucene

That's another one of those nasty tradeoffs... ;-) Memory is even more 
precious than file descriptors in my situation... Specially with a jvm 
that have this funky notion of constraining your memory usage...

> if there's anything you're doing wrong (perhaps opening files and not 
> closing them, and leaving them for the garbage collector to eventually 
> get around to closing?)

Sure. I went through all those sanity checks. Also, in my case, the 
garbage collector is my friend as I'm using the java.lang.ref API 
extensively.

>  or if you have a pessimal usage pattern that exacerbates the situation.

Ummmm...?!? You lost me here... What's a "pessimal usage pattern"?

> if you can come up with a scheme to run Lucene indexing with modified 
> code for keeping track of file resources.

Sure, there are many thing that one could do... However, I have to 
balance how much time I want to invest into any one of those allays. One 
thing I really like about Lucene is it very simple API and usage. So far 
it has worked out pretty well for me as I'm using it pretty extensively. 
And I seem to have found -at last- a good balance between the different 
constrains I'm operating under.

> an anomalous situation (use on a client/desktop machine)

"Anomalous situation"?!?! Ummm... Lucene is just an API... Hopefully 
it's not bundled with some "dogma" attached to it... However, I'm kind 
of starting to wander about that considering some of the -very 
defensive- responses I got to my postings... Oh, well... I will just go 
back to my cave... :-(

> could configure lucene to be careful about how many files it keeps open 
> at any given time.

That will be great! On a somewhat related note, I have decided to stick 
with the com.lucene package for the time being.... I was pretty excited 
when the rc stuff came out, but it just didn't work out for me. My 
resources problem just went from bad to worse. And also, I have two 
issues with the release candidate: locking and reference counting.

Locking. I don't have anything against locking per see. However, I 
really don't like how it's implemented in the rc. Using files just do 
not work for me. It creates too many problems when something goes wrong 
(eg the app is killed without warning and I have to clean up all those 
locks by myself). What about using sockets or something to rendez-vous 
on an index? Or at a bare minimum, be able to disable the locking all 
together. I understand that most people are using Lucene under a very 
different setup that I do, but nevertheless it should not hurt to make 
it configurable. Anyway, it does not work for older jvm as noted in the 
source code. Last, but not least, I'm always get very scared when I see 
some "platform" dependent code somewhere (eg "if version 1 then ") ;-)

Reference counting. Well, as noted in a comment in the source code, the 
reference API is really the way to go... And trying to be backward 
compatible to version 0.9 is somehow missing the forest for the tree... 
Just my two cents in any case. And yes, I'm well aware that I can fix 
all these issue by myself... And start to contribute to Lucene instead 
of just ranting left and right... But also keep in mind that I'm just a 
humble Lucene user. And there seem to be a very clear distinction 
between "user" and "developer" in Lucene's world... ;-)

Thanks for your response in any case. I hope I didn't "offend" too many 
people with my ramblings ;-)

PA.

--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>

Re: Homogeneous vs Heterogeneous indexes (was: FileNotFoundException)

Posted by "Steven J. Owens" <pu...@darksleep.com>.

petite,

On Mon, Apr 29, 2002 at 07:54:43PM +0200, petite_abeille wrote:
> As a final note, several people suggested to increase the number of file 
> descriptors per process with something like "ulimit"...

     Just be glad you aren't doing this on Solaris with JDK 1.1.6,
where I first ran into ulimit issues - back when I encountered this
problem, the solaris default ulimit setting was 24 files, and JDK
1.1.6 reported the problem as an "OutOfMemory" error!  Looks like
things are improving :-).

> From what I learned today, I think it's a *bad* idea to have to
> change some system parameters just because your/my app is written in
> such a way that it may run out of some system resources. Your/my app
> has to fit in the system.  Hacking "ulimit" and/or other system
> parameters is just a quick patch that will -at best- delay dealing
> with the real problem that's usually one of design.

     Yes and no.  Setting ulimit to a reasonable number of open files
is not only not a patch, it's the "right" way to do it.  I understand
where you're coming from, really, and in a certain way, it makes
sense, BUT... sometimes the impulse for clean, good design takes you
too far down a blind alley.  Sometimes there is no elegant solution.
Sometimes there is no "best" way, only one of a limited set of options
with different tradeoffs.

     By definition, Lucene is an application that trades off up front
CPU (for indexing) and file resources (for storage) for request-time
speed.  The OS's job is to manage resources, and open files are one of
those resources.  That's the tradeoff here, and it's reasonable and
expected.  Most serious applications have to have some sort of OS
variable tweaking, you're just used to having it done invisibly and
painlessly.

     That said, since you're working on a client/desktop application,
not a server application, you need to think about ways to handle this:

     You could figure out the "right" way to set the system
configuration on install or launch.

     You could look at the alternative techniques for indexing in
Lucene, and see if any approaches there can help - for example, maybe
doing a lot of the more intense indexing work in a RAMDirectory, then
merging it into a normal file-based Directory.

     You could look more closely at what your application is doing,
and see if there's anything you're doing wrong (perhaps opening files
and not closing them, and leaving them for the garbage collector to
eventually get around to closing?) or if you have a pessimal usage
pattern that exacerbates the situation.

     You could take a closer look at the lucene indexing and file
management stuff, and see if you can come up with a scheme to run
Lucene indexing with modified code for keeping track of file
resources. 

     I'll bet Doug and the other developers would rather not add
open-file managmeent as a main, permanent part of lucene, since it
would add overhead to all uses of lucene just to deal with an
anomalous situation (use on a client/desktop machine).  But they might
be interested in a way to offer it as an optional feature, where
people using lucene in a constrained environment could configure
lucene to be careful about how many files it keeps open at any given
time.

Steven J. Owens
puff@darksleep.com

--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>

Re: Homogeneous vs Heterogeneous indexes (was: FileNotFoundException)

Posted by Joshua O'Madadhain <jm...@ics.uci.edu>.

On Mon, 29 Apr 2002, petite_abeille wrote:

> As a final note, several people suggested to increase the number of
> file descriptors per process with something like "ulimit"... From what
> I learned today, I think it's a *bad* idea to have to change some
> system parameters just because your/my app is written in such a way
> that it may run out of some system resources. Your/my app has to fit
> in the system.  Hacking "ulimit" and/or other system parameters is
> just a quick patch that will -at best- delay dealing with the real
> problem that's usually one of design.

If you have a suggestion for how Lucene could use fewer file descriptors
while still maintaining its performance, I'm sure that the developers
would be interested to hear it. 

However, some programs do require more resources than others.  If--as I
suspect is true in this case--this is a consequence of the complexity of
the task, then there's not much point in complaining about it.

Joshua

 jmadden@ics.uci.edu...Obscurium Per Obscurius...www.ics.uci.edu/~jmadden
    Joshua Madden: Information Scientist, Musician, Philosopher-At-Tall
 It's that moment of dawning comprehension that I live for--Bill Watterson
My opinions are too rational and insightful to be those of any organization.

--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>

Homogeneous vs Heterogeneous indexes (was: FileNotFoundException)

Posted by petite_abeille <pe...@mac.com>.

First of, thanks to Jagadesh Nandasamy who directed me to the right 
direction.

It seems, that in my situation, more homogeneous indexes work better 
than fewer heterogeneous indexes:

I have a dozen class that I'm indexing. They vary from two fields to 
more than a dozen field per document (aka object). I went through 
different indexing strategy with them (per class, per date, per root 
class, ... ) to see how it goes. In any case, while trying to use my 
stuff with rc4 I consolidated all my different class indexes into one 
root class index to see if I could reduce my resources consumption. Less 
indexes, less RandomAccessFile was the rational. Well, I was wrong. In 
fact the exact opposite seems to hold true: more -homogeneous- indexes 
use overall less RandomAccessFile than less -heterogeneous- indexes...

One of those -not so obvious- thing you have to learn the hard way I 
guess... ;-)

In any case, I would like to thanks again Jagadesh for his insight. Also 
thanks to Pier Fumagalli for pointing out "LSOF". A very handy tool 
indeed.

As a final note, several people suggested to increase the number of file 
descriptors per process with something like "ulimit"... From what I 
learned today, I think it's a *bad* idea to have to change some system 
parameters just because your/my app is written in such a way that it may 
run out of some system resources. Your/my app has to fit in the system. 
Hacking "ulimit" and/or other system parameters is just a quick patch 
that will -at best- delay dealing with the real problem that's usually 
one of design.

Just my two cents.

PA.



--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>

Re: FileNotFoundException: code example

Posted by Jagadesh Nandasamy <ja...@eself.com>.

Hi petite,
        I will try to be brief...
        In lucene the number of files created depends on the number of 
fields the document has....
        so lets take an example you want to index 100 files
        if each file contains 10 fields
             document.add(Field.Text(UNIQUE_ID, "12345678"))
             ...
             ...
             ...
             document.add(new Field(UNIQUE_TYPE, "xxxxxxxxxxxxx", true, 
true, false));
             document.add(new Field(PATH, "c:\xxx\yyy\zzz.doc", true, 
true, false));

        if in all the 100 documents, if all the 10 fields created have 
their field's key or name
        (ie UNIQUE_ID, UNIQUE_TYPE, PATH)the same then the number of 
files created by lucene
        remains under control. (MIND YOU the values of the fields can be 
different)
        Say for the above scenerio if the number of index files created 
are about 80(for 100 documents
        with 10 field's each), If you add another million documents with 
same 10 fields the number
        of index files would remain the same it would not create any 
more _f12 , _xxx files.
       
        In contrast say for the same number of documents if you create 
10 fields that are different
        for different documents like for the first document if you 
create a field like
           
            document.add(new Field("Doc_PATH_1", "c:\xxx\yyy\zzz.doc", 
true, true, false));
           
        and
            document.add(new Field("Doc_PATH_2", "c:\xxx\yyy\zzz.doc", 
true, true, false));
        for second document.   
           
        I think for each new field that is created about 3 files are 
created in index directory
        so you would end up having 1000's of files in index directory 
which would cause the
        "Too many files opened problem".
       
        And i think you dont have to be bothered about which OS you are 
using.
       
       
        Hope this helps...
       
-Jaggi
           


petite_abeille wrote:

> Hello again,
>
> attached is the source code of the only class interacting directly 
> with Lucene in my app. Sorry for not providing a complete test case as 
> it's hard for me to come up with something self contained. Maybe there 
> is something that's obviously wrong in what I'm doing.
>
> Thanks for any help.
>
> PA
>
>
>------------------------------------------------------------------------
>
>//
>//	===========================================================================
>//
>//	Title:		SZIndex.java
>//	Description:	[Description]
>//	Author:		Raphael Szwarc <ra...@hotmail.com>
>//	Creation Date:	Wed Sep 12 2001
>//	Legal:		Copyright (C) 2001 Raphael Szwarc. All Rights Reserved.
>//
>//	---------------------------------------------------------------------------
>//
>
>package alt.dev.szobject;
>
>import com.lucene.store.Directory;
>import com.lucene.store.FSDirectory;
>import com.lucene.store.RAMDirectory;
>import com.lucene.document.Field;
>import com.lucene.document.DateField;
>import com.lucene.document.Document;
>import com.lucene.analysis.Analyzer;
>import com.lucene.analysis.standard.StandardAnalyzer;
>import com.lucene.index.IndexWriter;
>import com.lucene.index.IndexReader;
>import com.lucene.index.Term;
>import com.lucene.search.IndexSearcher;
>import com.lucene.search.MultiSearcher;
>import com.lucene.search.Searcher;
>import com.lucene.search.Query;
>import com.lucene.search.Hits;
>
>import java.io.FilenameFilter;
>import java.io.File;
>import java.io.IOException;
>
>import java.util.Map;
>import java.util.Collection;
>import java.util.Date;
>import java.util.Iterator;
>
>import alt.dev.szfoundation.SZHexCoder;
>import alt.dev.szfoundation.SZDate;
>import alt.dev.szfoundation.SZSystem;
>import alt.dev.szfoundation.SZLog;
>
>final class SZIndex extends Object
>{
>
>//	===========================================================================
>//	Constant(s)
>//	---------------------------------------------------------------------------
>
>	private static final String	Extension = ".index";
>
>//	===========================================================================
>//	Class variable(s)
>//	---------------------------------------------------------------------------
>
>	private static final Filter	_filter = new Filter();
>
>//	===========================================================================
>//	Instance variable(s)
>//	---------------------------------------------------------------------------
>
>	private String			_path = null;
>	private transient File		_directory = null;
>	private transient Directory	_indexDirectory = null;
>	private transient IndexWriter	_writer = null;
>	
>	private transient IndexReader	_reader = null;
>	private transient Searcher	_searcher = null;
>
>	private transient Directory	_ramDirectory = null;
>	private transient IndexWriter	_ramWriter = null;
>	private transient int		_counter = 0;
>
>//	===========================================================================
>//	Constructor method(s)
>//	---------------------------------------------------------------------------
>
>	private SZIndex()
>	{
>		super();
>	}
>
>//	===========================================================================
>//	Class method(s)
>//	---------------------------------------------------------------------------
>
>	static FilenameFilter filter()
>	{
>		return _filter;
>	}
>	
>	static String stringByDeletingPathExtension(String aPath)
>	{
>		if ( aPath != null )
>		{
>			int	anIndex = aPath.lastIndexOf( SZIndex.Extension );
>			
>			if ( anIndex > 0 )
>			{
>				aPath = aPath.substring( 0, anIndex );
>			}
>			
>			return aPath;
>		}
>		
>		throw new IllegalArgumentException( "SZIndex.stringByDeletingPathExtension: null path." );
>	}
>
>	static SZIndex indexWithNameInDirectory(String aName, File aDirectory)
>	{
>		if ( aName != null )
>		{
>			if ( aDirectory != null )
>			{
>				String	anEncodedName = SZHexCoder.encode( aName.getBytes() );
>				//String	aPath = aDirectory.getPath() + File.separator + anEncodedName + SZIndex.Extension + File.separator;
>				String	aPath = aDirectory.getPath() + File.separator + aName + SZIndex.Extension + File.separator;
>				SZIndex	anIndex = new SZIndex();
>			
>				anIndex.setPath( aPath );
>				
>				return anIndex;
>			}
>
>			throw new IllegalArgumentException( "SZIndex.indexWithNameInDirectory: null directory." );
>		}
>		
>		throw new IllegalArgumentException( "SZIndex.indexWithNameInDirectory: null name." );
>	}
>
>	static String stringForValue(Object aValue )
>	{
>		if ( aValue != null )
>		{
>			String	aStringValue = null;
>						
>			if ( ( aValue instanceof SZDate ) == true )
>			{
>				aValue = ( (SZDate) aValue ).internalDate();
>			}
>			else
>			if ( ( aValue instanceof SZPersistent ) == true )
>			{
>				aValue = ( (SZPersistent) aValue ).id();
>			}
>						
>			if ( ( aValue instanceof Date ) == true )
>			{
>				aStringValue = DateField.dateToString( (Date) aValue );
>			}
>			else
>			if ( ( aValue instanceof SZID ) == true )
>			{
>				aStringValue = ( (SZID) aValue ).uuidString();
>			}
>			else
>			{
>				aStringValue = aValue.toString();
>			}
>			
>			return aStringValue;
>		}
>
>		throw new IllegalArgumentException( "SZIndex.stringForValue: null value." );
>	}
>	
>//	===========================================================================
>//	Instance method(s)
>//	---------------------------------------------------------------------------
>
>	private String path()
>	{
>		return _path;
>	}
>	
>	private void setPath(String aValue)
>	{
>		_path = aValue;
>	}
>	
>	private File directory()
>	{
>		if ( _directory == null )
>		{
>			String	aPath = this.path();
>			
>			if ( aPath != null )
>			{
>				_directory = new File( aPath );
>			
>				if ( _directory.exists() == false )
>				{
>					_directory.mkdirs();
>				}
>			}
>			else
>			{
>				throw new IllegalStateException( "SZIndex.directory: null path." );
>			}
>		}
>		
>		return _directory;
>	}
>	
>	private boolean shouldCreate()
>	{
>		File		aDirectory = this.directory();
>		String[]	aList = aDirectory.list();
>		
>		if ( ( aList == null ) || ( aList.length == 0 ) )
>		{
>			return true;
>		}
>		
>		return false;
>	}
>	
>	boolean exists()
>	{
>		File	aFile = this.directory();
>		
>		if ( aFile != null )
>		{
>			return aFile.exists();
>		}
>		
>		return false;
>	}
>	
>	private SZDate lastModifiedDate()
>	{
>		if ( this.exists() == true )
>		{
>			File	aDirectory = this.directory();
>			Date	aDate = new Date( aDirectory.lastModified() );
>			SZDate	aCalendarDate = SZDate.dateWithDate( aDate );
>			
>			return aCalendarDate;
>		}
>		
>		return null;
>	}
>	
>	public int hashCode()
>	{
>		return this.path().hashCode();
>	}
>	
>	public boolean equals(Object anObject)
>	{
>		if ( this == anObject )
>		{
>			return true;
>		}
>		
>		return this.path().equals( ( (SZIndex) anObject ).path() );
>	}
>
>	protected void finalize() throws Throwable
>	{
>		if ( _writer != null )
>		{
>			this.optimize();
>		}
>					
>		super.finalize();
>	}
>	
>//	===========================================================================
>//	Index method(s)
>//	---------------------------------------------------------------------------
>
>	synchronized void optimize()
>	{
>		try
>		{
>			this.flush();
>			
>			if ( _writer != null )
>			{
>				_writer.optimize();
>				_writer.close();
>			
>			}
>			
>			_writer = null;
>			_indexDirectory = null;
>		}
>		catch(Exception anException)
>		{
>			anException.printStackTrace();
>		
>			SZLog.warning( anException );
>			
>			_writer = null;
>			_indexDirectory = null;
>
>			SZSystem.gc();
>		}
>	}
>	
>				
>	private Directory indexDirectory() throws IOException
>	{
>		if ( _indexDirectory == null )
>		{
>			File	aFile = this.directory();
>			boolean	shouldCreate = this.shouldCreate();
>			
>			//_indexDirectory = FSDirectory.getDirectory( aFile, shouldCreate );
>			_indexDirectory = new FSDirectory( aFile, shouldCreate );
>		}
>		
>		return _indexDirectory;
>	}
>
>	private IndexWriter writer() throws IOException
>	{
>		if ( _writer == null )
>		{
>			Directory	aDirectory = this.indexDirectory();
>			Analyzer	anAnalyzer = new StandardAnalyzer();
>			boolean		shouldCreate = this.shouldCreate();
>			
>			_writer = new IndexWriter( aDirectory, anAnalyzer, shouldCreate );
>			_writer.mergeFactor = 2;
>		}
>
>		return _writer;
>	}
>	
>	private IndexReader reader() throws IOException
>	{
>		if ( _reader == null )
>		{
>			System.gc();
>			
>			_reader = IndexReader.open( this.indexDirectory() );
>		}
>		
>		return _reader;
>	}
>	
>	private Searcher searcher() throws IOException
>	{
>		if ( _searcher == null )
>		{
>			System.gc();
>
>			_searcher = new IndexSearcher( this.reader() );
>		}
>		
>		if ( _ramDirectory != null )
>		{
>			Searcher	aRamSearcher = new IndexSearcher( IndexReader.open( _ramDirectory ) );
>
>			return new MultiSearcher( new Searcher[] { aRamSearcher, _searcher } );
>		}
>	
>		return _searcher;
>	}
>
>//	===========================================================================
>//	RAM method(s)
>//	---------------------------------------------------------------------------
>
>	private Directory ramDirectory() throws IOException
>	{
>		if ( _ramDirectory == null )
>		{
>			_ramDirectory = new RAMDirectory();
>		}
>		
>		return _ramDirectory;
>	}
>
>	private IndexWriter ramWriter() throws IOException
>	{
>		if ( _ramWriter == null )
>		{
>			Directory	aDirectory = this.ramDirectory();
>			Analyzer	anAnalyzer = new StandardAnalyzer();
>			
>			_ramWriter = new IndexWriter( aDirectory, anAnalyzer, true );
>		}
>
>		return _ramWriter;
>	}
>	
>	private void flush() throws IOException
>	{
>		if ( ( _ramDirectory != null ) && 
>			( _ramDirectory.list() != null ) && 
>			( _ramDirectory.list().length > 0 ) && 
>			( _ramWriter != null ) )
>		{
>			_ramWriter.optimize();
>			_ramWriter.close();
>		
>			this.writer().addIndexes( new Directory[] { _ramDirectory } );
>			
>			_ramWriter = null;
>			_ramDirectory = null;
>			
>			_reader = null;
>			_searcher = null;
>		}
>	}
>
>//	===========================================================================
>//	Indexing method(s)
>//	---------------------------------------------------------------------------
>
>	synchronized Hits search(Query aQuery) throws IOException
>	{
>		if ( aQuery != null )
>		{
>			if ( this.shouldCreate() == false )
>			{
>				return this.searcher().search( aQuery );
>			}
>			
>			return null;
>		}
>		
>		throw new IllegalArgumentException( "SZIndex.search: null query." );
>	}
>
>	synchronized void deleteIndexWithID(SZID anID) throws IOException
>	{
>		if ( anID != null )
>		{
>			if ( this.shouldCreate() == false )
>			{
>				String		aValue = SZIndex.stringForValue( anID );
>				Term		aTerm = new Term( SZDescription.IDKey, aValue );
>				IndexReader	aReader = this.reader();
>					
>				aReader.delete( aTerm );
>			}
>
>			return;
>		}
>		
>		throw new IllegalArgumentException( "SZIndex.deleteIndexWithID: null id." );
>	}
>
>	synchronized void indexValuesWithID(Map someValues, SZID anID) throws IOException
>	{
>		if ( someValues != null )
>		{
>			if ( anID != null )
>			{
>				Class		aClass = anID.entity();
>				SZDescription	aDescription = SZDescription.descriptionForClass( aClass );
>				Collection	someUniqueKeys = aDescription.uniqueKeys();
>				String		anIdentifier = SZIndex.stringForValue( anID );
>				Field		anIdentifierField = Field.Keyword( SZDescription.IDKey, anIdentifier );
>				String		aClassName = anID.entity().getName();
>				Field		aClassField = Field.Keyword( SZDescription.ClassKey, aClassName );
>				Document	aDocument = new Document();
>				IndexWriter	aWriter = this.ramWriter();
>				
>				aDocument.add( anIdentifierField );
>				aDocument.add( aClassField );
>				
>				for( Iterator anIterator = someValues.keySet().iterator(); anIterator.hasNext(); )
>				{
>					Object	aKey = anIterator.next();
>					Object	aValue = someValues.get( aKey );
>					String	aKeyName = aKey.toString();
>					String	aStringValue = SZIndex.stringForValue( aValue );
>					Field	aField = null;
>
>					if ( ( ( aValue instanceof SZPersistent ) == true ) ||
>						( ( someUniqueKeys != null ) && ( someUniqueKeys.contains( aKeyName ) == true ) ) )
>					{
>						aField = new Field( aKeyName, aStringValue, false, true, false) ;
>					}
>					else
>					{
>						aField = Field.UnStored( aKeyName, aStringValue );
>					}
>
>					aDocument.add( aField );
>				}
>				
>				aWriter.addDocument( aDocument );
>				aWriter.optimize();
>				
>				_counter += 1;
>				
>				if ( _counter > 100 )
>				{
>					this.flush();
>					_counter = 0;
>				}
>				
>				return;
>			}
>
>			throw new IllegalArgumentException( "SZIndex.indexValues: null id." );
>		}
>		
>		throw new IllegalArgumentException( "SZIndex.indexValues: null values." );
>	}
>	
>//	===========================================================================
>//	FilenameFilter method(s)
>//	---------------------------------------------------------------------------
>
>	private static final class Filter extends Object implements FilenameFilter
>	{
>	
>		private Filter()
>		{
>			super();
>		}
>
>		public boolean accept(File aDirectory, String aName)
>		{
>			if ( aName.endsWith( SZIndex.Extension ) == true )
>			{
>				File	aFile = new File( aDirectory, aName );
>										
>				if ( aFile.isDirectory() == true )
>				{
>					return true;
>				}
>			}
>			
>			return false;
>		}
>	}
>	
>}
>
>
>------------------------------------------------------------------------
>
>--
>To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
>For additional commands, e-mail: <ma...@jakarta.apache.org>
>
> SZIndex.java
>
> Content-Type:
>
> text/plain
> Content-Encoding:
>
> 7bit
>
>
> ------------------------------------------------------------------------
> Part 1.3
>
> Content-Type:
>
> text/plain
>
>

Re: FileNotFoundException: code example

Posted by Otis Gospodnetic <ot...@yahoo.com>.

Hello,

I'll put my comments inline...

--- petite_abeille <pe...@mac.com> wrote:
> Hello again,
> 
> attached is the source code of the only class interacting directly
> with 
> Lucene in my app. Sorry for not providing a complete test case as
> it's 
> hard for me to come up with something self contained. Maybe there is 
> something that's obviously wrong in what I'm doing.
> 
> Thanks for any help.
> 
> PA
> 
> > //
> //
>
===========================================================================
> //
> //	Title:		SZIndex.java
> //	Description:	[Description]
> //	Author:		Raphael Szwarc <ra...@hotmail.com>
> //	Creation Date:	Wed Sep 12 2001
> //	Legal:		Copyright (C) 2001 Raphael Szwarc. All Rights Reserved.
> //
> //
>
---------------------------------------------------------------------------
> //
> 
> package alt.dev.szobject;
> 
> import com.lucene.store.Directory;
> import com.lucene.store.FSDirectory;
> import com.lucene.store.RAMDirectory;
> import com.lucene.document.Field;
> import com.lucene.document.DateField;
> import com.lucene.document.Document;
> import com.lucene.analysis.Analyzer;
> import com.lucene.analysis.standard.StandardAnalyzer;
> import com.lucene.index.IndexWriter;
> import com.lucene.index.IndexReader;
> import com.lucene.index.Term;
> import com.lucene.search.IndexSearcher;
> import com.lucene.search.MultiSearcher;
> import com.lucene.search.Searcher;
> import com.lucene.search.Query;
> import com.lucene.search.Hits;
> 
> import java.io.FilenameFilter;
> import java.io.File;
> import java.io.IOException;
> 
> import java.util.Map;
> import java.util.Collection;
> import java.util.Date;
> import java.util.Iterator;
> 
> import alt.dev.szfoundation.SZHexCoder;
> import alt.dev.szfoundation.SZDate;
> import alt.dev.szfoundation.SZSystem;
> import alt.dev.szfoundation.SZLog;
> 
> final class SZIndex extends Object
> {
> 
> //
>
===========================================================================
> //	Constant(s)
> //
>
---------------------------------------------------------------------------
> 
> 	private static final String	Extension = ".index";
> 
> //
>
===========================================================================
> //	Class variable(s)
> //
>
---------------------------------------------------------------------------
> 
> 	private static final Filter	_filter = new Filter();
> 
> //
>
===========================================================================
> //	Instance variable(s)
> //
>
---------------------------------------------------------------------------
> 
> 	private String			_path = null;
> 	private transient File		_directory = null;
> 	private transient Directory	_indexDirectory = null;
> 	private transient IndexWriter	_writer = null;
> 	
> 	private transient IndexReader	_reader = null;
> 	private transient Searcher	_searcher = null;
> 
> 	private transient Directory	_ramDirectory = null;
> 	private transient IndexWriter	_ramWriter = null;
> 	private transient int		_counter = 0;
> 
> //
>
===========================================================================
> //	Constructor method(s)
> //
>
---------------------------------------------------------------------------
> 
> 	private SZIndex()
> 	{
> 		super();
> 	}
> 
> //
>
===========================================================================
> //	Class method(s)
> //
>
---------------------------------------------------------------------------
> 
> 	static FilenameFilter filter()
> 	{
> 		return _filter;
> 	}
> 	
> 	static String stringByDeletingPathExtension(String aPath)
> 	{
> 		if ( aPath != null )
> 		{
> 			int	anIndex = aPath.lastIndexOf( SZIndex.Extension );
> 			
> 			if ( anIndex > 0 )
> 			{
> 				aPath = aPath.substring( 0, anIndex );
> 			}
> 			
> 			return aPath;
> 		}
> 		
> 		throw new IllegalArgumentException(
> "SZIndex.stringByDeletingPathExtension: null path." );
> 	}
> 
> 	static SZIndex indexWithNameInDirectory(String aName, File
> aDirectory)
> 	{
> 		if ( aName != null )
> 		{
> 			if ( aDirectory != null )
> 			{
> 				String	anEncodedName = SZHexCoder.encode( aName.getBytes() );
> 				//String	aPath = aDirectory.getPath() + File.separator +
> anEncodedName + SZIndex.Extension + File.separator;
> 				String	aPath = aDirectory.getPath() + File.separator + aName +
> SZIndex.Extension + File.separator;
> 				SZIndex	anIndex = new SZIndex();
> 			
> 				anIndex.setPath( aPath );
> 				
> 				return anIndex;
> 			}
> 
> 			throw new IllegalArgumentException(
> "SZIndex.indexWithNameInDirectory: null directory." );
> 		}
> 		
> 		throw new IllegalArgumentException(
> "SZIndex.indexWithNameInDirectory: null name." );
> 	}
> 
> 	static String stringForValue(Object aValue )
> 	{
> 		if ( aValue != null )
> 		{
> 			String	aStringValue = null;
> 						
> 			if ( ( aValue instanceof SZDate ) == true )
> 			{
> 				aValue = ( (SZDate) aValue ).internalDate();
> 			}
> 			else
> 			if ( ( aValue instanceof SZPersistent ) == true )
> 			{
> 				aValue = ( (SZPersistent) aValue ).id();
> 			}
> 						
> 			if ( ( aValue instanceof Date ) == true )
> 			{
> 				aStringValue = DateField.dateToString( (Date) aValue );
> 			}
> 			else
> 			if ( ( aValue instanceof SZID ) == true )
> 			{
> 				aStringValue = ( (SZID) aValue ).uuidString();
> 			}
> 			else
> 			{
> 				aStringValue = aValue.toString();
> 			}
> 			
> 			return aStringValue;
> 		}
> 
> 		throw new IllegalArgumentException( "SZIndex.stringForValue: null
> value." );
> 	}
> 	
> //
>
===========================================================================
> //	Instance method(s)
> //
>
---------------------------------------------------------------------------
> 
> 	private String path()
> 	{
> 		return _path;
> 	}
> 	
> 	private void setPath(String aValue)
> 	{
> 		_path = aValue;
> 	}
> 	
> 	private File directory()
> 	{
> 		if ( _directory == null )
> 		{
> 			String	aPath = this.path();
> 			
> 			if ( aPath != null )
> 			{
> 				_directory = new File( aPath );
> 			
> 				if ( _directory.exists() == false )
> 				{
> 					_directory.mkdirs();
> 				}
> 			}
> 			else
> 			{
> 				throw new IllegalStateException( "SZIndex.directory: null path."
> );
> 			}
> 		}
> 		
> 		return _directory;
> 	}
> 	
> 	private boolean shouldCreate()
> 	{
> 		File		aDirectory = this.directory();
> 		String[]	aList = aDirectory.list();
> 		
> 		if ( ( aList == null ) || ( aList.length == 0 ) )
> 		{
> 			return true;
> 		}
> 		
> 		return false;
> 	}
> 	
> 	boolean exists()

OG: one may think this method checks for existence of an index, but it
only checks for existence of a directory.  Perhaps directoryExists()
would be a better name.

> 	{
> 		File	aFile = this.directory();
> 		
> 		if ( aFile != null )
> 		{
> 			return aFile.exists();
> 		}
> 		
> 		return false;
> 	}
> 	
> 	private SZDate lastModifiedDate()
> 	{
> 		if ( this.exists() == true )
> 		{
> 			File	aDirectory = this.directory();
> 			Date	aDate = new Date( aDirectory.lastModified() );
> 			SZDate	aCalendarDate = SZDate.dateWithDate( aDate );
> 			
> 			return aCalendarDate;
> 		}
> 		
> 		return null;
> 	}
> 	
> 	public int hashCode()
> 	{
> 		return this.path().hashCode();
> 	}
> 	
> 	public boolean equals(Object anObject)
> 	{
> 		if ( this == anObject )
> 		{
> 			return true;
> 		}
> 		
> 		return this.path().equals( ( (SZIndex) anObject ).path() );
> 	}
> 
> 	protected void finalize() throws Throwable
> 	{
> 		if ( _writer != null )
> 		{
> 			this.optimize();

OG: perhaps you want to close some stuff here, although I'm not sure
about doing that in finalize()...

> 		}
> 					
> 		super.finalize();
> 	}
> 	
> //
>
===========================================================================
> //	Index method(s)
> //
>
---------------------------------------------------------------------------
> 
> 	synchronized void optimize()
> 	{
> 		try
> 		{
> 			this.flush();
> 			
> 			if ( _writer != null )
> 			{
> 				_writer.optimize();

OG: optimize can throw IOException.  In that case your close() will not
get executed.  Maybe you can use a finally block.

> 				_writer.close();
> 			
> 			}
> 			
> 			_writer = null;
> 			_indexDirectory = null;
> 		}
> 		catch(Exception anException)
> 		{
> 			anException.printStackTrace();
> 		
> 			SZLog.warning( anException );
> 			
> 			_writer = null;
> 			_indexDirectory = null;

OG: duplicate assignments, suitable for finally block.

> 			SZSystem.gc();
> 		}
> 	}
> 	
> 				
> 	private Directory indexDirectory() throws IOException
> 	{
> 		if ( _indexDirectory == null )
> 		{
> 			File	aFile = this.directory();
> 			boolean	shouldCreate = this.shouldCreate();
> 			
> 			//_indexDirectory = FSDirectory.getDirectory( aFile, shouldCreate
> );
> 			_indexDirectory = new FSDirectory( aFile, shouldCreate );
> 		}
> 		
> 		return _indexDirectory;
> 	}
> 
> 	private IndexWriter writer() throws IOException
> 	{
> 		if ( _writer == null )
> 		{
> 			Directory	aDirectory = this.indexDirectory();
> 			Analyzer	anAnalyzer = new StandardAnalyzer();
> 			boolean		shouldCreate = this.shouldCreate();
> 			
> 			_writer = new IndexWriter( aDirectory, anAnalyzer, shouldCreate );
> 			_writer.mergeFactor = 2;
> 		}
> 
> 		return _writer;
> 	}
> 	
> 	private IndexReader reader() throws IOException
> 	{
> 		if ( _reader == null )
> 		{
> 			System.gc();
> 			
> 			_reader = IndexReader.open( this.indexDirectory() );
> 		}

OG: you are opening an IndexReader, but I don't think I saw it being
closed anywhere.

> 		return _reader;
> 	}
> 	
> 	private Searcher searcher() throws IOException
> 	{
> 		if ( _searcher == null )
> 		{
> 			System.gc();
> 
> 			_searcher = new IndexSearcher( this.reader() );
> 		}
> 		
> 		if ( _ramDirectory != null )
> 		{
> 			Searcher	aRamSearcher = new IndexSearcher( IndexReader.open(
> _ramDirectory ) );

OG: another open...

> 			return new MultiSearcher( new Searcher[] { aRamSearcher, _searcher
> } );
> 		}
> 	
> 		return _searcher;
> 	}
> 
> //
>
===========================================================================
> //	RAM method(s)
> //
>
---------------------------------------------------------------------------
> 
> 	private Directory ramDirectory() throws IOException
> 	{
> 		if ( _ramDirectory == null )
> 		{
> 			_ramDirectory = new RAMDirectory();
> 		}
> 		
> 		return _ramDirectory;
> 	}
> 
> 	private IndexWriter ramWriter() throws IOException
> 	{
> 		if ( _ramWriter == null )
> 		{
> 			Directory	aDirectory = this.ramDirectory();
> 			Analyzer	anAnalyzer = new StandardAnalyzer();
> 			
> 			_ramWriter = new IndexWriter( aDirectory, anAnalyzer, true );
> 		}
> 
> 		return _ramWriter;
> 	}
> 	
> 	private void flush() throws IOException
> 	{
> 		if ( ( _ramDirectory != null ) && 
> 			( _ramDirectory.list() != null ) && 
> 			( _ramDirectory.list().length > 0 ) && 
> 			( _ramWriter != null ) )
> 		{
> 			_ramWriter.optimize();
> 			_ramWriter.close();
> 		
> 			this.writer().addIndexes( new Directory[] { _ramDirectory } );
> 			
> 			_ramWriter = null;
> 			_ramDirectory = null;
> 			
> 			_reader = null;
> 			_searcher = null;

OG: both IndexReader and IndexSearcher have a close() method.  Have you
tried calling them here?  Does it help?  You can still assign nulls
later to help GC.

> 		}
> 	}
> 
> //
>
===========================================================================
> //	Indexing method(s)
> //
>
---------------------------------------------------------------------------
> 
> 	synchronized Hits search(Query aQuery) throws IOException
> 	{
> 		if ( aQuery != null )
> 		{
> 			if ( this.shouldCreate() == false )
> 			{
> 				return this.searcher().search( aQuery );
> 			}
> 			
> 			return null;
> 		}
> 		
> 		throw new IllegalArgumentException( "SZIndex.search: null query."
> );
> 	}
> 
> 	synchronized void deleteIndexWithID(SZID anID) throws IOException
> 	{
> 		if ( anID != null )
> 		{
> 			if ( this.shouldCreate() == false )
> 			{
> 				String		aValue = SZIndex.stringForValue( anID );
> 				Term		aTerm = new Term( SZDescription.IDKey, aValue );
> 				IndexReader	aReader = this.reader();
> 					
> 				aReader.delete( aTerm );
> 			}
> 
> 			return;
> 		}
> 		
> 		throw new IllegalArgumentException( "SZIndex.deleteIndexWithID:
> null id." );
> 	}
> 
> 	synchronized void indexValuesWithID(Map someValues, SZID anID)
> throws IOException
> 	{
> 		if ( someValues != null )
> 		{
> 			if ( anID != null )
> 			{
> 				Class		aClass = anID.entity();
> 				SZDescription	aDescription = SZDescription.descriptionForClass(
> aClass );
> 				Collection	someUniqueKeys = aDescription.uniqueKeys();
> 				String		anIdentifier = SZIndex.stringForValue( anID );
> 				Field		anIdentifierField = Field.Keyword( SZDescription.IDKey,
> anIdentifier );
> 				String		aClassName = anID.entity().getName();
> 				Field		aClassField = Field.Keyword( SZDescription.ClassKey,
> aClassName );
> 				Document	aDocument = new Document();
> 				IndexWriter	aWriter = this.ramWriter();
> 				
> 				aDocument.add( anIdentifierField );
> 				aDocument.add( aClassField );
> 				
> 				for( Iterator anIterator = someValues.keySet().iterator();
> anIterator.hasNext(); )
> 				{
> 					Object	aKey = anIterator.next();
> 					Object	aValue = someValues.get( aKey );
> 					String	aKeyName = aKey.toString();
> 					String	aStringValue = SZIndex.stringForValue( aValue );
> 					Field	aField = null;
> 
> 					if ( ( ( aValue instanceof SZPersistent ) == true ) ||
> 						( ( someUniqueKeys != null ) && ( someUniqueKeys.contains(
> aKeyName ) == true ) ) )
> 					{
> 						aField = new Field( aKeyName, aStringValue, false, true, false)
> ;
> 					}
> 					else
> 					{
> 						aField = Field.UnStored( aKeyName, aStringValue );
> 					}
> 
> 					aDocument.add( aField );
> 				}
> 				
> 				aWriter.addDocument( aDocument );
> 				aWriter.optimize();
> 				
> 				_counter += 1;
> 				
> 				if ( _counter > 100 )
> 				{
> 					this.flush();
> 					_counter = 0;
> 				}
> 				
> 				return;
> 			}
> 
> 			throw new IllegalArgumentException( "SZIndex.indexValues: null
> id." );
> 		}
> 		
> 		throw new IllegalArgumentException( "SZIndex.indexValues: null
> values." );
> 	}
> 	
> //
>
===========================================================================
> //	FilenameFilter method(s)
> //
>
---------------------------------------------------------------------------
> 
> 	private static final class Filter extends Object implements
> FilenameFilter
> 	{
> 	
> 		private Filter()
> 		{
> 			super();
> 		}
> 
> 		public boolean accept(File aDirectory, String aName)
> 		{
> 			if ( aName.endsWith( SZIndex.Extension ) == true )
> 			{
> 				File	aFile = new File( aDirectory, aName );
> 										
> 				if ( aFile.isDirectory() == true )
> 				{
> 					return true;
> 				}
> 			}
> 			
> 			return false;
> 		}
> 	}
> 	
> }


That's all I can see.

Otis


__________________________________________________
Do You Yahoo!?
Yahoo! Health - your guide to health and wellness
http://health.yahoo.com

--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>

FileNotFoundException: code example

Posted by petite_abeille <pe...@mac.com>.

Hello again,

attached is the source code of the only class interacting directly with 
Lucene in my app. Sorry for not providing a complete test case as it's 
hard for me to come up with something self contained. Maybe there is 
something that's obviously wrong in what I'm doing.

Thanks for any help.

PA

FileNotFoundException: a typical stack trace

Posted by petite_abeille <pe...@mac.com>.

Just to follow up on this, here is a typical stack trace for 
FileNotFoundException:

04/28 09:17:55 (Warning) SZIndexer.indexObjectWithValues: 
java.io.FileNotFoundException: _2o.prx (Too many open files)
java.io.FileNotFoundException: _2o.prx (Too many open files)
         at java.io.RandomAccessFile.open(Native Method)
         at java.io.RandomAccessFile.<init>(RandomAccessFile.java:98)
         at java.io.RandomAccessFile.<init>(RandomAccessFile.java:143)
         at com.lucene.store.FSInputStream.<init>(FSDirectory.java:161)
         at com.lucene.store.FSDirectory.openFile(FSDirectory.java:145)
         at 
com.lucene.index.SegmentReader.openProxStream(SegmentReader.java:178)
         at 
com.lucene.index.SegmentTermPositions.open(SegmentTermPositions.java:39)
         at 
com.lucene.index.SegmentMerger.appendPostings(SegmentMerger.java:177)
         at 
com.lucene.index.SegmentMerger.mergeTermInfo(SegmentMerger.java:157)
         at 
com.lucene.index.SegmentMerger.mergeTermInfos(SegmentMerger.java:138)
         at 
com.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:101)
         at com.lucene.index.SegmentMerger.merge(SegmentMerger.java:54)
         at 
com.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:267)
         at 
com.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:241)
         at com.lucene.index.IndexWriter.optimize(IndexWriter.java:163)
         at com.lucene.index.IndexWriter.addIndexes(IndexWriter.java:178)
         at alt.dev.szobject.SZIndex.flush(SZIndex.java:399)

And here is what I'm doing in the flush method:

	private void flush() throws IOException
	{
		if ( ( _ramDirectory != null ) &&
			( _ramDirectory.list() != null ) &&
			( _ramDirectory.list().length > 0 ) &&
			( _ramWriter != null ) )
		{
			_ramWriter.optimize();
			_ramWriter.close();
		
			this.writer().addIndexes( new Directory[] { _ramDirectory } );
			
			_ramWriter = null;
			_ramDirectory = null;
			
			_reader = null;
			_searcher = null;
		}
	}


Any insight more than welcome.

Thanks.

PA


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>

Re: rc4 and FileNotFoundException: an update

Posted by petite_abeille <pe...@mac.com>.

Hi Steven,

> Sounds like a pretty nasty situation.

It is...

> This makes sense - any effort to solve the problem will first involve 
> isolating the
> bug, and that's a task you're best suited for, since you know your
> system best.

Ok... From what I understand, this situation arise depending on my 
"usage pattern" of Lucene. For example, if I use it in "batch" mode (eg, 
through some tools to stress test my app by loading a couple of millions 
of objects), everything works perfectly fine. However, when running my 
app in a more "interactive" mode (eg, with user interaction, object 
indexing, writing and searching at the same time) I run into this 
exception very quickly. The problem, seems to have something to do with 
Searcher and/or how I'm using them. I need to investigate in that 
direction... Also, what it the "magic" formula for minimizing 
RandomAccessFile usage in Lucene to a very strict minimum? Is 
IndexWriter.mergeFactor the only parameter I can play with, or am I 
missing some other configuration that might help?

> Then post your code and ask if some of the more lucene-knowledgeable 
> can take a look.

Unfortunately, it's not that straightforward as I'm using Lucene as part 
of some sort of custom built oodbms and this behavior seems to be usage 
related... You can check the app at http://homepage.mac.com/zoe_info/ if 
that helps.

>  Re: index integrity, I agree that it would be really, really nice to 
> have some sort of "sanity" check.

I'm not familiar with Lucene internals, but is it conceivable to have 
some sort of checksum per document and/or index that will help to 
identify "corrupted" data?

> As for repairing an index, I think that's working sort of against the 
> grain of Lucene.

:-(

> In your case, it sounds like rebuilding the index is important, because 
> you're using Lucene as a data store.

Well, not exactly. I'm just using Lucene to index my data store (with a 
bunch of Field.Keyword and Field.Unstored). The actual object storage is 
handled externally to Lucene. However, I need a consistent index as I'm 
using it as part of my object tree.

> Maybe it'd be a better idea to figure out some way to use Lucene as the 
> indexing
> technology in a data store, the way traditional RDBMSes use indexes,
> for speeding access.

I agree. It's how I'm using it more or less. Nevertheless, for the sake 
of reliability, I need to have some level of confidence that the 
underlying indexes are "sane"... And a way to correct the problem if 
they are not. In my case, I will happily trade speed for reliability as 
I cannot afford to have inconsistent indexes. A corrupted index is of 
not use to me.

> Or possibly you should look at Xindice (http://xml.apache.org/xindice/) 
> which is an XML database.

I'm familiar with Xindice and other related toolboxes. However, I have 
some "peculiar" requirements, so I decided to custom made my own 
persistency layer. Works fine so far. Just this very annoying exception. 
Also this situation seems to arise on UNIX systems only as I never heard 
anybody complaining about it on any Windows type platforms... Very odd 
in any case...

Thanks for your help in any case.

PA


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>

Re: rc4 and FileNotFoundException: an update

Posted by "Steven J. Owens" <pu...@darksleep.com>.

On Fri, Apr 26, 2002 at 07:05:23PM +0200, petite_abeille wrote:
> I guess it's really not my day...
> [...]
> Well, it's pretty ugly. Whatever I'm doing with Lucene in the previous 
> package (com.lucene) is magnified many folds in rc4. After processing a 
> paltry 16 objects I got:
> 
> "SZFinder.findObjectsWithSpecificationInStore: 
> java.io.FileNotFoundException: _2.f14 (Too many open files)"

     Sounds like a pretty nasty situation.  

     One suggestion I have for you is that Doug is usually very
helpful with problems like this IF you can first narrow down what is
happening to the point that you can post a clear, specific, isolated
test that consistently causes the problem to happen.  This makes sense
- any effort to solve the problem will first involve isolating the
bug, and that's a task you're best suited for, since you know your
system best.

     So maybe your best approach would be to take a copy of your
system as above, and start gradually stripping out stuff, testing
between each run, until you have most of the application-specific
stuff removed, but the problem is still reoccurring consistently.
Then post your code and ask if some of the more lucene-knowledgable
can take a look.

     Re: index integrity, I agree that it would be really, really nice
to have some sort of "sanity" check.  I have yet to actually get into
the internals of the index, but I'd guess that there must be some sort
of at least superficial check, maybe some sort of format check.  

     If I was going to kludge something together, the first approach
I'd take would be to just open the index and roll through all of the
Documents in it, accessing all of the fields (or maybe just a few main
fields per Document).  I"m not sure what I'd *do* with the field
values (printing them out to the screen might take a while), other
than perhaps checking for nulls.  But I suspect that if the code gets
throught that without causing an exception or getting null values,
then at least the index's internal format is intact.  Maybe the test
code could save the number of lucene Document objects in the index in
between checks (and, of course, update this number when you add or
remove documents), and make sure it still has the right number of
documents.

     As for repairing an index, I think that's working sort of against
the grain of Lucene.  In your case, it sounds like rebuilding the
index is important, because you're using Lucene as a data store.  I
have some similar issues myself in some things I want to build (I end
up wanting both a data store and a search index; ultimately I've ended
up choosing to have a separate data store for the extra data).  But
Lucene is a search index, meant to be used more in a cache-like style,
so there's an underlying assumption that the original data is always
around to reindex.  Thus, repairing an index is less important, since
it is assumed you can always rebuild it.  

     I don't know much of the theories behind data store systems.  It
occurs to me that using Lucene as a data store, you'll always be
working against the grain, always swimming upstream.  Maybe it'd be a
better idea to figure out some way to use Lucene as the indexing
technology in a data store, the way traditional RDBMSes use indexes,
for speeding access.  

     Or possibly you should look at Xindice (http://xml.apache.org/xindice/)
which is an XML database.  You might find it easier to adapt that to your
needs.  I'm kind of curious as to how fast Xindice's XPath execution is, and
what their indexing is based on - there might be a use for Lucene there.

Steven J. Owens
puff@darksleep.com

--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>

rc4 and FileNotFoundException: an update

Posted by petite_abeille <pe...@mac.com>.

Hello again,

I guess it's really not my day...

Just to make sure I'm not hallucinating to much, I downloaded the latest 
and greatest: rc4. Changed all the packages names to org.apache. Updated 
a method here and there to reflect the APIs changes. And run my little 
app. I would like to emphasize that except updating to the latest Lucene 
release, nothing else has changed.

Well, it's pretty ugly. Whatever I'm doing with Lucene in the previous 
package (com.lucene) is magnified many folds in rc4. After processing a 
paltry 16 objects I got:

"SZFinder.findObjectsWithSpecificationInStore: 
java.io.FileNotFoundException: _2.f14 (Too many open files)"

At least in the previous version, I will see that after a couple of 
thousand of objects...

So, it seems, that there is something really rotten in the kingdom of 
Denmark...

Any help much appreciated.

Thanks.


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>

Re: FileNotFoundException: Too many open files

Posted by petite_abeille <pe...@mac.com>.

Hi Otis,

> I looked only at your application's screenshots and based on that my
> guess is that you have a fairly high number of index fields, and if I
> recall correctly that can cause the above error.

Well, I used to have an index per class. And I have around a dozen 
classes that get indexed. When trying to switch to the latest rc (with 
the exact same code base), I ran into so many problems with the now 
infamous "FileNotFoundException" that I consolidated everything in one 
index per object store. And switched back to the com.lucene package that 
-as far as I can personally tell- is *much* more stable. I do not store 
the content of the objects in the index, just some uuid as Field.Keyword 
and other attributes as Field.UnStored. On average, there seem to be 
less than one hundred Lucene files per index.

> This was mentioned on the list once, too.
> I suggested using a shutdown hook in Runtime package, but then somebody
> responded with a drawback of that approach.

I have this one under control... Thanks.

> Not that I know.  If locking is getting in the way maybe you are not
> using Lucene properly.  I haven't downloaded your application yet, so I
> haven't had the chance to peek at the source.

Please feel free to do so... ;-)

> Yes, I believe so - I never encountered any problems with that.

Great. That was my assumption all along...

R.


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>

Re: FileNotFoundException: Too many open files

Posted by Otis Gospodnetic <ot...@yahoo.com>.

Hello,

> I'm running into this exception quiet often while using Lucene (the 
> situation is so bad with the latest rc, that I had to revert to the
> last 
> com.lucene package). I'm sure I have my fair share of bugs in my app,
> 
> but nonetheless, how can I "control" Lucene usage of
> RandomAccessFile? 
> The indexes are optimized and I try to keep a close look at how many 
> IndexWriter/Reader exists at any point in time... Nevertheless, I run
> 
> into that exception much too often :-( Any help appreciated!
> 
> "04/26 00:07:11 (Warning) Finder.findObjectsWithSpecificationInStore:
> 
> java.io.FileNotFoundException:  _la.f9 (Too many open files)"

I looked only at your application's screenshots and based on that my
guess is that you have a fairly high number of index fields, and if I
recall correctly that can cause the above error.
This was mentioned on one of the lists fairly recently, I believe.

> Also, on a somewhat related note, how do I "shut down" Lucene
> properly. 
> Eg, do I need to do anything with the IndexWriter and so on?

This was mentioned on the list once, too.
I suggested using a shutdown hook in Runtime package, but then somebody
responded with a drawback of that approach.

> Last, but not least, is there a way to turn of the file locking in
> the latest rc as it's really getting in the way :-(

Not that I know.  If locking is getting in the way maybe you are not
using Lucene properly.  I haven't downloaded your application yet, so I
haven't had the chance to peek at the source.

> Finally, I just wanted to make sure: Lucene is fully multi-threaded 
> right? I can do search *and* write concurrently in different threads
> at the same time on the same index?

Yes, I believe so - I never encountered any problems with that.

> BTW, should I post this kind of question to user or dev?

I suggest -user until/unless we determine that there is something in
Lucene that we can fix or improve.

Otis


__________________________________________________
Do You Yahoo!?
Yahoo! Games - play chess, backgammon, pool and more
http://games.yahoo.com/

--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>