You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Kristian Rickert <Kr...@halo.com> on 2001/12/07 02:59:19 UTC
RAMDirectory bug?
I'm getting the classic bug "ArrayIndexOutOfBounds" when performing a search
in my RAMDirectory.
So I'm going to explain this one in full detail.. so forgive me if this is a
waste to you. I really do think this is a re-surfaced bug because when my
RAMDirectory has fewer than ~2000 documents, the searching works flawlessly.
So here goes:
I used release candidate 2 and the nightly build and both of these yield the
same result...
I am having the same problem with the RAMDirectory search as mentioned in
message # ... here is a self-contained example that will cause the same
error.. I'll comment along the way so I'm not one of those asses who says,
"here's my code.. fix it!"
* First of all, I am using this application to index objects that just
contain simple values. I already use a RB tree to retrieve the information,
which is plenty quick for what I need.
* So, I am storing these as unstored and only indexed.
* Furthermore, I do not need to index the primary key, only store it.. so my
"VendorDocument" file goes as follows:
public static Document Document(FourgenVendorInfo fvi) {
// make a new, empty document
Document doc = new Document();
doc.add(Field.UnStored("Address1", "" + fvi.getAddress1()));
doc.add(Field.UnStored("Address2", "" + fvi.getAddress1()));
doc.add(Field.UnStored("BusinessName", "" + fvi.getBus_name()));
doc.add(Field.UnStored("City", "" + fvi.getCity()));
doc.add(Field.UnStored("State", "" + fvi.getState()));
doc.add(Field.UnStored("CountryCode", "" + fvi.getCountry_code()));
doc.add(Field.UnStored("Fax", "" + fvi.getFax_phone()));
doc.add(Field.UnStored("Asi", "" + fvi.getHal_asi_no()));
doc.add(Field.UnStored("Phone", "" + fvi.getPhone()));
doc.add(Field.UnIndexed("VendorCode", "" + fvi.getVend_code()));
doc.add(Field.UnStored("Zip", "" + fvi.getZip()));
doc.add(Field.UnStored("PlusFour", "" +
AddressParsers.parsePlusFour(fvi.getZip())));
return doc;
}
* I have about 8000 of these documents to add to the index.
First, I create RAMStorage:
RAMDirectory RAMStorage = new RAMDirectory();
//RAMStorage.createFile("Vendors");
IndexWriter indexer = null;
try {
create the IndexWriter with the RAMStorage, using my vendor analyzer - which
is a simplified form of simpleanalyzer (it doesn't ignore digits, literally
3 letters different code than the SimpleAnalyzer)
indexer = new IndexWriter(RAMStorage, new VendorAnalyzer(),
true);
I add the docuemnts to the indexer, optimize it and close it.
if (fviAllVendors != null) {
for (int i = 0; i < fviAllVendors.length; i++) {
Document currentDoc =
VendorDocument.Document(fviAllVendors[i]);
indexer.addDocument(currentDoc);
//System.out.println(currentDoc.toString());
}
}
indexer.optimize();
indexer.close();
Finally, I perform the search with the line "+State:mn":
Query query = QueryParser.parse(line, "contents", analyzer);
System.out.println("Searching for: " + query.toString("contents"));
Hits hits = searcher.search(query);
It is at this point where I get the array index out of bounds exception.
Other facts to especially note:
* This error only happens when there is a successful hit in the search (this
makes sense if you view the stack trace)
* I have noticed that when I have an index size of ~2000, I never get the
thing to break. Thus, I might just break this up into multple RAM
directories as a hack fix, although I suspect it could be the data I'm
providing
* Wildcard querys work fine with the parser.
* According to the stack trace, the error happens from a readInternal()
command within the RAMInputStream
Oh yeah, my environment:
*Same error on NT 4.0 and Sun OS 7.
*2 GB memory with a 100MB heap - nothing really taking up memory space
I worked on a search engine on my own and will be willing to contribute if I
find out the problem. For now, I may just switch to a file system search
instead. But this will probably be slower than if I optimized the hell out
of oracle and had that database do the trick for me.
I hope this will help. Below is a list of the MAIN file I've been using to
test. Also, you'll see a copy of the stack trace.
import java.io.IOException;
import java.io.BufferedReader;
import java.io.InputStreamReader;
import org.apache.log4j.Category;
import org.apache.lucene.store.RAMDirectory;
import org.apache.lucene.index.*;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.SimpleAnalyzer;
import org.apache.lucene.analysis.StopAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.search.Searcher;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.WildcardQuery;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.Hits;
import org.apache.lucene.queryParser.QueryParser;
/**
*
* @author FOOBAR
* @version
*/
public class VendorSearchIndexTest {
public static VendorInfo [] fviAllVendors;
/**
* @param args the command line arguments
*/
public static void main (String args[]) {
VendorInfo [] fviAllVendors =
DB_FourgenVendor.retrieveFourgenVendors();
//create a RAM directory
RAMDirectory RAMStorage = new RAMDirectory();
//I ran this with and without the line below
RAMStorage.createFile("Vendors");
IndexWriter indexer = null;
try {
indexer = new IndexWriter(RAMStorage, new VendorAnalyzer(),
true);
if (fviAllVendors != null) {
for (int i = 0; i < fviAllVendors.length; i++) {
Document currentDoc =
VendorDocument.Document(fviAllVendors[i]);
indexer.addDocument(currentDoc);
//System.out.println(currentDoc.toString());
}
}
indexer.optimize();
indexer.close();
Searcher searcher = new IndexSearcher(RAMStorage);
Analyzer analyzer = new VendorAnalyzer();
BufferedReader in = new BufferedReader(new
InputStreamReader(System.in));
while (true) {
System.out.print("Query: ");
String line = in.readLine();
if (line.length() == -1)
break;
//WildcardQuery query = new WildcardQuery(new Term("+City", "LI*"));
Query query = QueryParser.parse(line, "contents", analyzer);
System.out.println("Searching for: " + query.toString("contents"));
Hits hits = searcher.search(query);
System.out.println(hits.length() + " total matching documents");
final int HITS_PER_PAGE = 10;
for (int start = 0; start < hits.length(); start += HITS_PER_PAGE) {
int end = Math.min(hits.length(), start + HITS_PER_PAGE);
for (int i = start; i < end; i++)
System.out.println(i + ". " + hits.doc(i).get("VendorCode"));
if (hits.length() > end) {
System.out.print("more (y/n) ? ");
line = in.readLine();
if (line.length() == 0 || line.charAt(0) == 'n')
break;
}
}
}
searcher.close();
} catch (Exception e) {
System.out.println("Exception.. what the?: " + e.toString());
}
}}
--
To unsubscribe, e-mail: <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>