You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Ridwan Habbal <ri...@hotmail.com> on 2007/08/02 09:50:12 UTC

RE: IndexReader deletes more than expected











> Subject: RE: IndexReader deletes more that expected> Date: Wed, 1 Aug 2007 09:07:32 -0700> From: steven_parkes@esseff.org> To: java-user@lucene.apache.org> > If I'm reading this correctly, there's something a little wonky here. In> your example code, you close the IndexWriter and then, without creating> a new IndexWriter, you call addDocument again. This shouldn't be> possible (what version of Lucene are you using?)   Yes, you are correct, I close indexWriter and then add more docs. What's wrong? it worked out fine, and add docs i add will appear to NEW INSTANCES OF INDEX SEARCHERS after calling close on the indexWriter.    As for creating new IndexWriter, I tried to, however i suffered of the lock exception i got, although i was closing the IndexWriter instance, before creating a new IndexWriter instance! I don't know why! furthermore, this is useless, for multiThreaded app, cause you can't know who is still writing to your index, and who has colsed his IndexWriter. Even checking that the index is locked b4, leads to unnecessary overhead which can be avoided since it works for me and i can write with one single instance of IndexWriter. > > Assuming for the time being that you are creating the IndexWriter again,> the other issue here is that you shouldn't be able to have a reader and> a writer changing an index at the same time. There should be a lock> failure. This should occur either in the Index Well, I think that i don't get the problems you expect cause i use the Lucene version that is shiped by compass distribution (www.compassframework.org) In short, compass to Lucene is the same of ORM server like hibernate for DBMS like oracle. It's really works fine but i couldn't understand why compass hides the deleteDocuments(Term) method on IndexWriter classhttp://lucene.zones.apache.org:8080/hudson/job/Lucene-Nightly/javadoc/org/apache/lucene/index/IndexWriter.html#deleteDocuments(org.apache.lucene.index.Term). This is why i used delete on a reader rather than using it on the same writer instance, and the only one i have. I couldn't manage my index in one particular situation using compass, because i had to store data not in the usual way we do (every row in table is a record). So i think i have to ask The compass team about that. Anyways, if you have coments, you or the others please do. > > Might you be creating your IndexWriters (which you don't show) with the> create flag always set to true? That will wipe your index each time,> ignoring the locks and cause all sorts of weird results.No, I don't create a new Instance of indexWriter, the only one I create is in the Service constructor, so i create a new clean (has no docs) index only when program starts up. public LuceneServiceSHImp(String indexDirectory) throws IOException{this.indexDirectory = indexDirectory;standardAnalyzer = new StandardAnalyzer();indexWriter = new IndexWriter(new java.io.File(indexDirectory), standardAnalyzer, true);indexWriter.close();}> > -----Original Message-----> From: Ridwan Habbal [mailto:ridwanhabbal@hotmail.com] > Sent: Wednesday, August 01, 2007 8:48 AM> To: java-user@lucene.apache.org> Subject: IndexReader deletes more that expected> > Hi, I got unexpected behavior while testing lucene. To shortly address> the problem: Using IndexWriter I add docs with fields named ID with a> consecutive order (1,2,3,4, etc) then close that index. I get new> IndexReader, and call IndexReader.deleteDocuments(Term). The term is> simply new Term("ID", "1"). and then class close on IndexReader. Things> work out fine. But if i add docs using IndexWriter, close writer, then> create new IndexReader to delete one of the docs already inserted, but> without closing index. while the indexReader that perform deletion is> still not closed, I add more docs, and then commit the IndexWriter, so> when i search I get all added docs in the two phases (before using> deleteDocuments() on IndexReader and after because i haven't closed> IndexReader, howerer, closed IndexWriter). I close IndexReader and then> query the index, so i deletes all docs after opening it till closing it,> in addition to the specified doc in the Term object (in this test case:> ID=1). I know that i can avoid this by close IndexReader directly after> deleting docs, but what about runing it on mutiThread app like web> application? There you are the code: > IndexSearcher indexSearcher = new IndexSearcher(this.indexDirectory);> Hits hitsB4InsertAndClose = null;> hitsB4InsertAndClose = getAllAsHits(indexSearcher);> int beforeInsertAndClose = hitsB4InsertAndClose.length();> > indexWriter.addDocument(getNewElement());> indexWriter.addDocument(getNewElement());> indexWriter.addDocument(getNewElement());> indexWriter.close();> IndexSearcher indexSearcherDel = new IndexSearcher(this.indexDirectory);> indexSearcherDel.getIndexReader().deleteDocuments(new Term("ID","1"));> > indexWriter.addDocument(getNewElement());> indexWriter.addDocument(getNewElement());> indexWriter.addDocument(getNewElement());> indexWriter.addDocument(getNewElement());> indexWriter.addDocument(getNewElement());> indexWriter.addDocument(getNewElement());> indexWriter.addDocument(getNewElement());> indexWriter.addDocument(getNewElement());> indexWriter.addDocument(getNewElement());> indexWriter.addDocument(getNewElement());> indexWriter.addDocument(getNewElement());> > indexWriter.close();> Hits hitsAfterInsertAndClose = getAllAsHits(indexSearcher);> int AfterInsertAndClose = hitsAfterInsertAndClose.length();//This is 14> > indexWriter.addDocument(getNewElement());> indexWriter.close();> Hits hitsAfterInsertAndAfterCloseb4Delete = getAllAsHits(indexSearcher);> int hitsAfterInsertButAndAfterCountb4Delete => hitsAfterInsertAndAfterCloseb4Delete.length();//This is 15> > > > indexSearcherDel.close();> Hits hitsAfterInsertAndAfterClose = getAllAsHits(indexSearcher);int> hitsAfterInsertButAndAfterCount => hitsAfterInsertAndAfterClose.length();//This is 2 The two methods I> Use > private Hits getAllAsHits(IndexSearcher indexSearcher){> try{> Analyzer analyzer = new StandardAnalyzer();> String defaultSearchField = "all";> QueryParser parser = new QueryParser(defaultSearchField, analyzer);> indexSearcher = new IndexSearcher(this.indexDirectory);> Hits hits = indexSearcher.search(parser.parse("+alias:mydoc"));> indexSearcher.close();> return hits;> }catch(IOException ex){> throw new RuntimeException(ex);> }catch(org.apache.lucene.queryParser.ParseException ex){> throw new RuntimeException(ex);> }> > }> > private Document getNewElement(){> Map<String, String> map = new HashMap();> map.put("ID", new Integer(insertCounter).toString());> map.put("name", "name"+insertCounter);> insertCounter++;> Document document = new Document();> for (Iterator iter = map.keySet().iterator(); iter.hasNext();) {> String key = (String) iter.next();> document.add(new Field(key, map.get(key), Store.YES, Index.TOKENIZED));> }> document.add(new Field("alias", "mydoc", Store.YES,> Index.UN_TOKENIZED));> return document;} any clue why it works that way? I expect it to delete> only one doc? > _________________________________________________________________> PC Magazine's 2007 editors' choice for best web mail-award-winning> Windows Live Hotmail.> http://imagine-windowslive.com/hotmail/?locale=en-us&ocid=TXT_TAGHM_migr> ation_HMWL_mini_pcmag_0707> > ---------------------------------------------------------------------> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org> For additional commands, e-mail: java-user-help@lucene.apache.org> 

Don't get caught with egg on your face.    Play Chicktionary!  
_________________________________________________________________
Don't get caught with egg on your face. Play Chicktionary!  
http://club.live.com/chicktionary.aspx?icid=chick_wlmailtextlink