You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Java Programmer <jp...@gmail.com> on 2006/11/29 09:36:43 UTC

Writing and searching same time

Hello,
I have trouble with writing and searching on lucene index same time,
all I did so far is making a class which has 2 methods:
private String indexLocation;
	
public void addDocument(int id,String title, String body) throws IOException{
		IndexWriter indexWriter = new IndexWriter(indexLocation, new
SimpleAnalyzer(), false);
		Document doc = new Document();
		doc.add(new Field("id",Integer.toString(id),Store.YES,Index.NO));
		doc.add(new Field("title",title,Store.NO,Index.TOKENIZED));
		doc.add(new Field("body",body,Store.NO,Index.TOKENIZED));
		indexWriter.addDocument(doc);
		indexWriter.close();
}
	
public List<Integer> search(String query) throws IOException, ParseException{
		IndexSearcher indexSearcher = new IndexSearcher(indexLocation);
		MultiFieldQueryParser queryparser = new MultiFieldQueryParser(new
String[]{"title","body"}, new SimpleAnalyzer());
		Query q = queryparser.parse(query);
		Hits hits = indexSearcher.search(q);
		Iterator it = hits.iterator();
		List<Integer> output = new ArrayList<Integer>();
		while(it.hasNext()){
			output.add(Integer.parseInt(((Hit)it.next()).getDocument().get("id")));
		}
		indexSearcher.close();
		return output;
}
What I don't like is that I have in each method opening IndexWriter
and IndexSearcher, I try to open them once and keep opened throught
whole lifecycle of application (which would be very long cause it
would be search for news working as webservice), but when I wasn't
close IndexWriter then IndexSearcher wasn't seen any new documents in
index. Next step was keeping IndexWriter open and reopen only
IndexSearcher but in this case also IndexSearcher was seen old index
without new documents. So my final version is this above, but could it
be better, without closing IndexWriter after each addition, and
opening IndexSearcher before each search query? What is the best
pattern of doing such systems?

Another question: do I need provide any synchronization on
indexWriter.addDocument(doc) method? I see that it isn't synchronized,
so maybe programmer need to do it himself?

Best regards,
Adr

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


[ANN]VTD-XML 1.8

Posted by Jimmy Zhang <cr...@comcast.net>.
Version 1.8 of VTD-XML is now released. The new features are:

·        XMLModifier is a easy to use class that takes advantage of the
incremental update capability offered by VTD-XML

·        XPath built-in functions are now almost complete

·        This release added encoding support for iso-8859-2~10, windows code
page 1250~1258

·        Added various functions to autoPilot that evaluate XPath to string,
number and boolean

·        This release also fixes a number of XPath bugs related to string
handling

To download the latest release please visit http://vtd-xml.sf.net



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Writing and searching same time

Posted by Simon Willnauer <si...@googlemail.com>.
On 11/29/06, Java Programmer <jp...@gmail.com> wrote:
> Hello,
> I have trouble with writing and searching on lucene index same time,
> all I did so far is making a class which has 2 methods:
> private String indexLocation;
>
> public void addDocument(int id,String title, String body) throws IOException{
>                 IndexWriter indexWriter = new IndexWriter(indexLocation, new
> SimpleAnalyzer(), false);
>                 Document doc = new Document();
>                 doc.add(new Field("id",Integer.toString(id),Store.YES,Index.NO));
>                 doc.add(new Field("title",title,Store.NO,Index.TOKENIZED));
>                 doc.add(new Field("body",body,Store.NO,Index.TOKENIZED));
>                 indexWriter.addDocument(doc);
>                 indexWriter.close();
> }
>
> public List<Integer> search(String query) throws IOException, ParseException{
>                 IndexSearcher indexSearcher = new IndexSearcher(indexLocation);
>                 MultiFieldQueryParser queryparser = new MultiFieldQueryParser(new
> String[]{"title","body"}, new SimpleAnalyzer());
>                 Query q = queryparser.parse(query);
>                 Hits hits = indexSearcher.search(q);
>                 Iterator it = hits.iterator();
>                 List<Integer> output = new ArrayList<Integer>();
>                 while(it.hasNext()){
>                         output.add(Integer.parseInt(((Hit)it.next()).getDocument().get("id")));
>                 }
>                 indexSearcher.close();
>                 return output;
> }
> What I don't like is that I have in each method opening IndexWriter
> and IndexSearcher, I try to open them once and keep opened throught
> whole lifecycle of application (which would be very long cause it
> would be search for news working as webservice), but when I wasn't
> close IndexWriter then IndexSearcher wasn't seen any new documents in
> index. Next step was keeping IndexWriter open and reopen only
> IndexSearcher but in this case also IndexSearcher was seen old index
> without new documents. So my final version is this above, but could it
> be better, without closing IndexWriter after each addition, and
> opening IndexSearcher before each search query? What is the best
> pattern of doing such systems?

Hi there, it's quiet a good Idea to keep IndexWriter/Reader(Searcher)
open as long as possible. A quiet nice patter is used by solr and
gdata-server. The indexwriter will be closed after a certain amount of
document have been added to the index or if an timeout exceeded
without any updates inserts. As soon as the writers close method is
called a new index searcher will be opened and released to the
application. If you work in a webapp e.g. in a multithreaded env. you
shout track the references of you searcher to close it if nobody uses
it anymore.
Keeping Searchers and Writers open will result in a much better
performance if it is considerable to have updates and insert invisible
to the searcher for a certain amount of time.

Have a look at this:
http://svn.apache.org/repos/asf/lucene/java/trunk/contrib/gdata-server/src/java/org/apache/lucene/gdata/search/index/GDataIndexer.java

http://svn.apache.org/repos/asf/lucene/java/trunk/contrib/gdata-server/src/java/org/apache/lucene/gdata/utils/ReferenceCounter.java
http://svn.apache.org/repos/asf/lucene/java/trunk/contrib/gdata-server/src/java/org/apache/lucene/gdata/search/index/IndexController.java
--> private ReferenceCounter<IndexSearcher>
getNewServiceSearcher(final Directory dir)



>
> Another question: do I need provide any synchronization on
> indexWriter.addDocument(doc) method? I see that it isn't synchronized,
> so maybe programmer need to do it himself?
>
> Best regards,
> Adr
You could queue the document to add to the index to keep your
indexwriter busy. Might be a good idea anyway.

best regards simon
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org