You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by syedfa <fa...@gmail.com> on 2008/11/03 00:53:21 UTC

Searching over multiple fields using XML document

Dear fellow Java/Lucene developers:

I am trying to search an xml document over multiple fields.  The index I
created using the SAX method.  I am trying to search shakespeare's "Hamlet"
over the <speaker> and <lines> tags for words that the user is looking for. 
I am thinking of using the MultiFieldQueryParser however, I also read that a
better alternative would be to combine the various fields together.  In the
book, "Lucene in Action", the author writes:

"A synthetic 'contents' field in our test environment uses this scheme to
put author and subjects together:

doc.add(Field.UnStored("contents", author + " " + subjects));

We used a space (" ") between author and subjects to separate words for the
analyzer."

I am not sure I fully understand what the author is referring to here.  In
my situation I have the following:

public void endElement(String uri, String localName, String qName) throws
SAXException{
		
		try {
			
			if(qName.equals("REFERENCE")){
				Field reference = new Field(qName, elementBuffer.toString(),
Field.Store.YES, Field.Index.NO, Field.TermVector.NO);
				doc.add(reference);
			}
			
			else if(qName.equals("SPEAKER")){
				Field speaker = new Field(qName, elementBuffer.toString(),
Field.Store.YES, Field.Index.TOKENIZED, Field.TermVector.YES);
				speaker.setBoost(2.0f);
				doc.add(speaker);
			}
			else if(qName.equals("LINES")){
				Field lines = new Field(qName, elementBuffer.toString(),
Field.Store.YES, Field.Index.TOKENIZED, Field.TermVector.YES);
				lines.setBoost(1.0f);
				doc.add(lines);
				indexWriter.addDocument(doc);				
			}
			
			else{
				return;
			}
				
		} catch (CorruptIndexException e) {
			e.printStackTrace();
		} catch (IOException e) {
			e.printStackTrace();
		}

		
		
	}


How would I combine the fields together into one synthetic field here so
that in my searcher code, I would search over one field, yet retrieve the
results from the several fields that the keyword is found and show that to
the user?

All I want to do, is allow a user to search an xml document over multiple
fields and return the results with the keywords they are searching for,
highlighted in the results list, just as google does when searching for
websites.   At this point, I am able to do a simple/fuzzy/wildcard search
over one field in the xml document, but would like to extend this
functionality over multiple fields.  Any ideas?

Thanks in advance to all who reply.
Sincerely;
Fayyaz
-- 
View this message in context: http://www.nabble.com/Searching-over-multiple-fields-using-XML-document-tp20295306p20295306.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org