You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@lucene.apache.org by SBS <jt...@uow.edu.au> on 2011/06/18 13:35:32 UTC

Highlighting matching words within a document

This is my first look at Lucene and I am sure this must be a common problem. 
I have managed to index all my content and get results from queries but the
question is how do I highlight the matches in a document?  I mean I have
some HTML documents and then I need to display somehow the occurrences
within those documents that match the query.

If the query was a single word then the problem is quite easy but when you
consider that the query can be complex and contain wildcards and compound
expressions, how can I know which words or partial words within the document
are the reasons for the document being returned as a match?

Thanks,

-sbs

--
View this message in context: http://lucene.472066.n3.nabble.com/Highlighting-matching-words-within-a-document-tp3079505p3079505.html
Sent from the Lucene - General mailing list archive at Nabble.com.

Re: Highlighting matching words within a document

Posted by raphael812 <or...@eecs.qmul.ac.uk>.
I want to build a search engine using the lucene library. i have not used it
before but i am currently studying the book "Lucene in action". i have a
file Indexer.java which is supposed to take two command line arguments one
specfying the directory containing the files to be indexed and another
containing to store the index. I want to run the class file from the command
line using this command 

java  -classpath lucene-core-3.2.0.jar:. Indexer  bibletext index

but it says the following

Exception in thread "main" java.lang.NoClassDefFoundError: Indexer (wrong
name: indexer/Indexer)
	at java.lang.ClassLoader.defineClass1(Native Method)
	at java.lang.ClassLoader.defineClass(ClassLoader.java:634)
	at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
	at java.net.URLClassLoader.defineClass(URLClassLoader.java:277)
	at java.net.URLClassLoader.access$000(URLClassLoader.java:73)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:212)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
Could not find the main class: Indexer. Program will exit.

How can I index a directory containing HTML, PDF, and text files from the
command line so i could run my searches on it. thanks for your anticipated
help.

Raphael812


--
View this message in context: http://lucene.472066.n3.nabble.com/Highlighting-matching-words-within-a-document-tp3079505p3084103.html
Sent from the Lucene - General mailing list archive at Nabble.com.

Re: Highlighting matching words within a document

Posted by SBS <jt...@uow.edu.au>.
Thanks Koji, that's just what I was looking for :-)

-sbs

--
View this message in context: http://lucene.472066.n3.nabble.com/Highlighting-matching-words-within-a-document-tp3079505p3082029.html
Sent from the Lucene - General mailing list archive at Nabble.com.

Re: Highlighting matching words within a document

Posted by Koji Sekiguchi <ko...@r.email.ne.jp>.
(11/06/18 20:35), SBS wrote:
> This is my first look at Lucene and I am sure this must be a common problem.
> I have managed to index all my content and get results from queries but the
> question is how do I highlight the matches in a document?  I mean I have
> some HTML documents and then I need to display somehow the occurrences
> within those documents that match the query.
>
> If the query was a single word then the problem is quite easy but when you
> consider that the query can be complex and contain wildcards and compound
> expressions, how can I know which words or partial words within the document
> are the reasons for the document being returned as a match?

Lucene do not only search documents but also highlight matched terms in the documents for you.

Please look at:

http://lucene.apache.org/java/3_2_0/api/all/org/apache/lucene/search/highlight/package-summary.html#package_description

koji
-- 
http://www.rondhuit.com/en/