You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@lucene.apache.org by SBS <jt...@uow.edu.au> on 2011/06/18 13:35:32 UTC
Highlighting matching words within a document
This is my first look at Lucene and I am sure this must be a common problem.
I have managed to index all my content and get results from queries but the
question is how do I highlight the matches in a document? I mean I have
some HTML documents and then I need to display somehow the occurrences
within those documents that match the query.
If the query was a single word then the problem is quite easy but when you
consider that the query can be complex and contain wildcards and compound
expressions, how can I know which words or partial words within the document
are the reasons for the document being returned as a match?
Thanks,
-sbs
--
View this message in context: http://lucene.472066.n3.nabble.com/Highlighting-matching-words-within-a-document-tp3079505p3079505.html
Sent from the Lucene - General mailing list archive at Nabble.com.
Re: Highlighting matching words within a document
Posted by raphael812 <or...@eecs.qmul.ac.uk>.
I want to build a search engine using the lucene library. i have not used it
before but i am currently studying the book "Lucene in action". i have a
file Indexer.java which is supposed to take two command line arguments one
specfying the directory containing the files to be indexed and another
containing to store the index. I want to run the class file from the command
line using this command
java -classpath lucene-core-3.2.0.jar:. Indexer bibletext index
but it says the following
Exception in thread "main" java.lang.NoClassDefFoundError: Indexer (wrong
name: indexer/Indexer)
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:634)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:277)
at java.net.URLClassLoader.access$000(URLClassLoader.java:73)
at java.net.URLClassLoader$1.run(URLClassLoader.java:212)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
Could not find the main class: Indexer. Program will exit.
How can I index a directory containing HTML, PDF, and text files from the
command line so i could run my searches on it. thanks for your anticipated
help.
Raphael812
--
View this message in context: http://lucene.472066.n3.nabble.com/Highlighting-matching-words-within-a-document-tp3079505p3084103.html
Sent from the Lucene - General mailing list archive at Nabble.com.
Re: Highlighting matching words within a document
Posted by SBS <jt...@uow.edu.au>.
Thanks Koji, that's just what I was looking for :-)
-sbs
--
View this message in context: http://lucene.472066.n3.nabble.com/Highlighting-matching-words-within-a-document-tp3079505p3082029.html
Sent from the Lucene - General mailing list archive at Nabble.com.
Re: Highlighting matching words within a document
Posted by Koji Sekiguchi <ko...@r.email.ne.jp>.
(11/06/18 20:35), SBS wrote:
> This is my first look at Lucene and I am sure this must be a common problem.
> I have managed to index all my content and get results from queries but the
> question is how do I highlight the matches in a document? I mean I have
> some HTML documents and then I need to display somehow the occurrences
> within those documents that match the query.
>
> If the query was a single word then the problem is quite easy but when you
> consider that the query can be complex and contain wildcards and compound
> expressions, how can I know which words or partial words within the document
> are the reasons for the document being returned as a match?
Lucene do not only search documents but also highlight matched terms in the documents for you.
Please look at:
http://lucene.apache.org/java/3_2_0/api/all/org/apache/lucene/search/highlight/package-summary.html#package_description
koji
--
http://www.rondhuit.com/en/