You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Li Leon <le...@gmail.com> on 2010/01/11 08:00:17 UTC
Highlight the whole sentence instead of the partial matching terms
*Hi all,*
**
*I wanted to use to following code to highlight the whole sentence with
"\"something\"~n" to be parsed.*
**
*The QueryParser part worked well, but when integrated with Highlighter, it
ended up with exception. *
**
*Does anyone have any clue as I'm investigating this.*
**
**
*Thanks,*
**
Exception in thread "main" java.lang.NoClassDefFoundError:
org/apache/lucene/index/memory/MemoryIndex
at
org.apache.lucene.search.highlight.WeightedSpanTermExtractor.getReaderForField(
*WeightedSpanTermExtractor.java:361*)
at
org.apache.lucene.search.highlight.WeightedSpanTermExtractor.extractWeightedSpanTerms(
*WeightedSpanTermExtractor.java:282*)
at
org.apache.lucene.search.highlight.WeightedSpanTermExtractor.extract(*
WeightedSpanTermExtractor.java:149*)
at
org.apache.lucene.search.highlight.WeightedSpanTermExtractor.getWeightedSpanTerms(
*WeightedSpanTermExtractor.java:414*)
at org.apache.lucene.search.highlight.QueryScorer.initExtractor(*
QueryScorer.java:216*)
at org.apache.lucene.search.highlight.QueryScorer.init(*
QueryScorer.java:184*)
at
org.apache.lucene.search.highlight.Highlighter.getBestTextFragments(*
Highlighter.java:226*)
at org.apache.lucene.search.highlight.Highlighter.getBestFragments(*
Highlighter.java:184*)
at org.apache.lucene.search.highlight.Highlighter.getBestFragments(*
Highlighter.java:488*)
at Hightlight.main(*Hightlight.java:182*)
**
*public* *class* Hightlight {
*private* *static* *final* String *text* =
"Giving end users some context around hits from their searches is
friendly and, more important, useful. An useful example is friendly Google
search context.";
*private* *static* *final* String[] *stopwords* = {};
*private* *static* IndexSearcher *searcher*;
*public* *static* *void* main(String[] args) *throws* Exception {
String filename = args[0];
String searchText = "\"Giving and\"~11";
#1
QueryParser parser = *new* QueryParser("f", *new*
StandardAnalyzer(*stopwords*));
Query query = parser.parse(searchText);
SimpleHTMLFormatter formatter =
*new* SimpleHTMLFormatter("<span class=\"highlight\">",
"</span>");
TokenStream tokens = *new* StandardAnalyzer(*stopwords*)
.tokenStream("f", *new* StringReader(*text*));
QueryScorer scorer = *new* QueryScorer(query, "f");
Highlighter highlighter = *new* Highlighter(formatter, scorer);
highlighter.setTextFragmenter(
*new* NullFragmenter());
String result =
highlighter.getBestFragments(tokens, *text*, 3, "...");
FileWriter writer = *new* FileWriter(filename);
writer.write("<html>");
writer.write("<style>\n" +
".highlight {\n" +
" background: yellow;\n" +
"}\n" +
"</style>");
writer.write("<body>");
writer.write(result);
writer.write("</body></html>");
writer.close();
}
}
Re: Highlight the whole sentence instead of the partial matching
terms
Posted by Sanne Grinovero <sa...@gmail.com>.
If you're searching for terms "giving" and "and", it will only
highlight those terms, not the whole sentence.. that's how the
highlighter is meant to work: highlight what the user did query. Also
there's no built-in concept of sentence.
regards,
Sanne
2010/1/11 Li Leon <le...@gmail.com>:
> Just figured out, missed "lucene-memory-2.4.1.jar" external jar inclusion.
>
> With that jar included, "\"Giving and\"~11" only got "Giving" & "and"
> highlighted but not the whole sentence, any ideas?
>
> Thanks,
>
> 2010/1/11 Li Leon <le...@gmail.com>
>
>> *Hi all,*
>> **
>> *I wanted to use to following code to highlight the whole sentence with
>> "\"something\"~n" to be parsed.*
>> **
>> *The QueryParser part worked well, but when integrated with Highlighter,
>> it ended up with exception. *
>> **
>> *Does anyone have any clue as I'm investigating this.*
>> **
>> **
>> *Thanks,*
>> **
>>
>>
>>
>> Exception in thread "main" java.lang.NoClassDefFoundError:
>> org/apache/lucene/index/memory/MemoryIndex
>>
>> at
>> org.apache.lucene.search.highlight.WeightedSpanTermExtractor.getReaderForField(
>> *WeightedSpanTermExtractor.java:361*)
>>
>> at
>> org.apache.lucene.search.highlight.WeightedSpanTermExtractor.extractWeightedSpanTerms(
>> *WeightedSpanTermExtractor.java:282*)
>>
>> at
>> org.apache.lucene.search.highlight.WeightedSpanTermExtractor.extract(*
>> WeightedSpanTermExtractor.java:149*)
>>
>> at
>> org.apache.lucene.search.highlight.WeightedSpanTermExtractor.getWeightedSpanTerms(
>> *WeightedSpanTermExtractor.java:414*)
>>
>> at org.apache.lucene.search.highlight.QueryScorer.initExtractor(*
>> QueryScorer.java:216*)
>>
>> at org.apache.lucene.search.highlight.QueryScorer.init(*
>> QueryScorer.java:184*)
>>
>> at
>> org.apache.lucene.search.highlight.Highlighter.getBestTextFragments(*
>> Highlighter.java:226*)
>>
>> at org.apache.lucene.search.highlight.Highlighter.getBestFragments(*
>> Highlighter.java:184*)
>>
>> at org.apache.lucene.search.highlight.Highlighter.getBestFragments(*
>> Highlighter.java:488*)
>>
>> at Hightlight.main(*Hightlight.java:182*)
>>
>>
>> **
>> *public* *class* Hightlight {
>>
>> *private* *static* *final* String *text* =
>>
>> "Giving end users some context around hits from their searches is
>> friendly and, more important, useful. An useful example is friendly Google
>> search context.";
>>
>>
>>
>> *private* *static* *final* String[] *stopwords* = {};
>>
>> *private* *static* IndexSearcher *searcher*;
>>
>> *public* *static* *void* main(String[] args) *throws* Exception {
>>
>>
>>
>> String filename = args[0];
>>
>>
>>
>> String searchText = "\"Giving and\"~11";
>> #1
>>
>> QueryParser parser = *new* QueryParser("f", *new*
>>
>> StandardAnalyzer(*stopwords*));
>>
>> Query query = parser.parse(searchText);
>>
>>
>>
>> SimpleHTMLFormatter formatter =
>>
>> *new* SimpleHTMLFormatter("<span class=\"highlight\">",
>>
>> "</span>");
>>
>>
>>
>> TokenStream tokens = *new* StandardAnalyzer(*stopwords*)
>>
>> .tokenStream("f", *new* StringReader(*text*));
>>
>>
>>
>> QueryScorer scorer = *new* QueryScorer(query, "f");
>>
>>
>>
>> Highlighter highlighter = *new* Highlighter(formatter, scorer);
>>
>> highlighter.setTextFragmenter(
>>
>> *new* NullFragmenter());
>>
>>
>>
>> String result =
>>
>> highlighter.getBestFragments(tokens, *text*, 3, "...");
>>
>>
>>
>> FileWriter writer = *new* FileWriter(filename);
>>
>> writer.write("<html>");
>>
>> writer.write("<style>\n" +
>>
>> ".highlight {\n" +
>>
>> " background: yellow;\n" +
>>
>> "}\n" +
>>
>> "</style>");
>>
>> writer.write("<body>");
>>
>> writer.write(result);
>>
>> writer.write("</body></html>");
>>
>> writer.close();
>>
>> }
>>
>> }
>>
>>
>>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Highlight the whole sentence instead of the partial matching
terms
Posted by Li Leon <le...@gmail.com>.
Just figured out, missed "lucene-memory-2.4.1.jar" external jar inclusion.
With that jar included, "\"Giving and\"~11" only got "Giving" & "and"
highlighted but not the whole sentence, any ideas?
Thanks,
2010/1/11 Li Leon <le...@gmail.com>
> *Hi all,*
> **
> *I wanted to use to following code to highlight the whole sentence with
> "\"something\"~n" to be parsed.*
> **
> *The QueryParser part worked well, but when integrated with Highlighter,
> it ended up with exception. *
> **
> *Does anyone have any clue as I'm investigating this.*
> **
> **
> *Thanks,*
> **
>
>
>
> Exception in thread "main" java.lang.NoClassDefFoundError:
> org/apache/lucene/index/memory/MemoryIndex
>
> at
> org.apache.lucene.search.highlight.WeightedSpanTermExtractor.getReaderForField(
> *WeightedSpanTermExtractor.java:361*)
>
> at
> org.apache.lucene.search.highlight.WeightedSpanTermExtractor.extractWeightedSpanTerms(
> *WeightedSpanTermExtractor.java:282*)
>
> at
> org.apache.lucene.search.highlight.WeightedSpanTermExtractor.extract(*
> WeightedSpanTermExtractor.java:149*)
>
> at
> org.apache.lucene.search.highlight.WeightedSpanTermExtractor.getWeightedSpanTerms(
> *WeightedSpanTermExtractor.java:414*)
>
> at org.apache.lucene.search.highlight.QueryScorer.initExtractor(*
> QueryScorer.java:216*)
>
> at org.apache.lucene.search.highlight.QueryScorer.init(*
> QueryScorer.java:184*)
>
> at
> org.apache.lucene.search.highlight.Highlighter.getBestTextFragments(*
> Highlighter.java:226*)
>
> at org.apache.lucene.search.highlight.Highlighter.getBestFragments(*
> Highlighter.java:184*)
>
> at org.apache.lucene.search.highlight.Highlighter.getBestFragments(*
> Highlighter.java:488*)
>
> at Hightlight.main(*Hightlight.java:182*)
>
>
> **
> *public* *class* Hightlight {
>
> *private* *static* *final* String *text* =
>
> "Giving end users some context around hits from their searches is
> friendly and, more important, useful. An useful example is friendly Google
> search context.";
>
>
>
> *private* *static* *final* String[] *stopwords* = {};
>
> *private* *static* IndexSearcher *searcher*;
>
> *public* *static* *void* main(String[] args) *throws* Exception {
>
>
>
> String filename = args[0];
>
>
>
> String searchText = "\"Giving and\"~11";
> #1
>
> QueryParser parser = *new* QueryParser("f", *new*
>
> StandardAnalyzer(*stopwords*));
>
> Query query = parser.parse(searchText);
>
>
>
> SimpleHTMLFormatter formatter =
>
> *new* SimpleHTMLFormatter("<span class=\"highlight\">",
>
> "</span>");
>
>
>
> TokenStream tokens = *new* StandardAnalyzer(*stopwords*)
>
> .tokenStream("f", *new* StringReader(*text*));
>
>
>
> QueryScorer scorer = *new* QueryScorer(query, "f");
>
>
>
> Highlighter highlighter = *new* Highlighter(formatter, scorer);
>
> highlighter.setTextFragmenter(
>
> *new* NullFragmenter());
>
>
>
> String result =
>
> highlighter.getBestFragments(tokens, *text*, 3, "...");
>
>
>
> FileWriter writer = *new* FileWriter(filename);
>
> writer.write("<html>");
>
> writer.write("<style>\n" +
>
> ".highlight {\n" +
>
> " background: yellow;\n" +
>
> "}\n" +
>
> "</style>");
>
> writer.write("<body>");
>
> writer.write(result);
>
> writer.write("</body></html>");
>
> writer.close();
>
> }
>
> }
>
>
>