You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucenenet.apache.org by Reuben Tonna <re...@onvol.net> on 2007/01/04 15:28:47 UTC
Lucene highlighter - issue - no highlights returned - a proposed
solution
Hi,
I am currently using the latest port of Lucene and Highlighter for
an application that I am working on. I found that the highlighting
module was failing to return any highlight results, even if the term
existed in the document. After some investigaton the issue turns out to
be in the file QueryTermExtractor.cs method GetTerms(..) reproduced
below, with the changes necessary for the code to work correctly.
A couple of notes:
1. Iteration should be over the nonWeightedTerms hash. I confirmed that
this is what the Java version does.
2. Using the dictionary iteration over the hash. Each iteration gives
us access to the key or value. Both are identical so I opted for the
value (more readable) to get the term from. Note that the original
dictionary iteration will fail becauase an iterator cannot be casted to
a Term type (only the value or key pointed to by the iterator can).
//fieldname MUST be interned prior to this call
private static void GetTerms(Query query,
System.Collections.Hashtable terms, bool prohibited, System.String
fieldName)
{
try
{
System.Collections.Hashtable nonWeightedTerms = new
System.Collections.Hashtable();
query.ExtractTerms(nonWeightedTerms);
/*
foreach (Term term in terms.Values)
{
if ((fieldName == null) || (term.Field() == fieldName))
{
WeightedTerm temp = new
WeightedTerm(query.GetBoost(), term.Text());
terms.Add(temp, temp);
}
}
*/
System.Collections.IDictionaryEnumerator iter =
nonWeightedTerms.GetEnumerator();
while (iter.MoveNext()) {
Term term = (Term)iter.Value;
if ((fieldName == null) || (term.Field() ==
fieldName)) {
WeightedTerm temp = new
WeightedTerm(query.GetBoost(), term.Text());
terms.Add(temp, temp);
}
}
/*
for (System.Collections.IEnumerator iter =
nonWeightedTerms.GetEnumerator(); iter.MoveNext(); )
{
Term term = (Term) iter.Current;
if ((fieldName == null) || (term.Field() == fieldName))
{
WeightedTerm temp = new
WeightedTerm(query.GetBoost(), term.Text());
terms.Add(temp, temp);
}
}
*/
}
catch (System.NotSupportedException ignore)
{
//this is non-fatal for our purposes
}
}
Hth
Reuben.
RE: Lucene highlighter - issue - no highlights returned - a proposed solution
Posted by George Aroush <ge...@aroush.net>.
Hi Reuben,
By latest you mean Lucene.Net 2.0 build 3 and Highlighter.Net 2.0 build 0,
right? If so, as you may have noticed in the release note, Highlighter.Net
2.0 is not fully debugged yet, it is an "alpha" release such that a large
number of NUnit tests are failing. The code change you provided might fix
all those issue. I will try it tonight.
Many thanks for looking into this issue.
Regards,
-- George Aroush
-----Original Message-----
From: Reuben Tonna [mailto:reubent@onvol.net]
Sent: Thursday, January 04, 2007 9:29 AM
To: george@aroush.net
Cc: lucene-net-dev@incubator.apache.org
Subject: Lucene highlighter - issue - no highlights returned - a proposed
solution
Hi,
I am currently using the latest port of Lucene and Highlighter for an
application that I am working on. I found that the highlighting module was
failing to return any highlight results, even if the term existed in the
document. After some investigaton the issue turns out to be in the file
QueryTermExtractor.cs method GetTerms(..) reproduced below, with the changes
necessary for the code to work correctly.
A couple of notes:
1. Iteration should be over the nonWeightedTerms hash. I confirmed that
this is what the Java version does.
2. Using the dictionary iteration over the hash. Each iteration gives us
access to the key or value. Both are identical so I opted for the value
(more readable) to get the term from. Note that the original dictionary
iteration will fail becauase an iterator cannot be casted to a Term type
(only the value or key pointed to by the iterator can).
//fieldname MUST be interned prior to this call
private static void GetTerms(Query query,
System.Collections.Hashtable terms, bool prohibited, System.String
fieldName)
{
try
{
System.Collections.Hashtable nonWeightedTerms = new
System.Collections.Hashtable();
query.ExtractTerms(nonWeightedTerms);
/*
foreach (Term term in terms.Values)
{
if ((fieldName == null) || (term.Field() == fieldName))
{
WeightedTerm temp = new
WeightedTerm(query.GetBoost(), term.Text());
terms.Add(temp, temp);
}
}
*/
System.Collections.IDictionaryEnumerator iter =
nonWeightedTerms.GetEnumerator();
while (iter.MoveNext()) {
Term term = (Term)iter.Value;
if ((fieldName == null) || (term.Field() ==
fieldName)) {
WeightedTerm temp = new
WeightedTerm(query.GetBoost(), term.Text());
terms.Add(temp, temp);
}
}
/*
for (System.Collections.IEnumerator iter =
nonWeightedTerms.GetEnumerator(); iter.MoveNext(); )
{
Term term = (Term) iter.Current;
if ((fieldName == null) || (term.Field() == fieldName))
{
WeightedTerm temp = new
WeightedTerm(query.GetBoost(), term.Text());
terms.Add(temp, temp);
}
}
*/
}
catch (System.NotSupportedException ignore)
{
//this is non-fatal for our purposes
}
}
Hth
Reuben.