You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucenenet.apache.org by Reuben Tonna <re...@onvol.net> on 2007/01/04 15:28:47 UTC

Lucene highlighter - issue - no highlights returned - a proposed solution

Hi,
    I am currently using the latest port of Lucene and Highlighter for 
an application that I am working on.  I found that the highlighting 
module was failing to return any highlight results, even if the term 
existed in the document.  After some investigaton the issue turns out to 
be in the file QueryTermExtractor.cs method GetTerms(..) reproduced 
below, with the changes necessary for the code to work correctly.

A couple of notes:

1. Iteration should be over the nonWeightedTerms hash.  I confirmed that 
this is what the Java version does.
2. Using the dictionary iteration over the hash.  Each iteration gives 
us access to the key or value.  Both are identical so I opted for the 
value (more readable) to get the term from.  Note that the original 
dictionary iteration will fail becauase an iterator cannot be casted to 
a Term type (only the value or key pointed to by the iterator can).

//fieldname MUST be interned prior to this call
        private static void  GetTerms(Query query, 
System.Collections.Hashtable terms, bool prohibited, System.String 
fieldName)
        {
            try
            {
                System.Collections.Hashtable nonWeightedTerms = new 
System.Collections.Hashtable();
                query.ExtractTerms(nonWeightedTerms);

                /*
                foreach (Term term in terms.Values)
                {
                    if ((fieldName == null) || (term.Field() == fieldName))
                    {
                        WeightedTerm temp = new 
WeightedTerm(query.GetBoost(), term.Text());
                        terms.Add(temp, temp);
                    }
                }
                */

                System.Collections.IDictionaryEnumerator iter = 
nonWeightedTerms.GetEnumerator();
                while (iter.MoveNext()) {
                    Term term = (Term)iter.Value;
                    if ((fieldName == null) || (term.Field() == 
fieldName)) {
                        WeightedTerm temp = new 
WeightedTerm(query.GetBoost(), term.Text());
                        terms.Add(temp, temp);
                    }
                }

                /*
                for (System.Collections.IEnumerator iter = 
nonWeightedTerms.GetEnumerator(); iter.MoveNext(); )
                {
                    Term term = (Term) iter.Current;
                    if ((fieldName == null) || (term.Field() == fieldName))
                    {
                        WeightedTerm temp = new 
WeightedTerm(query.GetBoost(), term.Text());
                        terms.Add(temp, temp);
                    }
                }
                 */
               
            }
            catch (System.NotSupportedException ignore)
            {
                //this is non-fatal for our purposes
            }
        }

Hth

Reuben.

RE: Lucene highlighter - issue - no highlights returned - a proposed solution

Posted by George Aroush <ge...@aroush.net>.
Hi Reuben,

By latest you mean Lucene.Net 2.0 build 3 and Highlighter.Net 2.0 build 0,
right?  If so, as you may have noticed in the release note, Highlighter.Net
2.0 is not fully debugged yet, it is an "alpha" release such that a large
number of NUnit tests are failing.  The code change you provided might fix
all those issue.  I will try it tonight.

Many thanks for looking into this issue.

Regards,

-- George Aroush


-----Original Message-----
From: Reuben Tonna [mailto:reubent@onvol.net] 
Sent: Thursday, January 04, 2007 9:29 AM
To: george@aroush.net
Cc: lucene-net-dev@incubator.apache.org
Subject: Lucene highlighter - issue - no highlights returned - a proposed
solution

Hi,
    I am currently using the latest port of Lucene and Highlighter for an
application that I am working on.  I found that the highlighting module was
failing to return any highlight results, even if the term existed in the
document.  After some investigaton the issue turns out to be in the file
QueryTermExtractor.cs method GetTerms(..) reproduced below, with the changes
necessary for the code to work correctly.

A couple of notes:

1. Iteration should be over the nonWeightedTerms hash.  I confirmed that
this is what the Java version does.
2. Using the dictionary iteration over the hash.  Each iteration gives us
access to the key or value.  Both are identical so I opted for the value
(more readable) to get the term from.  Note that the original dictionary
iteration will fail becauase an iterator cannot be casted to a Term type
(only the value or key pointed to by the iterator can).

//fieldname MUST be interned prior to this call
        private static void  GetTerms(Query query,
System.Collections.Hashtable terms, bool prohibited, System.String
fieldName)
        {
            try
            {
                System.Collections.Hashtable nonWeightedTerms = new
System.Collections.Hashtable();
                query.ExtractTerms(nonWeightedTerms);

                /*
                foreach (Term term in terms.Values)
                {
                    if ((fieldName == null) || (term.Field() == fieldName))
                    {
                        WeightedTerm temp = new
WeightedTerm(query.GetBoost(), term.Text());
                        terms.Add(temp, temp);
                    }
                }
                */

                System.Collections.IDictionaryEnumerator iter =
nonWeightedTerms.GetEnumerator();
                while (iter.MoveNext()) {
                    Term term = (Term)iter.Value;
                    if ((fieldName == null) || (term.Field() ==
fieldName)) {
                        WeightedTerm temp = new
WeightedTerm(query.GetBoost(), term.Text());
                        terms.Add(temp, temp);
                    }
                }

                /*
                for (System.Collections.IEnumerator iter =
nonWeightedTerms.GetEnumerator(); iter.MoveNext(); )
                {
                    Term term = (Term) iter.Current;
                    if ((fieldName == null) || (term.Field() == fieldName))
                    {
                        WeightedTerm temp = new
WeightedTerm(query.GetBoost(), term.Text());
                        terms.Add(temp, temp);
                    }
                }
                 */
               
            }
            catch (System.NotSupportedException ignore)
            {
                //this is non-fatal for our purposes
            }
        }

Hth

Reuben.