You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Jack Tang <hi...@gmail.com> on 2005/04/07 09:20:37 UTC

Re: Highlighting query words in cached html

Hi Ferenc

It is not tha answer to your question. But I have extract highlighting
query words in summaries. Here are what I have done:


---------- Configuration file(nutch-site.xml)------------------
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="nutch-conf.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<nutch-conf>

	<!-- terms highlight style -->
	<property>
	  <name>summary.fragment.highlight.exp</name>
	  <value>token</value>
	  <description>Default is '%token%'. The term name(expression) that 
	   will be used in highlight style.</description>
	</property>
	<property>
	  <name>summary.fragment.highlight.style</name>
	  <value>&lt;font color="red"&gt;token&lt;/font&gt;</value>
	  <description>Default is '&lt;b&gt;%token%&lt;/b&gt;'. The style of 
	  highlight in summary. </description>
	</property>

</nutch-conf>
 
-------- Change code in Summary.java-------------
toString() method in innner class Highlight.
- return "<b>" + super.toString() + "</b>";

+ NutchConf conf = NutchConf.get();
+ String termName = conf.get("summary.fragment.highlight.exp","%token%");
+ String style    =
conf.get("summary.fragment.highlight.style","<b>%token%</b>");
+  	
+ return style.replaceAll(termName,super.toString());

In future, you can write some JavaScript to turn on/off the highlight:)

Regards 
  
/Jack
 



On Apr 7, 2005 5:14 PM, yoursoft@freemail.hu <yo...@freemail.hu> wrote:
> Dear Guys,
> 
> I would like highlight searched words in cached html content (like as
> google).
> I have a problem with it:
> If query like eg.: window, and I have a javascript in the html with eg.
> window.open, and I change the content all words with "window", this will
> broke the cached content. I need only change  the 'text'  content of
> cached html.
> Can anyone to a idea how to make it?
> 
> Best Regards,
>    Ferenc
>