You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Sami Siren (JIRA)" <ji...@apache.org> on 2005/04/06 15:25:16 UTC
[jira] Updated: (NUTCH-4) Serious bug: OutOfMemoryError: Java heap space
[ http://issues.apache.org/jira/browse/NUTCH-4?page=history ]
Sami Siren updated NUTCH-4:
---------------------------
Attachment: query_parser_unbalanced_fix.tar.gz
Attached file contains a JunitTest for query parser and a fix proposal for the unbalanced quote bug.
to apply:
-copy the testcase into right directory
-apply the patch
to generate new parser:
ant generate-src
...and compile normally
> Serious bug: OutOfMemoryError: Java heap space
> ----------------------------------------------
>
> Key: NUTCH-4
> URL: http://issues.apache.org/jira/browse/NUTCH-4
> Project: Nutch
> Type: Bug
> Reporter: Stefan Grroschupf
> Assignee: Sami Siren
> Attachments: query_parser_unbalanced_fix.tar.gz
>
> posted by: msashnikov
> http://sourceforge.net/tracker/index.php?func=detail&aid=1110947&group_id=59548&atid=491356
> Serious bug: OutOfMemoryError: Java heap space
> Nutch 0.6 throws the following exception when the
> search phrase includes just a single quote. Something
> like "java or ja"va.
> Here is the exception:
> javax.servlet.ServletException: Java heap space
> org.apache.jasper.runtime.PageContextImpl.doH
> andlePageException(PageContextImpl.java:845)
> org.apache.jasper.runtime.PageContextImpl.han
> dlePageException(PageContextImpl.java:778)
> org.apache.jsp.search_jsp._jspService
> (org.apache.jsp.search_jsp:685)
> org.apache.jasper.runtime.HttpJspBase.service
> (HttpJspBase.java:99)
> javax.servlet.http.HttpServlet.service
> (HttpServlet.java:802)
> org.apache.jasper.servlet.JspServletWrapper.se
> rvice(JspServletWrapper.java:325)
> org.apache.jasper.servlet.JspServlet.serviceJsp
> File(JspServlet.java:295)
> org.apache.jasper.servlet.JspServlet.service
> (JspServlet.java:245)
> javax.servlet.http.HttpServlet.service
> (HttpServlet.java:802)
> root cause
> java.lang.OutOfMemoryError: Java heap space
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
If you want more information on JIRA, or have a bug to report see:
http://www.atlassian.com/software/jira
Re: Highlighting query words in cached html
Posted by Jack Tang <hi...@gmail.com>.
Hi Ferenc
It is not tha answer to your question. But I have extract highlighting
query words in summaries. Here are what I have done:
---------- Configuration file(nutch-site.xml)------------------
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="nutch-conf.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<nutch-conf>
<!-- terms highlight style -->
<property>
<name>summary.fragment.highlight.exp</name>
<value>token</value>
<description>Default is '%token%'. The term name(expression) that
will be used in highlight style.</description>
</property>
<property>
<name>summary.fragment.highlight.style</name>
<value><font color="red">token</font></value>
<description>Default is '<b>%token%</b>'. The style of
highlight in summary. </description>
</property>
</nutch-conf>
-------- Change code in Summary.java-------------
toString() method in innner class Highlight.
- return "<b>" + super.toString() + "</b>";
+ NutchConf conf = NutchConf.get();
+ String termName = conf.get("summary.fragment.highlight.exp","%token%");
+ String style =
conf.get("summary.fragment.highlight.style","<b>%token%</b>");
+
+ return style.replaceAll(termName,super.toString());
In future, you can write some JavaScript to turn on/off the highlight:)
Regards
/Jack
On Apr 7, 2005 5:14 PM, yoursoft@freemail.hu <yo...@freemail.hu> wrote:
> Dear Guys,
>
> I would like highlight searched words in cached html content (like as
> google).
> I have a problem with it:
> If query like eg.: window, and I have a javascript in the html with eg.
> window.open, and I change the content all words with "window", this will
> broke the cached content. I need only change the 'text' content of
> cached html.
> Can anyone to a idea how to make it?
>
> Best Regards,
> Ferenc
>
Highlighting query words in cached html
Posted by "yoursoft@freemail.hu" <yo...@freemail.hu>.
Dear Guys,
I would like highlight searched words in cached html content (like as
google).
I have a problem with it:
If query like eg.: window, and I have a javascript in the html with eg.
window.open, and I change the content all words with "window", this will
broke the cached content. I need only change the 'text' content of
cached html.
Can anyone to a idea how to make it?
Best Regards,
Ferenc