You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Sami Siren (JIRA)" <ji...@apache.org> on 2006/10/24 17:28:17 UTC

[jira] Resolved: (NUTCH-379) ParseUtil does not pass through the content's URL to the ParserFactory

     [ http://issues.apache.org/jira/browse/NUTCH-379?page=all ]

Sami Siren resolved NUTCH-379.
------------------------------

    Resolution: Fixed

Committed this to 0.8(.x) branch and trunk. Thanks Chris.

> ParseUtil does not pass through the content's URL to the ParserFactory
> ----------------------------------------------------------------------
>
>                 Key: NUTCH-379
>                 URL: http://issues.apache.org/jira/browse/NUTCH-379
>             Project: Nutch
>          Issue Type: Bug
>          Components: fetcher
>    Affects Versions: 0.8.1, 0.8, 0.9.0
>         Environment: Power Mac Dual G5, 2.0 Ghz, although fix is independent of environment
>            Reporter: Chris A. Mattmann
>         Assigned To: Chris A. Mattmann
>             Fix For: 0.8.2, 0.9.0
>
>         Attachments: NUTCH-379.Mattmann.100406.patch.txt
>
>
> Currently the ParseUtil class that is called by the Fetcher to actually perform the parsing of content does not forward thorugh the content's url for use in the ParserFactory. A bigger issue, however, is that the url (and for that matter, the pathSuffix) is no longer used to determine which parsing plugin should be called. My colleague at JPL discovered that more major bug and will soon input a JIRA issue for it. However, in the meantime, this small patch at least sets up the forwarding of the content's URL to the ParserFactory.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira