You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Rohit Kulkarni (JIRA)" <ji...@apache.org> on 2005/04/26 08:21:23 UTC

[jira] Created: (NUTCH-53) Parser plugin for Zip files

Parser plugin for Zip files
---------------------------

         Key: NUTCH-53
         URL: http://issues.apache.org/jira/browse/NUTCH-53
     Project: Nutch
        Type: Improvement
  Components: fetcher  
    Reporter: Rohit Kulkarni
    Priority: Trivial


Nutch plugin to parse Zip files (using java.util.zip)

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Updated: (NUTCH-53) Parser plugin for Zip files

Posted by "Rohit Kulkarni (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/NUTCH-53?page=all ]

Rohit Kulkarni updated NUTCH-53:
--------------------------------

    Attachment: parse-zip.zip

The plugin is tested with the latest nutch SVN and seems to work
fine. 
Currently handles and calls parsers for the following types of files within the zip file..
text/plain
text/html
msexcel
mspowerpoint
msword
pdf
rtf
mp3
zip

Please try it out and let me know if anyone has any suggestions.

Plugin is attached as a zip file

thanks,

Rohit & Ashish

> Parser plugin for Zip files
> ---------------------------
>
>          Key: NUTCH-53
>          URL: http://issues.apache.org/jira/browse/NUTCH-53
>      Project: Nutch
>         Type: Improvement
>   Components: fetcher
>     Reporter: Rohit Kulkarni
>     Priority: Trivial
>  Attachments: parse-zip.zip
>
> Nutch plugin to parse Zip files (using java.util.zip)

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (NUTCH-53) Parser plugin for Zip files

Posted by "ilango gurusamy (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/NUTCH-53?page=comments#action_12366059 ] 

ilango gurusamy commented on NUTCH-53:
--------------------------------------

Hi Rohit
Are there instructions to install the plugin in Nutch. Are there any other areas where work is needed on this plugin?

thanks
ilango

> Parser plugin for Zip files
> ---------------------------
>
>          Key: NUTCH-53
>          URL: http://issues.apache.org/jira/browse/NUTCH-53
>      Project: Nutch
>         Type: Improvement
>   Components: fetcher
>     Reporter: Rohit Kulkarni
>     Priority: Trivial
>      Fix For: 0.8-dev
>  Attachments: parse-zip.zip
>
> Nutch plugin to parse Zip files (using java.util.zip)

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Resolved: (NUTCH-53) Parser plugin for Zip files

Posted by "Jerome Charron (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/NUTCH-53?page=all ]
     
Jerome Charron resolved NUTCH-53:
---------------------------------

    Fix Version: 0.8-dev
     Resolution: Fixed

Parser committed after some minor refactoring due to some API changes.
(http://svn.apache.org/viewcvs.cgi?rev=278626&view=rev)

Thanks to Rohit Kulkarni.


> Parser plugin for Zip files
> ---------------------------
>
>          Key: NUTCH-53
>          URL: http://issues.apache.org/jira/browse/NUTCH-53
>      Project: Nutch
>         Type: Improvement
>   Components: fetcher
>     Reporter: Rohit Kulkarni
>     Priority: Trivial
>      Fix For: 0.8-dev
>  Attachments: parse-zip.zip
>
> Nutch plugin to parse Zip files (using java.util.zip)

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira