You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Stephan Strittmatter (JIRA)" <ji...@apache.org> on 2005/04/01 10:22:43 UTC
[jira] Updated: (NUTCH-21) parser plugin for MS PowerPoint slides
[ http://issues.apache.org/jira/browse/NUTCH-21?page=history ]
Stephan Strittmatter updated NUTCH-21:
--------------------------------------
Attachment: parse-mspowerpoint.zip
build.xml.patch.txt
Atached you can find the complete PowerPoint parser.
Also included in the zip are:
* JUnit test with one sample (the protocol-file plugin is required to run this)
It is a very detailed test, which could check on char basis the result!
I think It would be a good idea to extract this as a small test
environment for all parsers to get the most useful parsing results.
* The required POI jars are also included.
The build.xml file of the plugin directory has to be updated for this additional plugin. For tis I attached the build.xml.patch.txt
> parser plugin for MS PowerPoint slides
> --------------------------------------
>
> Key: NUTCH-21
> URL: http://issues.apache.org/jira/browse/NUTCH-21
> Project: Nutch
> Type: Improvement
> Components: fetcher
> Reporter: Stefan Grroschupf
> Priority: Trivial
> Attachments: build.xml.patch.txt, parse-mspowerpoint.zip
>
> transfered from:
> http://sourceforge.net/tracker/index.php?func=detail&aid=1109321&group_id=59548&atid=491356
> submitted by:
> Stephan Strittmatter
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
If you want more information on JIRA, or have a bug to report see:
http://www.atlassian.com/software/jira