You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2021/04/19 08:36:00 UTC

[jira] [Created] (NUTCH-2861) Remove parse-swf

Sebastian Nagel created NUTCH-2861:
--------------------------------------

             Summary: Remove parse-swf
                 Key: NUTCH-2861
                 URL: https://issues.apache.org/jira/browse/NUTCH-2861
             Project: Nutch
          Issue Type: Improvement
          Components: parser, plugin
    Affects Versions: 1.18
            Reporter: Sebastian Nagel
             Fix For: 1.19


We should consider to remove the Shockwafe Flash parser plugin ([parse-swf|https://github.com/apache/nutch/tree/master/src/plugin/parse-swf]):
- Shockwave/[Adobe Flash| https://en.wikipedia.org/wiki/Adobe_Flash] reached [end-of-life|https://helpx.adobe.com/shockwave/shockwave-end-of-life-faq.html]
- major browsers now block playing Flash content
- the plugin is based on 15-year old library ([javaswf|https://github.com/apache/nutch/tree/master/src/plugin/parse-swf/lib]), not maintained anymore and not available on Maven repository
- it's shipped in binary form also in the source package which contradicts the [Apache release policy|https://www.apache.org/legal/release-policy.html#source-packages]

Notes:
- should place a notice about the removal in the release not, as parse-tika is not able to extract textual content from *.swf files
- do not forget to unregister the plugin in [parse-plugins.xml|https://github.com/apache/nutch/blob/6c02da053d8ce65e0283a144ab59586e563608b8/conf/parse-plugins.xml.template#L54]




--
This message was sent by Atlassian Jira
(v8.3.4#803005)