You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2021/04/19 08:36:00 UTC
[jira] [Created] (NUTCH-2861) Remove parse-swf
Sebastian Nagel created NUTCH-2861:
--------------------------------------
Summary: Remove parse-swf
Key: NUTCH-2861
URL: https://issues.apache.org/jira/browse/NUTCH-2861
Project: Nutch
Issue Type: Improvement
Components: parser, plugin
Affects Versions: 1.18
Reporter: Sebastian Nagel
Fix For: 1.19
We should consider to remove the Shockwafe Flash parser plugin ([parse-swf|https://github.com/apache/nutch/tree/master/src/plugin/parse-swf]):
- Shockwave/[Adobe Flash| https://en.wikipedia.org/wiki/Adobe_Flash] reached [end-of-life|https://helpx.adobe.com/shockwave/shockwave-end-of-life-faq.html]
- major browsers now block playing Flash content
- the plugin is based on 15-year old library ([javaswf|https://github.com/apache/nutch/tree/master/src/plugin/parse-swf/lib]), not maintained anymore and not available on Maven repository
- it's shipped in binary form also in the source package which contradicts the [Apache release policy|https://www.apache.org/legal/release-policy.html#source-packages]
Notes:
- should place a notice about the removal in the release not, as parse-tika is not able to extract textual content from *.swf files
- do not forget to unregister the plugin in [parse-plugins.xml|https://github.com/apache/nutch/blob/6c02da053d8ce65e0283a144ab59586e563608b8/conf/parse-plugins.xml.template#L54]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)