You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Viksit Gaur <vi...@gmail.com> on 2008/07/01 18:40:54 UTC
Nutch SWF based on Adobe's latest spec?
Hi All,
I was wondering if there are plans to update the SWF parser with the
latest Adobe specification? Could someone also shed some light on how
the new specifications could (would?) change the way this parser works?
Cheers,
Viksit
Re: Nutch SWF based on Adobe's latest spec?
Posted by Andrzej Bialecki <ab...@getopt.org>.
Viksit Gaur wrote:
> Hi All,
>
> I was wondering if there are plans to update the SWF parser with the
> latest Adobe specification? Could someone also shed some light on how
> the new specifications could (would?) change the way this parser works?
Nutch community welcomes Slashdot readers ;)
Currently the Nutch SWF parser is based on a fairly old library,
Java/SWF. I just browsed through the Open Source Flex SDK, and there is
a chance that the swfutils/ package could provide the text / outlink
extraction functionality that we need. It remains to be seen whether it
works significantly better than the current parser ...
As far as I'm aware there are no plans in Nutch or Tika to write an SWF
parser from scratch, based on this updated specification. If someone
were to submit code that uses swfutils.jar to provide the parsing of new
formats, we will certainly welcome such contribution. :)
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com