You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@streams.apache.org by "Steve Blackmon (JIRA)" <ji...@apache.org> on 2015/07/01 03:31:05 UTC

[jira] [Created] (STREAMS-345) LinkCrawler in streams-processor-urls

Steve Blackmon created STREAMS-345:
--------------------------------------

             Summary: LinkCrawler in streams-processor-urls
                 Key: STREAMS-345
                 URL: https://issues.apache.org/jira/browse/STREAMS-345
             Project: Streams
          Issue Type: Improvement
            Reporter: Steve Blackmon
            Assignee: Steve Blackmon


LinkResolverProcessor can follow links through redirects, tracking status codes and other metadata, but does not save the content of the page.

Add a processor to the module that retrieves and saves the content of web pages referenced in the links field or activity object url fields.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)