You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/10/18 21:35:05 UTC

[jira] [Assigned] (NUTCH-2141) Change the InteractiveSelenium plugin handler Interface to return page content

     [ https://issues.apache.org/jira/browse/NUTCH-2141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris A. Mattmann reassigned NUTCH-2141:
----------------------------------------

    Assignee: Chris A. Mattmann

> Change the InteractiveSelenium plugin handler Interface to return page content
> ------------------------------------------------------------------------------
>
>                 Key: NUTCH-2141
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2141
>             Project: Nutch
>          Issue Type: Improvement
>          Components: plugin
>            Reporter: Balaji Gurumurthy
>            Assignee: Chris A. Mattmann
>              Labels: selenium
>
> The handler interface in the protocol-interactiveselenium plugin currently provide methods to manipulate the page content and the HTTPResponse class read's the page content from the driver. This limits the amount of HTML content that could be returned to nutch.
> The processDriver method could return a String object instead. This is particularly helpful  in cases such as handling pagination when multiple pages' content can be appended and returned from the handler. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)