You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2015/10/15 05:13:05 UTC
[jira] [Commented] (NUTCH-2141) Change the InteractiveSelenium
plugin handler Interface to return page content
[ https://issues.apache.org/jira/browse/NUTCH-2141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14958233#comment-14958233 ]
ASF GitHub Bot commented on NUTCH-2141:
---------------------------------------
GitHub user balajig17 opened a pull request:
https://github.com/apache/nutch/pull/77
fix for NUTCH-2141 contributed by Balaji Gurumurthy
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/balajig17/nutch NUTCH-2141
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/nutch/pull/77.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #77
----
commit d9486a5567ceb9a6c77e6fe3994350f37a433510
Author: Balaji <ba...@gmail.com>
Date: 2015-10-15T03:10:16Z
fix for NUTCH-2141 contributed by Balaji Gurumurthy
----
> Change the InteractiveSelenium plugin handler Interface to return page content
> ------------------------------------------------------------------------------
>
> Key: NUTCH-2141
> URL: https://issues.apache.org/jira/browse/NUTCH-2141
> Project: Nutch
> Issue Type: Improvement
> Components: plugin
> Reporter: Balaji Gurumurthy
> Labels: selenium
>
> The handler interface in the protocol-interactiveselenium plugin currently provide methods to manipulate the page content and the HTTPResponse class read's the page content from the driver. This limits the amount of HTML content that could be returned to nutch.
> The processDriver method could return a String object instead. This is particularly helpful in cases such as handling pagination when multiple pages' content can be appended and returned from the handler.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)