You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/10/18 21:41:05 UTC

[jira] [Resolved] (NUTCH-2141) Change the InteractiveSelenium plugin handler Interface to return page content

     [ https://issues.apache.org/jira/browse/NUTCH-2141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris A. Mattmann resolved NUTCH-2141.
--------------------------------------
       Resolution: Fixed
    Fix Version/s: 1.11

Thanks [~BalaJira] [~joyce@apache.org] plenty to  improve on but a great start!

{noformat}
[chipotle:~/tmp/nutch1.11] mattmann% svn commit -m "Fix for NUTCH-2141: Change the InteractiveSelenium plugin handler Interface to return page content contributed by Balaji <ba...@gmail.com> this closes #77 #75"
Sending        CHANGES.txt
Sending        src/plugin/protocol-interactiveselenium/src/java/org/apache/nutch/protocol/interactiveselenium/HttpResponse.java
Sending        src/plugin/protocol-interactiveselenium/src/java/org/apache/nutch/protocol/interactiveselenium/handlers/DefalultMultiInteractionHandler.java
Sending        src/plugin/protocol-interactiveselenium/src/java/org/apache/nutch/protocol/interactiveselenium/handlers/DefaultClickAllAjaxLinksHandler.java
Sending        src/plugin/protocol-interactiveselenium/src/java/org/apache/nutch/protocol/interactiveselenium/handlers/DefaultHandler.java
Sending        src/plugin/protocol-interactiveselenium/src/java/org/apache/nutch/protocol/interactiveselenium/handlers/InteractiveSeleniumHandler.java
Transmitting file data ......
Committed revision 1709307.
[chipotle:~/tmp/nutch1.11] mattmann% 
{noformat}


> Change the InteractiveSelenium plugin handler Interface to return page content
> ------------------------------------------------------------------------------
>
>                 Key: NUTCH-2141
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2141
>             Project: Nutch
>          Issue Type: Improvement
>          Components: plugin
>            Reporter: Balaji Gurumurthy
>            Assignee: Chris A. Mattmann
>              Labels: selenium
>             Fix For: 1.11
>
>
> The handler interface in the protocol-interactiveselenium plugin currently provide methods to manipulate the page content and the HTTPResponse class read's the page content from the driver. This limits the amount of HTML content that could be returned to nutch.
> The processDriver method could return a String object instead. This is particularly helpful  in cases such as handling pagination when multiple pages' content can be appended and returned from the handler. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)