You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Brian Zhao (JIRA)" <ji...@apache.org> on 2016/06/02 16:09:59 UTC

[jira] [Created] (NUTCH-2273) Selenium and InteractiveSelenium Do Not Support HTTPS

Brian Zhao created NUTCH-2273:
---------------------------------

             Summary: Selenium and InteractiveSelenium Do Not Support HTTPS
                 Key: NUTCH-2273
                 URL: https://issues.apache.org/jira/browse/NUTCH-2273
             Project: Nutch
          Issue Type: Bug
          Components: plugin
    Affects Versions: 1.11
            Reporter: Brian Zhao


Both Selenium and InteractiveSelenium plugins do not have the https protocol specified in their plugin.xml, and will not fetch https links.

To fix for the Selenium plugin you should add: 
  
      <implementation id="org.apache.nutch.protocol.selenium.Http"
                      class="org.apache.nutch.protocol.selenium.Http">
         <parameter name="protocolName" value="https"/>
      </implementation>

to Selenium's plugin.xml (as a child element of the "extension" element)

An implementation already exists in protocol-http HttpResponse.java, and I've merged it into selenium's HttpResponse.java here: http://pastebin.com/ZAPfwee4

This should probably be similarly done for the InteractiveSelenium plugin.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)