You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Stas Batururimi (JIRA)" <ji...@apache.org> on 2018/12/10 07:57:00 UTC

[jira] [Commented] (NUTCH-2676) Update to the latest selenium and add code to use chrome and firefox headless mode with the remote web driver

    [ https://issues.apache.org/jira/browse/NUTCH-2676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16714396#comment-16714396 ] 

Stas Batururimi commented on NUTCH-2676:
----------------------------------------

[~wastl-nagel]Some updates: the work is still in progress, I have updated some parts and working on some other while testing different configuration. The patch hasn't been abandoned.

By the way, can I add an option to not consider robots.txt or it's better to keep it private and not to be pushed into the main repository?

> Update to the latest selenium and add code to use chrome and firefox headless mode with the remote web driver
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: NUTCH-2676
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2676
>             Project: Nutch
>          Issue Type: New Feature
>          Components: protocol
>    Affects Versions: 1.15
>            Reporter: Stas Batururimi
>            Priority: Major
>             Fix For: 1.16
>
>         Attachments: Screenshot 2018-11-16 at 18.15.52.png
>
>
> * Selenium needs to be updated
>  * missing remote web driver for chrome 
>  * necessity to add headless mode for both remote WebDriverBase Firefox & Chrome
>  * use case with Selenium grid using docker (1 hub docker container, several nodes in different docker containers, Nutch in another docker container, streaming to Apache Solr in docker container, that is at least 4 different docker containers)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)