You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/01/20 05:37:00 UTC

[jira] [Commented] (NUTCH-2980) Upgrade Selenium Java to 4.7.2

    [ https://issues.apache.org/jira/browse/NUTCH-2980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17678993#comment-17678993 ] 

ASF GitHub Bot commented on NUTCH-2980:
---------------------------------------

KamilMroczek opened a new pull request, #753:
URL: https://github.com/apache/nutch/pull/753

   - Disabled phantomJS driver as it was causing problems casting TakeScreenshot to HtmlUnitWebDriver and the project has been archived since 2018
   - Improved README setup instructions for IntelliJ
   
   The following libraries were added as part of the selenium-java and htmlunit upgrades. They are all Apache 2.0, MIT or EDL.
   
   async-http-client
   async-http-client-netty-utils
   auto-common
   auto-service
   auto-service-annotations
   checker-qual
   dec
   failsafe
   failureaccess
   htmlunit-xpath
   jakarta.activation
   jcommander
   jtoml
   listenablefuture
   netty-buffer
   netty-codec
   netty-codec-http
   netty-codec-socks
   netty-common
   netty-handler
   netty-handler-proxy
   netty-reactive-streams
   netty-resolver
   netty-transport
   netty-transport-classes-epoll
   netty-transport-classes-kqueue
   netty-transport-native-epoll
   netty-transport-native-kqueue
   netty-transport-native-unix-common
   opentelemetry-api
   opentelemetry-api-logs
   opentelemetry-context
   opentelemetry-exporter-common
   opentelemetry-exporter-logging
   opentelemetry-sdk
   opentelemetry-sdk-common
   opentelemetry-sdk-extension-autoconfigure
   opentelemetry-sdk-extension-autoconfigure-spi
   opentelemetry-sdk-logs
   opentelemetry-sdk-metrics
   opentelemetry-sdk-trace
   opentelemetry-semconv
   reactive-streams
   salvation2
   selenium-chromium-driver
   selenium-devtools-v106
   selenium-devtools-v107
   selenium-devtools-v108
   selenium-devtools-v85
   selenium-http
   selenium-json
   selenium-manager




> Upgrade Selenium Java to 4.7.2
> ------------------------------
>
>                 Key: NUTCH-2980
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2980
>             Project: Nutch
>          Issue Type: Improvement
>          Components: plugin, protocol
>    Affects Versions: 1.19
>            Reporter: Kamil Mroczek
>            Priority: Major
>             Fix For: 1.20
>
>
> Selenium version is quite old and had some issues processing a website. Once I switched to the latest version I was able to scrape that websites. Good to keep it up to date since we were already 1 major release behind.
> Upgrading Selenium Java from 3.141.59 to 4.7.2 and Selenium HTMLUnit from 2.35.1 to 4.7.0.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)