You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/01/20 05:37:00 UTC
[jira] [Commented] (NUTCH-2980) Upgrade Selenium Java to 4.7.2
[ https://issues.apache.org/jira/browse/NUTCH-2980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17678993#comment-17678993 ]
ASF GitHub Bot commented on NUTCH-2980:
---------------------------------------
KamilMroczek opened a new pull request, #753:
URL: https://github.com/apache/nutch/pull/753
- Disabled phantomJS driver as it was causing problems casting TakeScreenshot to HtmlUnitWebDriver and the project has been archived since 2018
- Improved README setup instructions for IntelliJ
The following libraries were added as part of the selenium-java and htmlunit upgrades. They are all Apache 2.0, MIT or EDL.
async-http-client
async-http-client-netty-utils
auto-common
auto-service
auto-service-annotations
checker-qual
dec
failsafe
failureaccess
htmlunit-xpath
jakarta.activation
jcommander
jtoml
listenablefuture
netty-buffer
netty-codec
netty-codec-http
netty-codec-socks
netty-common
netty-handler
netty-handler-proxy
netty-reactive-streams
netty-resolver
netty-transport
netty-transport-classes-epoll
netty-transport-classes-kqueue
netty-transport-native-epoll
netty-transport-native-kqueue
netty-transport-native-unix-common
opentelemetry-api
opentelemetry-api-logs
opentelemetry-context
opentelemetry-exporter-common
opentelemetry-exporter-logging
opentelemetry-sdk
opentelemetry-sdk-common
opentelemetry-sdk-extension-autoconfigure
opentelemetry-sdk-extension-autoconfigure-spi
opentelemetry-sdk-logs
opentelemetry-sdk-metrics
opentelemetry-sdk-trace
opentelemetry-semconv
reactive-streams
salvation2
selenium-chromium-driver
selenium-devtools-v106
selenium-devtools-v107
selenium-devtools-v108
selenium-devtools-v85
selenium-http
selenium-json
selenium-manager
> Upgrade Selenium Java to 4.7.2
> ------------------------------
>
> Key: NUTCH-2980
> URL: https://issues.apache.org/jira/browse/NUTCH-2980
> Project: Nutch
> Issue Type: Improvement
> Components: plugin, protocol
> Affects Versions: 1.19
> Reporter: Kamil Mroczek
> Priority: Major
> Fix For: 1.20
>
>
> Selenium version is quite old and had some issues processing a website. Once I switched to the latest version I was able to scrape that websites. Good to keep it up to date since we were already 1 major release behind.
> Upgrading Selenium Java from 3.141.59 to 4.7.2 and Selenium HTMLUnit from 2.35.1 to 4.7.0.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)