You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by GitBox <gi...@apache.org> on 2023/01/20 05:36:21 UTC
[GitHub] [nutch] KamilMroczek opened a new pull request, #753: NUTCH-2980: Upgraded Selenium to 4.7.2 + HTMLUnit
KamilMroczek opened a new pull request, #753:
URL: https://github.com/apache/nutch/pull/753
- Disabled phantomJS driver as it was causing problems casting TakeScreenshot to HtmlUnitWebDriver and the project has been archived since 2018
- Improved README setup instructions for IntelliJ
The following libraries were added as part of the selenium-java and htmlunit upgrades. They are all Apache 2.0, MIT or EDL.
async-http-client
async-http-client-netty-utils
auto-common
auto-service
auto-service-annotations
checker-qual
dec
failsafe
failureaccess
htmlunit-xpath
jakarta.activation
jcommander
jtoml
listenablefuture
netty-buffer
netty-codec
netty-codec-http
netty-codec-socks
netty-common
netty-handler
netty-handler-proxy
netty-reactive-streams
netty-resolver
netty-transport
netty-transport-classes-epoll
netty-transport-classes-kqueue
netty-transport-native-epoll
netty-transport-native-kqueue
netty-transport-native-unix-common
opentelemetry-api
opentelemetry-api-logs
opentelemetry-context
opentelemetry-exporter-common
opentelemetry-exporter-logging
opentelemetry-sdk
opentelemetry-sdk-common
opentelemetry-sdk-extension-autoconfigure
opentelemetry-sdk-extension-autoconfigure-spi
opentelemetry-sdk-logs
opentelemetry-sdk-metrics
opentelemetry-sdk-trace
opentelemetry-semconv
reactive-streams
salvation2
selenium-chromium-driver
selenium-devtools-v106
selenium-devtools-v107
selenium-devtools-v108
selenium-devtools-v85
selenium-http
selenium-json
selenium-manager
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@nutch.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [nutch] sebastian-nagel merged pull request #753: NUTCH-2980: Upgraded Selenium to 4.7.2 + HTMLUnit
Posted by "sebastian-nagel (via GitHub)" <gi...@apache.org>.
sebastian-nagel merged PR #753:
URL: https://github.com/apache/nutch/pull/753
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@nutch.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [nutch] KamilMroczek commented on pull request #753: NUTCH-2980: Upgraded Selenium to 4.7.2 + HTMLUnit
Posted by "KamilMroczek (via GitHub)" <gi...@apache.org>.
KamilMroczek commented on PR #753:
URL: https://github.com/apache/nutch/pull/753#issuecomment-1399257710
Ok. I was able to run in local mode on mac with firefox & chrome. And also on AWS Linux with Chrome.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@nutch.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [nutch] sebastian-nagel commented on pull request #753: NUTCH-2980: Upgraded Selenium to 4.7.2 + HTMLUnit
Posted by "sebastian-nagel (via GitHub)" <gi...@apache.org>.
sebastian-nagel commented on PR #753:
URL: https://github.com/apache/nutch/pull/753#issuecomment-1435694153
Finally, I was able to successfully test it - the reason was that on recent Ubuntu systems Firefox and Chromium are installed as snap packages. This adds extra sandboxing and requires that `TMPDIR` points to a folder the snap packages are allowed to write to (they cannot write to the default `/tmp/`).
Thanks, @KamilMroczek !
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@nutch.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [nutch] KamilMroczek commented on pull request #753: NUTCH-2980: Upgraded Selenium to 4.7.2 + HTMLUnit
Posted by GitBox <gi...@apache.org>.
KamilMroczek commented on PR #753:
URL: https://github.com/apache/nutch/pull/753#issuecomment-1398516536
Yeah the verifying of the licenses was a bit of a pain. I found some tools to help with finding licenses for a batch of libraries but none of them (which I could find) supported our format.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@nutch.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [nutch] sebastian-nagel commented on pull request #753: NUTCH-2980: Upgraded Selenium to 4.7.2 + HTMLUnit
Posted by "sebastian-nagel (via GitHub)" <gi...@apache.org>.
sebastian-nagel commented on PR #753:
URL: https://github.com/apache/nutch/pull/753#issuecomment-1399245784
Hi @KamilMroczek, indeed - keeping the licenses up-to-date is a difficult task. See also NUTCH-2290 and NUTCH-2981.
Unfortunately, so far I wasn't able to successfully test protocol-selenium and this PR. But this is on my side. Both Chrome and Firefox (recent browser and driver versions) show up (in headful mode) but for some reason the driver than times out with obscure error messages. It's reproducible from Python, so it's something with my system (maybe because I recently switched to use Wayland instead of X11). Will try it on a different system...
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@nutch.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org