You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/05/03 22:28:20 UTC

[GitHub] [arrow] jonkeane opened a new pull request #10237: ARROW-12571: [R][CI] Run nightly R with valgrind

jonkeane opened a new pull request #10237:
URL: https://github.com/apache/arrow/pull/10237


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nealrichardson commented on a change in pull request #10237: ARROW-12571: [R][CI] Run nightly R with valgrind

Posted by GitBox <gi...@apache.org>.
nealrichardson commented on a change in pull request #10237:
URL: https://github.com/apache/arrow/pull/10237#discussion_r625872881



##########
File path: docker-compose.yml
##########
@@ -1050,6 +1051,30 @@ services:
       /bin/bash -c "
         /arrow/ci/scripts/r_sanitize.sh /arrow"
 
+  ubuntu-r-valgrind:
+    # Only 18.04 and amd64 supported
+    # Usage:
+    #   docker-compose build ubuntu-r-valgrind
+    #   docker-compose run ubuntu-r-valgrind
+    image: ${REPO}:amd64-ubuntu-18.04-r-valgrind-r
+    build:
+      context: .
+      dockerfile: ci/docker/linux-r.dockerfile
+      args:
+        base: wch1/r-debug:latest
+        r_bin: RDvalgrind
+    environment:
+      <<: *ccache
+      # AVX512 not supported by Valgrind (similar to ARROW-9851) some runners support AVX512 and some do not
+      # so some build might pass without this setting
+      ARROW_USER_SIMD_LEVEL: "AVX2"
+      ARROW_RUNTIME_SIMD_LEVEL: "AVX2"

Review comment:
       Since you're using the R package's bundled build script, I think this is how you need to pass in extra args: 
   
   ```suggestion
         EXTRA_CMAKE_FLAGS: "-DARROW_USER_SIMD_LEVEL=AVX2 -DARROW_RUNTIME_SIMD_LEVEL=AVX2"
   ```
   
   https://github.com/apache/arrow/blob/master/r/inst/build_arrow_static.sh#L75




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #10237: ARROW-12571: [R][CI] Run nightly R with valgrind

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #10237:
URL: https://github.com/apache/arrow/pull/10237#issuecomment-832029988


   Revision: 7f29abc988154513ad8f43a789b3399c3dfd971c
   
   Submitted crossbow builds: [ursacomputing/crossbow @ actions-384](https://github.com/ursacomputing/crossbow/branches/all?query=actions-384)
   
   |Task|Status|
   |----|------|
   |test-r-linux-valgrind|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-384-azure-test-r-linux-valgrind)](https://dev.azure.com/ursacomputing/crossbow/_build/latest?definitionId=1&branchName=actions-384-azure-test-r-linux-valgrind)|


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nealrichardson commented on a change in pull request #10237: ARROW-12571: [R][CI] Run nightly R with valgrind

Posted by GitBox <gi...@apache.org>.
nealrichardson commented on a change in pull request #10237:
URL: https://github.com/apache/arrow/pull/10237#discussion_r625934420



##########
File path: docker-compose.yml
##########
@@ -1050,6 +1051,30 @@ services:
       /bin/bash -c "
         /arrow/ci/scripts/r_sanitize.sh /arrow"
 
+  ubuntu-r-valgrind:
+    # Only 18.04 and amd64 supported
+    # Usage:
+    #   docker-compose build ubuntu-r-valgrind
+    #   docker-compose run ubuntu-r-valgrind
+    image: ${REPO}:amd64-ubuntu-18.04-r-valgrind-r
+    build:
+      context: .
+      dockerfile: ci/docker/linux-r.dockerfile
+      args:
+        base: wch1/r-debug:latest
+        r_bin: RDvalgrind
+    environment:
+      <<: *ccache
+      # AVX512 not supported by Valgrind (similar to ARROW-9851) some runners support AVX512 and some do not
+      # so some build might pass without this setting
+      ARROW_USER_SIMD_LEVEL: "AVX2"
+      ARROW_RUNTIME_SIMD_LEVEL: "AVX2"

Review comment:
       Thanks, suggestion amended




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on pull request #10237: ARROW-12571: [R][CI] Run nightly R with valgrind

Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #10237:
URL: https://github.com/apache/arrow/pull/10237#issuecomment-831825313


   I'm trying this out and I see that it takes a lot of time without outputting anything:
   ```
   + RDvalgrind --vanilla -d 'valgrind --tool=memcheck --leak-check=full --track-origins=yes' -f testthat.R
   ```
   
   Can the above command output progress while running the test suite?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jonkeane commented on a change in pull request #10237: ARROW-12571: [R][CI] Run nightly R with valgrind

Posted by GitBox <gi...@apache.org>.
jonkeane commented on a change in pull request #10237:
URL: https://github.com/apache/arrow/pull/10237#discussion_r625748167



##########
File path: ci/scripts/r_valgrind.sh
##########
@@ -0,0 +1,81 @@
+#!/usr/bin/env bash
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+set -ex
+
+: ${R_BIN:=RDvalgrind}
+
+printenv

Review comment:
       I was using it to diagnose if the `ARROW_RUNTIME_SIMD_LEVEL` var was being passed — secrets should be redacted even with something like `printenv`, but I've removed it regardless.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nealrichardson commented on a change in pull request #10237: ARROW-12571: [R][CI] Run nightly R with valgrind

Posted by GitBox <gi...@apache.org>.
nealrichardson commented on a change in pull request #10237:
URL: https://github.com/apache/arrow/pull/10237#discussion_r625870268



##########
File path: r/tests/testthat/test-compute-aggregate.R
##########
@@ -37,7 +37,8 @@ test_that("sum.Array", {
 
   floats <- c(floats, NA)
   na <- Array$create(floats)
-  expect_identical(as.numeric(sum(na)), sum(floats))
+  # expect_identical(as.numeric(sum(na)), sum(floats))
+  expect_identical(as.numeric(sum(na)), NA_real_)

Review comment:
       Could also do something like (of course confirming the right way to determine a devel version). Could also add the devel check to the `skip_on_linux_cran` (and call it `skip_on_valgrind()`) if you wanted to make that narrower. 
   
   ```suggestion
     if (!grepl("devel", R.version.string)) {
       # Valgrind on R-devel confuses NaN and NA_real_
       expect_identical(as.numeric(sum(na)), sum(floats))
     }
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jonkeane commented on pull request #10237: ARROW-12571: [R][CI] Run nightly R with valgrind

Posted by GitBox <gi...@apache.org>.
jonkeane commented on pull request #10237:
URL: https://github.com/apache/arrow/pull/10237#issuecomment-832029204


   @github-actions crossbow submit test-r-linux-valgrind


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jonkeane commented on a change in pull request #10237: ARROW-12571: [R][CI] Run nightly R with valgrind

Posted by GitBox <gi...@apache.org>.
jonkeane commented on a change in pull request #10237:
URL: https://github.com/apache/arrow/pull/10237#discussion_r625876806



##########
File path: r/tests/testthat/test-compute-aggregate.R
##########
@@ -37,7 +37,8 @@ test_that("sum.Array", {
 
   floats <- c(floats, NA)
   na <- Array$create(floats)
-  expect_identical(as.numeric(sum(na)), sum(floats))
+  # expect_identical(as.numeric(sum(na)), sum(floats))
+  expect_identical(as.numeric(sum(na)), NA_real_)

Review comment:
       Yeah, I'll add the devel check to `skip_on_linux_cran()` (and probably rename it too, though I'm a bit hesitant to make it _so_ obvious that we are skipping the problematic tests — is that something we should worry about?).
   
   As for these specific tests, looking at it now I wonder if we have benefit from `expect_identical(as.numeric(sum(na)), sum(floats))` versus `expect_identical(as.numeric(sum(na)), NA_real_)`? Is there a circumstance where we expect `sum(floats)` to change behavior and we would want to match it? I find `expect_identical(as.numeric(sum(na)), NA_real_)` to be easier to read at a glance and see exactly what it's testing/expecting.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #10237: ARROW-12571: [R][CI] Run nightly R with valgrind

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #10237:
URL: https://github.com/apache/arrow/pull/10237#discussion_r625877050



##########
File path: docker-compose.yml
##########
@@ -1050,6 +1051,30 @@ services:
       /bin/bash -c "
         /arrow/ci/scripts/r_sanitize.sh /arrow"
 
+  ubuntu-r-valgrind:
+    # Only 18.04 and amd64 supported
+    # Usage:
+    #   docker-compose build ubuntu-r-valgrind
+    #   docker-compose run ubuntu-r-valgrind
+    image: ${REPO}:amd64-ubuntu-18.04-r-valgrind-r
+    build:
+      context: .
+      dockerfile: ci/docker/linux-r.dockerfile
+      args:
+        base: wch1/r-debug:latest
+        r_bin: RDvalgrind
+    environment:
+      <<: *ccache
+      # AVX512 not supported by Valgrind (similar to ARROW-9851) some runners support AVX512 and some do not
+      # so some build might pass without this setting
+      ARROW_USER_SIMD_LEVEL: "AVX2"
+      ARROW_RUNTIME_SIMD_LEVEL: "AVX2"

Review comment:
       `ARROW_USER_SIMD_LEVEL` isn't a CMake option, which is the whole point. You use the same binary and change the instruction set at runtime.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jonkeane commented on pull request #10237: ARROW-12571: [R][CI] Run nightly R with valgrind

Posted by GitBox <gi...@apache.org>.
jonkeane commented on pull request #10237:
URL: https://github.com/apache/arrow/pull/10237#issuecomment-832168744


   @github-actions crossbow submit test-r-linux-valgrind


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jonkeane commented on pull request #10237: ARROW-12571: [R][CI] Run nightly R with valgrind

Posted by GitBox <gi...@apache.org>.
jonkeane commented on pull request #10237:
URL: https://github.com/apache/arrow/pull/10237#issuecomment-831948627


   @github-actions crossbow submit test-r-linux-valgrind


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #10237: ARROW-12571: [R][CI] Run nightly R with valgrind

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #10237:
URL: https://github.com/apache/arrow/pull/10237#issuecomment-831949275


   Revision: 4ede120d38854d6bbf7556450ed15e8345b59fcd
   
   Submitted crossbow builds: [ursacomputing/crossbow @ actions-381](https://github.com/ursacomputing/crossbow/branches/all?query=actions-381)
   
   |Task|Status|
   |----|------|
   |test-r-linux-valgrind|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-381-azure-test-r-linux-valgrind)](https://dev.azure.com/ursacomputing/crossbow/_build/latest?definitionId=1&branchName=actions-381-azure-test-r-linux-valgrind)|


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jonkeane closed pull request #10237: ARROW-12571: [R][CI] Run nightly R with valgrind

Posted by GitBox <gi...@apache.org>.
jonkeane closed pull request #10237:
URL: https://github.com/apache/arrow/pull/10237


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jonkeane commented on pull request #10237: ARROW-12571: [R][CI] Run nightly R with valgrind

Posted by GitBox <gi...@apache.org>.
jonkeane commented on pull request #10237:
URL: https://github.com/apache/arrow/pull/10237#issuecomment-831921766






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #10237: ARROW-12571: [R][CI] Run nightly R with valgrind

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #10237:
URL: https://github.com/apache/arrow/pull/10237#issuecomment-832169162


   Revision: 324786f8752fb7ba926d6742e4229726f0b4c534
   
   Submitted crossbow builds: [ursacomputing/crossbow @ actions-390](https://github.com/ursacomputing/crossbow/branches/all?query=actions-390)
   
   |Task|Status|
   |----|------|
   |test-r-linux-valgrind|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-390-azure-test-r-linux-valgrind)](https://dev.azure.com/ursacomputing/crossbow/_build/latest?definitionId=1&branchName=actions-390-azure-test-r-linux-valgrind)|


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jonkeane commented on a change in pull request #10237: ARROW-12571: [R][CI] Run nightly R with valgrind

Posted by GitBox <gi...@apache.org>.
jonkeane commented on a change in pull request #10237:
URL: https://github.com/apache/arrow/pull/10237#discussion_r625986802



##########
File path: ci/scripts/r_valgrind.sh
##########
@@ -0,0 +1,34 @@
+#!/usr/bin/env bash
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+set -ex
+
+: ${R_BIN:=RDvalgrind}
+
+source_dir=${1}/r
+
+${R_BIN} CMD INSTALL ${source_dir}
+pushd ${source_dir}/tests
+
+export TEST_R_WITH_ARROW=TRUE
+# to generate suppression files run:
+# ${R_BIN} --vanilla -d "valgrind --tool=memcheck --leak-check=full --track-origins=yes --gen-suppressions=all --log-file=memcheck.log" -f testthat.R 
+${R_BIN} --vanilla -d "valgrind --tool=memcheck --leak-check=full --track-origins=yes --error-exitcode=1 --suppressions=/${1}/ci/etc/valgrind-cran.supp" -f testthat.R | tee testthat.out 

Review comment:
       Yes it does, though I will re-confirm that once we are close to merging




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #10237: ARROW-12571: [R][CI] Run nightly R with valgrind

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #10237:
URL: https://github.com/apache/arrow/pull/10237#issuecomment-832057311


   Revision: a94cff3c451e708e387d27dd89e9832b2eedbfea
   
   Submitted crossbow builds: [ursacomputing/crossbow @ actions-385](https://github.com/ursacomputing/crossbow/branches/all?query=actions-385)
   
   |Task|Status|
   |----|------|
   |test-r-linux-valgrind|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-385-azure-test-r-linux-valgrind)](https://dev.azure.com/ursacomputing/crossbow/_build/latest?definitionId=1&branchName=actions-385-azure-test-r-linux-valgrind)|


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #10237: ARROW-12571: [R][CI] Run nightly R with valgrind

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #10237:
URL: https://github.com/apache/arrow/pull/10237#issuecomment-832256320


   Revision: 3bb94e67303db5dad8dd1e144bc77653cf955662
   
   Submitted crossbow builds: [ursacomputing/crossbow @ actions-391](https://github.com/ursacomputing/crossbow/branches/all?query=actions-391)
   
   |Task|Status|
   |----|------|
   |test-r-linux-valgrind|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-391-azure-test-r-linux-valgrind)](https://dev.azure.com/ursacomputing/crossbow/_build/latest?definitionId=1&branchName=actions-391-azure-test-r-linux-valgrind)|


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #10237: ARROW-12571: [R][CI] Run nightly R with valgrind

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #10237:
URL: https://github.com/apache/arrow/pull/10237#issuecomment-831922352


   Revision: 041f4095fb71adfbfb5d504ff83da04c07cb4352
   
   Submitted crossbow builds: [ursacomputing/crossbow @ actions-379](https://github.com/ursacomputing/crossbow/branches/all?query=actions-379)
   
   |Task|Status|
   |----|------|
   |test-r-linux-valgrind|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-379-azure-test-r-linux-valgrind)](https://dev.azure.com/ursacomputing/crossbow/_build/latest?definitionId=1&branchName=actions-379-azure-test-r-linux-valgrind)|


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #10237: ARROW-12571: [R][CI] Run nightly R with valgrind

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #10237:
URL: https://github.com/apache/arrow/pull/10237#issuecomment-832294509


   Revision: 5812a194c3d97f9db6386ce48d3c0c0e3d4a0eea
   
   Submitted crossbow builds: [ursacomputing/crossbow @ actions-392](https://github.com/ursacomputing/crossbow/branches/all?query=actions-392)
   
   |Task|Status|
   |----|------|
   |test-r-linux-valgrind|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-392-azure-test-r-linux-valgrind)](https://dev.azure.com/ursacomputing/crossbow/_build/latest?definitionId=1&branchName=actions-392-azure-test-r-linux-valgrind)|


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jonkeane commented on pull request #10237: ARROW-12571: [R][CI] Run nightly R with valgrind

Posted by GitBox <gi...@apache.org>.
jonkeane commented on pull request #10237:
URL: https://github.com/apache/arrow/pull/10237#issuecomment-832294144


   @github-actions crossbow submit test-r-linux-valgrind


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jonkeane commented on pull request #10237: ARROW-12571: [R][CI] Run nightly R with valgrind

Posted by GitBox <gi...@apache.org>.
jonkeane commented on pull request #10237:
URL: https://github.com/apache/arrow/pull/10237#issuecomment-832294273


   And ^^^ should pass now that we reverted our intentional failures.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jonkeane closed pull request #10237: ARROW-12571: [R][CI] Run nightly R with valgrind

Posted by GitBox <gi...@apache.org>.
jonkeane closed pull request #10237:
URL: https://github.com/apache/arrow/pull/10237


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jonkeane commented on pull request #10237: ARROW-12571: [R][CI] Run nightly R with valgrind

Posted by GitBox <gi...@apache.org>.
jonkeane commented on pull request #10237:
URL: https://github.com/apache/arrow/pull/10237#issuecomment-832168913


   We don't expect ^^^ to pass, it is intentionally broken


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jonkeane commented on pull request #10237: ARROW-12571: [R][CI] Run nightly R with valgrind

Posted by GitBox <gi...@apache.org>.
jonkeane commented on pull request #10237:
URL: https://github.com/apache/arrow/pull/10237#issuecomment-832055156


   @github-actions crossbow submit test-r-linux-valgrind
   
   (with mimalloc as default)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #10237: ARROW-12571: [R][CI] Run nightly R with valgrind

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #10237:
URL: https://github.com/apache/arrow/pull/10237#issuecomment-831576872


   Revision: 0e1d0e219c3e05b52ad180bd626a499138b577d6
   
   Submitted crossbow builds: [ursacomputing/crossbow @ actions-377](https://github.com/ursacomputing/crossbow/branches/all?query=actions-377)
   
   |Task|Status|
   |----|------|
   |test-r-linux-valgrind|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-377-azure-test-r-linux-valgrind)](https://dev.azure.com/ursacomputing/crossbow/_build/latest?definitionId=1&branchName=actions-377-azure-test-r-linux-valgrind)|


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jonkeane commented on pull request #10237: ARROW-12571: [R][CI] Run nightly R with valgrind

Posted by GitBox <gi...@apache.org>.
jonkeane commented on pull request #10237:
URL: https://github.com/apache/arrow/pull/10237#issuecomment-832142215


   @github-actions crossbow submit test-r-linux-valgrind
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nealrichardson commented on a change in pull request #10237: ARROW-12571: [R][CI] Run nightly R with valgrind

Posted by GitBox <gi...@apache.org>.
nealrichardson commented on a change in pull request #10237:
URL: https://github.com/apache/arrow/pull/10237#discussion_r625956698



##########
File path: r/tests/testthat/helper-skip.R
##########
@@ -22,13 +22,19 @@ build_features <- c(
 )
 
 skip_if_not_available <- function(feature) {
+  if (feature == "re2" && identical(tolower(Sys.info()[["sysname"]]), "linux")) {
+    skip_on_cran()
+  }
+

Review comment:
       ```suggestion
     if (feature == "re2") {
       # RE2 does not support valgrind (on purpose): https://github.com/google/re2/issues/177
       skip_on_valgrind()
     }
   
   ```

##########
File path: docker-compose.yml
##########
@@ -1050,6 +1051,33 @@ services:
       /bin/bash -c "
         /arrow/ci/scripts/r_sanitize.sh /arrow"
 
+  ubuntu-r-valgrind:
+    # Only 18.04 and amd64 supported
+    # Usage:
+    #   docker-compose build ubuntu-r-valgrind
+    #   docker-compose run ubuntu-r-valgrind
+    image: ${REPO}:amd64-ubuntu-18.04-r-valgrind
+    build:
+      context: .
+      dockerfile: ci/docker/linux-r.dockerfile
+      cache_from:
+        - ${REPO}:amd64-ubuntu-18.04-r-valgrind
+      args:
+        base: wch1/r-debug:latest
+        r_bin: RDvalgrind
+    environment:
+      <<: *ccache
+      # AVX512 not supported by Valgrind (similar to ARROW-9851) some runners support AVX512 and some do not
+      # so some build might pass without this setting
+      ARROW_USER_SIMD_LEVEL: "AVX2"
+      ARROW_RUNTIME_SIMD_LEVEL: "AVX2"

Review comment:
       ```suggestion
   ```

##########
File path: r/tests/testthat/helper-skip.R
##########
@@ -49,6 +55,18 @@ skip_if_not_running_large_memory_tests <- function() {
   )
 }
 
+skip_on_valgrind <- function() {
+  # This does not actually skip on valgrind, however it skips on CRAN when the 
+  # OS is linux + and the R versoin is development (which is where valgrind is run
+  # as of this code)

Review comment:
       ```suggestion
     # This does not actually skip on valgrind because we can't exactly detect it.
     # Instead, it skips on CRAN when the OS is linux + and the R version is development 
     # (which is where valgrind is run as of this code)
   ```

##########
File path: ci/scripts/r_valgrind.sh
##########
@@ -0,0 +1,34 @@
+#!/usr/bin/env bash
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+set -ex
+
+: ${R_BIN:=RDvalgrind}
+
+source_dir=${1}/r
+
+${R_BIN} CMD INSTALL ${source_dir}
+pushd ${source_dir}/tests
+
+export TEST_R_WITH_ARROW=TRUE
+# to generate suppression files run:
+# ${R_BIN} --vanilla -d "valgrind --tool=memcheck --leak-check=full --track-origins=yes --gen-suppressions=all --log-file=memcheck.log" -f testthat.R 
+${R_BIN} --vanilla -d "valgrind --tool=memcheck --leak-check=full --track-origins=yes --error-exitcode=1 --suppressions=/${1}/ci/etc/valgrind-cran.supp" -f testthat.R | tee testthat.out 

Review comment:
       Have you confirmed that this job (in the script's final form) correctly fails if there's a valgrind failure? You could try this by making `skip_on_valgrind()` no-op and see if it fails as expected, then restore the skips.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #10237: ARROW-12571: [R][CI] Run nightly R with valgrind

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #10237:
URL: https://github.com/apache/arrow/pull/10237#issuecomment-831575024


   https://issues.apache.org/jira/browse/ARROW-12571


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #10237: ARROW-12571: [R][CI] Run nightly R with valgrind

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #10237:
URL: https://github.com/apache/arrow/pull/10237#discussion_r625657634



##########
File path: ci/scripts/r_valgrind.sh
##########
@@ -0,0 +1,81 @@
+#!/usr/bin/env bash
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+set -ex
+
+: ${R_BIN:=RDvalgrind}
+
+printenv

Review comment:
       Isn't this dangerous if there are CI secrets in environment variables? Why is this needed?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jonkeane commented on pull request #10237: ARROW-12571: [R][CI] Run nightly R with valgrind

Posted by GitBox <gi...@apache.org>.
jonkeane commented on pull request #10237:
URL: https://github.com/apache/arrow/pull/10237#issuecomment-832255913


   @github-actions crossbow submit test-r-linux-valgrind


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #10237: ARROW-12571: [R][CI] Run nightly R with valgrind

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #10237:
URL: https://github.com/apache/arrow/pull/10237#discussion_r625619463



##########
File path: ci/scripts/r_valgrind.sh
##########
@@ -0,0 +1,81 @@
+#!/usr/bin/env bash
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+set -ex
+
+: ${R_BIN:=RDvalgrind}
+
+printenv
+
+source_dir=${1}/r
+
+${R_BIN} CMD INSTALL ${source_dir}
+pushd ${source_dir}/tests
+
+export TEST_R_WITH_ARROW=TRUE
+${R_BIN} --vanilla -d "valgrind --tool=memcheck --leak-check=full --track-origins=yes" -f testthat.R > testthat.out 2>&1 || true
+
+# TODO: examples? vignettes? R CMD check with valgrind on?
+
+cat testthat.out
+
+# Check the valgrind output
+# We might also considering using the greps that LibthGBM uses:
+# https://github.com/microsoft/LightGBM/blob/fa6d356555f9ef888acf5f5e259dca958ca24f6d/.ci/test_r_package_valgrind.sh#L20-L85
+
+# No errors at all

Review comment:
       You should use `valgrind --error-exitcode=1` instead of grepping.

##########
File path: ci/scripts/r_valgrind.sh
##########
@@ -0,0 +1,81 @@
+#!/usr/bin/env bash
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+set -ex
+
+: ${R_BIN:=RDvalgrind}
+
+printenv
+
+source_dir=${1}/r
+
+${R_BIN} CMD INSTALL ${source_dir}
+pushd ${source_dir}/tests
+
+export TEST_R_WITH_ARROW=TRUE
+${R_BIN} --vanilla -d "valgrind --tool=memcheck --leak-check=full --track-origins=yes" -f testthat.R > testthat.out 2>&1 || true
+
+# TODO: examples? vignettes? R CMD check with valgrind on?
+
+cat testthat.out
+
+# Check the valgrind output
+# We might also considering using the greps that LibthGBM uses:
+# https://github.com/microsoft/LightGBM/blob/fa6d356555f9ef888acf5f5e259dca958ca24f6d/.ci/test_r_package_valgrind.sh#L20-L85
+
+# No errors at all
+if [[ "$(grep -c 'ERROR SUMMARY: 0 errors' testthat.out)" == 1 ]]; then
+  no_errors=true
+else
+  no_errors=false
+fi
+
+# This error happens in our CI when there is a `skip()` of any sort. It does not show up on the cran 
+# logs so we should be able to ignore it.
+# ==273== Conditional jump or move depends on uninitialised value(s)
+# ==273==    at 0x49CFFB6: gregexpr_Regexc (grep.c:2413)
+# ==273==    by 0x49D2CB2: do_regexpr (grep.c:3050)
+# ==273==    by 0x49A08C8: bcEval (eval.c:7115)
+# ==273==    by 0x498BF52: Rf_eval (eval.c:727)
+# ==273==    by 0x498ECC1: R_execClosure (eval.c:1897)
+# ==273==    by 0x498E974: Rf_applyClosure (eval.c:1823)
+# ==273==    by 0x49A04FC: bcEval (eval.c:7083)
+# ==273==    by 0x498BF52: Rf_eval (eval.c:727)
+# ==273==    by 0x498BA9E: forcePromise (eval.c:555)
+# ==273==    by 0x4996C1B: FORCE_PROMISE (eval.c:5136)
+# ==273==    by 0x4996DD6: getvar (eval.c:5177)
+# ==273==    by 0x499DA15: bcEval (eval.c:6867)
+# ==273==  Uninitialised value was created by a stack allocation
+# ==273==    at 0x49CFAB5: gregexpr_Regexc (grep.c:2343)
+# ==273==
+if [[   "$(grep -c 'ERROR SUMMARY: 2 errors' testthat.out)" = 1 &&
+  "$(grep -c 'Conditional jump or move depends on uninitialised value' testthat.out)" = 1 &&
+  "$(grep -c 'Uninitialised value was created by a stack allocation' testthat.out)" = 1 ]]; then

Review comment:
       We should use a Valgrind suppressions file instead of grepping:
   https://stackoverflow.com/questions/17159578/generating-suppressions-for-memory-leaks
   

##########
File path: r/tests/testthat/test-compute-aggregate.R
##########
@@ -37,7 +37,8 @@ test_that("sum.Array", {
 
   floats <- c(floats, NA)
   na <- Array$create(floats)
-  expect_identical(as.numeric(sum(na)), sum(floats))
+  # expect_identical(as.numeric(sum(na)), sum(floats))
+  expect_identical(as.numeric(sum(na)), NA_real_)

Review comment:
       Why these changes? If they have a specific purpose, there should be a comment explaining them.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nealrichardson commented on a change in pull request #10237: ARROW-12571: [R][CI] Run nightly R with valgrind

Posted by GitBox <gi...@apache.org>.
nealrichardson commented on a change in pull request #10237:
URL: https://github.com/apache/arrow/pull/10237#discussion_r625872881



##########
File path: docker-compose.yml
##########
@@ -1050,6 +1051,30 @@ services:
       /bin/bash -c "
         /arrow/ci/scripts/r_sanitize.sh /arrow"
 
+  ubuntu-r-valgrind:
+    # Only 18.04 and amd64 supported
+    # Usage:
+    #   docker-compose build ubuntu-r-valgrind
+    #   docker-compose run ubuntu-r-valgrind
+    image: ${REPO}:amd64-ubuntu-18.04-r-valgrind-r
+    build:
+      context: .
+      dockerfile: ci/docker/linux-r.dockerfile
+      args:
+        base: wch1/r-debug:latest
+        r_bin: RDvalgrind
+    environment:
+      <<: *ccache
+      # AVX512 not supported by Valgrind (similar to ARROW-9851) some runners support AVX512 and some do not
+      # so some build might pass without this setting
+      ARROW_USER_SIMD_LEVEL: "AVX2"
+      ARROW_RUNTIME_SIMD_LEVEL: "AVX2"

Review comment:
       Since you're using the R package's bundled build script, I think this is how you need to pass in extra args: 
   
   ```suggestion
         EXTRA_CMAKE_FLAGS: "-DARROW_RUNTIME_SIMD_LEVEL=AVX2"
   ```
   
   https://github.com/apache/arrow/blob/master/r/inst/build_arrow_static.sh#L75




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jonkeane commented on pull request #10237: ARROW-12571: [R][CI] Run nightly R with valgrind

Posted by GitBox <gi...@apache.org>.
jonkeane commented on pull request #10237:
URL: https://github.com/apache/arrow/pull/10237#issuecomment-832256389


   We don't expect ^^^ to pass, it is intentionally broken


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jonkeane commented on pull request #10237: ARROW-12571: [R][CI] Run nightly R with valgrind

Posted by GitBox <gi...@apache.org>.
jonkeane commented on pull request #10237:
URL: https://github.com/apache/arrow/pull/10237#issuecomment-831576557


   @github-actions crossbow submit test-r-linux-valgrind


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #10237: ARROW-12571: [R][CI] Run nightly R with valgrind

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #10237:
URL: https://github.com/apache/arrow/pull/10237#issuecomment-831922518


   Revision: 041f4095fb71adfbfb5d504ff83da04c07cb4352
   
   Submitted crossbow builds: [ursacomputing/crossbow @ actions-380](https://github.com/ursacomputing/crossbow/branches/all?query=actions-380)
   
   |Task|Status|
   |----|------|
   |test-r-linux-valgrind|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-380-azure-test-r-linux-valgrind)](https://dev.azure.com/ursacomputing/crossbow/_build/latest?definitionId=1&branchName=actions-380-azure-test-r-linux-valgrind)|


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #10237: ARROW-12571: [R][CI] Run nightly R with valgrind

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #10237:
URL: https://github.com/apache/arrow/pull/10237#issuecomment-832144273


   Revision: 3ab88713346805c19145a94247e5fa596c3dccac
   
   Submitted crossbow builds: [ursacomputing/crossbow @ actions-389](https://github.com/ursacomputing/crossbow/branches/all?query=actions-389)
   
   |Task|Status|
   |----|------|
   |test-r-linux-valgrind|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-389-azure-test-r-linux-valgrind)](https://dev.azure.com/ursacomputing/crossbow/_build/latest?definitionId=1&branchName=actions-389-azure-test-r-linux-valgrind)|


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jonkeane commented on pull request #10237: ARROW-12571: [R][CI] Run nightly R with valgrind

Posted by GitBox <gi...@apache.org>.
jonkeane commented on pull request #10237:
URL: https://github.com/apache/arrow/pull/10237#issuecomment-832168626


   If ^^^ doesn't pass, there was a merge/squash error


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jonkeane commented on pull request #10237: ARROW-12571: [R][CI] Run nightly R with valgrind

Posted by GitBox <gi...@apache.org>.
jonkeane commented on pull request #10237:
URL: https://github.com/apache/arrow/pull/10237#issuecomment-832056545


   @github-actions crossbow submit test-r-linux-valgrind


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org