You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Dewey Dunnington (Jira)" <ji...@apache.org> on 2022/07/26 15:06:00 UTC

[jira] [Updated] (ARROW-17213) [C++] Compute kernel change introduced test-r-linux-valgrind failure

     [ https://issues.apache.org/jira/browse/ARROW-17213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dewey Dunnington updated ARROW-17213:
-------------------------------------
    Description: 
It looks like a change in ARROW-17135 may have introduced a test-r-linux-valgrind nightly failure where (valgrind thinks) uninitialized values are somehow being used when comparing two arrays (the R call is `identical()`):

The build log is here: https://dev.azure.com/ursacomputing/crossbow/_build/results?buildId=30075&view=logs&j=0da5d1d9-276d-5173-c4c4-9d4d4ed14fdb&t=d9b15392-e4ce-5e4c-0c8c-b69645229181&l=25758

The relevant (I think) sample of the valgrind output:

{code}
==5249== Conditional jump or move depends on uninitialised value(s)
==5249==    at 0x485207E: bcmp (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==5249==    by 0x49E831E: R_compute_identical (identical.c:233)
==5249==    by 0x49E7B8E: do_identical (identical.c:94)
==5249==    by 0x49B24EC: bcEval (eval.c:7126)
==5249==    by 0x499DB93: Rf_eval (eval.c:748)
==5249==    by 0x49A0902: R_execClosure (eval.c:1918)
==5249==    by 0x49A05B5: Rf_applyClosure (eval.c:1844)
==5249==    by 0x49B2120: bcEval (eval.c:7094)
==5249==    by 0x499DB93: Rf_eval (eval.c:748)
==5249==    by 0x49A0902: R_execClosure (eval.c:1918)
==5249==    by 0x49A05B5: Rf_applyClosure (eval.c:1844)
==5249==    by 0x49B2120: bcEval (eval.c:7094)
==5249==  Uninitialised value was created by a heap allocation
==5249==    at 0x484DE30: memalign (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==5249==    by 0x484DF92: posix_memalign (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==5249==    by 0xFF7D27F: arrow::(anonymous namespace)::SystemAllocator::AllocateAligned(long, unsigned char**) (memory_pool.cc:318)
==5249==    by 0xFF7D3F9: arrow::BaseMemoryPoolImpl<arrow::(anonymous namespace)::SystemAllocator>::Allocate(long, unsigned char**) (memory_pool.cc:458)
==5249==    by 0xFB31A46: GcMemoryPool::Allocate(long, unsigned char**)::{lambda()#1}::operator()() const (memorypool.cpp:27)
==5249==    by 0xFB31E10: arrow::Status GcMemoryPool::GcAndTryAgain<GcMemoryPool::Allocate(long, unsigned char**)::{lambda()#1}>(GcMemoryPool::Allocate(long, unsigned char**)::{lambda()#1} const&) (memorypool.cpp:45)
==5249==    by 0xFB31ABB: GcMemoryPool::Allocate(long, unsigned char**) (memorypool.cpp:27)
==5249==    by 0xFF81A97: arrow::PoolBuffer::Reserve(long) (memory_pool.cc:806)
==5249==    by 0xFF81B6D: arrow::PoolBuffer::Resize(long, bool) (memory_pool.cc:830)
==5249==    by 0xFF80D94: ResizePoolBuffer<std::unique_ptr<arrow::ResizableBuffer>, std::unique_ptr<arrow::PoolBuffer> > (memory_pool.cc:869)
==5249==    by 0xFF80D94: arrow::AllocateResizableBuffer(long, arrow::MemoryPool*) (memory_pool.cc:882)
==5249==    by 0x1022D197: arrow::compute::KernelContext::Allocate(long) (kernel.cc:48)
==5249==    by 0x10584C80: arrow::compute::internal::(anonymous namespace)::CompareKernel<arrow::Int32Type>::Exec(arrow::compute::KernelContext*, arrow::compute::ExecSpan const&, arrow::compute::ExecResult*) (scalar_compare.cc:274)
==5249== 
{code}

Possible reprex for R code that triggered this. I can't run valgrind at this second but it's this test that triggered the failure (running this code after starting R with {{R -d valgrind}} should replicate the failure).

{code:R}
library(arrow, warn.conflicts = FALSE)
#> Some features are not enabled in this build of Arrow. Run `arrow_info()` for more information.
library(testthat, warn.conflicts = FALSE)

expect_type_equal <- function(object, expected, ...) {
  if (inherits(object, c("Array", "ChunkedArray"))) {
    object <- object$type
  }
  if (inherits(expected, c("Array", "ChunkedArray"))) {
    expected <- expected$type
  }
  expect_equal(object, expected, ...)
}

expect_r6_class <- function(object, class) {
  expect_s3_class(object, class)
  expect_s3_class(object, "R6")
}

expect_bool_function_equal <- function(array_exp, r_exp) {
  # Assert that the Array operation returns a boolean array
  # and that its contents are equal to expected
  expect_r6_class(array_exp, "ArrowDatum")
  expect_type_equal(array_exp, bool())
  expect_identical(as.vector(array_exp), r_exp)
}

expect_array_compares <- function(x, compared_to) {
  r_values <- as.vector(x)
  r_compared_to <- as.vector(compared_to)
  # Iterate over all comparison functions
  expect_bool_function_equal(x == compared_to, r_values == r_compared_to)
  expect_bool_function_equal(x != compared_to, r_values != r_compared_to)
  expect_bool_function_equal(x > compared_to, r_values > r_compared_to)
  expect_bool_function_equal(x >= compared_to, r_values >= r_compared_to)
  expect_bool_function_equal(x < compared_to, r_values < r_compared_to)
  expect_bool_function_equal(x <= compared_to, r_values <= r_compared_to)
}

expect_array_compares(ChunkedArray$create(1:3, 4:5), 4L)
{code}



  was:
It looks like a change in ARROW-17135 may have introduced a test-r-linux-valgrind nightly failure where (valgrind thinks) uninitialized values are somehow being used when comparing two arrays (the R call is `identical()`):

The build log is here: https://dev.azure.com/ursacomputing/crossbow/_build/results?buildId=30075&view=logs&j=0da5d1d9-276d-5173-c4c4-9d4d4ed14fdb&t=d9b15392-e4ce-5e4c-0c8c-b69645229181&l=25758

The relevant (I think) sample of the valgrind output:

{code}
==5249== Conditional jump or move depends on uninitialised value(s)
==5249==    at 0x485207E: bcmp (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==5249==    by 0x49E831E: R_compute_identical (identical.c:233)
==5249==    by 0x49E7B8E: do_identical (identical.c:94)
==5249==    by 0x49B24EC: bcEval (eval.c:7126)
==5249==    by 0x499DB93: Rf_eval (eval.c:748)
==5249==    by 0x49A0902: R_execClosure (eval.c:1918)
==5249==    by 0x49A05B5: Rf_applyClosure (eval.c:1844)
==5249==    by 0x49B2120: bcEval (eval.c:7094)
==5249==    by 0x499DB93: Rf_eval (eval.c:748)
==5249==    by 0x49A0902: R_execClosure (eval.c:1918)
==5249==    by 0x49A05B5: Rf_applyClosure (eval.c:1844)
==5249==    by 0x49B2120: bcEval (eval.c:7094)
==5249==  Uninitialised value was created by a heap allocation
==5249==    at 0x484DE30: memalign (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==5249==    by 0x484DF92: posix_memalign (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==5249==    by 0xFF7D27F: arrow::(anonymous namespace)::SystemAllocator::AllocateAligned(long, unsigned char**) (memory_pool.cc:318)
==5249==    by 0xFF7D3F9: arrow::BaseMemoryPoolImpl<arrow::(anonymous namespace)::SystemAllocator>::Allocate(long, unsigned char**) (memory_pool.cc:458)
==5249==    by 0xFB31A46: GcMemoryPool::Allocate(long, unsigned char**)::{lambda()#1}::operator()() const (memorypool.cpp:27)
==5249==    by 0xFB31E10: arrow::Status GcMemoryPool::GcAndTryAgain<GcMemoryPool::Allocate(long, unsigned char**)::{lambda()#1}>(GcMemoryPool::Allocate(long, unsigned char**)::{lambda()#1} const&) (memorypool.cpp:45)
==5249==    by 0xFB31ABB: GcMemoryPool::Allocate(long, unsigned char**) (memorypool.cpp:27)
==5249==    by 0xFF81A97: arrow::PoolBuffer::Reserve(long) (memory_pool.cc:806)
==5249==    by 0xFF81B6D: arrow::PoolBuffer::Resize(long, bool) (memory_pool.cc:830)
==5249==    by 0xFF80D94: ResizePoolBuffer<std::unique_ptr<arrow::ResizableBuffer>, std::unique_ptr<arrow::PoolBuffer> > (memory_pool.cc:869)
==5249==    by 0xFF80D94: arrow::AllocateResizableBuffer(long, arrow::MemoryPool*) (memory_pool.cc:882)
==5249==    by 0x1022D197: arrow::compute::KernelContext::Allocate(long) (kernel.cc:48)
==5249==    by 0x10584C80: arrow::compute::internal::(anonymous namespace)::CompareKernel<arrow::Int32Type>::Exec(arrow::compute::KernelContext*, arrow::compute::ExecSpan const&, arrow::compute::ExecResult*) (scalar_compare.cc:274)
==5249== 
{code}

Possible reprex for R code that triggered this (I can't run valgrind at this second but it's this test that triggered the failure):

{code:R}
library(arrow, warn.conflicts = FALSE)
#> Some features are not enabled in this build of Arrow. Run `arrow_info()` for more information.
library(testthat, warn.conflicts = FALSE)

expect_type_equal <- function(object, expected, ...) {
  if (inherits(object, c("Array", "ChunkedArray"))) {
    object <- object$type
  }
  if (inherits(expected, c("Array", "ChunkedArray"))) {
    expected <- expected$type
  }
  expect_equal(object, expected, ...)
}

expect_r6_class <- function(object, class) {
  expect_s3_class(object, class)
  expect_s3_class(object, "R6")
}

expect_bool_function_equal <- function(array_exp, r_exp) {
  # Assert that the Array operation returns a boolean array
  # and that its contents are equal to expected
  expect_r6_class(array_exp, "ArrowDatum")
  expect_type_equal(array_exp, bool())
  expect_identical(as.vector(array_exp), r_exp)
}

expect_array_compares <- function(x, compared_to) {
  r_values <- as.vector(x)
  r_compared_to <- as.vector(compared_to)
  # Iterate over all comparison functions
  expect_bool_function_equal(x == compared_to, r_values == r_compared_to)
  expect_bool_function_equal(x != compared_to, r_values != r_compared_to)
  expect_bool_function_equal(x > compared_to, r_values > r_compared_to)
  expect_bool_function_equal(x >= compared_to, r_values >= r_compared_to)
  expect_bool_function_equal(x < compared_to, r_values < r_compared_to)
  expect_bool_function_equal(x <= compared_to, r_values <= r_compared_to)
}

expect_array_compares(ChunkedArray$create(1:3, 4:5), 4L)
{code}




> [C++] Compute kernel change introduced test-r-linux-valgrind failure
> --------------------------------------------------------------------
>
>                 Key: ARROW-17213
>                 URL: https://issues.apache.org/jira/browse/ARROW-17213
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++
>            Reporter: Dewey Dunnington
>            Priority: Major
>
> It looks like a change in ARROW-17135 may have introduced a test-r-linux-valgrind nightly failure where (valgrind thinks) uninitialized values are somehow being used when comparing two arrays (the R call is `identical()`):
> The build log is here: https://dev.azure.com/ursacomputing/crossbow/_build/results?buildId=30075&view=logs&j=0da5d1d9-276d-5173-c4c4-9d4d4ed14fdb&t=d9b15392-e4ce-5e4c-0c8c-b69645229181&l=25758
> The relevant (I think) sample of the valgrind output:
> {code}
> ==5249== Conditional jump or move depends on uninitialised value(s)
> ==5249==    at 0x485207E: bcmp (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==5249==    by 0x49E831E: R_compute_identical (identical.c:233)
> ==5249==    by 0x49E7B8E: do_identical (identical.c:94)
> ==5249==    by 0x49B24EC: bcEval (eval.c:7126)
> ==5249==    by 0x499DB93: Rf_eval (eval.c:748)
> ==5249==    by 0x49A0902: R_execClosure (eval.c:1918)
> ==5249==    by 0x49A05B5: Rf_applyClosure (eval.c:1844)
> ==5249==    by 0x49B2120: bcEval (eval.c:7094)
> ==5249==    by 0x499DB93: Rf_eval (eval.c:748)
> ==5249==    by 0x49A0902: R_execClosure (eval.c:1918)
> ==5249==    by 0x49A05B5: Rf_applyClosure (eval.c:1844)
> ==5249==    by 0x49B2120: bcEval (eval.c:7094)
> ==5249==  Uninitialised value was created by a heap allocation
> ==5249==    at 0x484DE30: memalign (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==5249==    by 0x484DF92: posix_memalign (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==5249==    by 0xFF7D27F: arrow::(anonymous namespace)::SystemAllocator::AllocateAligned(long, unsigned char**) (memory_pool.cc:318)
> ==5249==    by 0xFF7D3F9: arrow::BaseMemoryPoolImpl<arrow::(anonymous namespace)::SystemAllocator>::Allocate(long, unsigned char**) (memory_pool.cc:458)
> ==5249==    by 0xFB31A46: GcMemoryPool::Allocate(long, unsigned char**)::{lambda()#1}::operator()() const (memorypool.cpp:27)
> ==5249==    by 0xFB31E10: arrow::Status GcMemoryPool::GcAndTryAgain<GcMemoryPool::Allocate(long, unsigned char**)::{lambda()#1}>(GcMemoryPool::Allocate(long, unsigned char**)::{lambda()#1} const&) (memorypool.cpp:45)
> ==5249==    by 0xFB31ABB: GcMemoryPool::Allocate(long, unsigned char**) (memorypool.cpp:27)
> ==5249==    by 0xFF81A97: arrow::PoolBuffer::Reserve(long) (memory_pool.cc:806)
> ==5249==    by 0xFF81B6D: arrow::PoolBuffer::Resize(long, bool) (memory_pool.cc:830)
> ==5249==    by 0xFF80D94: ResizePoolBuffer<std::unique_ptr<arrow::ResizableBuffer>, std::unique_ptr<arrow::PoolBuffer> > (memory_pool.cc:869)
> ==5249==    by 0xFF80D94: arrow::AllocateResizableBuffer(long, arrow::MemoryPool*) (memory_pool.cc:882)
> ==5249==    by 0x1022D197: arrow::compute::KernelContext::Allocate(long) (kernel.cc:48)
> ==5249==    by 0x10584C80: arrow::compute::internal::(anonymous namespace)::CompareKernel<arrow::Int32Type>::Exec(arrow::compute::KernelContext*, arrow::compute::ExecSpan const&, arrow::compute::ExecResult*) (scalar_compare.cc:274)
> ==5249== 
> {code}
> Possible reprex for R code that triggered this. I can't run valgrind at this second but it's this test that triggered the failure (running this code after starting R with {{R -d valgrind}} should replicate the failure).
> {code:R}
> library(arrow, warn.conflicts = FALSE)
> #> Some features are not enabled in this build of Arrow. Run `arrow_info()` for more information.
> library(testthat, warn.conflicts = FALSE)
> expect_type_equal <- function(object, expected, ...) {
>   if (inherits(object, c("Array", "ChunkedArray"))) {
>     object <- object$type
>   }
>   if (inherits(expected, c("Array", "ChunkedArray"))) {
>     expected <- expected$type
>   }
>   expect_equal(object, expected, ...)
> }
> expect_r6_class <- function(object, class) {
>   expect_s3_class(object, class)
>   expect_s3_class(object, "R6")
> }
> expect_bool_function_equal <- function(array_exp, r_exp) {
>   # Assert that the Array operation returns a boolean array
>   # and that its contents are equal to expected
>   expect_r6_class(array_exp, "ArrowDatum")
>   expect_type_equal(array_exp, bool())
>   expect_identical(as.vector(array_exp), r_exp)
> }
> expect_array_compares <- function(x, compared_to) {
>   r_values <- as.vector(x)
>   r_compared_to <- as.vector(compared_to)
>   # Iterate over all comparison functions
>   expect_bool_function_equal(x == compared_to, r_values == r_compared_to)
>   expect_bool_function_equal(x != compared_to, r_values != r_compared_to)
>   expect_bool_function_equal(x > compared_to, r_values > r_compared_to)
>   expect_bool_function_equal(x >= compared_to, r_values >= r_compared_to)
>   expect_bool_function_equal(x < compared_to, r_values < r_compared_to)
>   expect_bool_function_equal(x <= compared_to, r_values <= r_compared_to)
> }
> expect_array_compares(ChunkedArray$create(1:3, 4:5), 4L)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)