You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Vitalie Spinu (Jira)" <ji...@apache.org> on 2022/07/23 18:04:00 UTC

[jira] [Commented] (ARROW-16680) [R] Weird R error: Error in fs___FileSystem__GetTargetInfos_FileSelector(self, x) : ignoring SIGPIPE signal

    [ https://issues.apache.org/jira/browse/ARROW-16680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17570344#comment-17570344 ] 

Vitalie Spinu commented on ARROW-16680:
---------------------------------------

I am not seeing this from `fs__FileSystem_GetTargetInfos_FileSelector`. In fact all I see is `Error: ignoring SIGPIPE signal Execution halted` which pops after my my entire R script completes. 


To my eye this comes from aws-> curl. This is dbg backtrace:
{code:java}
Thread 12 "R" received signal SIGPIPE, Broken pipe.
[Switching to Thread 0x7fffd37fe700 (LWP 207327)]
__libc_write (nbytes=31, buf=0x7fffc05828a3, fd=15) at ../sysdeps/unix/sysv/linux/write.c:26
26      ../sysdeps/unix/sysv/linux/write.c: No such file or directory.
(gdb) backtrace
#0  __libc_write (nbytes=31, buf=0x7fffc05828a3, fd=15) at ../sysdeps/unix/sysv/linux/write.c:26
#1  __libc_write (fd=15, buf=0x7fffc05828a3, nbytes=31) at ../sysdeps/unix/sysv/linux/write.c:24
#2  0x00007fffec3ad459 in ?? () from /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
#3  0x00007fffec3a863e in ?? () from /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
#4  0x00007fffec3a7654 in ?? () from /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
#5  0x00007fffec3a7b17 in BIO_write () from /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
#6  0x00007fffec113dde in ?? () from /usr/lib/x86_64-linux-gnu/libssl.so.1.1
#7  0x00007fffec114cd9 in ?? () from /usr/lib/x86_64-linux-gnu/libssl.so.1.1
#8  0x00007fffec11e88e in ?? () from /usr/lib/x86_64-linux-gnu/libssl.so.1.1
#9  0x00007fffec11ca65 in ?? () from /usr/lib/x86_64-linux-gnu/libssl.so.1.1
#10 0x00007fffec127ec3 in SSL_shutdown () from /usr/lib/x86_64-linux-gnu/libssl.so.1.1
#11 0x00007fffec2d37c5 in ?? () from /usr/lib/x86_64-linux-gnu/libcurl.so.4
#12 0x00007fffec2d3835 in ?? () from /usr/lib/x86_64-linux-gnu/libcurl.so.4
#13 0x00007fffec2918ce in ?? () from /usr/lib/x86_64-linux-gnu/libcurl.so.4
#14 0x00007fffec294216 in ?? () from /usr/lib/x86_64-linux-gnu/libcurl.so.4
#15 0x00007fffec2a6ecf in ?? () from /usr/lib/x86_64-linux-gnu/libcurl.so.4
#16 0x00007fffec2a7d31 in curl_multi_perform () from /usr/lib/x86_64-linux-gnu/libcurl.so.4
#17 0x00007fffec29e1bb in curl_easy_perform () from /usr/lib/x86_64-linux-gnu/libcurl.so.4
#18 0x00007fffee3e247b in Aws::Http::CurlHttpClient::MakeRequest(std::shared_ptr<Aws::Http::HttpRequest> const&, Aws::Utils::RateLimits::RateLimiterInterface*, Aws::Utils::RateLimits::RateLimiterInterface*) const ()
   from /path/to/renv/library/R-4.2/x86_64-pc-linux-gnu/arrow/libs/arrow.so
#19 0x00007fffee1a721a in Aws::Client::AWSClient::AttemptOneRequest(std::shared_ptr<Aws::Http::HttpRequest> const&, Aws::AmazonWebServiceRequest const&, char const*, char const*, char const*) const ()
   from /path/to/renv/library/R-4.2/x86_64-pc-linux-gnu/arrow/libs/arrow.so
#20 0x00007fffee1bc1a3 in Aws::Client::AWSClient::AttemptExhaustively(Aws::Http::URI const&, Aws::AmazonWebServiceRequest const&, Aws::Http::HttpMethod, char const*, char const*, char const*) const ()
   from /path/to/renv/library/R-4.2/x86_64-pc-linux-gnu/arrow/libs/arrow.so
#21 0x00007fffee1bd448 in Aws::Client::AWSClient::MakeRequestWithUnparsedResponse(Aws::Http::URI const&, Aws::AmazonWebServiceRequest const&, Aws::Http::HttpMethod, char const*, char const*, char const*) const ()
   from /path/to/renv/library/R-4.2/x86_64-pc-linux-gnu/arrow/libs/arrow.so
#22 0x00007fffee2e5933 in Aws::S3::S3Client::GetObject(Aws::S3::Model::GetObjectRequest const&) const () from /path/to/renv/library/R-4.2/x86_64-pc-linux-gnu/arrow/libs/arrow.so
#23 0x00007fffedfb6234 in arrow::fs::(anonymous namespace)::ObjectInputFile::ReadAt(long, long, void*) () from /path/to/renv/library/R-4.2/x86_64-pc-linux-gnu/arrow/libs/arrow.so
#24 0x00007fffedfb6931 in arrow::fs::(anonymous namespace)::ObjectInputFile::ReadAt(long, long) () from /path/to/renv/library/R-4.2/x86_64-pc-linux-gnu/arrow/libs/arrow.so
#25 0x00007fffed378474 in arrow::internal::FnOnce<void ()>::FnImpl<std::_Bind<arrow::detail::ContinueFuture (arrow::Future<std::shared_ptr<arrow::Buffer> >, arrow::io::RandomAccessFile::ReadAsync(arrow::io::IOContext const&, long, long)::{lambda()#1})> >::invoke() () from /path/to/renv/library/R-4.2/x86_64-pc-linux-gnu/arrow/libs/arrow.so
#26 0x00007fffed402ca7 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::{lambda()#1}> > >::_M_run() ()
   from /path/to/renv/library/R-4.2/x86_64-pc-linux-gnu/arrow/libs/arrow.so
#27 0x00007ffff59d0de4 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#28 0x00007ffff77c7609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#29 0x00007ffff76ec133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
 {code}
I am using arrow's dataset on a S3 location and as the error does not occur on a specific call, I cannot apply the  catch-retry strategy. 

 

I have also seen this error when using Athena odbc driver from R. 

> [R] Weird R error: Error in fs___FileSystem__GetTargetInfos_FileSelector(self, x) :    ignoring SIGPIPE signal
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: ARROW-16680
>                 URL: https://issues.apache.org/jira/browse/ARROW-16680
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: R
>    Affects Versions: 8.0.0
>            Reporter: Carl Boettiger
>            Priority: Major
>
> Okay apologies, this is a bit of a weird error but is annoying the heck out of me.  The following block of all R code, when run with Rscript (or embedded into any form of Rmd, quarto, knitr doc) produces the error below (at least most of the time):
>  
> {code:java}
> library(arrow)
> library(dplyr){code}
> {code:java}
> Sys.setenv(AWS_EC2_METADATA_DISABLED = "TRUE")
> Sys.unsetenv("AWS_ACCESS_KEY_ID")
> Sys.unsetenv("AWS_SECRET_ACCESS_KEY")
> Sys.unsetenv("AWS_DEFAULT_REGION")
> Sys.unsetenv("AWS_S3_ENDPOINT")s3 <- arrow::s3_bucket(bucket = "scores/parquet",
>                        endpoint_override = "data.ecoforecast.org")
> ds <- arrow::open_dataset(s3, partitioning = c("theme", "year"))
> ds |> dplyr::filter(theme == "phenology") |> dplyr::collect()
> {code}
> Gives the error
>  
>  
> {code:java}
> Error in fs___FileSystem__GetTargetInfos_FileSelector(self, x) : 
>   ignoring SIGPIPE signal
> Calls: %>% ... <Anonymous> -> fs___FileSystem__GetTargetInfos_FileSelector {code}
> But only when run as a script! When run interactively in an R console, this code runs just fine.  Even as a script the code seems to run fine, but erroneously seems to be attempting this sigpipe I don't understand.  
> If the script is executed with litter ([https://dirk.eddelbuettel.com/code/littler.html)] then it runs fine, since littler handles sigpipe but Rscripts don't.  But I have no idea why the above code throws a pipe in the first place.  Worse, if I choose a different filter for the above, like "aquatics", it (usually) works without the error.  
> I have no idea why `fs___FileSystem__GetTargetInfos_FileSelector` results in this, but would really appreciate any hints on how to avoid this as it makes it very hard to use arrow in workflows right now! 
>  
> thanks for all you do!
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)