You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Dewey Dunnington (Jira)" <ji...@apache.org> on 2021/11/05 17:35:00 UTC
[jira] [Commented] (ARROW-14611) [R][C++] Reporting progress from
copy_files()?
[ https://issues.apache.org/jira/browse/ARROW-14611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17439398#comment-17439398 ]
Dewey Dunnington commented on ARROW-14611:
------------------------------------------
I took the opportunity to learn a bit about the C++ sources! I didn’t find a way to put a callback into CopyFiles that gives any progress info but perhaps there is one.
C++ source from the R package: <https://github.com/apache/arrow/blob/master/r/src/filesystem.cpp#L267-L275>
Implementation for arrow::fs::CopyFiles: <https://github.com/apache/arrow/blob/master/cpp/src/arrow/filesystem/filesystem.cc#L586-L607>
{code:R}
library(cpp11)
Sys.setenv(
PKG_CXXFLAGS = paste0("-I", Sys.getenv("ARROW_HOME"), "/include"),
PKG_LIBS = paste0("-L", Sys.getenv("ARROW_HOME"), "/lib", " -larrow")
)
cpp11::cpp_source(code = '
#include <cpp11.hpp>
#include <arrow/filesystem/api.h>
using namespace cpp11;
using namespace arrow;
[[cpp11::register]]
void copy_files2(std::string src_dir, std::string dst_dir) {
auto fs = std::make_shared<fs::LocalFileSystem>();
fs::FileSelector source_sel;
source_sel.base_dir = src_dir;
Status status = fs::CopyFiles(fs, source_sel, fs, dst_dir);
if (!status.ok()) {
std::string s = status.ToString();
stop("%s", s.c_str());
}
}
')
source_dir <- tempfile()
dest_dir <- tempfile()
dir.create(source_dir)
for (i in 1:1000) {
write(
as.character(1:i),
sprintf("%s/file%03d.txt", source_dir, i)
)
}
dir.create(dest_dir)
copy_files2(source_dir, dest_dir)
waldo::compare(list.files(source_dir), list.files(dest_dir))
#> ✓ No differences
{code}
If there is a way to do this with a callback, C++ progress bars using the progress package might be useful? <https://github.com/r-lib/progress#c-api>
> [R][C++] Reporting progress from copy_files()?
> ----------------------------------------------
>
> Key: ARROW-14611
> URL: https://issues.apache.org/jira/browse/ARROW-14611
> Project: Apache Arrow
> Issue Type: Improvement
> Components: C++, R
> Reporter: Nicola Crane
> Priority: Minor
>
> Would it be possible to have something that reports progress from {{copy_files()}} which calls CopyFiles from FileSystem? When copying huge files, the R session just hangs and the user doesn't know if it's working or not.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)