You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/05/21 21:57:49 UTC

[GitHub] [arrow] nealrichardson opened a new pull request #7245: ARROW-8885: [R] Don't include everything everywhere

nealrichardson opened a new pull request #7245:
URL: https://github.com/apache/arrow/pull/7245


   Rather than `#including` everything in one header file and `#including` that in every .cpp file, this breaks out those includes to each .cpp file. There's also a more limited `arrow_exports.h` now that is included in the `arrowExports.cpp` that gets generated. 
   
   Installation/compilation time for the R bindings on my laptop (single-threaded) dropped from 130s to 90s (roughly 1/3 less, 40s total) as a result, so it seems like we were wasting a lot of CPU cycles before. 
   
   There's probably more improvements that could be made here, open to suggestion from those who understand C++ better.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] fsaintjacques commented on a change in pull request #7245: ARROW-8885: [R] Don't include everything everywhere

Posted by GitBox <gi...@apache.org>.
fsaintjacques commented on a change in pull request #7245:
URL: https://github.com/apache/arrow/pull/7245#discussion_r429372456



##########
File path: r/src/arrow_types.h
##########
@@ -176,64 +37,21 @@ inline constexpr Rbyte default_value<RAWSXP>() {
 
 }  // namespace Rcpp
 
-namespace arrow {
-namespace r {
-
-template <typename T>
-inline std::shared_ptr<T> extract(SEXP x) {
-  return Rcpp::ConstReferenceSmartPtrInputParameter<std::shared_ptr<T>>(x);
-}
-
-}  // namespace r
-}  // namespace arrow
-
 #if defined(ARROW_R_WITH_ARROW)
-#include <arrow/api.h>
-#include <arrow/c/bridge.h>
-#include <arrow/compute/api.h>
-#include <arrow/csv/reader.h>
-#include <arrow/dataset/api.h>
-#include <arrow/filesystem/filesystem.h>
-#include <arrow/filesystem/localfs.h>
-#include <arrow/filesystem/s3fs.h>
-#include <arrow/io/compressed.h>
-#include <arrow/io/file.h>
-#include <arrow/io/memory.h>
-#include <arrow/ipc/feather.h>
-#include <arrow/ipc/reader.h>
-#include <arrow/ipc/writer.h>
-#include <arrow/json/reader.h>
+#include <arrow/buffer.h>  // for RBuffer definition below

Review comment:
       `RBuffer` is a templated class, the definition must be in the header and all it's dependencies must also be included in the headers.
   
   In other words, by moving `RBuffer` in it's own file, the using classes will also need to include this new header + the original `<arrow/buffer.h>`, i.e. no benefits.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #7245: ARROW-8885: [R] Don't include everything everywhere

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #7245:
URL: https://github.com/apache/arrow/pull/7245#issuecomment-632369134


   https://issues.apache.org/jira/browse/ARROW-8885


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nealrichardson commented on a change in pull request #7245: ARROW-8885: [R] Don't include everything everywhere

Posted by GitBox <gi...@apache.org>.
nealrichardson commented on a change in pull request #7245:
URL: https://github.com/apache/arrow/pull/7245#discussion_r429431625



##########
File path: r/src/arrow_types.h
##########
@@ -176,64 +37,21 @@ inline constexpr Rbyte default_value<RAWSXP>() {
 
 }  // namespace Rcpp
 
-namespace arrow {
-namespace r {
-
-template <typename T>
-inline std::shared_ptr<T> extract(SEXP x) {
-  return Rcpp::ConstReferenceSmartPtrInputParameter<std::shared_ptr<T>>(x);
-}
-
-}  // namespace r
-}  // namespace arrow
-
 #if defined(ARROW_R_WITH_ARROW)
-#include <arrow/api.h>
-#include <arrow/c/bridge.h>
-#include <arrow/compute/api.h>
-#include <arrow/csv/reader.h>
-#include <arrow/dataset/api.h>
-#include <arrow/filesystem/filesystem.h>
-#include <arrow/filesystem/localfs.h>
-#include <arrow/filesystem/s3fs.h>
-#include <arrow/io/compressed.h>
-#include <arrow/io/file.h>
-#include <arrow/io/memory.h>
-#include <arrow/ipc/feather.h>
-#include <arrow/ipc/reader.h>
-#include <arrow/ipc/writer.h>
-#include <arrow/json/reader.h>
+#include <arrow/buffer.h>  // for RBuffer definition below

Review comment:
       đź‘Ť I don't have great instincts about what drives compile time and how to optimize it




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nealrichardson commented on pull request #7245: ARROW-8885: [R] Don't include everything everywhere

Posted by GitBox <gi...@apache.org>.
nealrichardson commented on pull request #7245:
URL: https://github.com/apache/arrow/pull/7245#issuecomment-632384317


   @github-actions autotune


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] fsaintjacques commented on a change in pull request #7245: ARROW-8885: [R] Don't include everything everywhere

Posted by GitBox <gi...@apache.org>.
fsaintjacques commented on a change in pull request #7245:
URL: https://github.com/apache/arrow/pull/7245#discussion_r429372456



##########
File path: r/src/arrow_types.h
##########
@@ -176,64 +37,21 @@ inline constexpr Rbyte default_value<RAWSXP>() {
 
 }  // namespace Rcpp
 
-namespace arrow {
-namespace r {
-
-template <typename T>
-inline std::shared_ptr<T> extract(SEXP x) {
-  return Rcpp::ConstReferenceSmartPtrInputParameter<std::shared_ptr<T>>(x);
-}
-
-}  // namespace r
-}  // namespace arrow
-
 #if defined(ARROW_R_WITH_ARROW)
-#include <arrow/api.h>
-#include <arrow/c/bridge.h>
-#include <arrow/compute/api.h>
-#include <arrow/csv/reader.h>
-#include <arrow/dataset/api.h>
-#include <arrow/filesystem/filesystem.h>
-#include <arrow/filesystem/localfs.h>
-#include <arrow/filesystem/s3fs.h>
-#include <arrow/io/compressed.h>
-#include <arrow/io/file.h>
-#include <arrow/io/memory.h>
-#include <arrow/ipc/feather.h>
-#include <arrow/ipc/reader.h>
-#include <arrow/ipc/writer.h>
-#include <arrow/json/reader.h>
+#include <arrow/buffer.h>  // for RBuffer definition below

Review comment:
       `RBuffer` is a templated class, the definition must be in the header and all it's dependencies must also be included in the headers.
   
   In other words, by moving `RBuffer` in it's own file, the using classes will also need to include this new header + the original `<arrow/buffer.h>`, e.g. no benefits.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] fsaintjacques commented on a change in pull request #7245: ARROW-8885: [R] Don't include everything everywhere

Posted by GitBox <gi...@apache.org>.
fsaintjacques commented on a change in pull request #7245:
URL: https://github.com/apache/arrow/pull/7245#discussion_r429369559



##########
File path: r/src/array_from_vector.cpp
##########
@@ -20,6 +20,11 @@
 #include "./arrow_types.h"
 
 #if defined(ARROW_R_WITH_ARROW)
+#include <arrow/array.h>
+#include <arrow/builder.h>
+#include <arrow/table.h>
+// This says "Private header, not to be exported"

Review comment:
       Yep, this is fine.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] fsaintjacques edited a comment on pull request #7245: ARROW-8885: [R] Don't include everything everywhere

Posted by GitBox <gi...@apache.org>.
fsaintjacques edited a comment on pull request #7245:
URL: https://github.com/apache/arrow/pull/7245#issuecomment-632825223


   @nealrichardson the next step is probably to break `arrowExport.{h,cpp}` into one per file cpp file, e.g.
   
   ```
   src/buffer.cpp
   src/buffer.h
   src/buffer_export.cpp
   ...
   ```
   
   This will make the export file much smaller and require less headers to read, on top of allowing this to be parallelized (as opposed to a single thread for the huge file as it is). I suspect we can shave probably another 33%, the final link will still be a bottleneck. I'm not familiar with rcpp and it's restrictions/requirements, @romainfrancois might say if this is never going to work.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nealrichardson commented on a change in pull request #7245: ARROW-8885: [R] Don't include everything everywhere

Posted by GitBox <gi...@apache.org>.
nealrichardson commented on a change in pull request #7245:
URL: https://github.com/apache/arrow/pull/7245#discussion_r429388011



##########
File path: r/src/arrow_types.h
##########
@@ -176,64 +37,21 @@ inline constexpr Rbyte default_value<RAWSXP>() {
 
 }  // namespace Rcpp
 
-namespace arrow {
-namespace r {
-
-template <typename T>
-inline std::shared_ptr<T> extract(SEXP x) {
-  return Rcpp::ConstReferenceSmartPtrInputParameter<std::shared_ptr<T>>(x);
-}
-
-}  // namespace r
-}  // namespace arrow
-
 #if defined(ARROW_R_WITH_ARROW)
-#include <arrow/api.h>
-#include <arrow/c/bridge.h>
-#include <arrow/compute/api.h>
-#include <arrow/csv/reader.h>
-#include <arrow/dataset/api.h>
-#include <arrow/filesystem/filesystem.h>
-#include <arrow/filesystem/localfs.h>
-#include <arrow/filesystem/s3fs.h>
-#include <arrow/io/compressed.h>
-#include <arrow/io/file.h>
-#include <arrow/io/memory.h>
-#include <arrow/ipc/feather.h>
-#include <arrow/ipc/reader.h>
-#include <arrow/ipc/writer.h>
-#include <arrow/json/reader.h>
+#include <arrow/buffer.h>  // for RBuffer definition below

Review comment:
       Right but only 2 cpp files need RBuffer, so we wouldn’t have to include it for the other 20 files. So does that help?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] fsaintjacques commented on a change in pull request #7245: ARROW-8885: [R] Don't include everything everywhere

Posted by GitBox <gi...@apache.org>.
fsaintjacques commented on a change in pull request #7245:
URL: https://github.com/apache/arrow/pull/7245#discussion_r429395569



##########
File path: r/src/arrow_types.h
##########
@@ -176,64 +37,21 @@ inline constexpr Rbyte default_value<RAWSXP>() {
 
 }  // namespace Rcpp
 
-namespace arrow {
-namespace r {
-
-template <typename T>
-inline std::shared_ptr<T> extract(SEXP x) {
-  return Rcpp::ConstReferenceSmartPtrInputParameter<std::shared_ptr<T>>(x);
-}
-
-}  // namespace r
-}  // namespace arrow
-
 #if defined(ARROW_R_WITH_ARROW)
-#include <arrow/api.h>
-#include <arrow/c/bridge.h>
-#include <arrow/compute/api.h>
-#include <arrow/csv/reader.h>
-#include <arrow/dataset/api.h>
-#include <arrow/filesystem/filesystem.h>
-#include <arrow/filesystem/localfs.h>
-#include <arrow/filesystem/s3fs.h>
-#include <arrow/io/compressed.h>
-#include <arrow/io/file.h>
-#include <arrow/io/memory.h>
-#include <arrow/ipc/feather.h>
-#include <arrow/ipc/reader.h>
-#include <arrow/ipc/writer.h>
-#include <arrow/json/reader.h>
+#include <arrow/buffer.h>  // for RBuffer definition below

Review comment:
       It doesn't, I had to touch 6 files, but it's exactly the same time to compile.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] fsaintjacques commented on pull request #7245: ARROW-8885: [R] Don't include everything everywhere

Posted by GitBox <gi...@apache.org>.
fsaintjacques commented on pull request #7245:
URL: https://github.com/apache/arrow/pull/7245#issuecomment-632825223


   @nealrichardson the next step is probably to break `arrowExport.{h,cpp}` into one per file cpp file, e.g.
   
   ```
   src/buffer.cpp
   src/buffer.h
   src/buffer_export.cpp
   ...
   ```
   
   This will make the export file much smaller and require less headers to read, on top of allowing this to be parallelized (as opposed to a single thread for the huge file as it is). I suspect we can shave probably another 33%, the final link will still be a bottleneck.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] fsaintjacques commented on a change in pull request #7245: ARROW-8885: [R] Don't include everything everywhere

Posted by GitBox <gi...@apache.org>.
fsaintjacques commented on a change in pull request #7245:
URL: https://github.com/apache/arrow/pull/7245#discussion_r429371319



##########
File path: r/src/arrow_exports.h
##########
@@ -0,0 +1,56 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+// This gets included in arrowExports.cpp
+
+#pragma once
+
+#include "arrow_rcpp.h"
+
+#if defined(ARROW_R_WITH_ARROW)
+#include <arrow/csv/reader.h>
+#include <arrow/dataset/api.h>

Review comment:
       It ain't going to help, you'll have to include `<arrow/dataset/api.h` in `arrowExports.cpp` anyway (since it calls the dataset constructor/destructor/move.

##########
File path: r/src/arrow_exports.h
##########
@@ -0,0 +1,56 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+// This gets included in arrowExports.cpp
+
+#pragma once
+
+#include "arrow_rcpp.h"
+
+#if defined(ARROW_R_WITH_ARROW)
+#include <arrow/csv/reader.h>
+#include <arrow/dataset/api.h>

Review comment:
       It ain't going to help, you'll have to include `<arrow/dataset/api.h` in `arrowExports.cpp` anyway (since it calls the dataset constructor/destructor/move).




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] fsaintjacques commented on a change in pull request #7245: ARROW-8885: [R] Don't include everything everywhere

Posted by GitBox <gi...@apache.org>.
fsaintjacques commented on a change in pull request #7245:
URL: https://github.com/apache/arrow/pull/7245#discussion_r429372721



##########
File path: r/src/arrow_types.h
##########
@@ -176,64 +37,21 @@ inline constexpr Rbyte default_value<RAWSXP>() {
 
 }  // namespace Rcpp
 
-namespace arrow {
-namespace r {
-
-template <typename T>
-inline std::shared_ptr<T> extract(SEXP x) {
-  return Rcpp::ConstReferenceSmartPtrInputParameter<std::shared_ptr<T>>(x);
-}
-
-}  // namespace r
-}  // namespace arrow
-
 #if defined(ARROW_R_WITH_ARROW)
-#include <arrow/api.h>
-#include <arrow/c/bridge.h>
-#include <arrow/compute/api.h>
-#include <arrow/csv/reader.h>
-#include <arrow/dataset/api.h>
-#include <arrow/filesystem/filesystem.h>
-#include <arrow/filesystem/localfs.h>
-#include <arrow/filesystem/s3fs.h>
-#include <arrow/io/compressed.h>
-#include <arrow/io/file.h>
-#include <arrow/io/memory.h>
-#include <arrow/ipc/feather.h>
-#include <arrow/ipc/reader.h>
-#include <arrow/ipc/writer.h>
-#include <arrow/json/reader.h>
+#include <arrow/buffer.h>  // for RBuffer definition below
 #include <arrow/result.h>
-#include <arrow/type.h>
 #include <arrow/type_fwd.h>
-#include <arrow/util/checked_cast.h>
-#include <arrow/util/compression.h>
-#include <arrow/util/iterator.h>
+// Do we need ubsan always there?

Review comment:
       It's not needed, I removed it (will be in #7251).




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] fsaintjacques closed pull request #7245: ARROW-8885: [R] Don't include everything everywhere

Posted by GitBox <gi...@apache.org>.
fsaintjacques closed pull request #7245:
URL: https://github.com/apache/arrow/pull/7245


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nealrichardson commented on a change in pull request #7245: ARROW-8885: [R] Don't include everything everywhere

Posted by GitBox <gi...@apache.org>.
nealrichardson commented on a change in pull request #7245:
URL: https://github.com/apache/arrow/pull/7245#discussion_r429303750



##########
File path: r/src/arrow_types.h
##########
@@ -176,64 +37,21 @@ inline constexpr Rbyte default_value<RAWSXP>() {
 
 }  // namespace Rcpp
 
-namespace arrow {
-namespace r {
-
-template <typename T>
-inline std::shared_ptr<T> extract(SEXP x) {
-  return Rcpp::ConstReferenceSmartPtrInputParameter<std::shared_ptr<T>>(x);
-}
-
-}  // namespace r
-}  // namespace arrow
-
 #if defined(ARROW_R_WITH_ARROW)
-#include <arrow/api.h>
-#include <arrow/c/bridge.h>
-#include <arrow/compute/api.h>
-#include <arrow/csv/reader.h>
-#include <arrow/dataset/api.h>
-#include <arrow/filesystem/filesystem.h>
-#include <arrow/filesystem/localfs.h>
-#include <arrow/filesystem/s3fs.h>
-#include <arrow/io/compressed.h>
-#include <arrow/io/file.h>
-#include <arrow/io/memory.h>
-#include <arrow/ipc/feather.h>
-#include <arrow/ipc/reader.h>
-#include <arrow/ipc/writer.h>
-#include <arrow/json/reader.h>
+#include <arrow/buffer.h>  // for RBuffer definition below
 #include <arrow/result.h>
-#include <arrow/type.h>
 #include <arrow/type_fwd.h>
-#include <arrow/util/checked_cast.h>
-#include <arrow/util/compression.h>
-#include <arrow/util/iterator.h>
+// Do we need ubsan always there?

Review comment:
       question

##########
File path: r/src/array_from_vector.cpp
##########
@@ -20,6 +20,11 @@
 #include "./arrow_types.h"
 
 #if defined(ARROW_R_WITH_ARROW)
+#include <arrow/array.h>
+#include <arrow/builder.h>
+#include <arrow/table.h>
+// This says "Private header, not to be exported"

Review comment:
       Is this ok?

##########
File path: r/src/arrow_types.h
##########
@@ -176,64 +37,21 @@ inline constexpr Rbyte default_value<RAWSXP>() {
 
 }  // namespace Rcpp
 
-namespace arrow {
-namespace r {
-
-template <typename T>
-inline std::shared_ptr<T> extract(SEXP x) {
-  return Rcpp::ConstReferenceSmartPtrInputParameter<std::shared_ptr<T>>(x);
-}
-
-}  // namespace r
-}  // namespace arrow
-
 #if defined(ARROW_R_WITH_ARROW)
-#include <arrow/api.h>
-#include <arrow/c/bridge.h>
-#include <arrow/compute/api.h>
-#include <arrow/csv/reader.h>
-#include <arrow/dataset/api.h>
-#include <arrow/filesystem/filesystem.h>
-#include <arrow/filesystem/localfs.h>
-#include <arrow/filesystem/s3fs.h>
-#include <arrow/io/compressed.h>
-#include <arrow/io/file.h>
-#include <arrow/io/memory.h>
-#include <arrow/ipc/feather.h>
-#include <arrow/ipc/reader.h>
-#include <arrow/ipc/writer.h>
-#include <arrow/json/reader.h>
+#include <arrow/buffer.h>  // for RBuffer definition below

Review comment:
       The RBuffer class definition should probably be moved to buffer.cpp but I didn't work out how to register it in the namespace in this file.

##########
File path: r/src/arrow_exports.h
##########
@@ -0,0 +1,56 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+// This gets included in arrowExports.cpp
+
+#pragma once
+
+#include "arrow_rcpp.h"
+
+#if defined(ARROW_R_WITH_ARROW)
+#include <arrow/csv/reader.h>
+#include <arrow/dataset/api.h>

Review comment:
       I suspect if there were more type_fwd.h files (like for dataset), we could trim this file back further and shave off some more seconds. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nealrichardson commented on a change in pull request #7245: ARROW-8885: [R] Don't include everything everywhere

Posted by GitBox <gi...@apache.org>.
nealrichardson commented on a change in pull request #7245:
URL: https://github.com/apache/arrow/pull/7245#discussion_r429433084



##########
File path: r/src/arrow_exports.h
##########
@@ -0,0 +1,56 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+// This gets included in arrowExports.cpp
+
+#pragma once
+
+#include "arrow_rcpp.h"
+
+#if defined(ARROW_R_WITH_ARROW)
+#include <arrow/csv/reader.h>
+#include <arrow/dataset/api.h>

Review comment:
       Are you sure? IIUC (which I may very well not) arrowExports.cpp itself doesn't call the constructors etc., it just wraps, so you just need to have the types in the signature declared somewhere. See e.g. https://github.com/apache/arrow/blob/master/r/src/arrowExports.cpp#L1534-L1547




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org