You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "ildipo (via GitHub)" <gi...@apache.org> on 2023/03/09 16:23:15 UTC

[GitHub] [arrow] ildipo opened a new pull request, #34518: GH-15280: [C++] Refactor to reorganize dependencies as a prequel to moving acero out of libarrow

ildipo opened a new pull request, #34518:
URL: https://github.com/apache/arrow/pull/34518

   
   ### Rationale for this change
   
   In order to remove acero from libarrow we need to reorganize the code to reduce entanglement between compute/exec and the rest of the product by removing any dependency from the codebase (except dataset) into compute/exec.
   
   ### What changes are included in this PR?
   
   This PR removes some of the dependencies:
   * some of the functions in `compute/exec/util.[h,cc]` that are widely used are moved to a new `compute/util.[h,cc] `file
   * `key_map` and `key_hash` are widely used and thus moved from compute/exec to compute
   * some light_array tests do use functionalities inside `compute/exec`, so the tests are split into 2 parts, one inside `compute` and another inside `compute/exec`
   
   ### Are these changes tested?
   
   The existing tests cover all these changes.
   One additional test program has been added by splitting light_array_test in two parts.
   
   ### Are there any user-facing changes?
   
   no
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] icexelloss merged pull request #34518: GH-15280: [C++] Refactor to reorganize dependencies as a prequel to moving acero out of libarrow

Posted by "icexelloss (via GitHub)" <gi...@apache.org>.
icexelloss merged PR #34518:
URL: https://github.com/apache/arrow/pull/34518


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] icexelloss commented on pull request #34518: GH-15280: [C++] Refactor to reorganize dependencies as a prequel to moving acero out of libarrow

Posted by "icexelloss (via GitHub)" <gi...@apache.org>.
icexelloss commented on PR #34518:
URL: https://github.com/apache/arrow/pull/34518#issuecomment-1466516064

   @ildipo Is on vacation I will double check with him and I will merge


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] ildipo commented on a diff in pull request #34518: GH-15280: [C++] Refactor to reorganize dependencies as a prequel to moving acero out of libarrow

Posted by "ildipo (via GitHub)" <gi...@apache.org>.
ildipo commented on code in PR #34518:
URL: https://github.com/apache/arrow/pull/34518#discussion_r1135569492


##########
cpp/src/arrow/compute/light_array_test.cc:
##########
@@ -216,27 +215,6 @@ TEST(KeyColumnArray, SliceBool) {
   }
 }
 
-TEST(KeyColumnArray, FromExecBatch) {

Review Comment:
   In the coming CR I'm moving a few more things around so I'll change this to a single file as well.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] westonpace commented on a diff in pull request #34518: GH-15280: [C++] Refactor to reorganize dependencies as a prequel to moving acero out of libarrow

Posted by "westonpace (via GitHub)" <gi...@apache.org>.
westonpace commented on code in PR #34518:
URL: https://github.com/apache/arrow/pull/34518#discussion_r1133910492


##########
cpp/src/arrow/compute/light_array_test.cc:
##########
@@ -216,27 +215,6 @@ TEST(KeyColumnArray, SliceBool) {
   }
 }
 
-TEST(KeyColumnArray, FromExecBatch) {

Review Comment:
   Hmm, it's a bit odd to have these tests split this way.  Do you want to just move this entire test file to `exec`?  Maybe later we can, in a follow-up, attempt to pull out the utilities that are needed in `exec/test_util.h` so the entire test file can move out of `exec`.
   
   However, for now, I think it's probably ok that this is tested in exec.



##########
cpp/src/arrow/compute/util_internal.h:
##########
@@ -0,0 +1,31 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#pragma once
+
+#include "arrow/util/logging.h"
+
+namespace arrow {
+namespace util {
+
+template <typename T>
+void CheckAlignment(const void* ptr) {

Review Comment:
   Yes, internal is correct for this, good catch.  `exec/util.h` has always been "implicitly internal" and I've been wanting to rename it forever but hadn't quite gotten around to it.  This is a good start.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] github-actions[bot] commented on pull request #34518: GH-15280: [C++] Refactor to reorganize dependencies as a prequel to moving acero out of libarrow

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #34518:
URL: https://github.com/apache/arrow/pull/34518#issuecomment-1462357068

   :warning: GitHub issue #15280 **has been automatically assigned in GitHub** to PR creator.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] ildipo commented on a diff in pull request #34518: GH-15280: [C++] Refactor to reorganize dependencies as a prequel to moving acero out of libarrow

Posted by "ildipo (via GitHub)" <gi...@apache.org>.
ildipo commented on code in PR #34518:
URL: https://github.com/apache/arrow/pull/34518#discussion_r1132436441


##########
cpp/src/arrow/compute/util_internal.h:
##########
@@ -0,0 +1,31 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#pragma once
+
+#include "arrow/util/logging.h"
+
+namespace arrow {
+namespace util {
+
+template <typename T>
+void CheckAlignment(const void* ptr) {

Review Comment:
   this uses a macro that should not end up in the external include files.
   util.h ends up in external include files (since we'll need it inside the future libacero)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] github-actions[bot] commented on pull request #34518: GH-15280: [C++] Refactor to reorganize dependencies as a prequel to moving acero out of libarrow

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #34518:
URL: https://github.com/apache/arrow/pull/34518#issuecomment-1462356977

   * Closes: #15280


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] icexelloss commented on a diff in pull request #34518: GH-15280: [C++] Refactor to reorganize dependencies as a prequel to moving acero out of libarrow

Posted by "icexelloss (via GitHub)" <gi...@apache.org>.
icexelloss commented on code in PR #34518:
URL: https://github.com/apache/arrow/pull/34518#discussion_r1132375052


##########
cpp/src/arrow/compute/util_internal.h:
##########
@@ -0,0 +1,31 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#pragma once
+
+#include "arrow/util/logging.h"
+
+namespace arrow {
+namespace util {
+
+template <typename T>
+void CheckAlignment(const void* ptr) {

Review Comment:
   Curious why moving this from util.h to a new util_internal.h?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] ursabot commented on pull request #34518: GH-15280: [C++] Refactor to reorganize dependencies as a prequel to moving acero out of libarrow

Posted by "ursabot (via GitHub)" <gi...@apache.org>.
ursabot commented on PR #34518:
URL: https://github.com/apache/arrow/pull/34518#issuecomment-1468772311

   Benchmark runs are scheduled for baseline = 0582122b41949790b795c3760ab41f21275f393c and contender = 44edc27e549d82db930421b0d4c76098941afd71. 44edc27e549d82db930421b0d4c76098941afd71 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/b36ac80d86ba4b3eba7ee8a9a3b1f455...a352a5d3951c4e5389244094a09b4ff7/)
   [Failed :arrow_down:0.45% :arrow_up:0.03%] [test-mac-arm](https://conbench.ursa.dev/compare/runs/41116055d4de4e17915acfb5a1e9b8c9...fff6f7f0873243e7854566819aad5122/)
   [Finished :arrow_down:0.0% :arrow_up:0.0%] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/c2b535ec182741d8a009ba6ce0506f50...fe0d64be1b4b4b73b134d4f3d2e9d4a5/)
   [Finished :arrow_down:0.94% :arrow_up:0.03%] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/42298ce99c30489d92c4e9ee705df5b5...74e48f7b404f49d5a46709e56071227b/)
   Buildkite builds:
   [Finished] [`44edc27e` ec2-t3-xlarge-us-east-2](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/2522)
   [Failed] [`44edc27e` test-mac-arm](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/2552)
   [Finished] [`44edc27e` ursa-i9-9960x](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-i9-9960x/builds/2520)
   [Finished] [`44edc27e` ursa-thinkcentre-m75q](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/2543)
   [Finished] [`0582122b` ec2-t3-xlarge-us-east-2](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/2521)
   [Finished] [`0582122b` test-mac-arm](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/2551)
   [Finished] [`0582122b` ursa-i9-9960x](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-i9-9960x/builds/2519)
   [Finished] [`0582122b` ursa-thinkcentre-m75q](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/2542)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
   test-mac-arm: Supported benchmark langs: C++, Python, R
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] ursabot commented on pull request #34518: GH-15280: [C++] Refactor to reorganize dependencies as a prequel to moving acero out of libarrow

Posted by "ursabot (via GitHub)" <gi...@apache.org>.
ursabot commented on PR #34518:
URL: https://github.com/apache/arrow/pull/34518#issuecomment-1468903711

   ['Python', 'R'] benchmarks have high level of regressions.
   [test-mac-arm](https://conbench.ursa.dev/compare/runs/41116055d4de4e17915acfb5a1e9b8c9...fff6f7f0873243e7854566819aad5122/)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org