You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "rtpsw (via GitHub)" <gi...@apache.org> on 2023/02/04 08:11:37 UTC
[GitHub] [arrow] rtpsw opened a new pull request, #34042: GH-33850: [C++] Allow Substrait's default extension provider to be configured
rtpsw opened a new pull request, #34042:
URL: https://github.com/apache/arrow/pull/34042
See #33850
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] icexelloss commented on pull request #34042: GH-33850: [C++] Allow Substrait's default extension provider to be configured
Posted by "icexelloss (via GitHub)" <gi...@apache.org>.
icexelloss commented on PR #34042:
URL: https://github.com/apache/arrow/pull/34042#issuecomment-1422667510
Yaron, Probably easier to create a new branch./PR to fix the issue. I am
not sure how reopening a merged PR would work.
On Wed, Feb 8, 2023 at 4:17 AM rtpsw ***@***.***> wrote:
> @westonpace <https://github.com/westonpace>, could you please reopen this
> PR? I already pushed a commit to the same branch.
>
> —
> Reply to this email directly, view it on GitHub
> <https://github.com/apache/arrow/pull/34042#issuecomment-1422276974>, or
> unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AAGBXLFAQPAHW6W7BRHPZBLWWNQCFANCNFSM6AAAAAAURBC37M>
> .
> You are receiving this because you were mentioned.Message ID:
> ***@***.***>
>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] rtpsw commented on pull request #34042: GH-33850: [C++] Allow Substrait's default extension provider to be configured
Posted by "rtpsw (via GitHub)" <gi...@apache.org>.
rtpsw commented on PR #34042:
URL: https://github.com/apache/arrow/pull/34042#issuecomment-1416693996
cc @westonpace @icexelloss
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] icexelloss commented on a diff in pull request #34042: GH-33850: [C++] Allow Substrait's default extension provider to be configured
Posted by "icexelloss (via GitHub)" <gi...@apache.org>.
icexelloss commented on code in PR #34042:
URL: https://github.com/apache/arrow/pull/34042#discussion_r1096718718
##########
cpp/src/arrow/engine/substrait/options.cc:
##########
@@ -122,11 +122,23 @@ class DefaultExtensionProvider : public BaseExtensionProvider {
}
};
-std::shared_ptr<ExtensionProvider> ExtensionProvider::kDefaultExtensionProvider =
+namespace {
+
+std::shared_ptr<ExtensionProvider> kDefaultExtensionProvider =
std::make_shared<DefaultExtensionProvider>();
+std::mutex kDefaultExtensionProviderMutex;
+
+} // namespace
+
std::shared_ptr<ExtensionProvider> default_extension_provider() {
- return ExtensionProvider::kDefaultExtensionProvider;
+ std::unique_lock<std::mutex> lock(kDefaultExtensionProviderMutex);
+ return kDefaultExtensionProvider;
+}
+
+void set_default_extension_provider(const std::shared_ptr<ExtensionProvider>& provider) {
Review Comment:
Can we document these method?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] icexelloss commented on pull request #34042: GH-33850: [C++] Allow Substrait's default extension provider to be configured
Posted by "icexelloss (via GitHub)" <gi...@apache.org>.
icexelloss commented on PR #34042:
URL: https://github.com/apache/arrow/pull/34042#issuecomment-1419455493
> I'd rather stick to something like what we already have as it is very simple. Yes, most providers will want to do something similar but I don't think it is a large burden for them.
This sounds fine to me. We can implement the fallback logic internally.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] icexelloss commented on pull request #34042: GH-33850: [C++] Allow Substrait's default extension provider to be configured
Posted by "icexelloss (via GitHub)" <gi...@apache.org>.
icexelloss commented on PR #34042:
URL: https://github.com/apache/arrow/pull/34042#issuecomment-1418060328
IMO we can handle the "fallback" behavior in Acero instead of custom code so not everyone that wants to provide custom extension need to implement this.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] kou commented on pull request #34042: GH-33850: [C++] Allow Substrait's default extension provider to be configured
Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on PR #34042:
URL: https://github.com/apache/arrow/pull/34042#issuecomment-1422191511
@rtpsw @westonpace This broke `test-ubuntu-18.04-cpp` job:
https://github.com/ursacomputing/crossbow/actions/runs/4121522608/jobs/7117290841#step:5:1695
```text
FAILED: src/arrow/engine/CMakeFiles/arrow_substrait_shared.dir/substrait/options.cc.o
/usr/local/bin/sccache /usr/bin/c++ -DARROW_ENGINE_EXPORTING -DARROW_HAVE_RUNTIME_AVX2 -DARROW_HAVE_RUNTIME_AVX512 -DARROW_HAVE_RUNTIME_BMI2 -DARROW_HAVE_RUNTIME_SSE4_2 -DARROW_HAVE_SSE4_2 -DARROW_HDFS -DARROW_MIMALLOC -DARROW_NO_DEPRECATED_API -DARROW_WITH_BROTLI -DARROW_WITH_BZ2 -DARROW_WITH_LZ4 -DARROW_WITH_RE2 -DARROW_WITH_SNAPPY -DARROW_WITH_TIMING_TESTS -DARROW_WITH_UTF8PROC -DARROW_WITH_ZLIB -DARROW_WITH_ZSTD -DURI_STATIC_BUILD -DUTF8PROC_STATIC -Darrow_substrait_shared_EXPORTS -Isrc -I/arrow/cpp/src -I/arrow/cpp/src/generated -Isubstrait_ep-generated -isystem /arrow/cpp/thirdparty/flatbuffers/include -isystem /arrow/cpp/thirdparty/hadoop/include -isystem orc_ep-install/include -isystem zstd_ep-install/include -isystem utf8proc_ep-install/include -isystem xsimd_ep/src/xsimd_ep-install/include -isystem jemalloc_ep-prefix/src -isystem mimalloc_ep/src/mimalloc_ep/include/mimalloc-2.0 -Wno-noexcept-type -fdiagnostics-color=always -Wall -Wno-conversion -Wno-sign-conversion -W
unused-result -fno-semantic-interposition -msse4.2 -g -Werror -O0 -ggdb -fPIC -pthread -std=c++1z -MD -MT src/arrow/engine/CMakeFiles/arrow_substrait_shared.dir/substrait/options.cc.o -MF src/arrow/engine/CMakeFiles/arrow_substrait_shared.dir/substrait/options.cc.o.d -o src/arrow/engine/CMakeFiles/arrow_substrait_shared.dir/substrait/options.cc.o -c /arrow/cpp/src/arrow/engine/substrait/options.cc
/arrow/cpp/src/arrow/engine/substrait/options.cc:130:6: error: 'mutex' in namespace 'std' does not name a type
std::mutex g_default_extension_provider_mutex;
^~~~~
/arrow/cpp/src/arrow/engine/substrait/options.cc: In function 'std::shared_ptr<arrow::engine::ExtensionProvider> arrow::engine::default_extension_provider()':
/arrow/cpp/src/arrow/engine/substrait/options.cc:135:8: error: 'unique_lock' is not a member of 'std'
std::unique_lock<std::mutex> lock(g_default_extension_provider_mutex);
^~~~~~~~~~~
/arrow/cpp/src/arrow/engine/substrait/options.cc:135:8: note: suggested alternative: 'unique_copy'
std::unique_lock<std::mutex> lock(g_default_extension_provider_mutex);
^~~~~~~~~~~
unique_copy
/arrow/cpp/src/arrow/engine/substrait/options.cc:135:25: error: 'mutex' is not a member of 'std'
std::unique_lock<std::mutex> lock(g_default_extension_provider_mutex);
^~~~~
/arrow/cpp/src/arrow/engine/substrait/options.cc:135:37: error: 'g_default_extension_provider_mutex' was not declared in this scope
std::unique_lock<std::mutex> lock(g_default_extension_provider_mutex);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/arrow/cpp/src/arrow/engine/substrait/options.cc:135:37: note: suggested alternative: 'default_extension_provider'
std::unique_lock<std::mutex> lock(g_default_extension_provider_mutex);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
default_extension_provider
/arrow/cpp/src/arrow/engine/substrait/options.cc:135:32: error: 'lock' was not declared in this scope
std::unique_lock<std::mutex> lock(g_default_extension_provider_mutex);
^~~~
/arrow/cpp/src/arrow/engine/substrait/options.cc:135:32: note: suggested alternative: 'clock'
std::unique_lock<std::mutex> lock(g_default_extension_provider_mutex);
^~~~
clock
/arrow/cpp/src/arrow/engine/substrait/options.cc: In function 'void arrow::engine::set_default_extension_provider(const std::shared_ptr<arrow::engine::ExtensionProvider>&)':
/arrow/cpp/src/arrow/engine/substrait/options.cc:140:8: error: 'unique_lock' is not a member of 'std'
std::unique_lock<std::mutex> lock(g_default_extension_provider_mutex);
^~~~~~~~~~~
/arrow/cpp/src/arrow/engine/substrait/options.cc:140:8: note: suggested alternative: 'unique_copy'
std::unique_lock<std::mutex> lock(g_default_extension_provider_mutex);
^~~~~~~~~~~
unique_copy
/arrow/cpp/src/arrow/engine/substrait/options.cc:140:25: error: 'mutex' is not a member of 'std'
std::unique_lock<std::mutex> lock(g_default_extension_provider_mutex);
^~~~~
/arrow/cpp/src/arrow/engine/substrait/options.cc:140:37: error: 'g_default_extension_provider_mutex' was not declared in this scope
std::unique_lock<std::mutex> lock(g_default_extension_provider_mutex);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/arrow/cpp/src/arrow/engine/substrait/options.cc:140:37: note: suggested alternative: 'default_extension_provider'
std::unique_lock<std::mutex> lock(g_default_extension_provider_mutex);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
default_extension_provider
/arrow/cpp/src/arrow/engine/substrait/options.cc:140:32: error: 'lock' was not declared in this scope
std::unique_lock<std::mutex> lock(g_default_extension_provider_mutex);
^~~~
/arrow/cpp/src/arrow/engine/substrait/options.cc:140:32: note: suggested alternative: 'clock'
std::unique_lock<std::mutex> lock(g_default_extension_provider_mutex);
^~~~
clock
```
Could you check this?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] westonpace commented on a diff in pull request #34042: GH-33850: [C++] Allow Substrait's default extension provider to be configured
Posted by "westonpace (via GitHub)" <gi...@apache.org>.
westonpace commented on code in PR #34042:
URL: https://github.com/apache/arrow/pull/34042#discussion_r1097626452
##########
cpp/src/arrow/engine/substrait/options.cc:
##########
@@ -122,11 +122,23 @@ class DefaultExtensionProvider : public BaseExtensionProvider {
}
};
-std::shared_ptr<ExtensionProvider> ExtensionProvider::kDefaultExtensionProvider =
+namespace {
+
+std::shared_ptr<ExtensionProvider> kDefaultExtensionProvider =
std::make_shared<DefaultExtensionProvider>();
+std::mutex kDefaultExtensionProviderMutex;
+
+} // namespace
+
std::shared_ptr<ExtensionProvider> default_extension_provider() {
- return ExtensionProvider::kDefaultExtensionProvider;
+ std::unique_lock<std::mutex> lock(kDefaultExtensionProviderMutex);
Review Comment:
Better safe than sorry I think.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] rtpsw commented on a diff in pull request #34042: GH-33850: [C++] Allow Substrait's default extension provider to be configured
Posted by "rtpsw (via GitHub)" <gi...@apache.org>.
rtpsw commented on code in PR #34042:
URL: https://github.com/apache/arrow/pull/34042#discussion_r1097905231
##########
cpp/src/arrow/engine/substrait/options.cc:
##########
@@ -122,11 +122,23 @@ class DefaultExtensionProvider : public BaseExtensionProvider {
}
};
-std::shared_ptr<ExtensionProvider> ExtensionProvider::kDefaultExtensionProvider =
+namespace {
+
+std::shared_ptr<ExtensionProvider> kDefaultExtensionProvider =
std::make_shared<DefaultExtensionProvider>();
+std::mutex kDefaultExtensionProviderMutex;
+
+} // namespace
+
std::shared_ptr<ExtensionProvider> default_extension_provider() {
- return ExtensionProvider::kDefaultExtensionProvider;
+ std::unique_lock<std::mutex> lock(kDefaultExtensionProviderMutex);
+ return kDefaultExtensionProvider;
+}
+
+void set_default_extension_provider(const std::shared_ptr<ExtensionProvider>& provider) {
Review Comment:
It is declared in the public header `options.h` and documented there.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] westonpace commented on pull request #34042: GH-33850: [C++] Allow Substrait's default extension provider to be configured
Posted by "westonpace (via GitHub)" <gi...@apache.org>.
westonpace commented on PR #34042:
URL: https://github.com/apache/arrow/pull/34042#issuecomment-1419410914
We could make the chaining of extension providers more explicit (e.g. Acero could have a stack of extension providers). Or we could create a base class that we offer as a helper / example. However, I'm not sure that this is a good idea. I'd rather stick to something like what we already have as it is very simple. Yes, most providers will want to do something similar but I don't think it is a large burden for them.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] github-actions[bot] commented on pull request #34042: GH-33850: [C++] Allow Substrait's default extension provider to be configured
Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #34042:
URL: https://github.com/apache/arrow/pull/34042#issuecomment-1416693992
* Closes: #33850
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] icexelloss commented on a diff in pull request #34042: GH-33850: [C++] Allow Substrait's default extension provider to be configured
Posted by "icexelloss (via GitHub)" <gi...@apache.org>.
icexelloss commented on code in PR #34042:
URL: https://github.com/apache/arrow/pull/34042#discussion_r1096727243
##########
cpp/src/arrow/engine/substrait/options.cc:
##########
@@ -122,11 +122,23 @@ class DefaultExtensionProvider : public BaseExtensionProvider {
}
};
-std::shared_ptr<ExtensionProvider> ExtensionProvider::kDefaultExtensionProvider =
+namespace {
+
+std::shared_ptr<ExtensionProvider> kDefaultExtensionProvider =
std::make_shared<DefaultExtensionProvider>();
+std::mutex kDefaultExtensionProviderMutex;
+
+} // namespace
+
std::shared_ptr<ExtensionProvider> default_extension_provider() {
- return ExtensionProvider::kDefaultExtensionProvider;
+ std::unique_lock<std::mutex> lock(kDefaultExtensionProviderMutex);
Review Comment:
Not a big issue though - IMO this is a bit of overkill but wouldn't hurt.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] rtpsw commented on a diff in pull request #34042: GH-33850: [C++] Allow Substrait's default extension provider to be configured
Posted by "rtpsw (via GitHub)" <gi...@apache.org>.
rtpsw commented on code in PR #34042:
URL: https://github.com/apache/arrow/pull/34042#discussion_r1097911741
##########
cpp/src/arrow/engine/substrait/options.cc:
##########
@@ -122,11 +122,23 @@ class DefaultExtensionProvider : public BaseExtensionProvider {
}
};
-std::shared_ptr<ExtensionProvider> ExtensionProvider::kDefaultExtensionProvider =
+namespace {
+
+std::shared_ptr<ExtensionProvider> kDefaultExtensionProvider =
std::make_shared<DefaultExtensionProvider>();
+std::mutex kDefaultExtensionProviderMutex;
Review Comment:
You're right we can't use `kName`. I believe the right way to use is `g_name`, which I just used in the recent commit.
##########
cpp/src/arrow/engine/substrait/options.cc:
##########
@@ -122,11 +122,23 @@ class DefaultExtensionProvider : public BaseExtensionProvider {
}
};
-std::shared_ptr<ExtensionProvider> ExtensionProvider::kDefaultExtensionProvider =
+namespace {
+
+std::shared_ptr<ExtensionProvider> kDefaultExtensionProvider =
std::make_shared<DefaultExtensionProvider>();
+std::mutex kDefaultExtensionProviderMutex;
Review Comment:
You're right we can't use `kName`. I believe the right way is to use `g_name`, which I just used in the recent commit.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] github-actions[bot] commented on pull request #34042: GH-33850: [C++] Allow Substrait's default extension provider to be configured
Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #34042:
URL: https://github.com/apache/arrow/pull/34042#issuecomment-1422093375
Revision: aa571530da72b8e866b465d36413f1ad3af3602a
Submitted crossbow builds: [ursacomputing/crossbow @ actions-c920441bc9](https://github.com/ursacomputing/crossbow/branches/all?query=actions-c920441bc9)
|Task|Status|
|----|------|
|test-ubuntu-18.04-cpp|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-c920441bc9-github-test-ubuntu-18.04-cpp)](https://github.com/ursacomputing/crossbow/actions/runs/4121522608/jobs/7117290841)|
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] rtpsw commented on pull request #34042: GH-33850: [C++] Allow Substrait's default extension provider to be configured
Posted by "rtpsw (via GitHub)" <gi...@apache.org>.
rtpsw commented on PR #34042:
URL: https://github.com/apache/arrow/pull/34042#issuecomment-1418041925
> One thing wasn't clear to be is how do we dispatch some of the message to our extension provider vs use the default extension provider. For example, for the asof join message, I would imagine that being passed to the existing Acero extension provider - how would we handle that?
The usual pattern would be chaining of an extension provider - something along the lines of
```
class MyExtensionProvider {
public:
MyExtensionProvider(const std::shared_ptr<ExtensionProvider>& parent_provider) :
parent_provider_(parent_provider) {}
Result<RelationInfo> MakeRel(const std::vector<DeclarationInfo>& inputs,
const google::protobuf::Any& rel,
const ExtensionSet& ext_set) override {
if (rel.Is<substrait_ext::MyRel>()) {
substrait_ext::MyRel my_rel;
rel.UnpackTo(&my_rel);
return MakeMyRel(inputs, my_rel, ext_set);
}
return parent_provider_->MakeRel(inputs, rel, ext_set);
}
// MakeMyRel implementation here
private:
std::shared_ptr<ExtensionProvider> parent_provider_;
};
```
and for initialization
```
set_default_extension_provider(std::make_shared<MyExtensionProvider>(default_extension_provider());
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] rtpsw commented on pull request #34042: GH-33850: [C++] Allow Substrait's default extension provider to be configured
Posted by "rtpsw (via GitHub)" <gi...@apache.org>.
rtpsw commented on PR #34042:
URL: https://github.com/apache/arrow/pull/34042#issuecomment-1418075564
> IMO we can handle the "fallback" behavior in Acero instead of custom code so not everyone that wants to provide custom extension need to implement this.
What do you have in mind for the API? My suggestion would be to encapsulate the chaining pattern via an API like:
```
chain_extension_provider(const std::shared_ptr<ExtensionProvider>& provider);
```
Here `provider` would not need to call a parent because this would be done automatically.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] rtpsw commented on pull request #34042: GH-33850: [C++] Allow Substrait's default extension provider to be configured
Posted by "rtpsw (via GitHub)" <gi...@apache.org>.
rtpsw commented on PR #34042:
URL: https://github.com/apache/arrow/pull/34042#issuecomment-1422276974
@westonpace, could you please reopen this PR? I already pushed a commit to the same branch.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] icexelloss commented on a diff in pull request #34042: GH-33850: [C++] Allow Substrait's default extension provider to be configured
Posted by "icexelloss (via GitHub)" <gi...@apache.org>.
icexelloss commented on code in PR #34042:
URL: https://github.com/apache/arrow/pull/34042#discussion_r1096718632
##########
cpp/src/arrow/engine/substrait/options.cc:
##########
@@ -122,11 +122,23 @@ class DefaultExtensionProvider : public BaseExtensionProvider {
}
};
-std::shared_ptr<ExtensionProvider> ExtensionProvider::kDefaultExtensionProvider =
+namespace {
+
+std::shared_ptr<ExtensionProvider> kDefaultExtensionProvider =
std::make_shared<DefaultExtensionProvider>();
+std::mutex kDefaultExtensionProviderMutex;
+
+} // namespace
+
std::shared_ptr<ExtensionProvider> default_extension_provider() {
- return ExtensionProvider::kDefaultExtensionProvider;
+ std::unique_lock<std::mutex> lock(kDefaultExtensionProviderMutex);
Review Comment:
Don't think we need the lock here - IMO this method doesn't need to be thread safe (I cannot imagine who would want to do this in multi threaded way)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] icexelloss commented on a diff in pull request #34042: GH-33850: [C++] Allow Substrait's default extension provider to be configured
Posted by "icexelloss (via GitHub)" <gi...@apache.org>.
icexelloss commented on code in PR #34042:
URL: https://github.com/apache/arrow/pull/34042#discussion_r1097690244
##########
cpp/src/arrow/engine/substrait/options.cc:
##########
@@ -122,11 +122,23 @@ class DefaultExtensionProvider : public BaseExtensionProvider {
}
};
-std::shared_ptr<ExtensionProvider> ExtensionProvider::kDefaultExtensionProvider =
+namespace {
+
+std::shared_ptr<ExtensionProvider> kDefaultExtensionProvider =
std::make_shared<DefaultExtensionProvider>();
+std::mutex kDefaultExtensionProviderMutex;
+
+} // namespace
+
std::shared_ptr<ExtensionProvider> default_extension_provider() {
- return ExtensionProvider::kDefaultExtensionProvider;
+ std::unique_lock<std::mutex> lock(kDefaultExtensionProviderMutex);
Review Comment:
SGTM
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] ursabot commented on pull request #34042: GH-33850: [C++] Allow Substrait's default extension provider to be configured
Posted by "ursabot (via GitHub)" <gi...@apache.org>.
ursabot commented on PR #34042:
URL: https://github.com/apache/arrow/pull/34042#issuecomment-1420676291
Benchmark runs are scheduled for baseline = 2b661bad06bdb31eb54e1abadf396e95466637ca and contender = 7423f0332cb11eb780f421c07bac71f87bf44a03. 7423f0332cb11eb780f421c07bac71f87bf44a03 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/7dce38208957469d82d3900fefd83896...1fa959e4d4b14267a05fc085417464bd/)
[Failed :arrow_down:0.24% :arrow_up:0.0%] [test-mac-arm](https://conbench.ursa.dev/compare/runs/e259efc66c5a42c08417b799dd69733c...500a58e0187d433cb34558a9702fec5d/)
[Finished :arrow_down:0.0% :arrow_up:0.0%] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/003712cda92b4a67b9ef2e1e7e64a30f...8fb37e4c79ad4829a67e9b0cdc521e39/)
[Finished :arrow_down:0.03% :arrow_up:0.0%] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/6300d1d633994c388fb3c41cacc97a99...b82a0da8dee741228f8fbf19232054dd/)
Buildkite builds:
[Finished] [`7423f033` ec2-t3-xlarge-us-east-2](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/2337)
[Failed] [`7423f033` test-mac-arm](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/2365)
[Finished] [`7423f033` ursa-i9-9960x](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-i9-9960x/builds/2335)
[Finished] [`7423f033` ursa-thinkcentre-m75q](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/2356)
[Finished] [`2b661bad` ec2-t3-xlarge-us-east-2](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/2336)
[Finished] [`2b661bad` test-mac-arm](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/2364)
[Finished] [`2b661bad` ursa-i9-9960x](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-i9-9960x/builds/2334)
[Finished] [`2b661bad` ursa-thinkcentre-m75q](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/2355)
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] westonpace commented on a diff in pull request #34042: GH-33850: [C++] Allow Substrait's default extension provider to be configured
Posted by "westonpace (via GitHub)" <gi...@apache.org>.
westonpace commented on code in PR #34042:
URL: https://github.com/apache/arrow/pull/34042#discussion_r1097879089
##########
cpp/src/arrow/engine/substrait/options.cc:
##########
@@ -122,11 +122,23 @@ class DefaultExtensionProvider : public BaseExtensionProvider {
}
};
-std::shared_ptr<ExtensionProvider> ExtensionProvider::kDefaultExtensionProvider =
+namespace {
+
+std::shared_ptr<ExtensionProvider> kDefaultExtensionProvider =
std::make_shared<DefaultExtensionProvider>();
+std::mutex kDefaultExtensionProviderMutex;
Review Comment:
```suggestion
std::shared_ptr<ExtensionProvider> default_extension_provider =
std::make_shared<DefaultExtensionProvider>();
std::mutex default_extension_provider_mutex;
```
These aren't constants so we can't use `kName`. Will this work?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] westonpace merged pull request #34042: GH-33850: [C++] Allow Substrait's default extension provider to be configured
Posted by "westonpace (via GitHub)" <gi...@apache.org>.
westonpace merged PR #34042:
URL: https://github.com/apache/arrow/pull/34042
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] kou commented on pull request #34042: GH-33850: [C++] Allow Substrait's default extension provider to be configured
Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on PR #34042:
URL: https://github.com/apache/arrow/pull/34042#issuecomment-1421990857
@github-actions crossbuw submit test-ubuntu-18.04-cpp
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] icexelloss commented on a diff in pull request #34042: GH-33850: [C++] Allow Substrait's default extension provider to be configured
Posted by "icexelloss (via GitHub)" <gi...@apache.org>.
icexelloss commented on code in PR #34042:
URL: https://github.com/apache/arrow/pull/34042#discussion_r1096718914
##########
cpp/src/arrow/engine/substrait/options.cc:
##########
@@ -122,11 +122,23 @@ class DefaultExtensionProvider : public BaseExtensionProvider {
}
};
-std::shared_ptr<ExtensionProvider> ExtensionProvider::kDefaultExtensionProvider =
+namespace {
+
+std::shared_ptr<ExtensionProvider> kDefaultExtensionProvider =
std::make_shared<DefaultExtensionProvider>();
+std::mutex kDefaultExtensionProviderMutex;
+
+} // namespace
+
std::shared_ptr<ExtensionProvider> default_extension_provider() {
- return ExtensionProvider::kDefaultExtensionProvider;
+ std::unique_lock<std::mutex> lock(kDefaultExtensionProviderMutex);
+ return kDefaultExtensionProvider;
+}
+
+void set_default_extension_provider(const std::shared_ptr<ExtensionProvider>& provider) {
Review Comment:
Also - do we need to expose these in public headers?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] icexelloss commented on pull request #34042: GH-33850: [C++] Allow Substrait's default extension provider to be configured
Posted by "icexelloss (via GitHub)" <gi...@apache.org>.
icexelloss commented on PR #34042:
URL: https://github.com/apache/arrow/pull/34042#issuecomment-1418023500
One thing wasn't clear to be is how do we dispatch some of the message to our extension provider vs use the default extension provider. For example, for the asof join message, I would imagine that being passed to the existing Acero extension provider - how would we handle that?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] rtpsw commented on pull request #34042: GH-33850: [C++] Allow Substrait's default extension provider to be configured
Posted by "rtpsw (via GitHub)" <gi...@apache.org>.
rtpsw commented on PR #34042:
URL: https://github.com/apache/arrow/pull/34042#issuecomment-1422841215
Create a [fix-PR](https://github.com/apache/arrow/pull/34075).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] rtpsw commented on pull request #34042: GH-33850: [C++] Allow Substrait's default extension provider to be configured
Posted by "rtpsw (via GitHub)" <gi...@apache.org>.
rtpsw commented on PR #34042:
URL: https://github.com/apache/arrow/pull/34042#issuecomment-1422255071
@kou, I'll handle this. I don't have the same platform readily available, but I have a reasonable guess of the cause.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org