You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/10/12 13:24:59 UTC

[GitHub] [arrow] rtpsw opened a new pull request, #14386: ARROW-18004: [C++] ExecBatch conversion to RecordBatch may go out of bounds

rtpsw opened a new pull request, #14386:
URL: https://github.com/apache/arrow/pull/14386

   See https://issues.apache.org/jira/browse/ARROW-18004


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on pull request #14386: ARROW-18004: [C++] ExecBatch conversion to RecordBatch may go out of bounds

Posted by GitBox <gi...@apache.org>.
pitrou commented on PR #14386:
URL: https://github.com/apache/arrow/pull/14386#issuecomment-1276340243

   Ok, it looks like Acero relies on being able to silently truncate the number of fields in that method. Which is quite unfortunate.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rtpsw commented on pull request #14386: ARROW-18004: [C++] ExecBatch conversion to RecordBatch may go out of bounds

Posted by GitBox <gi...@apache.org>.
rtpsw commented on PR #14386:
URL: https://github.com/apache/arrow/pull/14386#issuecomment-1277813829

   Thanks @pitrou !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rtpsw commented on pull request #14386: ARROW-18004: [C++] ExecBatch conversion to RecordBatch may go out of bounds

Posted by GitBox <gi...@apache.org>.
rtpsw commented on PR #14386:
URL: https://github.com/apache/arrow/pull/14386#issuecomment-1276179877

   cc @icexelloss @westonpace


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rtpsw commented on pull request #14386: ARROW-18004: [C++] ExecBatch conversion to RecordBatch may go out of bounds

Posted by GitBox <gi...@apache.org>.
rtpsw commented on PR #14386:
URL: https://github.com/apache/arrow/pull/14386#issuecomment-1276588518

   @pitrou, please let me know which checks to remove from the test.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on pull request #14386: ARROW-18004: [C++] ExecBatch conversion to RecordBatch may go out of bounds

Posted by GitBox <gi...@apache.org>.
pitrou commented on PR #14386:
URL: https://github.com/apache/arrow/pull/14386#issuecomment-1276185070

   If we wanted to check that the ExecBatch values corresponding to the Schema, we should check for schema equality on each of the Datum values. That can unfortunately be expensive.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on a diff in pull request #14386: ARROW-18004: [C++] ExecBatch conversion to RecordBatch may go out of bounds

Posted by GitBox <gi...@apache.org>.
pitrou commented on code in PR #14386:
URL: https://github.com/apache/arrow/pull/14386#discussion_r993497169


##########
cpp/src/arrow/compute/exec_test.cc:
##########
@@ -55,6 +55,19 @@ using ::arrow::internal::BitmapEquals;
 using ::arrow::internal::CopyBitmap;
 using ::arrow::internal::CountSetBits;
 
+TEST(ExecBatch, Basics) {
+  auto i32_array = ArrayFromJSON(int32(), "[0, 1, 2]");
+  auto utf8_array = ArrayFromJSON(utf8(), R"(["a", "b", "c"])");
+  ExecBatch exec_batch({Datum(i32_array), Datum(utf8_array)}, 3);
+  auto right_schema = schema({field("a", int32()), field("b", utf8())});
+  ASSERT_OK_AND_ASSIGN(auto right_record_batch, exec_batch.ToRecordBatch(right_schema));
+  auto accept_schema = schema({field("a", int32())});
+  ASSERT_OK_AND_ASSIGN(auto accept_record_batch, exec_batch.ToRecordBatch(accept_schema));

Review Comment:
   Why would this succeed and produce a truncated result here?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on a diff in pull request #14386: ARROW-18004: [C++] ExecBatch conversion to RecordBatch may go out of bounds

Posted by GitBox <gi...@apache.org>.
pitrou commented on code in PR #14386:
URL: https://github.com/apache/arrow/pull/14386#discussion_r993696113


##########
cpp/src/arrow/compute/exec.cc:
##########
@@ -156,15 +156,22 @@ Result<ExecBatch> ExecBatch::Make(std::vector<Datum> values) {
 
 Result<std::shared_ptr<RecordBatch>> ExecBatch::ToRecordBatch(
     std::shared_ptr<Schema> schema, MemoryPool* pool) const {
+  if (static_cast<size_t>(schema->num_fields()) > values.size()) {
+    return Status::Invalid("mismatching schema size");
+  }
   ArrayVector columns(schema->num_fields());
 
   for (size_t i = 0; i < columns.size(); ++i) {
     const Datum& value = values[i];
     if (value.is_array()) {
       columns[i] = value.make_array();
       continue;
+    } else if (value.is_scalar()) {
+      ARROW_ASSIGN_OR_RAISE(columns[i],
+                            MakeArrayFromScalar(*value.scalar(), length, pool));
+    } else {
+      DCHECK(false);

Review Comment:
   Ok. Let's at least add a meaningful error message to the `DCHECK` :-)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on a diff in pull request #14386: ARROW-18004: [C++] ExecBatch conversion to RecordBatch may go out of bounds

Posted by GitBox <gi...@apache.org>.
pitrou commented on code in PR #14386:
URL: https://github.com/apache/arrow/pull/14386#discussion_r993598429


##########
cpp/src/arrow/compute/exec_test.cc:
##########
@@ -55,6 +55,19 @@ using ::arrow::internal::BitmapEquals;
 using ::arrow::internal::CopyBitmap;
 using ::arrow::internal::CountSetBits;
 
+TEST(ExecBatch, Basics) {

Review Comment:
   Nit :-)
   ```suggestion
   TEST(ExecBatch, ToRecordBatch) {
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on a diff in pull request #14386: ARROW-18004: [C++] ExecBatch conversion to RecordBatch may go out of bounds

Posted by GitBox <gi...@apache.org>.
pitrou commented on code in PR #14386:
URL: https://github.com/apache/arrow/pull/14386#discussion_r994747470


##########
cpp/src/arrow/compute/exec.cc:
##########
@@ -156,15 +156,23 @@ Result<ExecBatch> ExecBatch::Make(std::vector<Datum> values) {
 
 Result<std::shared_ptr<RecordBatch>> ExecBatch::ToRecordBatch(
     std::shared_ptr<Schema> schema, MemoryPool* pool) const {
+  if (static_cast<size_t>(schema->num_fields()) > values.size()) {
+    return Status::Invalid("ExecBatch::ToTRecordBatch mismatching schema size");
+  }
   ArrayVector columns(schema->num_fields());
 
   for (size_t i = 0; i < columns.size(); ++i) {
     const Datum& value = values[i];
     if (value.is_array()) {
       columns[i] = value.make_array();
       continue;
+    } else if (value.is_scalar()) {

Review Comment:
   Right, it doesn't hurt to return a nice error here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rtpsw commented on a diff in pull request #14386: ARROW-18004: [C++] ExecBatch conversion to RecordBatch may go out of bounds

Posted by GitBox <gi...@apache.org>.
rtpsw commented on code in PR #14386:
URL: https://github.com/apache/arrow/pull/14386#discussion_r993766078


##########
cpp/src/arrow/compute/exec.cc:
##########
@@ -156,15 +156,22 @@ Result<ExecBatch> ExecBatch::Make(std::vector<Datum> values) {
 
 Result<std::shared_ptr<RecordBatch>> ExecBatch::ToRecordBatch(
     std::shared_ptr<Schema> schema, MemoryPool* pool) const {
+  if (static_cast<size_t>(schema->num_fields()) > values.size()) {
+    return Status::Invalid("mismatching schema size");
+  }
   ArrayVector columns(schema->num_fields());
 
   for (size_t i = 0; i < columns.size(); ++i) {
     const Datum& value = values[i];
     if (value.is_array()) {
       columns[i] = value.make_array();
       continue;
+    } else if (value.is_scalar()) {
+      ARROW_ASSIGN_OR_RAISE(columns[i],
+                            MakeArrayFromScalar(*value.scalar(), length, pool));
+    } else {
+      DCHECK(false);

Review Comment:
   Created https://issues.apache.org/jira/browse/ARROW-18015 for the validation. In the meantime, I added a meaningful error message.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rtpsw commented on pull request #14386: ARROW-18004: [C++] ExecBatch conversion to RecordBatch may go out of bounds

Posted by GitBox <gi...@apache.org>.
rtpsw commented on PR #14386:
URL: https://github.com/apache/arrow/pull/14386#issuecomment-1276246581

   > Edit: if this avoids an immediate crash and allows you to see `Validate` failing afterwards, then I would be ok with this
   
   With the current PR code, `exec_batch.ToRecordBatch(reject_schema)` just raises an error, i.e., it does not let an invalid record batch get created. Should I add some other check, and which?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] bkietz commented on a diff in pull request #14386: ARROW-18004: [C++] ExecBatch conversion to RecordBatch may go out of bounds

Posted by GitBox <gi...@apache.org>.
bkietz commented on code in PR #14386:
URL: https://github.com/apache/arrow/pull/14386#discussion_r993595985


##########
cpp/src/arrow/compute/exec.cc:
##########
@@ -156,15 +156,22 @@ Result<ExecBatch> ExecBatch::Make(std::vector<Datum> values) {
 
 Result<std::shared_ptr<RecordBatch>> ExecBatch::ToRecordBatch(
     std::shared_ptr<Schema> schema, MemoryPool* pool) const {
+  if (static_cast<size_t>(schema->num_fields()) > values.size()) {
+    return Status::Invalid("mismatching schema size");
+  }
   ArrayVector columns(schema->num_fields());
 
   for (size_t i = 0; i < columns.size(); ++i) {
     const Datum& value = values[i];
     if (value.is_array()) {
       columns[i] = value.make_array();
       continue;
+    } else if (value.is_scalar()) {
+      ARROW_ASSIGN_OR_RAISE(columns[i],
+                            MakeArrayFromScalar(*value.scalar(), length, pool));
+    } else {
+      DCHECK(false);

Review Comment:
   ```suggestion
         Unreachable();
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on a diff in pull request #14386: ARROW-18004: [C++] ExecBatch conversion to RecordBatch may go out of bounds

Posted by GitBox <gi...@apache.org>.
pitrou commented on code in PR #14386:
URL: https://github.com/apache/arrow/pull/14386#discussion_r993597997


##########
cpp/src/arrow/compute/exec_test.cc:
##########
@@ -55,6 +55,19 @@ using ::arrow::internal::BitmapEquals;
 using ::arrow::internal::CopyBitmap;
 using ::arrow::internal::CountSetBits;
 
+TEST(ExecBatch, Basics) {
+  auto i32_array = ArrayFromJSON(int32(), "[0, 1, 2]");
+  auto utf8_array = ArrayFromJSON(utf8(), R"(["a", "b", "c"])");
+  ExecBatch exec_batch({Datum(i32_array), Datum(utf8_array)}, 3);
+  auto right_schema = schema({field("a", int32()), field("b", utf8())});
+  ASSERT_OK_AND_ASSIGN(auto right_record_batch, exec_batch.ToRecordBatch(right_schema));
+  auto accept_schema = schema({field("a", int32())});
+  ASSERT_OK_AND_ASSIGN(auto accept_record_batch, exec_batch.ToRecordBatch(accept_schema));

Review Comment:
   Yes, I checked this locally and it occurs in multiple places unfortunately. However, I don't think this is a behavior that we want to set in stone, so I think we should remove this particular test.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rtpsw commented on pull request #14386: ARROW-18004: [C++] ExecBatch conversion to RecordBatch may go out of bounds

Posted by GitBox <gi...@apache.org>.
rtpsw commented on PR #14386:
URL: https://github.com/apache/arrow/pull/14386#issuecomment-1276261064

   > I agree with this, but it is actually misleading here, because we're only checking a single condition and otherwise let errors crash silently (or corrupt memory etc.).
   
   I agree the leniency of the checks, as compared to validation, could be unexpected to some developers and therefore misleading. I'd suggest adding doc strings to clarify this. OTOH, I'm not sure which crash condition you mean could still occur. Since the current PR code check the number of values and their types before applying type-specific operations, I believe it can only crash on the `DCHECK(false)` line. I could fix this to return an error.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rtpsw commented on a diff in pull request #14386: ARROW-18004: [C++] ExecBatch conversion to RecordBatch may go out of bounds

Posted by GitBox <gi...@apache.org>.
rtpsw commented on code in PR #14386:
URL: https://github.com/apache/arrow/pull/14386#discussion_r993811161


##########
cpp/src/arrow/compute/exec.cc:
##########
@@ -156,15 +156,23 @@ Result<ExecBatch> ExecBatch::Make(std::vector<Datum> values) {
 
 Result<std::shared_ptr<RecordBatch>> ExecBatch::ToRecordBatch(
     std::shared_ptr<Schema> schema, MemoryPool* pool) const {
+  if (static_cast<size_t>(schema->num_fields()) > values.size()) {
+    return Status::Invalid("ExecBatch::ToTRecordBatch mismatching schema size");
+  }
   ArrayVector columns(schema->num_fields());
 
   for (size_t i = 0; i < columns.size(); ++i) {
     const Datum& value = values[i];
     if (value.is_array()) {
       columns[i] = value.make_array();
       continue;
+    } else if (value.is_scalar()) {

Review Comment:
   This code was there to begin with, so I'll let others chime in. My guess is this is related to the `RecordBatch` API coming much earlier (and tied more closely to the Array API) than that of `ExecBatch`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on pull request #14386: ARROW-18004: [C++] ExecBatch conversion to RecordBatch may go out of bounds

Posted by GitBox <gi...@apache.org>.
pitrou commented on PR #14386:
URL: https://github.com/apache/arrow/pull/14386#issuecomment-1276220334

   Edit: if this avoids an immediate crash and allows you to see `Validate` failing afterwards, then I would be ok with this (but let's not silently truncate output columns either).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on pull request #14386: ARROW-18004: [C++] ExecBatch conversion to RecordBatch may go out of bounds

Posted by GitBox <gi...@apache.org>.
pitrou commented on PR #14386:
URL: https://github.com/apache/arrow/pull/14386#issuecomment-1276210638

   > Another reason is that developers of large apps would generally prefer an error the app can handle, and perhaps raise to its user, over crashing the app.
   
   I agree with this, but it is actually misleading here, because we're only checking a single condition and otherwise let errors crash silently (or corrupt memory etc.).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] icexelloss commented on a diff in pull request #14386: ARROW-18004: [C++] ExecBatch conversion to RecordBatch may go out of bounds

Posted by GitBox <gi...@apache.org>.
icexelloss commented on code in PR #14386:
URL: https://github.com/apache/arrow/pull/14386#discussion_r993806477


##########
cpp/src/arrow/compute/exec.cc:
##########
@@ -156,15 +156,23 @@ Result<ExecBatch> ExecBatch::Make(std::vector<Datum> values) {
 
 Result<std::shared_ptr<RecordBatch>> ExecBatch::ToRecordBatch(
     std::shared_ptr<Schema> schema, MemoryPool* pool) const {
+  if (static_cast<size_t>(schema->num_fields()) > values.size()) {
+    return Status::Invalid("ExecBatch::ToTRecordBatch mismatching schema size");
+  }
   ArrayVector columns(schema->num_fields());
 
   for (size_t i = 0; i < columns.size(); ++i) {
     const Datum& value = values[i];
     if (value.is_array()) {
       columns[i] = value.make_array();
       continue;
+    } else if (value.is_scalar()) {

Review Comment:
   Hmm..when would a ExecBatch contain scalar value and why do we convert that into array? (For my knowledge)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on pull request #14386: ARROW-18004: [C++] ExecBatch conversion to RecordBatch may go out of bounds

Posted by GitBox <gi...@apache.org>.
pitrou commented on PR #14386:
URL: https://github.com/apache/arrow/pull/14386#issuecomment-1276242123

   @westonpace Are the failures mentioned above (mismatching schema size) legitimate?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rtpsw commented on pull request #14386: ARROW-18004: [C++] ExecBatch conversion to RecordBatch may go out of bounds

Posted by GitBox <gi...@apache.org>.
rtpsw commented on PR #14386:
URL: https://github.com/apache/arrow/pull/14386#issuecomment-1276247931

   > @westonpace Are the failures mentioned above (mismatching schema size) legitimate?
   
   Don't know yet; I'll need to examine. If I manage to do so quickly, I'll report here.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] bkietz commented on a diff in pull request #14386: ARROW-18004: [C++] ExecBatch conversion to RecordBatch may go out of bounds

Posted by GitBox <gi...@apache.org>.
bkietz commented on code in PR #14386:
URL: https://github.com/apache/arrow/pull/14386#discussion_r994747026


##########
cpp/src/arrow/compute/exec.cc:
##########
@@ -156,15 +156,23 @@ Result<ExecBatch> ExecBatch::Make(std::vector<Datum> values) {
 
 Result<std::shared_ptr<RecordBatch>> ExecBatch::ToRecordBatch(
     std::shared_ptr<Schema> schema, MemoryPool* pool) const {
+  if (static_cast<size_t>(schema->num_fields()) > values.size()) {
+    return Status::Invalid("ExecBatch::ToTRecordBatch mismatching schema size");
+  }
   ArrayVector columns(schema->num_fields());
 
   for (size_t i = 0; i < columns.size(); ++i) {
     const Datum& value = values[i];
     if (value.is_array()) {
       columns[i] = value.make_array();
       continue;
+    } else if (value.is_scalar()) {

Review Comment:
   RecordBatch requires Array columns. ExecBatch also supports for Scalars, which indicate a column with constant value. Such constant columns arise frequently (for example when the dataset has been partitioned on a column so a batch read from `cyl=4/dat.parquet` has a constant `cyl` column) and many kernels can take directly accept a scalar argument (for example the arithmetic kernels can add a scalar to an array), saving memory and processor overhead of broadcasting to Array.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou merged pull request #14386: ARROW-18004: [C++] ExecBatch conversion to RecordBatch may go out of bounds

Posted by GitBox <gi...@apache.org>.
pitrou merged PR #14386:
URL: https://github.com/apache/arrow/pull/14386


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rtpsw commented on a diff in pull request #14386: ARROW-18004: [C++] ExecBatch conversion to RecordBatch may go out of bounds

Posted by GitBox <gi...@apache.org>.
rtpsw commented on code in PR #14386:
URL: https://github.com/apache/arrow/pull/14386#discussion_r993695241


##########
cpp/src/arrow/compute/exec.cc:
##########
@@ -156,15 +156,22 @@ Result<ExecBatch> ExecBatch::Make(std::vector<Datum> values) {
 
 Result<std::shared_ptr<RecordBatch>> ExecBatch::ToRecordBatch(
     std::shared_ptr<Schema> schema, MemoryPool* pool) const {
+  if (static_cast<size_t>(schema->num_fields()) > values.size()) {
+    return Status::Invalid("mismatching schema size");
+  }
   ArrayVector columns(schema->num_fields());
 
   for (size_t i = 0; i < columns.size(); ++i) {
     const Datum& value = values[i];
     if (value.is_array()) {
       columns[i] = value.make_array();
       continue;
+    } else if (value.is_scalar()) {
+      ARROW_ASSIGN_OR_RAISE(columns[i],
+                            MakeArrayFromScalar(*value.scalar(), length, pool));
+    } else {
+      DCHECK(false);

Review Comment:
   I'm in favor of adding these validation methods - let's create a separate jira for this.
   
   > The constructor and ExecBatch::Make don't enforce this either.
   
   Right. Also note that `ExecBatch` has public members, which various pieces of code access directly, so it can be easy to make it invalid.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rtpsw commented on a diff in pull request #14386: ARROW-18004: [C++] ExecBatch conversion to RecordBatch may go out of bounds

Posted by GitBox <gi...@apache.org>.
rtpsw commented on code in PR #14386:
URL: https://github.com/apache/arrow/pull/14386#discussion_r993693488


##########
cpp/src/arrow/compute/exec_test.cc:
##########
@@ -55,6 +55,19 @@ using ::arrow::internal::BitmapEquals;
 using ::arrow::internal::CopyBitmap;
 using ::arrow::internal::CountSetBits;
 
+TEST(ExecBatch, Basics) {
+  auto i32_array = ArrayFromJSON(int32(), "[0, 1, 2]");
+  auto utf8_array = ArrayFromJSON(utf8(), R"(["a", "b", "c"])");
+  ExecBatch exec_batch({Datum(i32_array), Datum(utf8_array)}, 3);
+  auto right_schema = schema({field("a", int32()), field("b", utf8())});
+  ASSERT_OK_AND_ASSIGN(auto right_record_batch, exec_batch.ToRecordBatch(right_schema));
+  auto accept_schema = schema({field("a", int32())});
+  ASSERT_OK_AND_ASSIGN(auto accept_record_batch, exec_batch.ToRecordBatch(accept_schema));

Review Comment:
   I'll soon add a commit with checks showing the current situation, if only for posterity, and then I'll remove ones we do not want to keep.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rtpsw commented on a diff in pull request #14386: ARROW-18004: [C++] ExecBatch conversion to RecordBatch may go out of bounds

Posted by GitBox <gi...@apache.org>.
rtpsw commented on code in PR #14386:
URL: https://github.com/apache/arrow/pull/14386#discussion_r993520835


##########
cpp/src/arrow/compute/exec_test.cc:
##########
@@ -55,6 +55,19 @@ using ::arrow::internal::BitmapEquals;
 using ::arrow::internal::CopyBitmap;
 using ::arrow::internal::CountSetBits;
 
+TEST(ExecBatch, Basics) {
+  auto i32_array = ArrayFromJSON(int32(), "[0, 1, 2]");
+  auto utf8_array = ArrayFromJSON(utf8(), R"(["a", "b", "c"])");
+  ExecBatch exec_batch({Datum(i32_array), Datum(utf8_array)}, 3);
+  auto right_schema = schema({field("a", int32()), field("b", utf8())});
+  ASSERT_OK_AND_ASSIGN(auto right_record_batch, exec_batch.ToRecordBatch(right_schema));
+  auto accept_schema = schema({field("a", int32())});
+  ASSERT_OK_AND_ASSIGN(auto accept_record_batch, exec_batch.ToRecordBatch(accept_schema));

Review Comment:
   [This comment](https://github.com/apache/arrow/pull/14386#issuecomment-1276240482) gives the background.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] ursabot commented on pull request #14386: ARROW-18004: [C++] ExecBatch conversion to RecordBatch may go out of bounds

Posted by GitBox <gi...@apache.org>.
ursabot commented on PR #14386:
URL: https://github.com/apache/arrow/pull/14386#issuecomment-1279383357

   Benchmark runs are scheduled for baseline = f3327d2c37c375abdcd6299d4ea2cdbdcbc4cb62 and contender = b5b41ccfb05bfcacc674167c50e1f2894bbd3683. b5b41ccfb05bfcacc674167c50e1f2894bbd3683 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/91c0159651a64d29936a4e299ee8bf55...b8cff6ad40eb4edc84f0eee36dec4e0f/)
   [Failed :arrow_down:0.56% :arrow_up:0.0%] [test-mac-arm](https://conbench.ursa.dev/compare/runs/627afbdf41aa4243990160ee48184f01...e31fe0755d4f4916a53da1edac1f9d9f/)
   [Failed :arrow_down:0.0% :arrow_up:0.0%] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/48ddc8c6fdbf4afb9fd5d8b59719e12a...7323b36592a241b68c64e67c9d37dd25/)
   [Finished :arrow_down:0.82% :arrow_up:0.04%] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/43ddf2397b1d46ac9d3fe53d4d68f638...546721c504f04a59a8d05fece4014e20/)
   Buildkite builds:
   [Finished] [`b5b41ccf` ec2-t3-xlarge-us-east-2](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/1690)
   [Failed] [`b5b41ccf` test-mac-arm](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/1709)
   [Failed] [`b5b41ccf` ursa-i9-9960x](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-i9-9960x/builds/1692)
   [Finished] [`b5b41ccf` ursa-thinkcentre-m75q](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/1703)
   [Finished] [`f3327d2c` ec2-t3-xlarge-us-east-2](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/1689)
   [Failed] [`f3327d2c` test-mac-arm](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/1708)
   [Failed] [`f3327d2c` ursa-i9-9960x](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-i9-9960x/builds/1691)
   [Finished] [`f3327d2c` ursa-thinkcentre-m75q](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/1702)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
   test-mac-arm: Supported benchmark langs: C++, Python, R
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on pull request #14386: ARROW-18004: [C++] ExecBatch conversion to RecordBatch may go out of bounds

Posted by GitBox <gi...@apache.org>.
pitrou commented on PR #14386:
URL: https://github.com/apache/arrow/pull/14386#issuecomment-1277777245

   > @pitrou, please let me know which checks to remove from the test.
   
   In the interest of moving this forward before 10.0.0, I pushed some changes myself.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on pull request #14386: ARROW-18004: [C++] ExecBatch conversion to RecordBatch may go out of bounds

Posted by GitBox <gi...@apache.org>.
pitrou commented on PR #14386:
URL: https://github.com/apache/arrow/pull/14386#issuecomment-1277927368

   CI passed on @rtpsw 's fork.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rtpsw commented on pull request #14386: ARROW-18004: [C++] ExecBatch conversion to RecordBatch may go out of bounds

Posted by GitBox <gi...@apache.org>.
rtpsw commented on PR #14386:
URL: https://github.com/apache/arrow/pull/14386#issuecomment-1276201896

   > Same problem as #14347: this is basically adding a partial, incomplete guard for a condition the caller is supposed to check for themselves.
   
   > If we wanted to check that the ExecBatch values corresponding to the Schema, we should check for schema equality on each of the Datum values. That can unfortunately be expensive.
   
   To be sure, I understand that by partial incomplete guard you mean not checking for full schema equality. My goal here (and in the PR you linked) is not to fully guard, and I understand the cost of trying to do so. My goal is to avoid runtime crashes that are hard to debug when the cost of doing so is small. IMHO, this PR is a good tradeoff because it uses a cheap check to transform a crash failure to a raised error, which is easier to debug. I run into such crash failures while developing test cases. I agree that when a test case is correct then the failure cases do not occur, yet during development test cases are frequently not correct. Another reason is that developers of large apps would generally prefer an error the app can handle, and perhaps raise to its user, over crashing the app.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] github-actions[bot] commented on pull request #14386: ARROW-18004: [C++] ExecBatch conversion to RecordBatch may go out of bounds

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #14386:
URL: https://github.com/apache/arrow/pull/14386#issuecomment-1276429841

   :warning: Ticket **has not been started in JIRA**, please click 'Start Progress'.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on pull request #14386: ARROW-18004: [C++] ExecBatch conversion to RecordBatch may go out of bounds

Posted by GitBox <gi...@apache.org>.
pitrou commented on PR #14386:
URL: https://github.com/apache/arrow/pull/14386#issuecomment-1276183261

   Same problem as #14347: this is basically adding a partial, incomplete guard for a condition the caller is supposed to check for themselves.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] icexelloss commented on a diff in pull request #14386: ARROW-18004: [C++] ExecBatch conversion to RecordBatch may go out of bounds

Posted by GitBox <gi...@apache.org>.
icexelloss commented on code in PR #14386:
URL: https://github.com/apache/arrow/pull/14386#discussion_r993806477


##########
cpp/src/arrow/compute/exec.cc:
##########
@@ -156,15 +156,23 @@ Result<ExecBatch> ExecBatch::Make(std::vector<Datum> values) {
 
 Result<std::shared_ptr<RecordBatch>> ExecBatch::ToRecordBatch(
     std::shared_ptr<Schema> schema, MemoryPool* pool) const {
+  if (static_cast<size_t>(schema->num_fields()) > values.size()) {
+    return Status::Invalid("ExecBatch::ToTRecordBatch mismatching schema size");
+  }
   ArrayVector columns(schema->num_fields());
 
   for (size_t i = 0; i < columns.size(); ++i) {
     const Datum& value = values[i];
     if (value.is_array()) {
       columns[i] = value.make_array();
       continue;
+    } else if (value.is_scalar()) {

Review Comment:
   Hmm..when would a ExecBatch contains scalar value and why do we convert that into array?



##########
cpp/src/arrow/compute/exec.cc:
##########
@@ -156,15 +156,23 @@ Result<ExecBatch> ExecBatch::Make(std::vector<Datum> values) {
 
 Result<std::shared_ptr<RecordBatch>> ExecBatch::ToRecordBatch(
     std::shared_ptr<Schema> schema, MemoryPool* pool) const {
+  if (static_cast<size_t>(schema->num_fields()) > values.size()) {
+    return Status::Invalid("ExecBatch::ToTRecordBatch mismatching schema size");
+  }
   ArrayVector columns(schema->num_fields());
 
   for (size_t i = 0; i < columns.size(); ++i) {
     const Datum& value = values[i];
     if (value.is_array()) {
       columns[i] = value.make_array();
       continue;
+    } else if (value.is_scalar()) {

Review Comment:
   Hmm..when would a ExecBatch contain scalar value and why do we convert that into array?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] bkietz commented on a diff in pull request #14386: ARROW-18004: [C++] ExecBatch conversion to RecordBatch may go out of bounds

Posted by GitBox <gi...@apache.org>.
bkietz commented on code in PR #14386:
URL: https://github.com/apache/arrow/pull/14386#discussion_r993669256


##########
cpp/src/arrow/compute/exec.cc:
##########
@@ -156,15 +156,22 @@ Result<ExecBatch> ExecBatch::Make(std::vector<Datum> values) {
 
 Result<std::shared_ptr<RecordBatch>> ExecBatch::ToRecordBatch(
     std::shared_ptr<Schema> schema, MemoryPool* pool) const {
+  if (static_cast<size_t>(schema->num_fields()) > values.size()) {
+    return Status::Invalid("mismatching schema size");
+  }
   ArrayVector columns(schema->num_fields());
 
   for (size_t i = 0; i < columns.size(); ++i) {
     const Datum& value = values[i];
     if (value.is_array()) {
       columns[i] = value.make_array();
       continue;
+    } else if (value.is_scalar()) {
+      ARROW_ASSIGN_OR_RAISE(columns[i],
+                            MakeArrayFromScalar(*value.scalar(), length, pool));
+    } else {
+      DCHECK(false);

Review Comment:
   It would be a violation of ExecBatch's class invariant for the values to be other than Array or Scalar. Now that I'm looking for a statement of that invariant it's not easy to point at something, the closest I've got is in [streaming_execution.rst](https://github.com/apache/arrow/blob/f941118ea6ffbe1d1d8367d0218566e9e9dae550/docs/source/cpp/streaming_execution.rst#L317-L319). The constructor and ExecBatch::Make don't enforce this either. This validation should be explicit and centralized in ExecBatch



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on a diff in pull request #14386: ARROW-18004: [C++] ExecBatch conversion to RecordBatch may go out of bounds

Posted by GitBox <gi...@apache.org>.
pitrou commented on code in PR #14386:
URL: https://github.com/apache/arrow/pull/14386#discussion_r993599193


##########
cpp/src/arrow/compute/exec.cc:
##########
@@ -156,15 +156,22 @@ Result<ExecBatch> ExecBatch::Make(std::vector<Datum> values) {
 
 Result<std::shared_ptr<RecordBatch>> ExecBatch::ToRecordBatch(
     std::shared_ptr<Schema> schema, MemoryPool* pool) const {
+  if (static_cast<size_t>(schema->num_fields()) > values.size()) {
+    return Status::Invalid("mismatching schema size");
+  }
   ArrayVector columns(schema->num_fields());
 
   for (size_t i = 0; i < columns.size(); ++i) {
     const Datum& value = values[i];
     if (value.is_array()) {
       columns[i] = value.make_array();
       continue;
+    } else if (value.is_scalar()) {
+      ARROW_ASSIGN_OR_RAISE(columns[i],
+                            MakeArrayFromScalar(*value.scalar(), length, pool));
+    } else {
+      DCHECK(false);

Review Comment:
   Let's instead return `Status::TypeError` with a nice error message.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on a diff in pull request #14386: ARROW-18004: [C++] ExecBatch conversion to RecordBatch may go out of bounds

Posted by GitBox <gi...@apache.org>.
pitrou commented on code in PR #14386:
URL: https://github.com/apache/arrow/pull/14386#discussion_r993600098


##########
cpp/src/arrow/compute/exec.cc:
##########
@@ -156,15 +156,22 @@ Result<ExecBatch> ExecBatch::Make(std::vector<Datum> values) {
 
 Result<std::shared_ptr<RecordBatch>> ExecBatch::ToRecordBatch(
     std::shared_ptr<Schema> schema, MemoryPool* pool) const {
+  if (static_cast<size_t>(schema->num_fields()) > values.size()) {
+    return Status::Invalid("mismatching schema size");
+  }
   ArrayVector columns(schema->num_fields());
 
   for (size_t i = 0; i < columns.size(); ++i) {
     const Datum& value = values[i];
     if (value.is_array()) {
       columns[i] = value.make_array();
       continue;
+    } else if (value.is_scalar()) {
+      ARROW_ASSIGN_OR_RAISE(columns[i],
+                            MakeArrayFromScalar(*value.scalar(), length, pool));
+    } else {
+      DCHECK(false);

Review Comment:
   Well, it's not really unreachable :-)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on a diff in pull request #14386: ARROW-18004: [C++] ExecBatch conversion to RecordBatch may go out of bounds

Posted by GitBox <gi...@apache.org>.
pitrou commented on code in PR #14386:
URL: https://github.com/apache/arrow/pull/14386#discussion_r993681528


##########
cpp/src/arrow/compute/exec.cc:
##########
@@ -156,15 +156,22 @@ Result<ExecBatch> ExecBatch::Make(std::vector<Datum> values) {
 
 Result<std::shared_ptr<RecordBatch>> ExecBatch::ToRecordBatch(
     std::shared_ptr<Schema> schema, MemoryPool* pool) const {
+  if (static_cast<size_t>(schema->num_fields()) > values.size()) {
+    return Status::Invalid("mismatching schema size");
+  }
   ArrayVector columns(schema->num_fields());
 
   for (size_t i = 0; i < columns.size(); ++i) {
     const Datum& value = values[i];
     if (value.is_array()) {
       columns[i] = value.make_array();
       continue;
+    } else if (value.is_scalar()) {
+      ARROW_ASSIGN_OR_RAISE(columns[i],
+                            MakeArrayFromScalar(*value.scalar(), length, pool));
+    } else {
+      DCHECK(false);

Review Comment:
   We should certainly add `ExecBatch::Validate` (and perhaps `ExecBatch::ValidateFull`).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rtpsw commented on pull request #14386: ARROW-18004: [C++] ExecBatch conversion to RecordBatch may go out of bounds

Posted by GitBox <gi...@apache.org>.
rtpsw commented on PR #14386:
URL: https://github.com/apache/arrow/pull/14386#issuecomment-1276240482

   >  (but let's not silently truncate output columns either).
   
   Before coding the condition `if (static_cast<size_t>(schema->num_fields()) > values.size())` with `>` I tried coding it with `!=` but got test failures elsewhere:
   ```
   [ RUN      ] ExecPlanExecution.StressSourceOrderBy
   /mnt/user1/tscontract/github/rtpsw/arrow/cpp/src/arrow/compute/exec/plan_test.cc:739: Failure
   Failed
   '_error_or_value56.status()' failed with Invalid: mismatching schema size
   Google Test trace:
   /mnt/user1/tscontract/github/rtpsw/arrow/cpp/src/arrow/compute/exec/plan_test.cc:720: single threaded
   /mnt/user1/tscontract/github/rtpsw/arrow/cpp/src/arrow/compute/exec/plan_test.cc:717: unslowed
   [  FAILED  ] ExecPlanExecution.StressSourceOrderBy (1 ms)
   ```
   and
   ```
   [ RUN      ] Substrait.BasicPlanRoundTrippingEndToEnd
   /mnt/user1/tscontract/github/rtpsw/arrow/cpp/src/arrow/compute/exec/exec_plan.cc:58: Plan was destroyed before finishing
   /mnt/user1/tscontract/github/rtpsw/arrow/cpp/src/arrow/engine/substrait/serde_test.cc:2083: Failure
   Failed
   '_error_or_value131.status()' failed with Invalid: mismatching schema size
   [  FAILED  ] Substrait.BasicPlanRoundTrippingEndToEnd (1 ms)
   ```
   
   If we'd like to fix these test failures, I'd suggest doing so separately.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] github-actions[bot] commented on pull request #14386: ARROW-18004: [C++] ExecBatch conversion to RecordBatch may go out of bounds

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #14386:
URL: https://github.com/apache/arrow/pull/14386#issuecomment-1276429804

   https://issues.apache.org/jira/browse/ARROW-18004


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] rtpsw commented on pull request #14386: ARROW-18004: [C++] ExecBatch conversion to RecordBatch may go out of bounds

Posted by GitBox <gi...@apache.org>.
rtpsw commented on PR #14386:
URL: https://github.com/apache/arrow/pull/14386#issuecomment-1276466786

   [This comment](https://github.com/apache/arrow/pull/14386#discussion_r993693488) applies to the commit I just pushed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org