You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/12/16 15:23:47 UTC

[GitHub] [arrow] Yuhta opened a new pull request #8937: Arrow 10932

Yuhta opened a new pull request #8937:
URL: https://github.com/apache/arrow/pull/8937


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #8937: ARROW-10932: [C++] BinaryMemoTable::CopyOffsets access out-of-bound address when data is empty

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #8937:
URL: https://github.com/apache/arrow/pull/8937#discussion_r544519106



##########
File path: cpp/src/arrow/util/hashing.h
##########
@@ -691,7 +691,7 @@ class BinaryMemoTable : public MemoTable {
 
     const builder_offset_type* offsets = binary_builder_.offsets_data();
     const builder_offset_type delta =
-        start < binary_builder_.value_data_length() ? offsets[start] : 0;
+        start < binary_builder_.length() ? offsets[start] : 0;

Review comment:
       Is `start >= binary_builder_.length()` possible though? And in that case, should `delta` really be 0?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #8937: ARROW-10932: [C++] BinaryMemoTable::CopyOffsets access out-of-bound address when data is empty

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #8937:
URL: https://github.com/apache/arrow/pull/8937#discussion_r544537227



##########
File path: cpp/src/arrow/util/hashing.h
##########
@@ -691,7 +691,7 @@ class BinaryMemoTable : public MemoTable {
 
     const builder_offset_type* offsets = binary_builder_.offsets_data();
     const builder_offset_type delta =
-        start < binary_builder_.value_data_length() ? offsets[start] : 0;
+        start < binary_builder_.length() ? offsets[start] : 0;

Review comment:
       Good point, then I agree the current fix is fine. Thank you for the explanation!




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] Yuhta commented on a change in pull request #8937: ARROW-10932: [C++] BinaryMemoTable::CopyOffsets access out-of-bound address when data is empty

Posted by GitBox <gi...@apache.org>.
Yuhta commented on a change in pull request #8937:
URL: https://github.com/apache/arrow/pull/8937#discussion_r544522474



##########
File path: cpp/src/arrow/util/hashing.h
##########
@@ -691,7 +691,7 @@ class BinaryMemoTable : public MemoTable {
 
     const builder_offset_type* offsets = binary_builder_.offsets_data();
     const builder_offset_type delta =
-        start < binary_builder_.value_data_length() ? offsets[start] : 0;
+        start < binary_builder_.length() ? offsets[start] : 0;

Review comment:
       In case of `binary_builder_.length() > 0`, caller is responsible to make sure `start < binary_builder_.length()`.  So it's ok to just check `binary_builder_.length() > 0`.  It's just a little bit safer to check `start < binary_builder_.length()` ourselves.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #8937: ARROW-10932: [C++] BinaryMemoTable::CopyOffsets access out-of-bound address when data is empty

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #8937:
URL: https://github.com/apache/arrow/pull/8937#discussion_r544446692



##########
File path: cpp/src/arrow/util/hashing.h
##########
@@ -691,7 +691,7 @@ class BinaryMemoTable : public MemoTable {
 
     const builder_offset_type* offsets = binary_builder_.offsets_data();
     const builder_offset_type delta =
-        start < binary_builder_.value_data_length() ? offsets[start] : 0;
+        start < binary_builder_.length() ? offsets[start] : 0;

Review comment:
       Perhaps the proper condition should be `binary_builder_.length() != 0 ? offsets[start] : 0`?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #8937: ARROW-10932: [C++] BinaryMemoTable::CopyOffsets access out-of-bound address when data is empty

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #8937:
URL: https://github.com/apache/arrow/pull/8937#issuecomment-746531687


   https://issues.apache.org/jira/browse/ARROW-10932


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] Yuhta commented on a change in pull request #8937: ARROW-10932: [C++] BinaryMemoTable::CopyOffsets access out-of-bound address when data is empty

Posted by GitBox <gi...@apache.org>.
Yuhta commented on a change in pull request #8937:
URL: https://github.com/apache/arrow/pull/8937#discussion_r544534942



##########
File path: cpp/src/arrow/util/hashing.h
##########
@@ -691,7 +691,7 @@ class BinaryMemoTable : public MemoTable {
 
     const builder_offset_type* offsets = binary_builder_.offsets_data();
     const builder_offset_type delta =
-        start < binary_builder_.value_data_length() ? offsets[start] : 0;
+        start < binary_builder_.length() ? offsets[start] : 0;

Review comment:
       Depends on if you want to allow `start >= binary_builder_.length()`.  If it is allowed and the range is selected as ![formula](https://render.githubusercontent.com/render/math?math=index%20\in%20(-\infty,%20length)%20\cap%20[start,%20%2b\infty)), it is ok to set `delta = 0`, and leave the only element in `out_data` as `binary_builder_.value_data_length()`.  If we set `delta = offsets[start]` in this case, we are going to copy some unauthorized memory to `out_data` and it's a potential vulnerability.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou closed pull request #8937: ARROW-10932: [C++] BinaryMemoTable::CopyOffsets access out-of-bound address when data is empty

Posted by GitBox <gi...@apache.org>.
pitrou closed pull request #8937:
URL: https://github.com/apache/arrow/pull/8937


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] Yuhta commented on a change in pull request #8937: ARROW-10932: [C++] BinaryMemoTable::CopyOffsets access out-of-bound address when data is empty

Posted by GitBox <gi...@apache.org>.
Yuhta commented on a change in pull request #8937:
URL: https://github.com/apache/arrow/pull/8937#discussion_r544500867



##########
File path: cpp/src/arrow/util/hashing.h
##########
@@ -691,7 +691,7 @@ class BinaryMemoTable : public MemoTable {
 
     const builder_offset_type* offsets = binary_builder_.offsets_data();
     const builder_offset_type delta =
-        start < binary_builder_.value_data_length() ? offsets[start] : 0;
+        start < binary_builder_.length() ? offsets[start] : 0;

Review comment:
       `binary_builder_.length() != 0` is the minimal case we need to fix, and we still do out-of-bound access (and write the data read to `out_data`) when `start >= binary_builder_.length()`.
   If we do `start < binary_builder_.length()`, we cover the case when `start >= binary_builder_.length()` as well.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on pull request #8937: ARROW-10932: [C++] BinaryMemoTable::CopyOffsets access out-of-bound address when data is empty

Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #8937:
URL: https://github.com/apache/arrow/pull/8937#issuecomment-746770741


   Previous PR: #8933


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #8937: ARROW-10932: [C++] BinaryMemoTable::CopyOffsets access out-of-bound address when data is empty

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #8937:
URL: https://github.com/apache/arrow/pull/8937#discussion_r544523627



##########
File path: cpp/src/arrow/util/hashing.h
##########
@@ -691,7 +691,7 @@ class BinaryMemoTable : public MemoTable {
 
     const builder_offset_type* offsets = binary_builder_.offsets_data();
     const builder_offset_type delta =
-        start < binary_builder_.value_data_length() ? offsets[start] : 0;
+        start < binary_builder_.length() ? offsets[start] : 0;

Review comment:
       I'd rather use the strict check. It's not obvious that `delta = 0` is the right choice otherwise (since the table is non-empty).




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org