You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by GitBox <gi...@apache.org> on 2021/05/24 02:57:07 UTC

[GitHub] [parquet-mr] shangxinli commented on a change in pull request #910: PARQUET-2052: Integer overflow when writing huge binary using dictionary encoding

shangxinli commented on a change in pull request #910:
URL: https://github.com/apache/parquet-mr/pull/910#discussion_r637657084



##########
File path: parquet-column/src/main/java/org/apache/parquet/column/values/dictionary/DictionaryValuesWriter.java
##########
@@ -173,7 +173,7 @@ public BytesInput getBytes() {
       BytesInput bytes = concat(BytesInput.from(bytesHeader), rleEncodedBytes);
       // remember size of dictionary when we last wrote a page
       lastUsedDictionarySize = getDictionarySize();
-      lastUsedDictionaryByteSize = dictionaryByteSize;
+      lastUsedDictionaryByteSize = (int) dictionaryByteSize;

Review comment:
       If that user continues writing that large string, will here throw an exception?  And it will block the writer, right? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org