You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2020/10/19 23:07:37 UTC

[GitHub] [iceberg] kbendick opened a new pull request #1636: WIP - Draft to remove NarrowingCompoundAssignment warnings

kbendick opened a new pull request #1636:
URL: https://github.com/apache/iceberg/pull/1636


   But not sure if the possibility of integer overflow in `cv.childCount` still exists.
   
   I think because this output value is only used the one time, that the overflow isn't a concern and I can just change the addition in this way to remove the warnings.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] kbendick commented on pull request #1636: WIP - Draft to remove NarrowingCompoundAssignment warnings

Posted by GitBox <gi...@apache.org>.
kbendick commented on pull request #1636:
URL: https://github.com/apache/iceberg/pull/1636#issuecomment-760663505


   I'm gonna close this PR since it's relatively old and likely not that up to date with master. We can open another PR to fix these warnings.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] kbendick commented on a change in pull request #1636: WIP - Draft to remove NarrowingCompoundAssignment warnings

Posted by GitBox <gi...@apache.org>.
kbendick commented on a change in pull request #1636:
URL: https://github.com/apache/iceberg/pull/1636#discussion_r508115113



##########
File path: flink/src/main/java/org/apache/iceberg/flink/data/FlinkOrcWriters.java
##########
@@ -245,8 +245,19 @@ public void nonNullWrite(int rowId, ArrayData data, ColumnVector output) {
       ListColumnVector cv = (ListColumnVector) output;
       cv.lengths[rowId] = data.size();
       cv.offsets[rowId] = cv.childCount;
-      cv.childCount += cv.lengths[rowId];
+      // cv.childCount is for some reason an int, which generates all of these
+      // NarrowingCompoundAssignment warnings. Although this does nothing to prevent
+      // overflow from adding too many values to cv.childCount because `ListColumnVector`
+      // comes from package `org.apache.orc.storage.ql.exec.vector`, it at least removes

Review comment:
       As it is, my point is we can't necessarily change it to be a long unless somebody on the orc side can do that (which I think this is a possible bug in their interface given that the arrays `lengths` and `offsets` are both long arrays but I've no idea the ramifications of that being changed).




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] kbendick commented on a change in pull request #1636: WIP - Draft to remove NarrowingCompoundAssignment warnings

Posted by GitBox <gi...@apache.org>.
kbendick commented on a change in pull request #1636:
URL: https://github.com/apache/iceberg/pull/1636#discussion_r508113404



##########
File path: flink/src/main/java/org/apache/iceberg/flink/data/FlinkOrcWriters.java
##########
@@ -245,8 +245,19 @@ public void nonNullWrite(int rowId, ArrayData data, ColumnVector output) {
       ListColumnVector cv = (ListColumnVector) output;
       cv.lengths[rowId] = data.size();
       cv.offsets[rowId] = cv.childCount;
-      cv.childCount += cv.lengths[rowId];
+      // cv.childCount is for some reason an int, which generates all of these
+      // NarrowingCompoundAssignment warnings. Although this does nothing to prevent
+      // overflow from adding too many values to cv.childCount because `ListColumnVector`
+      // comes from package `org.apache.orc.storage.ql.exec.vector`, it at least removes
+      // the warning message. Whether that's preferred or not, I don't know as the
+      // issue still very much remains.
+      // This new statement is equivalent to the old, but because it's adding an int vs adding
+      // the int after the it has been cast to long when assigned to `cv.lengths[rowId]`,
+      // the narrowing compound assignment warning is gone. This could be added to any other
+      // occurrences of this same warning.
+      cv.childCount += data.size();
       // make sure the child is big enough.
+      // TODO - Would ensuring the child is big enough before adding to it help to suss out when this overflow happens?

Review comment:
       I see now that it doesn't. This is just ensuring the buffer is large enough to write all of the data to it.

##########
File path: flink/src/main/java/org/apache/iceberg/flink/data/FlinkOrcWriters.java
##########
@@ -285,8 +296,10 @@ public void nonNullWrite(int rowId, MapData data, ColumnVector output) {
       // record the length and start of the list elements
       cv.lengths[rowId] = data.size();
       cv.offsets[rowId] = cv.childCount;
-      cv.childCount += cv.lengths[rowId];
+      cv.childCount += data.size();
       // make sure the child is big enough
+      // TODO - Should this be done before adding to childCount in case of overflow?
+      //      - Possibly we could sum up the values of `cv.length` if we're really worried about overflow?

Review comment:
       Again, not needed.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] kbendick commented on a change in pull request #1636: WIP - Draft to remove NarrowingCompoundAssignment warnings

Posted by GitBox <gi...@apache.org>.
kbendick commented on a change in pull request #1636:
URL: https://github.com/apache/iceberg/pull/1636#discussion_r508113462



##########
File path: flink/src/main/java/org/apache/iceberg/flink/data/FlinkOrcWriters.java
##########
@@ -285,8 +296,10 @@ public void nonNullWrite(int rowId, MapData data, ColumnVector output) {
       // record the length and start of the list elements
       cv.lengths[rowId] = data.size();
       cv.offsets[rowId] = cv.childCount;
-      cv.childCount += cv.lengths[rowId];
+      cv.childCount += data.size();
       // make sure the child is big enough
+      // TODO - Should this be done before adding to childCount in case of overflow?
+      //      - Possibly we could sum up the values of `cv.length` if we're really worried about overflow?

Review comment:
       Again, this TODO is not needed but I'd rather not trigger CI, which it seems drafts do get run (at least initially). I may need to learn Travis so I can possibly help out with that.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] kbendick commented on a change in pull request #1636: WIP - Draft to remove NarrowingCompoundAssignment warnings

Posted by GitBox <gi...@apache.org>.
kbendick commented on a change in pull request #1636:
URL: https://github.com/apache/iceberg/pull/1636#discussion_r508115113



##########
File path: flink/src/main/java/org/apache/iceberg/flink/data/FlinkOrcWriters.java
##########
@@ -245,8 +245,19 @@ public void nonNullWrite(int rowId, ArrayData data, ColumnVector output) {
       ListColumnVector cv = (ListColumnVector) output;
       cv.lengths[rowId] = data.size();
       cv.offsets[rowId] = cv.childCount;
-      cv.childCount += cv.lengths[rowId];
+      // cv.childCount is for some reason an int, which generates all of these
+      // NarrowingCompoundAssignment warnings. Although this does nothing to prevent
+      // overflow from adding too many values to cv.childCount because `ListColumnVector`
+      // comes from package `org.apache.orc.storage.ql.exec.vector`, it at least removes

Review comment:
       As it is, my point is we can't necessarily change it to be a long unless somebody on the orc side can do that (which I think this is a possible bug in their interface given that the arrays `lengths` and `offsets` are both long arrays but I've no idea the ramifications of that being changed, if anybody was so inclined).




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] kbendick edited a comment on pull request #1636: WIP - Draft to remove NarrowingCompoundAssignment warnings

Posted by GitBox <gi...@apache.org>.
kbendick edited a comment on pull request #1636:
URL: https://github.com/apache/iceberg/pull/1636#issuecomment-712490984


   cc @rdblue @RussellSpitzer if you wouldn't mind taking a look at my draft approach to removing a good handful of `NarrowingCompoundAssignment` warnings, I'd be very appreciative.
   
   Specifically, I'm not 100% sure that there isn't still an issue with `cv.childCount` still being overflown. But that depends on whether or not `output` is constantly reused. Your input would be appreciated before I fix the rest of them (one way or another).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] kbendick closed pull request #1636: WIP - Draft to remove NarrowingCompoundAssignment warnings

Posted by GitBox <gi...@apache.org>.
kbendick closed pull request #1636:
URL: https://github.com/apache/iceberg/pull/1636


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #1636: WIP - Draft to remove NarrowingCompoundAssignment warnings

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #1636:
URL: https://github.com/apache/iceberg/pull/1636#discussion_r511099333



##########
File path: flink/src/main/java/org/apache/iceberg/flink/data/FlinkOrcWriters.java
##########
@@ -245,8 +245,19 @@ public void nonNullWrite(int rowId, ArrayData data, ColumnVector output) {
       ListColumnVector cv = (ListColumnVector) output;
       cv.lengths[rowId] = data.size();
       cv.offsets[rowId] = cv.childCount;
-      cv.childCount += cv.lengths[rowId];
+      // cv.childCount is for some reason an int, which generates all of these
+      // NarrowingCompoundAssignment warnings. Although this does nothing to prevent
+      // overflow from adding too many values to cv.childCount because `ListColumnVector`
+      // comes from package `org.apache.orc.storage.ql.exec.vector`, it at least removes

Review comment:
       I think the solution to use the int source data makes sense. We probably don't need to have the long comment, since it is clear that this is correct.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] kbendick edited a comment on pull request #1636: WIP - Draft to remove NarrowingCompoundAssignment warnings

Posted by GitBox <gi...@apache.org>.
kbendick edited a comment on pull request #1636:
URL: https://github.com/apache/iceberg/pull/1636#issuecomment-712490984


   cc @rdblue @RussellSpitzer if you could take a look at my draft approach to removing a good handful of `NarrowingCompoundAssignment` warnings.
   
   Specifically, I'm not 100% sure that there isn't still an issue with `cv.childCount` still being overflown. But that depends on whether or not `output` is constantly reused. Your input would be appreciated before I fix the rest of them (one way or another).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] kbendick commented on pull request #1636: WIP - Draft to remove NarrowingCompoundAssignment warnings

Posted by GitBox <gi...@apache.org>.
kbendick commented on pull request #1636:
URL: https://github.com/apache/iceberg/pull/1636#issuecomment-716301353


   > Looks correct to me. I'd just remove most of the comments because all you're doing is setting the value from the source that is already an `int`.
   > 
   > FYI @shardulm94.
   
   Cool I'll go ahead and then fix it this way then.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] kbendick commented on pull request #1636: WIP - Draft to remove NarrowingCompoundAssignment warnings

Posted by GitBox <gi...@apache.org>.
kbendick commented on pull request #1636:
URL: https://github.com/apache/iceberg/pull/1636#issuecomment-712490984


   cc @rdblue was @RussellSpitzer if you could take a look at my draft approach to removing a good handful of `NarrowingCompoundAssignment` warnings.
   
   Specifically, I'm not 100% sure that there isn't still an issue with `cv.childCount` still being overflown. But that depends on whether or not `output` is constantly reused. Your input would be appreciated before I fix the rest of them (one way or another).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] kbendick commented on a change in pull request #1636: WIP - Draft to remove NarrowingCompoundAssignment warnings

Posted by GitBox <gi...@apache.org>.
kbendick commented on a change in pull request #1636:
URL: https://github.com/apache/iceberg/pull/1636#discussion_r508115113



##########
File path: flink/src/main/java/org/apache/iceberg/flink/data/FlinkOrcWriters.java
##########
@@ -245,8 +245,19 @@ public void nonNullWrite(int rowId, ArrayData data, ColumnVector output) {
       ListColumnVector cv = (ListColumnVector) output;
       cv.lengths[rowId] = data.size();
       cv.offsets[rowId] = cv.childCount;
-      cv.childCount += cv.lengths[rowId];
+      // cv.childCount is for some reason an int, which generates all of these
+      // NarrowingCompoundAssignment warnings. Although this does nothing to prevent
+      // overflow from adding too many values to cv.childCount because `ListColumnVector`
+      // comes from package `org.apache.orc.storage.ql.exec.vector`, it at least removes

Review comment:
       As it is, my point is we can't necessarily change it to be a long unless somebody on the orc side can do that (which I think this is a possible bug in their interface given that the arrays `lengths` and `offsets` are both long arrays but I've no idea the ramifications of that being changed, if anybody was even so inclined).




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] kbendick commented on a change in pull request #1636: WIP - Draft to remove NarrowingCompoundAssignment warnings

Posted by GitBox <gi...@apache.org>.
kbendick commented on a change in pull request #1636:
URL: https://github.com/apache/iceberg/pull/1636#discussion_r508113404



##########
File path: flink/src/main/java/org/apache/iceberg/flink/data/FlinkOrcWriters.java
##########
@@ -245,8 +245,19 @@ public void nonNullWrite(int rowId, ArrayData data, ColumnVector output) {
       ListColumnVector cv = (ListColumnVector) output;
       cv.lengths[rowId] = data.size();
       cv.offsets[rowId] = cv.childCount;
-      cv.childCount += cv.lengths[rowId];
+      // cv.childCount is for some reason an int, which generates all of these
+      // NarrowingCompoundAssignment warnings. Although this does nothing to prevent
+      // overflow from adding too many values to cv.childCount because `ListColumnVector`
+      // comes from package `org.apache.orc.storage.ql.exec.vector`, it at least removes
+      // the warning message. Whether that's preferred or not, I don't know as the
+      // issue still very much remains.
+      // This new statement is equivalent to the old, but because it's adding an int vs adding
+      // the int after the it has been cast to long when assigned to `cv.lengths[rowId]`,
+      // the narrowing compound assignment warning is gone. This could be added to any other
+      // occurrences of this same warning.
+      cv.childCount += data.size();
       // make sure the child is big enough.
+      // TODO - Would ensuring the child is big enough before adding to it help to suss out when this overflow happens?

Review comment:
       I see now that it doesn't. This is just ensuring the buffer is large enough to write all of the data to it.
   
   I'd push a change to remove these TODOs but I don't want to trigger CI unnecessarily during the work day.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] kbendick commented on a change in pull request #1636: WIP - Draft to remove NarrowingCompoundAssignment warnings

Posted by GitBox <gi...@apache.org>.
kbendick commented on a change in pull request #1636:
URL: https://github.com/apache/iceberg/pull/1636#discussion_r508115113



##########
File path: flink/src/main/java/org/apache/iceberg/flink/data/FlinkOrcWriters.java
##########
@@ -245,8 +245,19 @@ public void nonNullWrite(int rowId, ArrayData data, ColumnVector output) {
       ListColumnVector cv = (ListColumnVector) output;
       cv.lengths[rowId] = data.size();
       cv.offsets[rowId] = cv.childCount;
-      cv.childCount += cv.lengths[rowId];
+      // cv.childCount is for some reason an int, which generates all of these
+      // NarrowingCompoundAssignment warnings. Although this does nothing to prevent
+      // overflow from adding too many values to cv.childCount because `ListColumnVector`
+      // comes from package `org.apache.orc.storage.ql.exec.vector`, it at least removes

Review comment:
       As it is, my point is we can't necessarily change it to be a long unless somebody on the orc side can do that (which I think is a possible bug given that the arrays `lengths` and `offsets` are both long arrays).




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #1636: WIP - Draft to remove NarrowingCompoundAssignment warnings

Posted by GitBox <gi...@apache.org>.
rdblue commented on pull request #1636:
URL: https://github.com/apache/iceberg/pull/1636#issuecomment-715535478


   Looks correct to me. I'd just remove most of the comments because all you're doing is setting the value from the source that is already an `int`.
   
   FYI @shardulm94.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org