You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/06/09 16:06:19 UTC

[GitHub] [spark] srowen commented on a change in pull request #28750: [SPARK-31916][SQL] StringConcat can overflow , leads to StringIndexOutOfBoundsException

srowen commented on a change in pull request #28750:
URL: https://github.com/apache/spark/pull/28750#discussion_r436668197



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala
##########
@@ -122,8 +122,8 @@ object StringUtils extends Logging {
           val available = maxLength - length
           val stringToAppend = if (available >= sLen) s else s.substring(0, available)
           strings.append(stringToAppend)
+          length += sLen
         }
-        length += sLen

Review comment:
       Nitpicking, but:
   Why not only update length inside the `if (!atLimit)` block as originally shown?
   And only add `stringToAppend.length()` there, if the point of `Math.min()` is to avoid overflow if adding a massive string where only part was added.
   

##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala
##########
@@ -122,8 +122,8 @@ object StringUtils extends Logging {
           val available = maxLength - length
           val stringToAppend = if (available >= sLen) s else s.substring(0, available)
           strings.append(stringToAppend)
+          length += sLen
         }
-        length += sLen

Review comment:
       Oh I see. If overflow is an issue, then make it a long? Capping it at this constant would then seem to potentially underreport the actual appended total.
   Maybe that creates other problems, not sure.

##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala
##########
@@ -122,8 +122,8 @@ object StringUtils extends Logging {
           val available = maxLength - length
           val stringToAppend = if (available >= sLen) s else s.substring(0, available)
           strings.append(stringToAppend)
+          length += sLen
         }
-        length += sLen

Review comment:
       I'm not sure you can overflow a long - if 2^31-1 strings (max length of the array tracking them) are added, of length 2^31-1, you still don't overflow.
   It's probably not worth debating so this is fine but I think a simpler change was also fine.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org