You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "zeroshade (via GitHub)" <gi...@apache.org> on 2023/04/17 14:47:10 UTC

[GitHub] [arrow] zeroshade commented on a diff in pull request #35161: GH-35160: [Go] Add arrayApproxEqualString to ignore null chars

zeroshade commented on code in PR #35161:
URL: https://github.com/apache/arrow/pull/35161#discussion_r1168824967


##########
go/arrow/array/binarybuilder.go:
##########
@@ -319,7 +320,10 @@ func (b *BinaryBuilder) UnmarshalOne(dec *json.Decoder) error {
 	case string:
 		data, err := base64.StdEncoding.DecodeString(v)
 		if err != nil {
-			return err
+			data, err = hex.DecodeString(v)
+			if err != nil {
+				return err
+			}

Review Comment:
   this seems odd to add here. Should we also add this to `AppendValueFromString` and update the docs that you can also use hex to specify binary strings?



##########
go/arrow/array/compare.go:
##########
@@ -630,6 +631,34 @@ func validityBitmapEqual(left, right arrow.Array) bool {
 	return true
 }
 
+func stripNulls(s string) string {
+	return strings.ReplaceAll(s, "\x00", "")
+}
+

Review Comment:
   In the Arrow spec a String array should be all valid utf8. So shouldn't this only be for trimming trailing Nulls? (ie: use `strings.TrimRight` instead of `strings.ReplaceAll` )?
   
   To this end, can you have a test with a string that contains a null inside it that *isn't* at the end and ensure that this approxequal only trims nulls, and doesn't strip them from within the string?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org