You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "hermanschaaf (via GitHub)" <gi...@apache.org> on 2023/04/17 15:02:54 UTC

[GitHub] [arrow] hermanschaaf opened a new pull request, #35191: GH-35190: [Go] Correctly handle null values in CSV reader

hermanschaaf opened a new pull request, #35191:
URL: https://github.com/apache/arrow/pull/35191

   The way the CSV reader handles null values currently makes it impossible for empty strings to be interpreted as null, even when this is explicitly enabled through the `csv.WithNullReader(stringsCanBeNull: true, "", "NULL", "null")` option. This change fixes this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] hermanschaaf commented on a diff in pull request #35191: GH-35190: [Go] Correctly handle null values in CSV reader

Posted by "hermanschaaf (via GitHub)" <gi...@apache.org>.
hermanschaaf commented on code in PR #35191:
URL: https://github.com/apache/arrow/pull/35191#discussion_r1168921586


##########
go/arrow/csv/reader.go:
##########
@@ -767,7 +767,7 @@ func (r *Reader) parseList(field array.Builder, str string) {
 
 func (r *Reader) parseBinaryType(field array.Builder, str string) {
 	// specialize the implementation when we know we cannot have nulls
-	if str != "" && r.isNull(str) {
+	if r.isNull(str) {
 		field.AppendNull()
 		return
 	}

Review Comment:
   Added two new test cases for this



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] zeroshade commented on a diff in pull request #35191: GH-35190: [Go] Correctly handle null values in CSV reader

Posted by "zeroshade (via GitHub)" <gi...@apache.org>.
zeroshade commented on code in PR #35191:
URL: https://github.com/apache/arrow/pull/35191#discussion_r1168865762


##########
go/arrow/csv/reader.go:
##########
@@ -767,7 +767,7 @@ func (r *Reader) parseList(field array.Builder, str string) {
 
 func (r *Reader) parseBinaryType(field array.Builder, str string) {
 	// specialize the implementation when we know we cannot have nulls
-	if str != "" && r.isNull(str) {
+	if r.isNull(str) {
 		field.AppendNull()
 		return
 	}

Review Comment:
   Do we currently have a test that verifies you can still get an empty string if it is not in the list of Null values? / `stringsCanBeNull` == `false` ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] hermanschaaf commented on a diff in pull request #35191: GH-35190: [Go] Correctly handle null values in CSV reader

Posted by "hermanschaaf (via GitHub)" <gi...@apache.org>.
hermanschaaf commented on code in PR #35191:
URL: https://github.com/apache/arrow/pull/35191#discussion_r1168886527


##########
go/arrow/csv/reader.go:
##########
@@ -767,7 +767,7 @@ func (r *Reader) parseList(field array.Builder, str string) {
 
 func (r *Reader) parseBinaryType(field array.Builder, str string) {
 	// specialize the implementation when we know we cannot have nulls
-	if str != "" && r.isNull(str) {
+	if r.isNull(str) {
 		field.AppendNull()
 		return
 	}

Review Comment:
   I think not, but happy to add one here



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] zeroshade merged pull request #35191: GH-35190: [Go] Correctly handle null values in CSV reader

Posted by "zeroshade (via GitHub)" <gi...@apache.org>.
zeroshade merged PR #35191:
URL: https://github.com/apache/arrow/pull/35191


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] github-actions[bot] commented on pull request #35191: GH-35190: [Go] Correctly handle null values in CSV reader

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #35191:
URL: https://github.com/apache/arrow/pull/35191#issuecomment-1511541438

   * Closes: #35190


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] github-actions[bot] commented on pull request #35191: GH-35190: [Go] Correctly handle null values in CSV reader

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #35191:
URL: https://github.com/apache/arrow/pull/35191#issuecomment-1511541554

   :warning: GitHub issue #35190 **has been automatically assigned in GitHub** to PR creator.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org