You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "yevgenypats (via GitHub)" <gi...@apache.org> on 2023/03/16 11:30:05 UTC

[GitHub] [arrow] yevgenypats opened a new pull request, #34585: GH-34584: [GO][CSV] Add extension types support.

yevgenypats opened a new pull request, #34585:
URL: https://github.com/apache/arrow/pull/34585

   Built on top of https://github.com/apache/arrow/issues/34453
   
   
   <!--
   Thanks for opening a pull request!
   If this is your first pull request you can find detailed information on how 
   to contribute here:
     * [New Contributor's Guide](https://arrow.apache.org/docs/dev/developers/guide/step_by_step/pr_lifecycle.html#reviews-and-merge-of-the-pull-request)
     * [Contributing Overview](https://arrow.apache.org/docs/dev/developers/overview.html)
   
   
   If this is not a [minor PR](https://github.com/apache/arrow/blob/main/CONTRIBUTING.md#Minor-Fixes). Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose
   
   Opening GitHub issues ahead of time contributes to the [Openness](http://theapacheway.com/open/#:~:text=Openness%20allows%20new%20users%20the,must%20happen%20in%20the%20open.) of the Apache Arrow project.
   
   Then could you also rename the pull request title in the following format?
   
       GH-${GITHUB_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}
   
   or
   
       MINOR: [${COMPONENT}] ${SUMMARY}
   
   In the case of PARQUET issues on JIRA the title also supports:
   
       PARQUET-${JIRA_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}
   
   -->
   
   ### Rationale for this change
   
   <!--
    Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed.
    Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes.  
   -->
   
   ### What changes are included in this PR?
   
   <!--
   There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR.
   -->
   
   ### Are these changes tested?
   
   <!--
   We typically require tests for all PRs in order to:
   1. Prevent the code from being accidentally broken by subsequent changes
   2. Serve as another way to document the expected behavior of the code
   
   If tests are not included in your PR, please explain why (for example, are they covered by existing tests)?
   -->
   
   ### Are there any user-facing changes?
   
   <!--
   If there are user-facing changes then we may require documentation to be updated before approving the PR.
   -->
   
   <!--
   If there are any breaking changes to public APIs, please uncomment the line below and explain which changes are breaking.
   -->
   <!-- **This PR includes breaking changes to public APIs.** -->
   
   <!--
   Please uncomment the line below (and provide explanation) if the changes fix either (a) a security vulnerability, (b) a bug that caused incorrect or invalid data to be produced, or (c) a bug that causes a crash (even when the API contract is upheld). We use this to highlight fixes to issues that may affect users without their knowledge. For this reason, fixing bugs that cause errors don't count, since those are usually obvious.
   -->
   <!-- **This PR contains a "Critical Fix".** -->


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] zeroshade commented on a diff in pull request #34585: GH-34584: [Go][CSV] Add extension types support

Posted by "zeroshade (via GitHub)" <gi...@apache.org>.
zeroshade commented on code in PR #34585:
URL: https://github.com/apache/arrow/pull/34585#discussion_r1142673122


##########
go/arrow/csv/transformer.go:
##########
@@ -225,6 +227,52 @@ func (w *Writer) transformColToStringArr(typ arrow.DataType, col arrow.Array) []
 				res[i] = w.nullValue
 			}
 		}
+	case arrow.ExtensionType:
+		arr := col.(array.ExtensionArray)
+		b, err := arr.MarshalJSON()
+		if err != nil {
+			panic(fmt.Errorf("arrow/csv: could not marshal extension array: %w", err))
+		}
+		var stringArr []interface{}
+		if err := json.Unmarshal(b, &stringArr); err != nil {
+			panic(fmt.Errorf("arrow/csv: could not unmarshal extnesion to string array: %s", err))
+		}

Review Comment:
   why marshal to json and then back? 
   
   I'd much prefer a new method added to an ExtensionType or the `ExtensionArray` interface that implements `Value(int) fmt.Stringer` to allow customized retrieval as whatever type is desired and requiring it to have a `String() string` method so it can just be printed out here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] yevgenypats commented on a diff in pull request #34585: GH-34584: [Go][CSV] Add extension types support

Posted by "yevgenypats (via GitHub)" <gi...@apache.org>.
yevgenypats commented on code in PR #34585:
URL: https://github.com/apache/arrow/pull/34585#discussion_r1145418190


##########
go/arrow/csv/reader.go:
##########
@@ -773,6 +778,18 @@ func (r *Reader) parseBinaryType(field array.Builder, str string) {
 	field.(*array.BinaryBuilder).Append(decodedVal)
 }
 
+func (r *Reader) parseExtension(field array.Builder, str string) {
+	if r.isNull(str) {
+		field.AppendNull()
+		return
+	}
+	dec := json.NewDecoder(strings.NewReader(`"` + str + `"`))

Review Comment:
   Done. Tacked here - https://github.com/apache/arrow/issues/34695



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] zeroshade commented on a diff in pull request #34585: GH-34584: [Go][CSV] Add extension types support

Posted by "zeroshade (via GitHub)" <gi...@apache.org>.
zeroshade commented on code in PR #34585:
URL: https://github.com/apache/arrow/pull/34585#discussion_r1143488837


##########
go/arrow/csv/reader.go:
##########
@@ -773,6 +778,18 @@ func (r *Reader) parseBinaryType(field array.Builder, str string) {
 	field.(*array.BinaryBuilder).Append(decodedVal)
 }
 
+func (r *Reader) parseExtension(field array.Builder, str string) {
+	if r.isNull(str) {
+		field.AppendNull()
+		return
+	}
+	dec := json.NewDecoder(strings.NewReader(`"` + str + `"`))

Review Comment:
   the raw value will be a string, but for example "3" would decode to the string "3" while the underlying storage or the custom unmarshalling may expect the number `3` instead. Essentially my objection here is the explicit addition of quotes and explicitly requiring JSON decoding. Also, if we're going to go this direction, we should at least use `strconv.Quote` instead :smile:



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] github-actions[bot] commented on pull request #34585: GH-34584: [GO][CSV] Add extension types support.

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #34585:
URL: https://github.com/apache/arrow/pull/34585#issuecomment-1471782613

   :warning: GitHub issue #34584 **has been automatically assigned in GitHub** to PR creator.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] github-actions[bot] commented on pull request #34585: GH-34584: [GO][CSV] Add extension types support.

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #34585:
URL: https://github.com/apache/arrow/pull/34585#issuecomment-1471782554

   * Closes: #34584


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] zeroshade merged pull request #34585: GH-34584: [Go][CSV] Add extension types support

Posted by "zeroshade (via GitHub)" <gi...@apache.org>.
zeroshade merged PR #34585:
URL: https://github.com/apache/arrow/pull/34585


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] ursabot commented on pull request #34585: GH-34584: [Go][CSV] Add extension types support

Posted by "ursabot (via GitHub)" <gi...@apache.org>.
ursabot commented on PR #34585:
URL: https://github.com/apache/arrow/pull/34585#issuecomment-1482005384

   Benchmark runs are scheduled for baseline = 1140a53f51a35c37008d97d43b2d00cff4ae2caa and contender = c68f76838ccaeaf97d8001237fd0ac89a58c293b. c68f76838ccaeaf97d8001237fd0ac89a58c293b is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/7aa486ca0e29400988b4369834b9a88f...ec7a2be8f37a42e7b49739c955b01567/)
   [Failed :arrow_down:0.24% :arrow_up:0.0%] [test-mac-arm](https://conbench.ursa.dev/compare/runs/119c2b2db73f4f01876e1feda0789cad...826a9f06aac348dda92405ca51f953d6/)
   [Finished :arrow_down:1.53% :arrow_up:0.0%] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/7508ee19ae4347f8a80270d6d9910bf0...5f64f8bac6aa4dbc9c7bf8456d76a92d/)
   [Failed :arrow_down:0.16% :arrow_up:0.0%] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/b7ba2a4ce62f4c2d905d7d7a210731ff...d525dac01b114358943049ac8408bcdf/)
   Buildkite builds:
   [Finished] [`c68f7683` ec2-t3-xlarge-us-east-2](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/2564)
   [Failed] [`c68f7683` test-mac-arm](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/2594)
   [Finished] [`c68f7683` ursa-i9-9960x](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-i9-9960x/builds/2562)
   [Failed] [`c68f7683` ursa-thinkcentre-m75q](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/2585)
   [Finished] [`1140a53f` ec2-t3-xlarge-us-east-2](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/2563)
   [Failed] [`1140a53f` test-mac-arm](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/2593)
   [Finished] [`1140a53f` ursa-i9-9960x](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-i9-9960x/builds/2561)
   [Finished] [`1140a53f` ursa-thinkcentre-m75q](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/2584)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
   test-mac-arm: Supported benchmark langs: C++, Python, R
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] zeroshade commented on a diff in pull request #34585: GH-34584: [Go][CSV] Add extension types support

Posted by "zeroshade (via GitHub)" <gi...@apache.org>.
zeroshade commented on code in PR #34585:
URL: https://github.com/apache/arrow/pull/34585#discussion_r1142670675


##########
go/arrow/csv/transformer.go:
##########
@@ -225,6 +227,52 @@ func (w *Writer) transformColToStringArr(typ arrow.DataType, col arrow.Array) []
 				res[i] = w.nullValue
 			}
 		}
+	case arrow.ExtensionType:
+		arr := col.(array.ExtensionArray)
+		b, err := arr.MarshalJSON()
+		if err != nil {
+			panic(fmt.Errorf("arrow/csv: could not marshal extension array: %w", err))
+		}
+		var stringArr []interface{}
+		if err := json.Unmarshal(b, &stringArr); err != nil {
+			panic(fmt.Errorf("arrow/csv: could not unmarshal extnesion to string array: %s", err))
+		}
+		fmt.Println(stringArr)

Review Comment:
   this should be removed, right?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] zeroshade commented on a diff in pull request #34585: GH-34584: [Go][CSV] Add extension types support

Posted by "zeroshade (via GitHub)" <gi...@apache.org>.
zeroshade commented on code in PR #34585:
URL: https://github.com/apache/arrow/pull/34585#discussion_r1142670113


##########
go/arrow/csv/reader.go:
##########
@@ -773,6 +778,18 @@ func (r *Reader) parseBinaryType(field array.Builder, str string) {
 	field.(*array.BinaryBuilder).Append(decodedVal)
 }
 
+func (r *Reader) parseExtension(field array.Builder, str string) {
+	if r.isNull(str) {
+		field.AppendNull()
+		return
+	}
+	dec := json.NewDecoder(strings.NewReader(`"` + str + `"`))

Review Comment:
   what if the underlying storage for the extension type isn't a string/binary data? This would break.
   
   Personally I'm not a fan of delegating the CSV decoding to the JSON decoder. It might make more sense to add a `FromText` or otherwise method to the `ExtensionBuilder` or some other additional interface method for use with CSV parsing.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] zeroshade commented on a diff in pull request #34585: GH-34584: [Go][CSV] Add extension types support

Posted by "zeroshade (via GitHub)" <gi...@apache.org>.
zeroshade commented on code in PR #34585:
URL: https://github.com/apache/arrow/pull/34585#discussion_r1145097963


##########
go/arrow/array/extension.go:
##########
@@ -38,7 +38,8 @@ type ExtensionArray interface {
 	ExtensionType() arrow.ExtensionType
 	// Storage returns the underlying storage array for this array.
 	Storage() arrow.Array
-
+	// ValueString returns a string represenation of the value at the given index for the extension array.
+	ValueString(i int) string

Review Comment:
   looks like you need to add a default implementation of `ValueString` for `ExtensionArrayBase` in order for `staticcheck` not to complain that `mustEmbedExtensionArrayBase` is unused. 
   
   Essentially, `ExtensionArrayBase` needs to meet the `ExtensionArray` interface for staticcheck linting to be happy. I say the default impl can panic unless you want to implement the whole `UntypedValue` idea from #34657 here as part of this PR and then just call that on the underlying storage array as the default impl.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] zeroshade commented on a diff in pull request #34585: GH-34584: [Go][CSV] Add extension types support

Posted by "zeroshade (via GitHub)" <gi...@apache.org>.
zeroshade commented on code in PR #34585:
URL: https://github.com/apache/arrow/pull/34585#discussion_r1145097963


##########
go/arrow/array/extension.go:
##########
@@ -38,7 +38,8 @@ type ExtensionArray interface {
 	ExtensionType() arrow.ExtensionType
 	// Storage returns the underlying storage array for this array.
 	Storage() arrow.Array
-
+	// ValueString returns a string represenation of the value at the given index for the extension array.
+	ValueString(i int) string

Review Comment:
   looks like you need to add a default implementation of `ValueString` for `ExtensionArrayBase` in order for `staticcheck` not to complain that `mustEmbedExtensionArrayBase` is unused. 
   
   Essentially, `ExtensionArrayBase` needs to meet the `ExtensionArray` interface for staticcheck linting to be happy.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] yevgenypats commented on a diff in pull request #34585: GH-34584: [Go][CSV] Add extension types support

Posted by "yevgenypats (via GitHub)" <gi...@apache.org>.
yevgenypats commented on code in PR #34585:
URL: https://github.com/apache/arrow/pull/34585#discussion_r1144376461


##########
go/arrow/csv/reader.go:
##########
@@ -773,6 +778,18 @@ func (r *Reader) parseBinaryType(field array.Builder, str string) {
 	field.(*array.BinaryBuilder).Append(decodedVal)
 }
 
+func (r *Reader) parseExtension(field array.Builder, str string) {
+	if r.isNull(str) {
+		field.AppendNull()
+		return
+	}
+	dec := json.NewDecoder(strings.NewReader(`"` + str + `"`))

Review Comment:
   I can go that direction, just needs some more guidance - so Im going to add an `AppendFromText` to the `Builder` interface. This will mean we will have to implement it in all types (which is fine). If yes, then Im going to open a new PR for that.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] zeroshade commented on a diff in pull request #34585: GH-34584: [Go][CSV] Add extension types support

Posted by "zeroshade (via GitHub)" <gi...@apache.org>.
zeroshade commented on code in PR #34585:
URL: https://github.com/apache/arrow/pull/34585#discussion_r1145313995


##########
go/arrow/csv/reader.go:
##########
@@ -773,6 +778,18 @@ func (r *Reader) parseBinaryType(field array.Builder, str string) {
 	field.(*array.BinaryBuilder).Append(decodedVal)
 }
 
+func (r *Reader) parseExtension(field array.Builder, str string) {
+	if r.isNull(str) {
+		field.AppendNull()
+		return
+	}
+	dec := json.NewDecoder(strings.NewReader(`"` + str + `"`))

Review Comment:
   I like the idea of `AppendFromText` because that could remove the need for the separate parse methods here. You'd just need to first check `r.isNull(str)`  before you could hand it off to the `AppendFromText` and have that return an `error` so it can error if it fails to parse the value. I like it. For the time being I think it's fine to leave the JSON style extension handling here for now and make that improvement as a separate PR. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] zeroshade commented on pull request #34585: GH-34584: [Go][CSV] Add extension types support

Posted by "zeroshade (via GitHub)" <gi...@apache.org>.
zeroshade commented on PR #34585:
URL: https://github.com/apache/arrow/pull/34585#issuecomment-1473914820

   @yevgenypats looks like you're utilizing `any` in the CSV package which currently, like the rest of the Arrow package, is listed as go 1.17. I'm considering that after the v12 release it might make sense for us to finally officially drop support for versions < go1.18. But until then you'll need to confine any features that require go1.18 to a file with a build constraint please.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] yevgenypats commented on a diff in pull request #34585: GH-34584: [Go][CSV] Add extension types support

Posted by "yevgenypats (via GitHub)" <gi...@apache.org>.
yevgenypats commented on code in PR #34585:
URL: https://github.com/apache/arrow/pull/34585#discussion_r1143314586


##########
go/arrow/csv/reader.go:
##########
@@ -773,6 +778,18 @@ func (r *Reader) parseBinaryType(field array.Builder, str string) {
 	field.(*array.BinaryBuilder).Append(decodedVal)
 }
 
+func (r *Reader) parseExtension(field array.Builder, str string) {
+	if r.isNull(str) {
+		field.AppendNull()
+		return
+	}
+	dec := json.NewDecoder(strings.NewReader(`"` + str + `"`))

Review Comment:
   Given this is a reader and it comes from the CSV it will always be a string, no?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] yevgenypats commented on pull request #34585: GH-34584: [Go][CSV] Add extension types support

Posted by "yevgenypats (via GitHub)" <gi...@apache.org>.
yevgenypats commented on PR #34585:
URL: https://github.com/apache/arrow/pull/34585#issuecomment-1473950082

   > @yevgenypats looks like you're utilizing `any` in the CSV package which currently, like the rest of the Arrow package, is listed as go 1.17. I'm considering that after the v12 release it might make sense for us to finally officially drop support for versions < go1.18. But until then you'll need to confine any features that require go1.18 to a file with a build constraint please.
   
   Gotcha. Pushed a fix to use `interface{}`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] yevgenypats commented on a diff in pull request #34585: GH-34584: [Go][CSV] Add extension types support

Posted by "yevgenypats (via GitHub)" <gi...@apache.org>.
yevgenypats commented on code in PR #34585:
URL: https://github.com/apache/arrow/pull/34585#discussion_r1143641506


##########
go/arrow/csv/reader.go:
##########
@@ -773,6 +778,18 @@ func (r *Reader) parseBinaryType(field array.Builder, str string) {
 	field.(*array.BinaryBuilder).Append(decodedVal)
 }
 
+func (r *Reader) parseExtension(field array.Builder, str string) {
+	if r.isNull(str) {
+		field.AppendNull()
+		return
+	}
+	dec := json.NewDecoder(strings.NewReader(`"` + str + `"`))

Review Comment:
   I think you are right. Basically if before that I used marshal to save it to the csv then unmarshalling could work but if we introduce `ValueString` then also we should do `FromString`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] zeroshade commented on a diff in pull request #34585: GH-34584: [Go][CSV] Add extension types support

Posted by "zeroshade (via GitHub)" <gi...@apache.org>.
zeroshade commented on code in PR #34585:
URL: https://github.com/apache/arrow/pull/34585#discussion_r1145315341


##########
go/arrow/csv/reader.go:
##########
@@ -773,6 +778,18 @@ func (r *Reader) parseBinaryType(field array.Builder, str string) {
 	field.(*array.BinaryBuilder).Append(decodedVal)
 }
 
+func (r *Reader) parseExtension(field array.Builder, str string) {
+	if r.isNull(str) {
+		field.AppendNull()
+		return
+	}
+	dec := json.NewDecoder(strings.NewReader(`"` + str + `"`))

Review Comment:
   Can you file an issue to track this, referencing this PR? I'll merge this after the issue gets created.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] yevgenypats commented on a diff in pull request #34585: GH-34584: [Go][CSV] Add extension types support

Posted by "yevgenypats (via GitHub)" <gi...@apache.org>.
yevgenypats commented on code in PR #34585:
URL: https://github.com/apache/arrow/pull/34585#discussion_r1143315829


##########
go/arrow/csv/transformer.go:
##########
@@ -225,6 +227,52 @@ func (w *Writer) transformColToStringArr(typ arrow.DataType, col arrow.Array) []
 				res[i] = w.nullValue
 			}
 		}
+	case arrow.ExtensionType:
+		arr := col.(array.ExtensionArray)
+		b, err := arr.MarshalJSON()
+		if err != nil {
+			panic(fmt.Errorf("arrow/csv: could not marshal extension array: %w", err))
+		}
+		var stringArr []interface{}
+		if err := json.Unmarshal(b, &stringArr); err != nil {
+			panic(fmt.Errorf("arrow/csv: could not unmarshal extnesion to string array: %s", err))
+		}

Review Comment:
   Yeah. I think this is a good idea.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org