You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "zeroshade (via GitHub)" <gi...@apache.org> on 2023/06/21 14:31:49 UTC

[GitHub] [arrow-adbc] zeroshade opened a new pull request, #828: fix(go/adbc/driver/snowflake): fix potential deadlock and error handling

zeroshade opened a new pull request, #828:
URL: https://github.com/apache/arrow-adbc/pull/828

   Found these when trying to do some performance testing.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-adbc] lidavidm merged pull request #828: fix(go/adbc/driver/snowflake): fix potential deadlock and error handling

Posted by "lidavidm (via GitHub)" <gi...@apache.org>.
lidavidm merged PR #828:
URL: https://github.com/apache/arrow-adbc/pull/828


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-adbc] lidavidm commented on a diff in pull request #828: fix(go/adbc/driver/snowflake): fix potential deadlock and error handling

Posted by "lidavidm (via GitHub)" <gi...@apache.org>.
lidavidm commented on code in PR #828:
URL: https://github.com/apache/arrow-adbc/pull/828#discussion_r1237294633


##########
go/adbc/driver/snowflake/driver.go:
##########
@@ -162,7 +162,11 @@ func errToAdbcErr(code adbc.Status, err error) error {
 	if errors.As(err, &sferr) {
 		var sqlstate [5]byte
 		if len(sferr.SQLState) > 0 {
-			copy(sqlstate[:], sferr.SQLState[:5])
+			if len(sferr.SQLState) <= 5 {
+				copy(sqlstate[:], sferr.SQLState)
+			} else {
+				copy(sqlstate[:], sferr.SQLState[:5])

Review Comment:
   OK, sounds good.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-adbc] zeroshade commented on a diff in pull request #828: fix(go/adbc/driver/snowflake): fix potential deadlock and error handling

Posted by "zeroshade (via GitHub)" <gi...@apache.org>.
zeroshade commented on code in PR #828:
URL: https://github.com/apache/arrow-adbc/pull/828#discussion_r1237185546


##########
go/adbc/driver/snowflake/record_reader.go:
##########
@@ -297,39 +298,41 @@ func newRecordReader(ctx context.Context, alloc memory.Allocator, ld gosnowflake
 	}
 
 	lastChannelIndex := len(chs) - 1
-	for i, b := range batches[1:] {
-		batch, batchIdx := b, i+1
-		chs[batchIdx] = make(chan arrow.Record, bufferSize)
-		group.Go(func() error {
-			// close channels (except the last) so that Next can move on to the next channel properly
-			if batchIdx != lastChannelIndex {
-				defer close(chs[batchIdx])
-			}
-
-			rdr, err := batch.GetStream(ctx)
-			if err != nil {
-				return err
-			}
-			defer rdr.Close()
+	go func() {

Review Comment:
   You're right, we need to add the close for `ch` in flightsql looking at it now. 
   
   The shift to put the loop into a goroutine here is to prepare for adding a `SetLimit` to allow users to control how many goroutines get created. Right now we're going to create one goroutine for *every* batch returned by snowflake. I didn't realize just how many batches it could possibly be, so it could end up being hundreds of goroutines or more. In theory it's not a huge deal since goroutines are fairly lightweight and this will be more I/O bound than CPU bound, but it would probably still be good to allow a user to set a limit to the parallelism. 
   
   I'll update this PR for the FlightSQL impl to add the proper close for the channel. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-adbc] lidavidm commented on a diff in pull request #828: fix(go/adbc/driver/snowflake): fix potential deadlock and error handling

Posted by "lidavidm (via GitHub)" <gi...@apache.org>.
lidavidm commented on code in PR #828:
URL: https://github.com/apache/arrow-adbc/pull/828#discussion_r1237200515


##########
go/adbc/driver/snowflake/driver.go:
##########
@@ -162,7 +162,11 @@ func errToAdbcErr(code adbc.Status, err error) error {
 	if errors.As(err, &sferr) {
 		var sqlstate [5]byte
 		if len(sferr.SQLState) > 0 {
-			copy(sqlstate[:], sferr.SQLState[:5])
+			if len(sferr.SQLState) <= 5 {
+				copy(sqlstate[:], sferr.SQLState)
+			} else {
+				copy(sqlstate[:], sferr.SQLState[:5])

Review Comment:
   Into the error message, not into the sqlstate field.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-adbc] lidavidm commented on a diff in pull request #828: fix(go/adbc/driver/snowflake): fix potential deadlock and error handling

Posted by "lidavidm (via GitHub)" <gi...@apache.org>.
lidavidm commented on code in PR #828:
URL: https://github.com/apache/arrow-adbc/pull/828#discussion_r1237155048


##########
go/adbc/driver/snowflake/record_reader.go:
##########
@@ -297,39 +298,41 @@ func newRecordReader(ctx context.Context, alloc memory.Allocator, ld gosnowflake
 	}
 
 	lastChannelIndex := len(chs) - 1
-	for i, b := range batches[1:] {
-		batch, batchIdx := b, i+1
-		chs[batchIdx] = make(chan arrow.Record, bufferSize)
-		group.Go(func() error {
-			// close channels (except the last) so that Next can move on to the next channel properly
-			if batchIdx != lastChannelIndex {
-				defer close(chs[batchIdx])
-			}
-
-			rdr, err := batch.GetStream(ctx)
-			if err != nil {
-				return err
-			}
-			defer rdr.Close()
+	go func() {

Review Comment:
   Do we need to do this for Flight SQL too?



##########
go/adbc/driver/snowflake/driver.go:
##########
@@ -162,7 +162,11 @@ func errToAdbcErr(code adbc.Status, err error) error {
 	if errors.As(err, &sferr) {
 		var sqlstate [5]byte
 		if len(sferr.SQLState) > 0 {
-			copy(sqlstate[:], sferr.SQLState[:5])
+			if len(sferr.SQLState) <= 5 {
+				copy(sqlstate[:], sferr.SQLState)
+			} else {
+				copy(sqlstate[:], sferr.SQLState[:5])

Review Comment:
   Is there any penalty to always just slicing?
   
   Should we embed SQLState into the error message always/if it's longer than 5?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-adbc] zeroshade commented on a diff in pull request #828: fix(go/adbc/driver/snowflake): fix potential deadlock and error handling

Posted by "zeroshade (via GitHub)" <gi...@apache.org>.
zeroshade commented on code in PR #828:
URL: https://github.com/apache/arrow-adbc/pull/828#discussion_r1237191815


##########
go/adbc/driver/snowflake/driver.go:
##########
@@ -162,7 +162,11 @@ func errToAdbcErr(code adbc.Status, err error) error {
 	if errors.As(err, &sferr) {
 		var sqlstate [5]byte
 		if len(sferr.SQLState) > 0 {
-			copy(sqlstate[:], sferr.SQLState[:5])
+			if len(sferr.SQLState) <= 5 {
+				copy(sqlstate[:], sferr.SQLState)
+			} else {
+				copy(sqlstate[:], sferr.SQLState[:5])

Review Comment:
   I found this because if `len(sferr.SQLState) < 5` the slicing will actually panic. 
   
   Currently and we can't just embed the sqlstate if the error message is longer than 5 because the `AdbcError` struct defines `sqlstate` to be a `char[5]` in the header file. That said, now that I think about it, the semantics of `copy` should only copy as much as the destination will allow, so we can probably do this without the slicing here and `copy` will automatically truncate for us since the destination is a `[5]byte`. I'll update this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-adbc] zeroshade commented on a diff in pull request #828: fix(go/adbc/driver/snowflake): fix potential deadlock and error handling

Posted by "zeroshade (via GitHub)" <gi...@apache.org>.
zeroshade commented on code in PR #828:
URL: https://github.com/apache/arrow-adbc/pull/828#discussion_r1237252549


##########
go/adbc/driver/snowflake/driver.go:
##########
@@ -162,7 +162,11 @@ func errToAdbcErr(code adbc.Status, err error) error {
 	if errors.As(err, &sferr) {
 		var sqlstate [5]byte
 		if len(sferr.SQLState) > 0 {
-			copy(sqlstate[:], sferr.SQLState[:5])
+			if len(sferr.SQLState) <= 5 {
+				copy(sqlstate[:], sferr.SQLState)
+			} else {
+				copy(sqlstate[:], sferr.SQLState[:5])

Review Comment:
   Ah gotcha. So SQLState will get added to the string representation of the error right now, so i think letting it truncate at 5 characters is fine with the copy.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org