You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "mslevine (via GitHub)" <gi...@apache.org> on 2023/05/23 04:04:35 UTC

[GitHub] [arrow] mslevine opened a new issue, #35718: [Go] [Parquet] panic writing with DeltaBinaryPacked when column only has nulls

mslevine opened a new issue, #35718:
URL: https://github.com/apache/arrow/issues/35718

   ### Describe the bug, including details regarding any error messages, version, and platform.
   
   The writer panics if DeltaBinaryPacked encoding is requested for a column that is all null. The following small change to an existing test case demonstrates the issue. 
   
   In case it matters, I'm using `go version go1.20.4 darwin/amd64`
   I ran into this issue using arrow v12.0.0, but this changed testcase fails using the current head of main.
   
   ```
   diff --git a/go/parquet/pqarrow/encode_arrow_test.go b/go/parquet/pqarrow/encode_arrow_test.go
   index 877f584f2..0b83ec696 100644
   --- a/go/parquet/pqarrow/encode_arrow_test.go
   +++ b/go/parquet/pqarrow/encode_arrow_test.go
   @@ -380,6 +380,8 @@ func TestWriteEmptyLists(t *testing.T) {
    
           props := parquet.NewWriterProperties(
                   parquet.WithVersion(parquet.V1_0),
   +               parquet.WithDictionaryDefault(false),
   +               parquet.WithEncoding(parquet.Encodings.DeltaBinaryPacked),
           )
           arrprops := pqarrow.DefaultWriterProps()
           var buf bytes.Buffer
   ```
   
   ### Component(s)
   
   Go


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] zeroshade closed issue #35718: [Go] [Parquet] panic writing with DeltaBinaryPacked when column only has nulls

Posted by "zeroshade (via GitHub)" <gi...@apache.org>.
zeroshade closed issue #35718: [Go] [Parquet] panic writing with DeltaBinaryPacked when column only has nulls
URL: https://github.com/apache/arrow/issues/35718


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] zeroshade commented on issue #35718: [Go] [Parquet] panic writing with DeltaBinaryPacked when column only has nulls

Posted by "zeroshade (via GitHub)" <gi...@apache.org>.
zeroshade commented on issue #35718:
URL: https://github.com/apache/arrow/issues/35718#issuecomment-1559665091

   @mslevine Thanks for filing this! Would you be interested in filing a PR with a new test case and your proposed fix? :smile: It should automatically tag me as a codeowner to review it if you do.
   
   If you're not interested, I'll try to get to this when I have some time.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] abhimanyusinghgaur commented on issue #35718: [Go] [Parquet] panic writing with DeltaBinaryPacked when column only has nulls

Posted by "abhimanyusinghgaur (via GitHub)" <gi...@apache.org>.
abhimanyusinghgaur commented on issue #35718:
URL: https://github.com/apache/arrow/issues/35718#issuecomment-1728507868

   Hi, This issue isn't fixed by #37112. Can you please reopen it?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] zeroshade commented on issue #35718: [Go] [Parquet] panic writing with DeltaBinaryPacked when column only has nulls

Posted by "zeroshade (via GitHub)" <gi...@apache.org>.
zeroshade commented on issue #35718:
URL: https://github.com/apache/arrow/issues/35718#issuecomment-1729716055

   @abhimanyusinghgaur I can reopen this issue, after which can you file a PR with your proposed fix and add a unit test case that can replicate the crash without the fix?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] mslevine commented on issue #35718: [Go] [Parquet] panic writing with DeltaBinaryPacked when column only has nulls

Posted by "mslevine (via GitHub)" <gi...@apache.org>.
mslevine commented on issue #35718:
URL: https://github.com/apache/arrow/issues/35718#issuecomment-1558504595

   Reading more code, it seems possible the fix could be as simple as
   ```
   diff --git a/go/parquet/internal/encoding/delta_bit_packing.go b/go/parquet/internal/encoding/delta_bit_packing.go
   index 2ebe6ad98..d68044a07 100644
   --- a/go/parquet/internal/encoding/delta_bit_packing.go
   +++ b/go/parquet/internal/encoding/delta_bit_packing.go
   @@ -458,7 +458,11 @@ func (enc *deltaBitPackEncoder) FlushValues() (Buffer, error) {
    
    // EstimatedDataEncodedSize returns the current amount of data actually flushed out and written
    func (enc *deltaBitPackEncoder) EstimatedDataEncodedSize() int64 {
   -       return int64(enc.bitWriter.Written())
   +       if enc.bitWriter != nil {
   +               return int64(enc.bitWriter.Written())
   +       } else {
   +               return 0
   +       }
    }
    
    // DeltaBitPackInt32Encoder is an encoder for the delta bitpacking encoding for int32 data.
   ```
   
   (although I could easily believe that the correct response when nil is some constant other than 0)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [Go] [Parquet] panic writing with DeltaBinaryPacked when column only has nulls [arrow]

Posted by "zeroshade (via GitHub)" <gi...@apache.org>.
zeroshade closed issue #35718: [Go] [Parquet] panic writing with DeltaBinaryPacked when column only has nulls
URL: https://github.com/apache/arrow/issues/35718


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org