You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "daniel-shields (via GitHub)" <gi...@apache.org> on 2023/06/29 14:56:22 UTC

[GitHub] [arrow] daniel-shields opened a new issue, #36386: pyarrow.parquet.Statistics.min_value incorrectly returns negative zero

daniel-shields opened a new issue, #36386:
URL: https://github.com/apache/arrow/issues/36386

   ### Describe the bug, including details regarding any error messages, version, and platform.
   
   Parquet statistics min value is showing negative zero even when that value is not in the array.
   
   ```
   import os
   import sys
   import pyarrow as pa
   import pyarrow.parquet as pq
   print(f"{sys.version}")
   print(f"{pa.__version__=}")
   
   array = pa.array([0.0, 1.0], pa.float64())
   table = pa.table([array], names="x")
   path = os.path.expanduser("~/test.parquet")
   pq.write_table(table, path)
   with pq.ParquetFile(path) as file:
       statistics = file.metadata.row_group(0).column(0).statistics
       has_min_max = statistics.has_min_max
       min_value = statistics.min
       max_value = statistics.max
       
       print(f"{has_min_max=}, {min_value=}, {max_value=}")
   ```
   
   output:
   ```
   3.9.12 (main, Jan  1 2020, 00:00:00) 
   [GCC 8.3.0]
   pa.__version__='12.0.0'
   has_min_max=True, min_value=-0.0, max_value=1.0
   ```
   
   ### Component(s)
   
   Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] daniel-shields closed issue #36386: pyarrow.parquet.Statistics.min_value incorrectly returns negative zero

Posted by "daniel-shields (via GitHub)" <gi...@apache.org>.
daniel-shields closed issue #36386: pyarrow.parquet.Statistics.min_value incorrectly returns negative zero
URL: https://github.com/apache/arrow/issues/36386


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] daniel-shields commented on issue #36386: pyarrow.parquet.Statistics.min_value incorrectly returns negative zero

Posted by "daniel-shields (via GitHub)" <gi...@apache.org>.
daniel-shields commented on issue #36386:
URL: https://github.com/apache/arrow/issues/36386#issuecomment-1613371158

   Looks like it is by design.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] mapleFU commented on issue #36386: pyarrow.parquet.Statistics.min_value incorrectly returns negative zero

Posted by "mapleFU (via GitHub)" <gi...@apache.org>.
mapleFU commented on issue #36386:
URL: https://github.com/apache/arrow/issues/36386#issuecomment-1613342268

   I guess it's by design. See the parquet standard: https://github.com/apache/parquet-format/blob/master/src/main/thrift/parquet.thrift#L912


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org