You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/04/18 14:31:51 UTC

[GitHub] [arrow-rs] jacwellington opened a new issue, #1586: Arrow decimals with precision 12 can't be written to parquet

jacwellington opened a new issue, #1586:
URL: https://github.com/apache/arrow-rs/issues/1586

   **Describe the bug**
   If you create an Arrow Decimal with precision 12 and scale of 0, then you get an error when trying to create an ArrowWriter.
   
   **To Reproduce**
   Here is a small program which will reproduce the error: 
   
   src/main.rs
   ```
   use arrow::{
       record_batch::RecordBatch,
       datatypes::{Field, Schema, DataType},
       array::DecimalBuilder,
   };
   use std::fs::File;
   use parquet::arrow::arrow_writer::ArrowWriter;
   use parquet::file::properties::WriterProperties;
   use std::sync::Arc;
   
   fn main() {
       let precision = 12;
       let scale = 0;
       let decimal_field = Field::new("a", DataType::Decimal(precision, scale), false);
       let schema = Schema::new(vec![decimal_field]);
       let mut builder = DecimalBuilder::new(1, precision, scale);
       builder.append_value(10).unwrap();
       builder.append_null().unwrap();
   
   
       let batch =
           RecordBatch::try_new(Arc::new(schema), vec![Arc::new(builder.finish())])
           .unwrap();
       let props = WriterProperties::builder().build();
       let file = File::create("tmp.parquet").unwrap();
       let writer = ArrowWriter::try_new(file, batch.schema(), Some(props)).unwrap();
   }
   ```
   
   Cargo.toml:
   ```
   [package]
   name = "arrow-test"
   version = "0.1.0"
   edition = "2021"
   
   # See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
   
   [dependencies]
   arrow = "11.1.0"
   parquet = "11.1.0"
   ```
   
   Output:
   ```
   thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: General("Cannot represent FIXED_LEN_BYTE_ARRAY as DECIMAL with length 5 and precision 12. The max precision can only be 11")', src/main.rs:26:74
   note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
   ```
   
   **Expected behavior**
   The writer should be created without error.
   
   **Additional context**
   There seems to be a conflict between how decimal_length_from_precision sets the length here: https://docs.rs/parquet/latest/src/parquet/arrow/schema.rs.html#312
   vs. how it is checked here: https://docs.rs/parquet/latest/src/parquet/schema/types.rs.html#498
   
   I don't have quite enough confidence in my understanding of the conversion to know which one is more accurate.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold commented on issue #1586: Arrow decimals with precision 12 can't be written to parquet

Posted by GitBox <gi...@apache.org>.
tustvold commented on issue #1586:
URL: https://github.com/apache/arrow-rs/issues/1586#issuecomment-1294337654

   I think this relates to #508


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold closed issue #1586: Arrow decimals with precision 12 can't be written to parquet

Posted by GitBox <gi...@apache.org>.
tustvold closed issue #1586: Arrow decimals with precision 12 can't be written to parquet
URL: https://github.com/apache/arrow-rs/issues/1586


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org