You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/04/13 08:43:44 UTC

[GitHub] [arrow-rs] tustvold opened a new issue, #1554: Improve ergonomics of `parquet::basic::LogicalType`

tustvold opened a new issue, #1554:
URL: https://github.com/apache/arrow-rs/issues/1554

   **Is your feature request related to a problem or challenge? Please describe what you are trying to do.**
   
   `LogicalType` is defined as
   
   ```
   pub enum LogicalType {
       STRING(StringType),
       MAP(MapType),
       LIST(ListType),
       ENUM(EnumType),
       DECIMAL(DecimalType),
       DATE(DateType),
       TIME(TimeType),
       TIMESTAMP(TimestampType),
       INTEGER(IntType),
       UNKNOWN(NullType),
       JSON(JsonType),
       BSON(BsonType),
       UUID(UUIDType),
   }
   ```
   
   Where the inner structs are actually defined by `parquet_format`. This not only requires an additional crate in scope, but also a number of these are actually empty structs resulting in constructions like
   
   ```
   LogicalType::String(Default::default)
   ```
   
   **Describe the solution you'd like**
   
   I would like to change the signature to be instead
   
   ```
   pub enum LogicalType {
       STRING,
       MAP,
       LIST,
       ENUM,
       DECIMAL {
           scale: i32,
           precision: i32,
       },
       DATE,
       TIME {
           is_adjusted_to_u_t_c: bool,
           unit: TimeUnit,
       },
       TIMESTAMP {
           is_adjusted_to_u_t_c: bool,
           unit: TimeUnit,
       },
       INTEGER {
           bit_width: i8,
           is_signed: bool,
       },
       UNKNOWN,
       JSON,
       BSON,
       UUID,
   }
   ```
   
   **Describe alternatives you've considered**
   
   We could leave the enumeration unchanged
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold commented on issue #1554: Improve ergonomics of `parquet::basic::LogicalType`

Posted by GitBox <gi...@apache.org>.
tustvold commented on issue #1554:
URL: https://github.com/apache/arrow-rs/issues/1554#issuecomment-1101439028

   We could also change the variants to be upper camel case, as is more idiomatic in Rust code 🤔


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold closed issue #1554: Improve ergonomics of `parquet::basic::LogicalType`

Posted by GitBox <gi...@apache.org>.
tustvold closed issue #1554: Improve ergonomics of `parquet::basic::LogicalType` 
URL: https://github.com/apache/arrow-rs/issues/1554


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] sunchao commented on issue #1554: Improve ergonomics of `parquet::basic::LogicalType`

Posted by GitBox <gi...@apache.org>.
sunchao commented on issue #1554:
URL: https://github.com/apache/arrow-rs/issues/1554#issuecomment-1105824993

   +1. I think the new proposal looks cleaner. Also +1 on changing to CamelCase.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold commented on issue #1554: Improve ergonomics of `parquet::basic::LogicalType`

Posted by GitBox <gi...@apache.org>.
tustvold commented on issue #1554:
URL: https://github.com/apache/arrow-rs/issues/1554#issuecomment-1107533204

   The convention for expressing parquet schema appears to be SHOUTY CASE, I would therefore avoid making changes to the parser to not accept this. I suppose you might be able to make it case-insensitive, but I'm not sure if this is spec-compliant, we'd need to check 🤔
   
   As for error messages, I think whatever is easiest. I personally prefer camel case as I don't like my error messages screaming at me, but the important thing is the meaning is clear 👍


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tfeda commented on issue #1554: Improve ergonomics of `parquet::basic::LogicalType`

Posted by GitBox <gi...@apache.org>.
tfeda commented on issue #1554:
URL: https://github.com/apache/arrow-rs/issues/1554#issuecomment-1107530472

   Hi all, I've started implementing this. Would we want the string parser to also use CamelCase?
   for example: 
   ```
   let message = " 
   message test_schema {
       REQUIRED INT32   uint8 (INTEGER(8,false));
       REQUIRED INT32   uint16 (INTEGER(16,false));
   } 
   "
   ```
   would be changed to
   
   ```
   let message = " 
   message test_schema {
       REQUIRED INT32   uint8 (Integer(8,false));
       REQUIRED INT32   uint16 (Integer(16,false));
   } 
   "
   ```
   I originally came across this when starting to edit error messages, and wasn't sure if printing the enum's format or the string's format is more instructive to the user.
   
   Ex:
   `general_err!("Failed to parse timeunit for Timestamp type")`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] alamb commented on issue #1554: Improve ergonomics of `parquet::basic::LogicalType`

Posted by GitBox <gi...@apache.org>.
alamb commented on issue #1554:
URL: https://github.com/apache/arrow-rs/issues/1554#issuecomment-1100872326

   cc @nevi-me 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org