You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/12/20 01:41:42 UTC

[GitHub] [arrow-rs] corneliusroemer opened a new issue #1064: Docs: Improve clarity by rewriting 'Any -> All unless overwritten'

corneliusroemer opened a new issue #1064:
URL: https://github.com/apache/arrow-rs/issues/1064


   **Describe the bug**
   In the parquet documentation, the following word choice (using *any*) is common:
   > Sets max statistics for **any** column
   
   https://github.com/apache/arrow-rs/blob/99b7d01103495607932343146c973b6fba0eb8d5/parquet/src/file/properties.rs#L347
   
   I don't think this is the clearest, most unambiguous word to use here.
   
   I think that the meaning intended is: "Sets max statistics for _all_ columns (or _every_ column)"
   
   Any is sloppy, because it could mean "more than one" and does not guarantee "all" when "all" is guaranteed.
   
   This usage of _any_ appears a couple of times in parquet. I think it should be edited for all occurences.
   
   https://github.com/apache/arrow-rs/blob/99b7d01103495607932343146c973b6fba0eb8d5/parquet/src/file/properties.rs#L311-L320
   
   https://github.com/apache/arrow-rs/blob/99b7d01103495607932343146c973b6fba0eb8d5/parquet/src/file/properties.rs#L326
   
   https://github.com/apache/arrow-rs/blob/99b7d01103495607932343146c973b6fba0eb8d5/parquet/src/file/properties.rs#L332
   
   https://github.com/apache/arrow-rs/blob/99b7d01103495607932343146c973b6fba0eb8d5/parquet/src/file/properties.rs#L341
   
   https://github.com/apache/arrow-rs/blob/99b7d01103495607932343146c973b6fba0eb8d5/parquet/src/file/properties.rs#L347
   
   I noticed this confusing wording in the docs for csv2parquet CLI and opened an issue there: https://github.com/domoritz/csv2parquet/issues/42
   
   @domoritz then pointed me here as he just copied from upstream.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] alamb closed issue #1064: Docs: Improve clarity by rewriting 'Any' -> 'All unless overwritten'

Posted by GitBox <gi...@apache.org>.
alamb closed issue #1064:
URL: https://github.com/apache/arrow-rs/issues/1064


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] domoritz commented on issue #1064: Docs: Improve clarity by rewriting 'Any' -> 'All unless overwritten'

Posted by GitBox <gi...@apache.org>.
domoritz commented on issue #1064:
URL: https://github.com/apache/arrow-rs/issues/1064#issuecomment-1005038465


   Fixed in https://github.com/apache/arrow-rs/pull/1126


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] alamb commented on issue #1064: Docs: Improve clarity by rewriting 'Any' -> 'All unless overwritten'

Posted by GitBox <gi...@apache.org>.
alamb commented on issue #1064:
URL: https://github.com/apache/arrow-rs/issues/1064#issuecomment-1003551660


   Thanks @corneliusroemer  -- can you submit a PR to improve the docs in arrow-rs?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] domoritz commented on issue #1064: Docs: Improve clarity by rewriting 'Any' -> 'All unless overwritten'

Posted by GitBox <gi...@apache.org>.
domoritz commented on issue #1064:
URL: https://github.com/apache/arrow-rs/issues/1064#issuecomment-1003600138


   I can make the pull request. I don't understand this comment, though
   
   ```
       /// If dictionary is not enabled, this is treated as a primary encoding for all
       /// columns. In case when dictionary is enabled for any column, this value is
       /// considered to be a fallback encoding for that column.
   ```
   
   What does `this` or `this value` refer to? The whole comment probably should be rewritten to be clearer.
   
   For now, I assume it meant to say
   
   ```
       /// If dictionary is not enabled, the provided value is treated as a primary
       /// encoding for all columns. When dictionary is enabled for a column, this
       /// value is considered to be a fallback encoding for that column.
   ``` 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org