You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/07/22 02:23:08 UTC

[GitHub] [arrow-rs] liukun4515 opened a new issue, #2126: Support filter for parquet data type

liukun4515 opened a new issue, #2126:
URL: https://github.com/apache/arrow-rs/issues/2126

   **Is your feature request related to a problem or challenge? Please describe what you are trying to do.**
   Currently, we want to filter a column in parquet file whose logical data type is decimal or string.
   
   In the parquet, the data of decimal and string will be stored as `binary` and `fixed_len_binary`.
   
   parquet-rs doesn't has the `filter` system or the comparison system.
   
   **Describe the solution you'd like**
   
   implement the `filter` system and comparison system in parquet-rs.
   
   
   **Describe alternatives you've considered**
   A clear and concise description of any alternative solutions or features you've considered.
   
   **Additional context**
   Add any other context or screenshots about the feature request here.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] liukun4515 commented on issue #2126: Support filter for parquet data type

Posted by GitBox <gi...@apache.org>.
liukun4515 commented on issue #2126:
URL: https://github.com/apache/arrow-rs/issues/2126#issuecomment-1192112831

   Do you have any opinions and suggestions?
   @tustvold @Ted-Jiang @alamb 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] alamb commented on issue #2126: Support filter for parquet data type

Posted by GitBox <gi...@apache.org>.
alamb commented on issue #2126:
URL: https://github.com/apache/arrow-rs/issues/2126#issuecomment-1192719943

   Are you thinking about something like "apply some predicate on a parquet binary column and then only decode pages from other columns that might have matching positions" 🤔 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold commented on issue #2126: Support filter for parquet data type

Posted by GitBox <gi...@apache.org>.
tustvold commented on issue #2126:
URL: https://github.com/apache/arrow-rs/issues/2126#issuecomment-1192571772

   Could you perhaps expand a bit on what API you are expecting for this, I'm not entirely sure what you mean by supporting a `filter` or comparison system within parquet-rs?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] liukun4515 commented on issue #2126: Support filter for parquet data type

Posted by GitBox <gi...@apache.org>.
liukun4515 commented on issue #2126:
URL: https://github.com/apache/arrow-rs/issues/2126#issuecomment-1196237933

   This has been resolved in the datafusion I have found.
   we convert the parquet statistic data to arrow array/ arrow data and apply the filter/predication to them.
   
   Thanks @alamb @tustvold 
   I will close this issue


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] liukun4515 closed issue #2126: Support filter for parquet data type

Posted by GitBox <gi...@apache.org>.
liukun4515 closed issue #2126: Support filter for parquet data type
URL: https://github.com/apache/arrow-rs/issues/2126


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org