You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/12/24 21:58:35 UTC

[GitHub] [arrow-rs] chadbrewbaker edited a comment on issue #1053: Parquet Fuzz Tests

chadbrewbaker edited a comment on issue #1053:
URL: https://github.com/apache/arrow-rs/issues/1053#issuecomment-997457311


   After thinking about this for a week - I'm inclined to start driving with [ Arrow Python/Hypothesis](https://github.com/apache/arrow/blob/master/python/pyarrow/tests/strategies.py)  and [Python Parquet tests](https://github.com/apache/arrow/tree/master/python/pyarrow/tests/parquet) then gradually add Proptest. AWS Labs has the [best proptest examples](https://github.com/search?q=org%3Aawslabs+proptest).
   
   Zooming out a bit more, DataFusion needs to be integrated in [squirrel](https://github.com/s3team/Squirrel) - [sqlancer](https://github.com/sqlancer/sqlancer) cross SQL engine tests. Can use [sqlsmith](https://github.com/anse1/sqlsmith) for reductions of large queries.
   
   We also want to be like AWS Redshift where you write a query in Python/SQL - and it emits Rust code that gets compiled and sent to worker nodes.
   
   Seems we might need thin-lto even on dev builds to reduce false positives  https://github.com/awslabs/rust-smt-ir/blob/551565ea5e97f502269d74d189e2e2c1e6b52f40/Cargo.toml#L11
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org