You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "mustafasrepo (via GitHub)" <gi...@apache.org> on 2023/02/07 11:35:09 UTC

[GitHub] [arrow-datafusion] mustafasrepo opened a new issue, #5212: Do not coerce types during union schema creation.

mustafasrepo opened a new issue, #5212:
URL: https://github.com/apache/arrow-datafusion/issues/5212

   **Describe the bug**
   A clear and concise description of what the bug is.
   When I run the query below on postgre
   ```sql
   SELECT c1, c9 FROM aggregate_test_100 
   UNION ALL 
   SELECT c1, c3 FROM aggregate_test_100
   ```
   where c9 has type `Bigint` and c3 has type `smallint`. It produces a valid result. However, when I run the above query on datafusion where c9 has type `Uint32` and c3 has type `Int8`.
   It gives the error `ArrowError(CastError("Can't cast value 1491205016 to type Int8"))`.
   The physical plan of the query above in DataFusion is as follows
   ```sql
   "UnionExec",
   "  ProjectionExec: expr=[c1@0 as c1, c3@1 as c3]",
   "    CsvExec: files={1 group: [[Users/akurmustafa/projects/synnada/arrow-datafusion-tmp/testing/data/csv/aggregate_test_100.csv]]}, has_header=true, limit=None, projection=[c1, c3]",
   "  ProjectionExec: expr=[c1@0 as c1, CAST(c9@1 AS Int8) as c3]",
   "    CsvExec: files={1 group: [[Users/akurmustafa/projects/synnada/arrow-datafusion-tmp/testing/data/csv/aggregate_test_100.csv]]}, has_header=true, limit=None, projection=[c1, c9]",
   ```
   
   Datafusion coerces the types `Uint32` and `Int8` to `Int8`. For instance we may choose to upcast types to `Int32` for this specific case.
   
   **To Reproduce**
   Steps to reproduce the behavior:
   One can run query above
   
   **Expected behavior**
   A clear and concise description of what you expected to happen.
   I expect above query to work
   
   **Additional context**
   Add any other context about the problem here.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb closed issue #5212: Upcast types during union schema creation.

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb closed issue #5212: Upcast types during union schema creation.
URL: https://github.com/apache/arrow-datafusion/issues/5212


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org