You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/06/27 06:36:21 UTC

[GitHub] [arrow-datafusion] liukun4515 opened a new issue, #2799: coercion rule about `eq` and InList between string type and numeric type

liukun4515 opened a new issue, #2799:
URL: https://github.com/apache/arrow-datafusion/issues/2799

   I have some concerns about the rule between string and number.
   I check some situation in the spark:
   ```
   spark-sql> desc t3;
   c1                      int
   
   spark-sql> explain extended select * from t3 where c1 = cast(123.123 as string);
   == Parsed Logical Plan ==
   'Project [*]
   +- 'Filter ('c1 = cast(123.123 as string))
      +- 'UnresolvedRelation [t3], [], false
   
   == Analyzed Logical Plan ==
   c1: int
   Project [c1#186]
   +- Filter (c1#186 = cast(cast(123.123 as string) as int))
      +- SubqueryAlias spark_catalog.default.t3
         +- HiveTableRelation [`default`.`t3`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, Data Cols: [c1#186], Partition Cols: []]
   
   == Optimized Logical Plan ==
   Filter (isnotnull(c1#186) AND (c1#186 = 123))
   +- HiveTableRelation [`default`.`t3`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, Data Cols: [c1#186], Partition Cols: []]
   
   == Physical Plan ==
   *(1) Filter (isnotnull(c1#186) AND (c1#186 = 123))
   +- Scan hive default.t3 [c1#186], HiveTableRelation [`default`.`t3`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, Data Cols: [c1#186], Partition Cols: []]
   ```
   In the previous case, the result of coercion is `Int`.
   I think we need to create an issue to track this.
   @viirya  @alamb
   
   _Originally posted by @liukun4515 in https://github.com/apache/arrow-datafusion/pull/2794#discussion_r907011527_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] liukun4515 commented on issue #2799: coercion rule about `eq` and InList between string type and numeric type

Posted by GitBox <gi...@apache.org>.
liukun4515 commented on issue #2799:
URL: https://github.com/apache/arrow-datafusion/issues/2799#issuecomment-1167008598

   I'm going to implement casting between UTF8 and decimal, we can also consider there cases together.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] liukun4515 commented on issue #2799: coercion rule about `eq` and InList between string type and numeric type

Posted by GitBox <gi...@apache.org>.
liukun4515 commented on issue #2799:
URL: https://github.com/apache/arrow-datafusion/issues/2799#issuecomment-1168189491

   > I wonder if the question here is "should we automatically try and coerce numbers to strings (which is more general but slower) or coerce strings to numbers (which is less general but faster)"?
   
   yes, we should try some other database system and discuss the best behavior for these case 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] liukun4515 commented on issue #2799: coercion rule about `eq` and InList between string type and numeric type

Posted by GitBox <gi...@apache.org>.
liukun4515 commented on issue #2799:
URL: https://github.com/apache/arrow-datafusion/issues/2799#issuecomment-1173812564

   > Here is what postgres does (appears to me to cast based on the type of the first element in the IN list):
   > 
   > ```
   > alamb=# select 5 in (1, 2, 'fff');
   > ERROR:  invalid input syntax for type integer: "fff"
   > LINE 1: select 5 in (1, 2, 'fff');
   >                            ^
   > alamb=# select 'foo' in (1, 2, 'fff');
   > ERROR:  invalid input syntax for type integer: "fff"
   > LINE 1: select 'foo' in (1, 2, 'fff');
   >                                ^
   > alamb=# select 'foo' in ('1', '2', 'fff');
   >  ?column? 
   > ----------
   >  f
   > (1 row)
   > ```
   
   Do you try  this?
   ```
   select 1 in (12,1,3,'123');
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] liukun4515 commented on issue #2799: coercion rule about `eq` and InList between string type and numeric type

Posted by GitBox <gi...@apache.org>.
liukun4515 commented on issue #2799:
URL: https://github.com/apache/arrow-datafusion/issues/2799#issuecomment-1166937215

   cc @viirya 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on issue #2799: coercion rule about `eq` and InList between string type and numeric type

Posted by GitBox <gi...@apache.org>.
alamb commented on issue #2799:
URL: https://github.com/apache/arrow-datafusion/issues/2799#issuecomment-1175285349

   > Do you try this?
   
   
   
   ```sql
   psql (14.3)
   Type "help" for help.
   
   alamb=# select 1 in (12,1,3,'123');
    ?column? 
   ----------
    t
   (1 row)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] viirya commented on issue #2799: coercion rule about `eq` and InList between string type and numeric type

Posted by GitBox <gi...@apache.org>.
viirya commented on issue #2799:
URL: https://github.com/apache/arrow-datafusion/issues/2799#issuecomment-1169142319

   > Here is what postgres does (appears to me to cast based on the type of the first element in the IN list)
   
   This sounds reasonable.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on issue #2799: coercion rule about `eq` and InList between string type and numeric type

Posted by GitBox <gi...@apache.org>.
alamb commented on issue #2799:
URL: https://github.com/apache/arrow-datafusion/issues/2799#issuecomment-1167659973

   I wonder if the question here is "should we automatically try and coerce numbers to strings (which is more general but slower) or coerce strings to numbers (which is less general but faster)"?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on issue #2799: coercion rule about `eq` and InList between string type and numeric type

Posted by GitBox <gi...@apache.org>.
alamb commented on issue #2799:
URL: https://github.com/apache/arrow-datafusion/issues/2799#issuecomment-1169103995

   Here is what postgres does (appears to me to cast based on the type of the first element in the IN list):
   
   ```
   alamb=# select 5 in (1, 2, 'fff');
   ERROR:  invalid input syntax for type integer: "fff"
   LINE 1: select 5 in (1, 2, 'fff');
                              ^
   alamb=# select 'foo' in (1, 2, 'fff');
   ERROR:  invalid input syntax for type integer: "fff"
   LINE 1: select 'foo' in (1, 2, 'fff');
                                  ^
   alamb=# select 'foo' in ('1', '2', 'fff');
    ?column? 
   ----------
    f
   (1 row)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org