You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Artur Sukhenko <ar...@gmail.com> on 2019/02/06 15:31:32 UTC

3 equalTo "3.15" = true

Hello guys,
I am migrating from Spark 1.6 to 2.2 and have this issue:
I am casting string to short and comparing them with equal .
Original code is:
... when(col(fieldName).equalTo(castedValueCol), castedValueCol).

  otherwise(defaultErrorValueCol)

Reproduce (version 2.3.0.cloudera4):
scala> val df = Seq("3.15").toDF("tier_id")
df: org.apache.spark.sql.DataFrame = [tier_id: string]

scala> val colShort = col("tier_id").cast(ShortType)
colShort: org.apache.spark.sql.Column = CAST(tier_id AS SMALLINT)

scala> val colString = col("tier_id")
colString: org.apache.spark.sql.Column = tier_id

scala> res4.select(colString, colShort, colShort.equalTo(colString)).show
+-------+-------+-------------------------------------+
|tier_id|tier_id|(CAST(tier_id AS SMALLINT) = tier_id)|
+-------+-------+-------------------------------------+
|   3.15|      3|                                 true|
+-------+-------+-------------------------------------+
scala>

Why is this?
-- 
--
Artur Sukhenko

RE : 3 equalTo "3.15" = true

Posted by Denis DEBARBIEUX <dd...@norsys.fr>.
I am confused since the two column have the same name.


________________________________________
De : Artur Sukhenko [artur.sukhenko@gmail.com]
Date d'envoi : mercredi 6 février 2019 17:32
À : Russell Spitzer
Cc : user@spark.apache.org
Objet : Re: 3 equalTo "3.15" = true

scala> df.select(colString, colShort, colShort.equalTo(colString)).explain
== Physical Plan ==
LocalTableScan [tier_id#3, tier_id#56, (CAST(tier_id AS SMALLINT) = tier_id)#50]


On Wed, Feb 6, 2019 at 6:19 PM Russell Spitzer <ru...@gmail.com>> wrote:
Run an "explain" instead of show, i'm betting it's casting tier_id to a small_int to do the comparison

On Wed, Feb 6, 2019 at 9:31 AM Artur Sukhenko <ar...@gmail.com>> wrote:
Hello guys,
I am migrating from Spark 1.6 to 2.2 and have this issue:
I am casting string to short and comparing them with equal .
Original code is:
... when(col(fieldName).equalTo(castedValueCol), castedValueCol).

  otherwise(defaultErrorValueCol)

Reproduce (version 2.3.0.cloudera4):
scala> val df = Seq("3.15").toDF("tier_id")
df: org.apache.spark.sql.DataFrame = [tier_id: string]

scala> val colShort = col("tier_id").cast(ShortType)
colShort: org.apache.spark.sql.Column = CAST(tier_id AS SMALLINT)

scala> val colString = col("tier_id")
colString: org.apache.spark.sql.Column = tier_id

scala> res4.select(colString, colShort, colShort.equalTo(colString)).show
+-------+-------+-------------------------------------+
|tier_id|tier_id|(CAST(tier_id AS SMALLINT) = tier_id)|
+-------+-------+-------------------------------------+
|   3.15|      3|                                 true|
+-------+-------+-------------------------------------+
scala>

Why is this?
--
--
Artur Sukhenko
--
--
Artur Sukhenko

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Re: 3 equalTo "3.15" = true

Posted by Artur Sukhenko <ar...@gmail.com>.
Probably it is wrong to compare StringType and ShortType.
I'll use something like this
df.select(colString, colShort,
colShort.equalTo(colString.cast(DecimalType(38,15)))).show

On Wed, Feb 6, 2019 at 6:32 PM Artur Sukhenko <ar...@gmail.com>
wrote:

> scala> df.select(colString, colShort, colShort.equalTo(colString)).explain
> == Physical Plan ==
> LocalTableScan [tier_id#3, tier_id#56, (CAST(tier_id AS SMALLINT) =
> tier_id)#50]
>
>
> On Wed, Feb 6, 2019 at 6:19 PM Russell Spitzer <ru...@gmail.com>
> wrote:
>
>> Run an "explain" instead of show, i'm betting it's casting tier_id to a
>> small_int to do the comparison
>>
>> On Wed, Feb 6, 2019 at 9:31 AM Artur Sukhenko <ar...@gmail.com>
>> wrote:
>>
>>> Hello guys,
>>> I am migrating from Spark 1.6 to 2.2 and have this issue:
>>> I am casting string to short and comparing them with equal .
>>> Original code is:
>>> ... when(col(fieldName).equalTo(castedValueCol), castedValueCol).
>>>
>>>   otherwise(defaultErrorValueCol)
>>>
>>> Reproduce (version 2.3.0.cloudera4):
>>> scala> val df = Seq("3.15").toDF("tier_id")
>>> df: org.apache.spark.sql.DataFrame = [tier_id: string]
>>>
>>> scala> val colShort = col("tier_id").cast(ShortType)
>>> colShort: org.apache.spark.sql.Column = CAST(tier_id AS SMALLINT)
>>>
>>> scala> val colString = col("tier_id")
>>> colString: org.apache.spark.sql.Column = tier_id
>>>
>>> scala> res4.select(colString, colShort, colShort.equalTo(colString)).show
>>> +-------+-------+-------------------------------------+
>>> |tier_id|tier_id|(CAST(tier_id AS SMALLINT) = tier_id)|
>>> +-------+-------+-------------------------------------+
>>> |   3.15|      3|                                 true|
>>> +-------+-------+-------------------------------------+
>>> scala>
>>>
>>> Why is this?
>>> --
>>> --
>>> Artur Sukhenko
>>>
>> --
> --
> Artur Sukhenko
>
-- 
--
Artur Sukhenko

Re: 3 equalTo "3.15" = true

Posted by Artur Sukhenko <ar...@gmail.com>.
scala> df.select(colString, colShort, colShort.equalTo(colString)).explain
== Physical Plan ==
LocalTableScan [tier_id#3, tier_id#56, (CAST(tier_id AS SMALLINT) =
tier_id)#50]


On Wed, Feb 6, 2019 at 6:19 PM Russell Spitzer <ru...@gmail.com>
wrote:

> Run an "explain" instead of show, i'm betting it's casting tier_id to a
> small_int to do the comparison
>
> On Wed, Feb 6, 2019 at 9:31 AM Artur Sukhenko <ar...@gmail.com>
> wrote:
>
>> Hello guys,
>> I am migrating from Spark 1.6 to 2.2 and have this issue:
>> I am casting string to short and comparing them with equal .
>> Original code is:
>> ... when(col(fieldName).equalTo(castedValueCol), castedValueCol).
>>
>>   otherwise(defaultErrorValueCol)
>>
>> Reproduce (version 2.3.0.cloudera4):
>> scala> val df = Seq("3.15").toDF("tier_id")
>> df: org.apache.spark.sql.DataFrame = [tier_id: string]
>>
>> scala> val colShort = col("tier_id").cast(ShortType)
>> colShort: org.apache.spark.sql.Column = CAST(tier_id AS SMALLINT)
>>
>> scala> val colString = col("tier_id")
>> colString: org.apache.spark.sql.Column = tier_id
>>
>> scala> res4.select(colString, colShort, colShort.equalTo(colString)).show
>> +-------+-------+-------------------------------------+
>> |tier_id|tier_id|(CAST(tier_id AS SMALLINT) = tier_id)|
>> +-------+-------+-------------------------------------+
>> |   3.15|      3|                                 true|
>> +-------+-------+-------------------------------------+
>> scala>
>>
>> Why is this?
>> --
>> --
>> Artur Sukhenko
>>
> --
--
Artur Sukhenko

Re: 3 equalTo "3.15" = true

Posted by Russell Spitzer <ru...@gmail.com>.
Run an "explain" instead of show, i'm betting it's casting tier_id to a
small_int to do the comparison

On Wed, Feb 6, 2019 at 9:31 AM Artur Sukhenko <ar...@gmail.com>
wrote:

> Hello guys,
> I am migrating from Spark 1.6 to 2.2 and have this issue:
> I am casting string to short and comparing them with equal .
> Original code is:
> ... when(col(fieldName).equalTo(castedValueCol), castedValueCol).
>
>   otherwise(defaultErrorValueCol)
>
> Reproduce (version 2.3.0.cloudera4):
> scala> val df = Seq("3.15").toDF("tier_id")
> df: org.apache.spark.sql.DataFrame = [tier_id: string]
>
> scala> val colShort = col("tier_id").cast(ShortType)
> colShort: org.apache.spark.sql.Column = CAST(tier_id AS SMALLINT)
>
> scala> val colString = col("tier_id")
> colString: org.apache.spark.sql.Column = tier_id
>
> scala> res4.select(colString, colShort, colShort.equalTo(colString)).show
> +-------+-------+-------------------------------------+
> |tier_id|tier_id|(CAST(tier_id AS SMALLINT) = tier_id)|
> +-------+-------+-------------------------------------+
> |   3.15|      3|                                 true|
> +-------+-------+-------------------------------------+
> scala>
>
> Why is this?
> --
> --
> Artur Sukhenko
>