You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Jacek Laskowski <ja...@japila.pl> on 2017/03/27 08:58:47 UTC

Why selectExpr changes schema (to include id column)?

Hi,

While toying with selectExpr I've noticed that the schema changes to
include id column. I can't seem to explain it. Anyone?

scala> spark.range(1).printSchema
root
 |-- value: long (nullable = true)

scala> spark.range(1).selectExpr("*").printSchema
root
 |-- id: long (nullable = false)

p.s. http://stackoverflow.com/q/43041975/1305344

Pozdrawiam,
Jacek Laskowski
----
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org

Re: Why selectExpr changes schema (to include id column)?

Posted by Hyukjin Kwon <gu...@gmail.com>.

Thanks for your confirmation.

On 28 Mar 2017 5:02 a.m., "Jacek Laskowski" <ja...@japila.pl> wrote:

Hi Hyukjin,

It was a false alarm as I had a local change to `def schema` in
`Dataset` that caused the issue.

I apologize for the noise. Sorry and thanks a lot for the prompt
response. I appreciate.

Pozdrawiam,
Jacek Laskowski
----
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski


On Mon, Mar 27, 2017 at 2:43 PM, Hyukjin Kwon <gu...@gmail.com> wrote:
> I just tried to build against the current master to help check -
> https://github.com/apache/spark/commit/3fbf0a5f9297f438bc92db11f106d4
a0ae568613
>
> It seems I can't reproduce this as below:
>
>
> scala> spark.range(1).printSchema
> root
>  |-- id: long (nullable = false)
>
>
> scala> spark.range(1).selectExpr("*").printSchema
> root
>  |-- id: long (nullable = false)
>
>
> scala> spark.version
> res2: String = 2.2.0-SNAPSHOT
>
>
>
>
> 2017-03-27 17:58 GMT+09:00 Jacek Laskowski <ja...@japila.pl>:
>>
>> Hi,
>>
>> While toying with selectExpr I've noticed that the schema changes to
>> include id column. I can't seem to explain it. Anyone?
>>
>> scala> spark.range(1).printSchema
>> root
>>  |-- value: long (nullable = true)
>>
>> scala> spark.range(1).selectExpr("*").printSchema
>> root
>>  |-- id: long (nullable = false)
>>
>> p.s. http://stackoverflow.com/q/43041975/1305344
>>
>> Pozdrawiam,
>> Jacek Laskowski
>> ----
>> https://medium.com/@jaceklaskowski/
>> Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
>> Follow me at https://twitter.com/jaceklaskowski
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>>
>

Re: Why selectExpr changes schema (to include id column)?

Posted by Jacek Laskowski <ja...@japila.pl>.

Hi Hyukjin,

It was a false alarm as I had a local change to `def schema` in
`Dataset` that caused the issue.

I apologize for the noise. Sorry and thanks a lot for the prompt
response. I appreciate.

Pozdrawiam,
Jacek Laskowski
----
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski


On Mon, Mar 27, 2017 at 2:43 PM, Hyukjin Kwon <gu...@gmail.com> wrote:
> I just tried to build against the current master to help check -
> https://github.com/apache/spark/commit/3fbf0a5f9297f438bc92db11f106d4a0ae568613
>
> It seems I can't reproduce this as below:
>
>
> scala> spark.range(1).printSchema
> root
>  |-- id: long (nullable = false)
>
>
> scala> spark.range(1).selectExpr("*").printSchema
> root
>  |-- id: long (nullable = false)
>
>
> scala> spark.version
> res2: String = 2.2.0-SNAPSHOT
>
>
>
>
> 2017-03-27 17:58 GMT+09:00 Jacek Laskowski <ja...@japila.pl>:
>>
>> Hi,
>>
>> While toying with selectExpr I've noticed that the schema changes to
>> include id column. I can't seem to explain it. Anyone?
>>
>> scala> spark.range(1).printSchema
>> root
>>  |-- value: long (nullable = true)
>>
>> scala> spark.range(1).selectExpr("*").printSchema
>> root
>>  |-- id: long (nullable = false)
>>
>> p.s. http://stackoverflow.com/q/43041975/1305344
>>
>> Pozdrawiam,
>> Jacek Laskowski
>> ----
>> https://medium.com/@jaceklaskowski/
>> Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
>> Follow me at https://twitter.com/jaceklaskowski
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>>
>

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org

Re: Why selectExpr changes schema (to include id column)?

Posted by Hyukjin Kwon <gu...@gmail.com>.

I just tried to build against the current master to help check -
https://github.com/apache/spark/commit/3fbf0a5f9297f438bc92db11f106d4a0ae568613

It seems I can't reproduce this as below:


scala> spark.range(1).printSchema
root
 |-- id: long (nullable = false)


scala> spark.range(1).selectExpr("*").printSchema
root
 |-- id: long (nullable = false)


scala> spark.version
res2: String = 2.2.0-SNAPSHOT




2017-03-27 17:58 GMT+09:00 Jacek Laskowski <ja...@japila.pl>:

> Hi,
>
> While toying with selectExpr I've noticed that the schema changes to
> include id column. I can't seem to explain it. Anyone?
>
> scala> spark.range(1).printSchema
> root
>  |-- value: long (nullable = true)
>
> scala> spark.range(1).selectExpr("*").printSchema
> root
>  |-- id: long (nullable = false)
>
> p.s. http://stackoverflow.com/q/43041975/1305344
>
> Pozdrawiam,
> Jacek Laskowski
> ----
> https://medium.com/@jaceklaskowski/
> Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
> Follow me at https://twitter.com/jaceklaskowski
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>