You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by 163 <he...@163.com> on 2017/11/23 03:09:52 UTC
SparkSQL not support CharType
Hi,
when I use Dataframe with table schema, It goes wrong:
val test_schema = StructType(Array(
StructField("id", IntegerType, false),
StructField("flag", CharType(1), false),
StructField("time", DateType, false)));
val df = spark.read.format("com.databricks.spark.csv")
.schema(test_schema)
.option("header", "false")
.option("inferSchema", "false")
.option("delimiter", ",")
.load("file:///Users/name/b")
The log is below:
Exception in thread "main" scala.MatchError: CharType(1) (of class org.apache.spark.sql.types.CharType)
at org.apache.spark.sql.catalyst.encoders.RowEncoder$.org$apache$spark$sql$catalyst$encoders$RowEncoder$$serializerFor(RowEncoder.scala:73)
at org.apache.spark.sql.catalyst.encoders.RowEncoder$$anonfun$2.apply(RowEncoder.scala:158)
at org.apache.spark.sql.catalyst.encoders.RowEncoder$$anonfun$2.apply(RowEncoder.scala:157)
Why? Is this a bug?
But I found spark will translate char type to string when using create table command:
create table test(flag char(1));
desc test: flag string;
Regards
Wendy He
Re: SparkSQL not support CharType
Posted by Jörn Franke <jo...@gmail.com>.
Or bytetype depending on the use case
> On 23. Nov 2017, at 10:18, Herman van Hövell tot Westerflier <hv...@databricks.com> wrote:
>
> You need to use a StringType. The CharType and VarCharType are there to ensure compatibility with Hive and ORC; they should not be used anywhere else.
>
>> On Thu, Nov 23, 2017 at 4:09 AM, 163 <he...@163.com> wrote:
>> Hi,
>> when I use Dataframe with table schema, It goes wrong:
>>
>> val test_schema = StructType(Array(
>> StructField("id", IntegerType, false),
>> StructField("flag", CharType(1), false),
>> StructField("time", DateType, false)));
>>
>> val df = spark.read.format("com.databricks.spark.csv")
>> .schema(test_schema)
>> .option("header", "false")
>> .option("inferSchema", "false")
>> .option("delimiter", ",")
>> .load("file:///Users/name/b")
>>
>> The log is below:
>> Exception in thread "main" scala.MatchError: CharType(1) (of class org.apache.spark.sql.types.CharType)
>> at org.apache.spark.sql.catalyst.encoders.RowEncoder$.org$apache$spark$sql$catalyst$encoders$RowEncoder$$serializerFor(RowEncoder.scala:73)
>> at org.apache.spark.sql.catalyst.encoders.RowEncoder$$anonfun$2.apply(RowEncoder.scala:158)
>> at org.apache.spark.sql.catalyst.encoders.RowEncoder$$anonfun$2.apply(RowEncoder.scala:157)
>>
>> Why? Is this a bug?
>>
>> But I found spark will translate char type to string when using create table command:
>>
>> create table test(flag char(1));
>> desc test: flag string;
>>
>>
>>
>>
>> Regards
>> Wendy He
>
>
>
> --
> Herman van Hövell
> Software Engineer
> Databricks Inc.
> hvanhovell@databricks.com
> +31 6 420 590 27
> databricks.com
>
>
>
>
Re: SparkSQL not support CharType
Posted by Jörn Franke <jo...@gmail.com>.
Or bytetype depending on the use case
> On 23. Nov 2017, at 10:18, Herman van Hövell tot Westerflier <hv...@databricks.com> wrote:
>
> You need to use a StringType. The CharType and VarCharType are there to ensure compatibility with Hive and ORC; they should not be used anywhere else.
>
>> On Thu, Nov 23, 2017 at 4:09 AM, 163 <he...@163.com> wrote:
>> Hi,
>> when I use Dataframe with table schema, It goes wrong:
>>
>> val test_schema = StructType(Array(
>> StructField("id", IntegerType, false),
>> StructField("flag", CharType(1), false),
>> StructField("time", DateType, false)));
>>
>> val df = spark.read.format("com.databricks.spark.csv")
>> .schema(test_schema)
>> .option("header", "false")
>> .option("inferSchema", "false")
>> .option("delimiter", ",")
>> .load("file:///Users/name/b")
>>
>> The log is below:
>> Exception in thread "main" scala.MatchError: CharType(1) (of class org.apache.spark.sql.types.CharType)
>> at org.apache.spark.sql.catalyst.encoders.RowEncoder$.org$apache$spark$sql$catalyst$encoders$RowEncoder$$serializerFor(RowEncoder.scala:73)
>> at org.apache.spark.sql.catalyst.encoders.RowEncoder$$anonfun$2.apply(RowEncoder.scala:158)
>> at org.apache.spark.sql.catalyst.encoders.RowEncoder$$anonfun$2.apply(RowEncoder.scala:157)
>>
>> Why? Is this a bug?
>>
>> But I found spark will translate char type to string when using create table command:
>>
>> create table test(flag char(1));
>> desc test: flag string;
>>
>>
>>
>>
>> Regards
>> Wendy He
>
>
>
> --
> Herman van Hövell
> Software Engineer
> Databricks Inc.
> hvanhovell@databricks.com
> +31 6 420 590 27
> databricks.com
>
>
>
>
Re: SparkSQL not support CharType
Posted by Herman van Hövell tot Westerflier <hv...@databricks.com>.
You need to use a StringType. The CharType and VarCharType are there to
ensure compatibility with Hive and ORC; they should not be used anywhere
else.
On Thu, Nov 23, 2017 at 4:09 AM, 163 <he...@163.com> wrote:
> Hi,
> when I use Dataframe with table schema, It goes wrong:
>
> val test_schema = StructType(Array(
>
> StructField("id", IntegerType, false),
> StructField("flag", CharType(1), false),
> StructField("time", DateType, false)));
>
> val df = spark.read.format("com.databricks.spark.csv")
> .schema(test_schema)
> .option("header", "false")
> .option("inferSchema", "false")
> .option("delimiter", ",")
> .load("file:///Users/name/b")
>
>
> The log is below:
> Exception in thread "main" scala.MatchError: CharType(1) (of class
> org.apache.spark.sql.types.CharType)
> at org.apache.spark.sql.catalyst.encoders.RowEncoder$.org$
> apache$spark$sql$catalyst$encoders$RowEncoder$$serializerFor(RowEncoder.
> scala:73)
> at org.apache.spark.sql.catalyst.encoders.RowEncoder$$anonfun$
> 2.apply(RowEncoder.scala:158)
> at org.apache.spark.sql.catalyst.encoders.RowEncoder$$anonfun$
> 2.apply(RowEncoder.scala:157)
>
> Why? Is this a bug?
>
> But I found spark will translate char type to string when using create
> table command:
>
> create table test(flag char(1));
> desc test: flag string;
>
>
>
>
> Regards
> Wendy He
>
--
Herman van Hövell
Software Engineer
Databricks Inc.
hvanhovell@databricks.com
+31 6 420 590 27
databricks.com
[image: http://databricks.com] <http://databricks.com/>
<https://databricks.com/product/unified-analytics-platform>
Re: SparkSQL not support CharType
Posted by Herman van Hövell tot Westerflier <hv...@databricks.com>.
You need to use a StringType. The CharType and VarCharType are there to
ensure compatibility with Hive and ORC; they should not be used anywhere
else.
On Thu, Nov 23, 2017 at 4:09 AM, 163 <he...@163.com> wrote:
> Hi,
> when I use Dataframe with table schema, It goes wrong:
>
> val test_schema = StructType(Array(
>
> StructField("id", IntegerType, false),
> StructField("flag", CharType(1), false),
> StructField("time", DateType, false)));
>
> val df = spark.read.format("com.databricks.spark.csv")
> .schema(test_schema)
> .option("header", "false")
> .option("inferSchema", "false")
> .option("delimiter", ",")
> .load("file:///Users/name/b")
>
>
> The log is below:
> Exception in thread "main" scala.MatchError: CharType(1) (of class
> org.apache.spark.sql.types.CharType)
> at org.apache.spark.sql.catalyst.encoders.RowEncoder$.org$
> apache$spark$sql$catalyst$encoders$RowEncoder$$serializerFor(RowEncoder.
> scala:73)
> at org.apache.spark.sql.catalyst.encoders.RowEncoder$$anonfun$
> 2.apply(RowEncoder.scala:158)
> at org.apache.spark.sql.catalyst.encoders.RowEncoder$$anonfun$
> 2.apply(RowEncoder.scala:157)
>
> Why? Is this a bug?
>
> But I found spark will translate char type to string when using create
> table command:
>
> create table test(flag char(1));
> desc test: flag string;
>
>
>
>
> Regards
> Wendy He
>
--
Herman van Hövell
Software Engineer
Databricks Inc.
hvanhovell@databricks.com
+31 6 420 590 27
databricks.com
[image: http://databricks.com] <http://databricks.com/>
<https://databricks.com/product/unified-analytics-platform>