You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Steve Loughran (JIRA)" <ji...@apache.org> on 2018/05/30 17:09:00 UTC

[jira] [Commented] (HIVE-19580) Hive 2.3.2 with ORC files stored on S3 are case sensitive

    [ https://issues.apache.org/jira/browse/HIVE-19580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16495447#comment-16495447 ] 

Steve Loughran commented on HIVE-19580:
---------------------------------------

Don't see why this should be s3-related.
* Can you replicate it on a normal hadoop FS?
* If not, given s3 is amazon's closed-source connector, can you replicate it with the ASF's own s3a connector

> Hive 2.3.2 with ORC files stored on S3 are case sensitive
> ---------------------------------------------------------
>
>                 Key: HIVE-19580
>                 URL: https://issues.apache.org/jira/browse/HIVE-19580
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 2.3.2
>         Environment: AWS S3 to store files
> Spark 2.3 but also true for lower versions
> Hive 2.3.2
>            Reporter: Arthur Baudry
>            Priority: Major
>             Fix For: 2.3.2
>
>
> Original file is csv:
> COL1,COL2
>  1,2
> ORC file are created with Spark 2.3:
> scala> val df = spark.read.option("header","true").csv("/user/hadoop/file")
> scala> df.printSchema
>  root
> |– COL1: string (nullable = true)|
> |– COL2: string (nullable = true)|
> scala> df.write.orc("s3://bucket/prefix")
> In Hive:
> hive> CREATE EXTERNAL TABLE test_orc(COL1 STRING, COL2 STRING) STORED AS ORC LOCATION ("s3://bucket/prefix");
> hive> SELECT * FROM test_orc;
>  OK
>  NULL NULL
> *Everyfield is null. However if fields are generated using lower case in Spark schemas then everything works.*
> The reason why I'm raising this bug is that we have customers using Hive 2.3.2 to read files we generate through Spark and all our code base is addressing fields using upper case while this is incompatible with their Hive instance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)