You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Nick Orka (JIRA)" <ji...@apache.org> on 2018/09/28 15:19:00 UTC

[jira] [Commented] (SPARK-21514) Hive has updated with new support for S3 and InsertIntoHiveTable.scala should update also

    [ https://issues.apache.org/jira/browse/SPARK-21514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16632003#comment-16632003 ] 

Nick Orka commented on SPARK-21514:
-----------------------------------

Recently S3 increased request rate. Thus eventual consistency became a huge problem now for data lakes based on S3. This approach can fix the issue because this is exact spot where all Spark jobs fails. Can you change a priority of the ticket? 

This is a real stopper for many data pipelines.

> Hive has updated with new support for S3 and InsertIntoHiveTable.scala should update also
> -----------------------------------------------------------------------------------------
>
>                 Key: SPARK-21514
>                 URL: https://issues.apache.org/jira/browse/SPARK-21514
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.2.0
>            Reporter: Javier Ros
>            Priority: Major
>
> Hive has updated adding new parameters to optimize the usage of S3, now you can avoid the usage of S3 as the stagingdir using the parameters hive.blobstore.supported.schemes & hive.blobstore.optimizations.enabled.
> The InsertIntoHiveTable.scala file should be updated with the same improvement to match the behavior of Hive.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org