You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Nick Orka (JIRA)" <ji...@apache.org> on 2018/09/28 15:19:00 UTC
[jira] [Commented] (SPARK-21514) Hive has updated with new support
for S3 and InsertIntoHiveTable.scala should update also
[ https://issues.apache.org/jira/browse/SPARK-21514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16632003#comment-16632003 ]
Nick Orka commented on SPARK-21514:
-----------------------------------
Recently S3 increased request rate. Thus eventual consistency became a huge problem now for data lakes based on S3. This approach can fix the issue because this is exact spot where all Spark jobs fails. Can you change a priority of the ticket?
This is a real stopper for many data pipelines.
> Hive has updated with new support for S3 and InsertIntoHiveTable.scala should update also
> -----------------------------------------------------------------------------------------
>
> Key: SPARK-21514
> URL: https://issues.apache.org/jira/browse/SPARK-21514
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 2.2.0
> Reporter: Javier Ros
> Priority: Major
>
> Hive has updated adding new parameters to optimize the usage of S3, now you can avoid the usage of S3 as the stagingdir using the parameters hive.blobstore.supported.schemes & hive.blobstore.optimizations.enabled.
> The InsertIntoHiveTable.scala file should be updated with the same improvement to match the behavior of Hive.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org