You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Steve Loughran (Jira)" <ji...@apache.org> on 2020/02/21 12:57:00 UTC

[jira] [Commented] (HADOOP-16875) S3Guard: add support for other MetadataStores

    [ https://issues.apache.org/jira/browse/HADOOP-16875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17041835#comment-17041835 ] 

Steve Loughran commented on HADOOP-16875:
-----------------------------------------

it's expensive if you buy IO capacity, but DDB has pay as you go which costs a lot less, especially when idle.

can you check out hadoop trunk and benchmark your costs there?

* we support pay-as-you-go tables at creation time
* we do a lot tracking of which paths exist during the bulk operations

The Hive team still complains a lot that there is too much iO...I'd argue the hive code needs to look at what it is doing as much as we can.

Finally, yes, there's support for different back ends if someone wanted to implement one. We already have a local one for testing alongside the DDB one. 

> S3Guard: add support for other MetadataStores
> ---------------------------------------------
>
>                 Key: HADOOP-16875
>                 URL: https://issues.apache.org/jira/browse/HADOOP-16875
>             Project: Hadoop Common
>          Issue Type: Wish
>    Affects Versions: 3.2.1
>            Reporter: Rafael Acevedo
>            Priority: Major
>
> Hi all,
>  
> Are there any plans to add other MetadataStore implementations for S3Guard? DynamoDB costs are too high when the read capacity/write capacity are high.
>  
> Maybe a Postgres/MySQL implementation is simple enough to implement and offer strong consistency.
> Another idea is to implement a Cassandra/Scylla MetadataStore(for better write scalability), but we should pay attention to consistency.
>  
> Any thoughts?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org