You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/07/21 01:00:22 UTC

[GitHub] [iceberg] yyanyy commented on a change in pull request #2831: Doc: add documentation for JDBC and DynamoDB catalogs

yyanyy commented on a change in pull request #2831:
URL: https://github.com/apache/iceberg/pull/2831#discussion_r673585442



##########
File path: site/docs/aws.md
##########
@@ -238,6 +242,59 @@ LOCATION 's3://my-special-table-bucket'
 PARTITIONED BY (category);
 ```
 
+### DynamoDB Catalog
+
+Iceberg supports using a [DynamoDB](https://aws.amazon.com/dynamodb) table to record and manage database and table information.
+
+#### Configurations
+
+The DynamoDB catalog supports the following configurations:
+
+| Property                          | Default                                            | Description                                            |
+| --------------------------------- | -------------------------------------------------- | ------------------------------------------------------ |
+| dynamodb.table-name               | iceberg                                            | name of the DynamoDB table used by DynamoDbCatalog     |
+
+
+#### Internal Table Design
+
+The DynamoDB table is designed with the following columns:
+
+| Column            | Key             | Type        | Description                                                          |
+| ----------------- | --------------- | ----------- |--------------------------------------------------------------------- |
+| identifier        | partition key   | string      | table identifier such as `db1.table1`, or `NAMESPACE` for namespaces |
+| namespace         | sort key        | string      | namespace name. A [global secondary index (GSI)](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GSI.html) is created with namespace as partition key, identifier as sort key, no other projected columns |
+| v                 |                 | string      | row version, used for optimistic locking |
+| updated_at        |                 | number      | timestamp (millis) of the last update | 
+| created_at        |                 | number      | timestamp (millis) of the table creation |
+| p.<property_key\> |                 | string      | Iceberg-defined table properties including `table_type`, `metadata_location` and `previous_metadata_location` or namespace properties
+
+This design has the following benefits:
+
+1. table name is used directly as partition key to avoid any potential [hot partition issue](https://aws.amazon.com/premiumsupport/knowledge-center/dynamodb-table-throttled/), comparing to use namespace as partition key and table name as sort key

Review comment:
       if my understanding from the comment above was correct, I think "comparing to use namespace as partition key and table name as sort key" here may imply that identifier will be always including table name; what about something like "avoid potential hot partition issue if there are heavy write traffic to tables within the same namespace, since users can configure the key to be on table level" 

##########
File path: site/docs/aws.md
##########
@@ -238,6 +242,59 @@ LOCATION 's3://my-special-table-bucket'
 PARTITIONED BY (category);
 ```
 
+### DynamoDB Catalog
+
+Iceberg supports using a [DynamoDB](https://aws.amazon.com/dynamodb) table to record and manage database and table information.
+
+#### Configurations
+
+The DynamoDB catalog supports the following configurations:
+
+| Property                          | Default                                            | Description                                            |
+| --------------------------------- | -------------------------------------------------- | ------------------------------------------------------ |
+| dynamodb.table-name               | iceberg                                            | name of the DynamoDB table used by DynamoDbCatalog     |
+
+
+#### Internal Table Design
+
+The DynamoDB table is designed with the following columns:
+
+| Column            | Key             | Type        | Description                                                          |
+| ----------------- | --------------- | ----------- |--------------------------------------------------------------------- |
+| identifier        | partition key   | string      | table identifier such as `db1.table1`, or `NAMESPACE` for namespaces |

Review comment:
       sorry what do you mean by "or `NAMESPACE` for namespaces"? this `identifier` can be just the namespaces for sharing a single table with all tables within namespaces? And for the namespace case, is that always be this exact "NAMESPACE" string, or this is just an example? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org