You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2023/01/10 13:44:31 UTC

[GitHub] [airflow] hyangminj opened a new issue, #28830: DynamoDB

hyangminj opened a new issue, #28830:
URL: https://github.com/apache/airflow/issues/28830

   ### Description
   
   Airflow provides the Amazon DynamoDB to Amazon S3 below.  
   https://airflow.apache.org/docs/apache-airflow-providers-amazon/stable/operators/transfer/dynamodb_to_s3.html
   
   Most of Data Engineer build their "export DDB data to s3" pipeline using "within the point in time recovery window".   
   https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/dynamodb.html#DynamoDB.Client.export_table_to_point_in_time
   
   I appreciate if airflow has this function as a native function.  
   
   ### Use case/motivation
   
   My daily batch job exports its data with pitr option. All of tasks is written by apache-airflow-providers-amazon except "export_table_to_point_in_time" task. 
   "export_table_to_point_in_time" task only used the python operator. I expect I can unify the task as apache-airflow-providers-amazon library. 
   
   ### Related issues
   
   _No response_
   
   ### Are you willing to submit a PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] utkarsharma2 commented on issue #28830: Export DynamoDB table to S3 with PITR

Posted by "utkarsharma2 (via GitHub)" <gi...@apache.org>.
utkarsharma2 commented on issue #28830:
URL: https://github.com/apache/airflow/issues/28830#issuecomment-1498441140

   @eladkal @Taragolis I would like to work on this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] hyangminj commented on issue #28830: Export DynamoDB table to S3 with PITR

Posted by GitBox <gi...@apache.org>.
hyangminj commented on issue #28830:
URL: https://github.com/apache/airflow/issues/28830#issuecomment-1380375817

   Sure, I will create separate PRs for operator and sensors. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk closed issue #28830: Export DynamoDB table to S3 with PITR

Posted by "potiuk (via GitHub)" <gi...@apache.org>.
potiuk closed issue #28830: Export DynamoDB table to S3 with PITR
URL: https://github.com/apache/airflow/issues/28830


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] hyangminj commented on issue #28830: Export DynamoDB table to S3 with PITR

Posted by GitBox <gi...@apache.org>.
hyangminj commented on issue #28830:
URL: https://github.com/apache/airflow/issues/28830#issuecomment-1378813813

   Your code and explanation would be best practice.  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] boring-cyborg[bot] commented on issue #28830: DynamoDB

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #28830:
URL: https://github.com/apache/airflow/issues/28830#issuecomment-1377307061

   Thanks for opening your first issue here! Be sure to follow the issue template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] Taragolis commented on issue #28830: export DDB table to s3 with pitr

Posted by GitBox <gi...@apache.org>.
Taragolis commented on issue #28830:
URL: https://github.com/apache/airflow/issues/28830#issuecomment-1377370315

   Feel free to make a PR. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on issue #28830: Export DynamoDB table to S3 with PITR

Posted by "potiuk (via GitHub)" <gi...@apache.org>.
potiuk commented on issue #28830:
URL: https://github.com/apache/airflow/issues/28830#issuecomment-1498673419

   Feel free


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] hyangminj commented on issue #28830: Export DynamoDB table to S3 with PITR

Posted by GitBox <gi...@apache.org>.
hyangminj commented on issue #28830:
URL: https://github.com/apache/airflow/issues/28830#issuecomment-1378766119

   Thank you for your information. 
   
   However, the exporting job is not final task in some data pipelines. It might be first steps of a dag to analysis data. 
   So the staring stage (Sensors) might be needed. 
   For example, If you create EMR job using [Create the Job Flow](https://airflow.apache.org/docs/apache-airflow-providers-amazon/stable/operators/emr.html#create-the-job-flow), you need "Wait on an Amazon EMR job flow state" or "Wait on an Amazon EMR step state" methods . 
   
   Anyway, I will prepare the PR for this function and ask any questions if it is needed. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] Taragolis commented on issue #28830: Export DynamoDB table to S3 with PITR

Posted by GitBox <gi...@apache.org>.
Taragolis commented on issue #28830:
URL: https://github.com/apache/airflow/issues/28830#issuecomment-1379058219

   You could create multiple PRs one by one which could implemented separate parts: one for operator one for sensors, it handy especially if it your first contribution: less time for review, less time for change requests.
    


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] Taragolis commented on issue #28830: export DDB table to s3 with pitr

Posted by GitBox <gi...@apache.org>.
Taragolis commented on issue #28830:
URL: https://github.com/apache/airflow/issues/28830#issuecomment-1377388529

   BTW, Hooks which provided by amazon-provider is basically just a wrapper around of `boto3` clients / resources.
   So you always could access to all `boto3` clients methods within the hook
   
   ```python
   
   from airflow.providers.amazon.aws.hooks.dynamodb import DynamoDBHook
   
   hook = DynamoDBHook(aws_conn_id="awesome-connection-id", region_name="us-east-1")
   # DynamoDBHook create `resource` as high level client
   # but you have an access to regular client by call `meta.client`.
   # This potentially could be uniform in the future, see: https://github.com/apache/airflow/discussions/28560
   client = hook.conn.meta.client
   client.export_table_to_point_in_time(
       TableArn='string',
       ExportTime=datetime(2015, 1, 1),
       ClientToken='string',
       S3Bucket='string',
       S3BucketOwner='string',
       S3Prefix='string',
       S3SseAlgorithm='AES256'|'KMS',
       S3SseKmsKeyId='string',
       ExportFormat='DYNAMODB_JSON'|'ION'
   )
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org