You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "trueleo (via GitHub)" <gi...@apache.org> on 2023/03/16 09:52:07 UTC

[GitHub] [arrow-rs] trueleo opened a new pull request, #3873: Add support for checksum algorithms in AWS

trueleo opened a new pull request, #3873:
URL: https://github.com/apache/arrow-rs/pull/3873

   # Which issue does this PR close?
   Closes #3725 
   
   # Rationale for this change
   
   
   # What changes are included in this PR?
   
   * Add with_checksum_algorithm option in AmazonS3Builder
   * Add Checksum enum 
   * Request Signer can take optional payload sha256 checksum to reuse for content-sha256.
   
   # Are there any user-facing changes?
   
   * Added with_checksum_algorithm option in AmazonS3Builder.
   * Add Checksum enum 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] trueleo commented on a diff in pull request #3873: Add support for checksum algorithms in AWS

Posted by "trueleo (via GitHub)" <gi...@apache.org>.
trueleo commented on code in PR #3873:
URL: https://github.com/apache/arrow-rs/pull/3873#discussion_r1143191113


##########
object_store/src/aws/client.rs:
##########
@@ -281,10 +286,22 @@ impl S3Client {
     ) -> Result<Response> {
         let credential = self.get_credential().await?;
         let url = self.config.path_url(path);
-
         let mut builder = self.client.request(Method::PUT, url);
+
+        let mut payload_checksum = None;
+
         if let Some(bytes) = bytes {
-            builder = builder.body(bytes)
+            payload_checksum = match self.config().checksum {
+                Some(checksum) => {
+                    let digest = checksum.digest(&bytes);

Review Comment:
   Yeah will fix it



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold commented on a diff in pull request #3873: Add support for checksum algorithms in AWS

Posted by "tustvold (via GitHub)" <gi...@apache.org>.
tustvold commented on code in PR #3873:
URL: https://github.com/apache/arrow-rs/pull/3873#discussion_r1143190371


##########
object_store/src/aws/client.rs:
##########
@@ -281,10 +286,22 @@ impl S3Client {
     ) -> Result<Response> {
         let credential = self.get_credential().await?;
         let url = self.config.path_url(path);
-
         let mut builder = self.client.request(Method::PUT, url);
+
+        let mut payload_checksum = None;
+
         if let Some(bytes) = bytes {
-            builder = builder.body(bytes)
+            payload_checksum = match self.config().checksum {
+                Some(checksum) => {
+                    let digest = checksum.digest(&bytes);

Review Comment:
   I think this is only correct if the checksum is a sha256 digest?



##########
object_store/src/aws/client.rs:
##########
@@ -281,10 +286,22 @@ impl S3Client {
     ) -> Result<Response> {
         let credential = self.get_credential().await?;
         let url = self.config.path_url(path);
-
         let mut builder = self.client.request(Method::PUT, url);
+
+        let mut payload_checksum = None;
+
         if let Some(bytes) = bytes {
-            builder = builder.body(bytes)
+            payload_checksum = match self.config().checksum {
+                Some(checksum) => {
+                    let digest = checksum.digest(&bytes);

Review Comment:
   I think passing this to the request signing is only correct if the checksum is a sha256 digest?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] trueleo commented on pull request #3873: Add support for checksum algorithms in AWS

Posted by "trueleo (via GitHub)" <gi...@apache.org>.
trueleo commented on PR #3873:
URL: https://github.com/apache/arrow-rs/pull/3873#issuecomment-1477612411

   @tustvold Sorry, I couldn't find time to get back to it. I need to add optional dependencies for crc32 and crc32c. Any suggestions 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold commented on pull request #3873: Add support for checksum algorithms in AWS

Posted by "tustvold (via GitHub)" <gi...@apache.org>.
tustvold commented on PR #3873:
URL: https://github.com/apache/arrow-rs/pull/3873#issuecomment-1477953417

   Thank you


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold commented on pull request #3873: Add support for checksum algorithms in AWS

Posted by "tustvold (via GitHub)" <gi...@apache.org>.
tustvold commented on PR #3873:
URL: https://github.com/apache/arrow-rs/pull/3873#issuecomment-1477619407

   > I need to add optional dependencies for crc32 and crc32c.
   
   Perhaps we could reduce the scope initially and only support sha256 digests, as these already need to be computed for request signing anyway? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] trueleo commented on pull request #3873: Add support for checksum algorithms in AWS

Posted by "trueleo (via GitHub)" <gi...@apache.org>.
trueleo commented on PR #3873:
URL: https://github.com/apache/arrow-rs/pull/3873#issuecomment-1477623009

   > Perhaps we could reduce the scope initially and only support sha256 digests
   
   @tustvold Fine by me. I'll change the Checksum enum to only have Sha1 and Sha256 as those are already provided by ring
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold commented on pull request #3873: Add support for checksum algorithms in AWS

Posted by "tustvold (via GitHub)" <gi...@apache.org>.
tustvold commented on PR #3873:
URL: https://github.com/apache/arrow-rs/pull/3873#issuecomment-1477610026

   What is this waiting on?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] trueleo commented on pull request #3873: Add support for checksum algorithms in AWS

Posted by "trueleo (via GitHub)" <gi...@apache.org>.
trueleo commented on PR #3873:
URL: https://github.com/apache/arrow-rs/pull/3873#issuecomment-1477854688

   > To be honest I wonder if we could just always include them, given they have been computed
   
   @tustvold ? Like setting `x-amz-checksum-sha256` when unsigned payload option is not set ?
   I don't know if that's a good idea. I too find existence of x-amz-content-sha256 and x-amz-checksum-sha256 weird but it's better to keep behaviour similar if we are going to support other algorithms.
   
   >  I'll change the Checksum enum to only have Sha1 and Sha256 
   
   Removed SHA1 as well for now. It's deprecated. ref https://github.com/briansmith/ring/issues/754 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold merged pull request #3873: Add support for checksum algorithms in AWS

Posted by "tustvold (via GitHub)" <gi...@apache.org>.
tustvold merged PR #3873:
URL: https://github.com/apache/arrow-rs/pull/3873


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org