You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "wjones127 (via GitHub)" <gi...@apache.org> on 2023/12/11 21:14:44 UTC

[PR] feat(object_store): use http1 by default [arrow-rs]

wjones127 opened a new pull request, #5204:
URL: https://github.com/apache/arrow-rs/pull/5204

   # Which issue does this PR close?
   
   Closes #5194.
   
   # Rationale for this change
    
   This makes object store performance less surprising, by making the defaults the more performant options.
   
   # What changes are included in this PR?
   
   Makes `HTTP1_ONLY` the default mode.
   
   # Are there any user-facing changes?
   
   This is a breaking change to defaults. However, user code generally will not need to change.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] feat(object_store): use http1 by default [arrow-rs]

Posted by "tustvold (via GitHub)" <gi...@apache.org>.
tustvold commented on code in PR #5204:
URL: https://github.com/apache/arrow-rs/pull/5204#discussion_r1423290834


##########
object_store/src/client/mod.rs:
##########
@@ -350,17 +353,28 @@ impl ClientOptions {
     }
 
     /// Only use http1 connections
+    ///
+    /// This is on by default, since http2 is known to be significantly slower than http1.
     pub fn with_http1_only(mut self) -> Self {
+        self.http2_only = false.into();
         self.http1_only = true.into();
         self
     }
 
     /// Only use http2 connections
     pub fn with_http2_only(mut self) -> Self {
+        self.http1_only = false.into();
         self.http2_only = true.into();
         self
     }
 
+    /// Use http2 if supported, otherwise use http1.
+    pub fn with_either_http1_http2(mut self) -> Self {

Review Comment:
   ```suggestion
       pub fn with_http2(mut self) -> Self {
   ```
   Perhaps?
   
   Would also be good to add an option for this



##########
object_store/src/gcp/mod.rs:
##########
@@ -29,6 +29,13 @@
 //! to abort the upload and drop those unneeded parts. In addition, you may wish to
 //! consider implementing automatic clean up of unused parts that are older than one
 //! week.
+//!
+//! ## Using HTTP/2
+//!
+//! Google Cloud Storage supports both HTTP/2 and HTTP/1. HTTP/1 is used by default
+//! because it allows much higher throughput in our benchmarks (see
+//! [#5194](https://github.com/apache/arrow-rs/issues/5194)). HTTP/2 can be
+//! enabled by setting [crate::ClientConfigKey::Http2Only] to true.

Review Comment:
   Perhaps could add a key for HTTP2 to allow auto-upgrade if supported?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] feat(object_store): use http1 by default [arrow-rs]

Posted by "wjones127 (via GitHub)" <gi...@apache.org>.
wjones127 merged PR #5204:
URL: https://github.com/apache/arrow-rs/pull/5204


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] feat(object_store): use http1 by default [arrow-rs]

Posted by "wjones127 (via GitHub)" <gi...@apache.org>.
wjones127 commented on code in PR #5204:
URL: https://github.com/apache/arrow-rs/pull/5204#discussion_r1424287651


##########
object_store/src/gcp/mod.rs:
##########
@@ -29,6 +29,13 @@
 //! to abort the upload and drop those unneeded parts. In addition, you may wish to
 //! consider implementing automatic clean up of unused parts that are older than one
 //! week.
+//!
+//! ## Using HTTP/2
+//!
+//! Google Cloud Storage supports both HTTP/2 and HTTP/1. HTTP/1 is used by default
+//! because it allows much higher throughput in our benchmarks (see
+//! [#5194](https://github.com/apache/arrow-rs/issues/5194)). HTTP/2 can be
+//! enabled by setting [crate::ClientConfigKey::Http2Only] to true.

Review Comment:
   I realized the easier advice is just to set `Http1Only` to false.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org