You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/04/23 16:12:42 UTC

[GitHub] [arrow] lidavidm commented on a change in pull request #7020: ARROW-8562: [C++] Parameterize I/O coalescing using s3 storage metrics

lidavidm commented on a change in pull request #7020:
URL: https://github.com/apache/arrow/pull/7020#discussion_r413923773



##########
File path: cpp/src/arrow/io/caching.h
##########
@@ -27,6 +27,44 @@
 
 namespace arrow {
 namespace io {
+
+struct ARROW_EXPORT CacheOptions {
+  static constexpr double kDefaultIdealBandwidthUtilizationFrac = 0.9;
+  static constexpr int64_t kDefaultMaxIdealRequestSizeMib = 64;
+
+  /// /brief The maximum distance in bytes between two consecutive
+  ///   ranges; beyond this value, ranges are not combined
+  int64_t hole_size_limit;
+  /// /brief The maximum size in bytes of a combined range; if
+  ///   combining two consecutive ranges would produce a range of a
+  ///   size greater than this, they are not combined
+  int64_t range_size_limit;
+
+  bool operator==(const CacheOptions& other) const {
+    return hole_size_limit == other.hole_size_limit &&
+           range_size_limit == other.range_size_limit;
+  }
+
+  /// \brief Construct CacheOptions from S3 storage metrics.
+  ///
+  /// \param[in] time_to_first_byte_millis Seek-time or Time-To-First-Byte (TTFB) in
+  ///   milliseconds, also called call setup latency of a new S3 request.
+  ///   The value is a positive integer.
+  /// \param[in] transfer_bandwidth_mib_per_sec Data transfer Bandwidth (BW) in MiB/sec.
+  ///   The value is a positive integer.
+  /// \param[in] ideal_bandwidth_utilization_frac Transfer bandwidth utilization fraction
+  ///   (per connection) to maximize the net data load.
+  ///   The value is a positive double precision number less than or equal to 1.
+  /// \param[in] max_ideal_request_size_mib The maximum single data request size (in MiB)
+  ///   to maximize the net data load.
+  ///   The value is a positive integer.
+  /// \return A new instance of CacheOptions.
+  static CacheOptions MakeFromS3Metrics(

Review comment:
       This might be better named as network metrics; it'd be useful for the GCS filesystem that's in progress, for instance.

##########
File path: cpp/src/arrow/io/caching.cc
##########
@@ -27,6 +28,91 @@
 
 namespace arrow {
 namespace io {
+
+CacheOptions CacheOptions::MakeFromS3Metrics(
+    const int64_t time_to_first_byte_millis, const int64_t transfer_bandwidth_mib_per_sec,
+    const double ideal_bandwidth_utilization_frac,
+    const int64_t max_ideal_request_size_mib) {
+  //
+  // The I/O coalescing algorithm uses two parameters:
+  //   1. hole_size_limit (a.k.a max_io_gap): Max I/O gap/hole size in bytes
+  //   2. range_size_limit (a.k.a ideal_request_size): Ideal I/O Request size in bytes
+  //
+  // These parameters can be derived from S3 metrics as described below:
+  //
+  // In an S3 compatible storage, there are two main metrics:
+  //   1. Seek-time or Time-To-First-Byte (TTFB) in seconds: call setup latency of a new
+  //      S3 request
+  //   2. Transfer Bandwidth (BW) for data in bytes/sec
+  //
+  // 1. Computing hole_size_limit:
+  //
+  //   hole_size_limit = TTFB * BW
+  //
+  //   This is also called Bandwidth-Delay-Product (BDP).
+  //   Two byte ranges that have a gap can still be mapped to the same read
+  //   if the gap is less than the bandwidth-delay product [TTFB * TransferBandwidth],
+  //   i.e. if the Time-To-First-Byte (or call setup latency of a new S3 request) is
+  //   expected to be greater than just reading and discarding the extra bytes on an
+  //   existing HTTP request.
+  //
+  // 2. Computing range_size_limit:
+  //
+  //   We want to have high bandwidth utilization per S3 connections,
+  //   i.e. transfer large amounts of data to amortize the seek overhead.
+  //   But, we also want to leverage parallelism by slicing very large IO chunks.
+  //   We define two more config parameters with suggested default values to control
+  //   the slice size and seek to balance the two effects with the goal of maximizing
+  //   net data load performance.
+  //
+  //   BW_util_frac (ideal bandwidth utilization): Transfer bandwidth utilization fraction
+  //     (per connection) to maximize the net data load. 90% is a good default number for
+  //     an effective transfer bandwidth.
+  //
+  //   MAX_IDEAL_REQUEST_SIZE: The maximum single data request size (in MiB) to maximize
+  //     the net data load. 64 MiB is a good default number for the ideal request size.
+  //
+  //   The amount of data that needs to be transferred in a single S3 get_object
+  //   request to achieve effective bandwidth eff_BW = BW_util_frac * BW is as follows:
+  //     eff_BW = range_size_limit / (TTFB + range_size_limit / BW)
+  //
+  //   Substituting TTFB = hole_size_limit / BW and eff_BW = BW_util_frac * BW, we get the
+  //   following result:
+  //     range_size_limit = hole_size_limit * BW_util_frac / (1 - BW_util_frac)
+  //
+  //   Applying the MAX_IDEAL_REQUEST_SIZE, we get the following:
+  //     range_size_limit = min(MAX_IDEAL_REQUEST_SIZE,
+  //                            hole_size_limit * BW_util_frac / (1 - BW_util_frac))
+  //
+  DCHECK_GT(time_to_first_byte_millis, 0) << "TTFB must be > 0";
+  DCHECK_GT(transfer_bandwidth_mib_per_sec, 0) << "Transfer bandwidth must be > 0";
+  DCHECK_GT(ideal_bandwidth_utilization_frac, 0)
+      << "Ideal bandwidth utilization fraction must be > 0";
+  DCHECK_LE(ideal_bandwidth_utilization_frac, 1.0)
+      << "Ideal bandwidth utilization fraction must be <= 1";
+  DCHECK_GT(max_ideal_request_size_mib, 0) << "Max Ideal request size must be > 0";
+
+  const double time_to_first_byte_sec = time_to_first_byte_millis / 1000.0;
+  const int64_t transfer_bandwidth_bytes_per_sec =
+      transfer_bandwidth_mib_per_sec * 1024 * 1024;
+  const int64_t max_ideal_request_size_bytes = max_ideal_request_size_mib * 1024 * 1024;
+
+  // hole_size_limit = TTFB * BW
+  const int64_t hole_size_limit = static_cast<int64_t>(
+      std::round(time_to_first_byte_sec * transfer_bandwidth_bytes_per_sec));
+  DCHECK_GT(hole_size_limit, 0) << "Computed hole_size_limit must be > 0";

Review comment:
       Given these checks, it might be good to have the method return `Result<CacheOptions>`, and convert the DCHECKs into returning `Status::Invalid`. That way it won't crash (or silently pass) if the user passes in invalid options.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org