You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Prashant Wason (Jira)" <ji...@apache.org> on 2021/01/26 23:24:00 UTC
[jira] [Created] (HUDI-1553) Add configs for TimelineServer to
configure Jetty
Prashant Wason created HUDI-1553:
------------------------------------
Summary: Add configs for TimelineServer to configure Jetty
Key: HUDI-1553
URL: https://issues.apache.org/jira/browse/HUDI-1553
Project: Apache Hudi
Issue Type: Improvement
Reporter: Prashant Wason
Assignee: Prashant Wason
TimelineServer uses Javalin which is based on Jetty.
By default Jetty:
* Has 200 threads
* Compresses output by gzip
* Handles each request sequentially
On a large-scale HUDI dataset (2000 partitions), when TimelineServer is enabled, the operations slow down due to following reasons:
# Driver process usually has a few cores. 200 Jetty threads lead to huge contention when 100s of executors connect to the Server in parallel.
# To handle large number of requests in parallel, its better to handle each HTTP request in an asynchronous manner using Futures which are supported by Javalin.
# The compute overhead of gzipping may not be necessary when the executors and driver are in the same rack or within the same datacenter
--
This message was sent by Atlassian Jira
(v8.3.4#803005)