You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Juan Yu (JIRA)" <ji...@apache.org> on 2017/05/25 18:40:04 UTC

[jira] [Created] (KUDU-2025) Upsert throughput is 10~20% slower than insert

Juan Yu created KUDU-2025:
-----------------------------

             Summary: Upsert throughput is 10~20% slower than insert
                 Key: KUDU-2025
                 URL: https://issues.apache.org/jira/browse/KUDU-2025
             Project: Kudu
          Issue Type: Bug
            Reporter: Juan Yu


According to Kudu design, upsert should be faster than insert.
I ran some tests to compare upsert and insert performance
picked a few tables (those larger one like store_sales, catalog_sales) from tpcds, each table is hash partitioned by first 3 columns. data are generated (shouldn't have duplicate key), 100G ~ 1TB range. each time data are ingested to newly created table.
In general, the upsert throughput is 10~20% slower than insert according to CM metrics.




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)