You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Juan Yu (JIRA)" <ji...@apache.org> on 2017/05/25 18:40:04 UTC
[jira] [Created] (KUDU-2025) Upsert throughput is 10~20% slower
than insert
Juan Yu created KUDU-2025:
-----------------------------
Summary: Upsert throughput is 10~20% slower than insert
Key: KUDU-2025
URL: https://issues.apache.org/jira/browse/KUDU-2025
Project: Kudu
Issue Type: Bug
Reporter: Juan Yu
According to Kudu design, upsert should be faster than insert.
I ran some tests to compare upsert and insert performance
picked a few tables (those larger one like store_sales, catalog_sales) from tpcds, each table is hash partitioned by first 3 columns. data are generated (shouldn't have duplicate key), 100G ~ 1TB range. each time data are ingested to newly created table.
In general, the upsert throughput is 10~20% slower than insert according to CM metrics.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)