You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@trafodion.apache.org by "Liu, Yuan (Yuan)" <yu...@esgyn.cn> on 2018/06/06 07:32:30 UTC

Upsert on part of columns get traf_merge which cause the performance not good

Hi Trafodioneers,

In many cases, we find that "UPDATE" is not that efficient, so sometimes we rewrite the "UPDATE" as "UPSERT".

However, if we only write part of columns in upsert, the query plan will get "TRAF_MERGE", which is still not efficient.

Only if we write all the columns in upsert, the query plan can get good query plan, such as VSBB_UPSERT.

If table has hundrends of columns, this is very complex for users.

It is possible that upsert choose "TRAF_REPLACE" by default, at the same time, we keep the values which are not in upsert?


Best regards,
Yuan


RE: Upsert on part of columns get traf_merge which cause the performance not good

Posted by Selva Govindarajan <se...@esgyn.com>.
The row that is being updated could have the non-default column value for the default column.  The update may not choose this column. In that case, TRAF_REPLACE would blindly replace these columns with the default value. It becomes worse if the default column is a timestamp with current_timestamp as default.

Update can implicitly select and update the column value. This could be beneficial in case of non-wide row in aligned format. In case of wide row, hbase format tables are better for update as Ming suggested.

Selva

From: Liu, Ming (Ming) <mi...@esgyn.cn>
Sent: Wednesday, June 6, 2018 5:40 AM
To: user@trafodion.apache.org; dev@trafodion.apache.org
Subject: RE: Upsert on part of columns get traf_merge which cause the performance not good

hi, Yuan,

You can check a discussion on this at https://lists.apache.org/list.html?user@trafodion.apache.org:2016-03, search for 'upsert semantics' topic.
Default mode should be MERGE as discussed, it was decided that. Correct behavior is more important than peformance.
I think you can create Non-Aligned format table, if the table is very frequently updated, it will be slower for read queries though.

Others may have better answer here

thanks,
Ming

From: Liu, Yuan (Yuan) <yu...@esgyn.cn>>
Sent: Wednesday, June 06, 2018 3:33 PM
To: dev@trafodion.apache.org<ma...@trafodion.apache.org>; user@trafodion.apache.org<ma...@trafodion.apache.org>
Subject: Upsert on part of columns get traf_merge which cause the performance not good

Hi Trafodioneers,

In many cases, we find that "UPDATE" is not that efficient, so sometimes we rewrite the "UPDATE" as "UPSERT".

However, if we only write part of columns in upsert, the query plan will get "TRAF_MERGE", which is still not efficient.

Only if we write all the columns in upsert, the query plan can get good query plan, such as VSBB_UPSERT.

If table has hundrends of columns, this is very complex for users.

It is possible that upsert choose "TRAF_REPLACE" by default, at the same time, we keep the values which are not in upsert?


Best regards,
Yuan


RE: Upsert on part of columns get traf_merge which cause the performance not good

Posted by Selva Govindarajan <se...@esgyn.com>.
The row that is being updated could have the non-default column value for the default column.  The update may not choose this column. In that case, TRAF_REPLACE would blindly replace these columns with the default value. It becomes worse if the default column is a timestamp with current_timestamp as default.

Update can implicitly select and update the column value. This could be beneficial in case of non-wide row in aligned format. In case of wide row, hbase format tables are better for update as Ming suggested.

Selva

From: Liu, Ming (Ming) <mi...@esgyn.cn>
Sent: Wednesday, June 6, 2018 5:40 AM
To: user@trafodion.apache.org; dev@trafodion.apache.org
Subject: RE: Upsert on part of columns get traf_merge which cause the performance not good

hi, Yuan,

You can check a discussion on this at https://lists.apache.org/list.html?user@trafodion.apache.org:2016-03, search for 'upsert semantics' topic.
Default mode should be MERGE as discussed, it was decided that. Correct behavior is more important than peformance.
I think you can create Non-Aligned format table, if the table is very frequently updated, it will be slower for read queries though.

Others may have better answer here

thanks,
Ming

From: Liu, Yuan (Yuan) <yu...@esgyn.cn>>
Sent: Wednesday, June 06, 2018 3:33 PM
To: dev@trafodion.apache.org<ma...@trafodion.apache.org>; user@trafodion.apache.org<ma...@trafodion.apache.org>
Subject: Upsert on part of columns get traf_merge which cause the performance not good

Hi Trafodioneers,

In many cases, we find that "UPDATE" is not that efficient, so sometimes we rewrite the "UPDATE" as "UPSERT".

However, if we only write part of columns in upsert, the query plan will get "TRAF_MERGE", which is still not efficient.

Only if we write all the columns in upsert, the query plan can get good query plan, such as VSBB_UPSERT.

If table has hundrends of columns, this is very complex for users.

It is possible that upsert choose "TRAF_REPLACE" by default, at the same time, we keep the values which are not in upsert?


Best regards,
Yuan


RE: Upsert on part of columns get traf_merge which cause the performance not good

Posted by "Liu, Ming (Ming)" <mi...@esgyn.cn>.
hi, Yuan,

You can check a discussion on this at https://lists.apache.org/list.html?user@trafodion.apache.org:2016-03, search for 'upsert semantics' topic.
Default mode should be MERGE as discussed, it was decided that. Correct behavior is more important than peformance.
I think you can create Non-Aligned format table, if the table is very frequently updated, it will be slower for read queries though.

Others may have better answer here

thanks,
Ming

From: Liu, Yuan (Yuan) <yu...@esgyn.cn>
Sent: Wednesday, June 06, 2018 3:33 PM
To: dev@trafodion.apache.org; user@trafodion.apache.org
Subject: Upsert on part of columns get traf_merge which cause the performance not good

Hi Trafodioneers,

In many cases, we find that "UPDATE" is not that efficient, so sometimes we rewrite the "UPDATE" as "UPSERT".

However, if we only write part of columns in upsert, the query plan will get "TRAF_MERGE", which is still not efficient.

Only if we write all the columns in upsert, the query plan can get good query plan, such as VSBB_UPSERT.

If table has hundrends of columns, this is very complex for users.

It is possible that upsert choose "TRAF_REPLACE" by default, at the same time, we keep the values which are not in upsert?


Best regards,
Yuan


RE: Upsert on part of columns get traf_merge which cause the performance not good

Posted by "Liu, Ming (Ming)" <mi...@esgyn.cn>.
hi, Yuan,

You can check a discussion on this at https://lists.apache.org/list.html?user@trafodion.apache.org:2016-03, search for 'upsert semantics' topic.
Default mode should be MERGE as discussed, it was decided that. Correct behavior is more important than peformance.
I think you can create Non-Aligned format table, if the table is very frequently updated, it will be slower for read queries though.

Others may have better answer here

thanks,
Ming

From: Liu, Yuan (Yuan) <yu...@esgyn.cn>
Sent: Wednesday, June 06, 2018 3:33 PM
To: dev@trafodion.apache.org; user@trafodion.apache.org
Subject: Upsert on part of columns get traf_merge which cause the performance not good

Hi Trafodioneers,

In many cases, we find that "UPDATE" is not that efficient, so sometimes we rewrite the "UPDATE" as "UPSERT".

However, if we only write part of columns in upsert, the query plan will get "TRAF_MERGE", which is still not efficient.

Only if we write all the columns in upsert, the query plan can get good query plan, such as VSBB_UPSERT.

If table has hundrends of columns, this is very complex for users.

It is possible that upsert choose "TRAF_REPLACE" by default, at the same time, we keep the values which are not in upsert?


Best regards,
Yuan