You are viewing a plain text version of this content. The canonical link for it is here.
Posted to codereview@trafodion.apache.org by selvaganesang <gi...@git.apache.org> on 2016/03/12 19:04:22 UTC

[GitHub] incubator-trafodion pull request: [TRAFODION-1887] Upsert into ali...

GitHub user selvaganesang opened a pull request:

    https://github.com/apache/incubator-trafodion/pull/380

    [TRAFODION-1887] Upsert into aligned format table with no primary key…

    …s fails with error 3241 after the patch from PR 340
    
    Also, added new test cases to test this sceanrio.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/selvaganesang/incubator-trafodion trafodion-1887

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-trafodion/pull/380.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #380
    
----
commit 6aed4d3db95a726d57a29f39a0383c2b1154e9d8
Author: selvaganesang <se...@esgyn.com>
Date:   2016-03-12T16:35:24Z

    [TRAFODION-1887] Upsert into aligned format table with no primary keys fails with error 3241 after the patch from PR 340
    
    Also, added new test cases to test this sceanrio.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: [TRAFODION-1887] Upsert into ali...

Posted by selvaganesang <gi...@git.apache.org>.
Github user selvaganesang commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/380#discussion_r56245070
  
    --- Diff: core/sql/optimizer/BindRelExpr.cpp ---
    @@ -10171,17 +10164,26 @@ matched clause of merge). If the upsert caused a row to be updated in the
     base table then the old version of the row will have to be deleted from 
     indexes, and a new version inserted. Upsert is being transformed to merge
     so that we can delete the old version of an updated row from the index.
    +
    +Upsert is also converted into merge when there are omitted cols with default values and 
    +TRAF_UPSERT_WITH_INSERT_DEFAULT_SEMANTICS is set to  OFF in case of aligned format table or 
    +omitted current timestamp cols in case of non-aligned row format
    --- End diff --
    
    In fact, we always generate assignment for all columns at the time of upsert even if the column is omitted in both formats. We had to do this because these columns need to flow to IM operators. In case of non-aligned format, these columns are removed from put into hbase at the executor. But these columns would exist in the raw index table in hbase if we happen to have an index on the omitted column. See PR #340
    
     For the first change request: If we always assign the current timestamp, what happens if the row exists. Won't we replace it the current timestamp even when some other column is updated or upserted.
    
    For the second change request: When the CQD TRAF_UPSERT_WITH_INSERT_DEFAULT_SEMANTICS  (aligned format) is set to ON, we just don't change it merge if there are no secondary indexes. 
    
    For non-aligned format, I need to recheck if the same CQD can be used. Though I had concluded we need to populate the default values for the non-aligned format earlier, there might be a way out of this. We can file a JIRA for it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: [TRAFODION-1887] Upsert into ali...

Posted by zellerh <gi...@git.apache.org>.
Github user zellerh commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/380#discussion_r56226976
  
    --- Diff: core/sql/optimizer/BindRelExpr.cpp ---
    @@ -10171,17 +10164,26 @@ matched clause of merge). If the upsert caused a row to be updated in the
     base table then the old version of the row will have to be deleted from 
     indexes, and a new version inserted. Upsert is being transformed to merge
     so that we can delete the old version of an updated row from the index.
    +
    +Upsert is also converted into merge when there are omitted cols with default values and 
    +TRAF_UPSERT_WITH_INSERT_DEFAULT_SEMANTICS is set to  OFF in case of aligned format table or 
    +omitted current timestamp cols in case of non-aligned row format
     */
    -NABoolean Insert::isUpsertThatNeedsMerge() const
    +NABoolean Insert::isUpsertThatNeedsMerge(NABoolean isAlignedRowFormat, NABoolean omittedDefaultCols,
    +                                   NABoolean omittedCurrentDefaultClassCols) const
     {
    -  if (!isUpsert() || getIsTrafLoadPrep() || 
    -      (getTableDesc()->isIdentityColumnGeneratedAlways() && 
    -       getTableDesc()->hasIdentityColumnInClusteringKey()) ||
    -      getTableDesc()->getClusteringIndex()->getNAFileSet()->hasSyskey() || 
    -      !(getTableDesc()->hasSecondaryIndexes()))
    -    return FALSE;
    -
    -  return TRUE;
    +  if (isUpsert() && 
    +      (NOT getIsTrafLoadPrep()) && 
    +      (NOT (getTableDesc()->isIdentityColumnGeneratedAlways() && getTableDesc()->hasIdentityColumnInClusteringKey())) && 
    +      (NOT (getTableDesc()->getClusteringIndex()->getNAFileSet()->hasSyskey())) && 
    +       ((getTableDesc()->hasSecondaryIndexes()) ||
    --- End diff --
    
    The indentation isn't correct on this line, it should start at the same space as the previous line. I find this if condition very hard to read. Here is a formatted version, the only change is indentation and comments:
    
        if ( // problem only happens for upsert
             isUpsert() &&
             ( // load prep avoids the problem by having its only index update logic
               NOT getIsTrafLoadPrep()) &&
             ( // with identity columns in the clustering key we never update existing rows
               NOT (getTableDesc()->isIdentityColumnGeneratedAlways() && getTableDesc()->hasIdentityColumnInClusteringKey())) &&
             ( // same with SYSKEY, we never update existing rows
               NOT (getTableDesc()->getClusteringIndex()->getNAFileSet()->hasSyskey())) &&
             ( // problem only occurs if we have secondary indexes...
               getTableDesc()->hasSecondaryIndexes() ||
               ( // ...or we omit "current" default columns in a non-aligned format
                 ( NOT isAlignedRowFormat) && omittedCurrentDefaultClassCols) ||
               ( // ...or we omit any default column in aligned format and the override CQD is not set
                 (isAlignedRowFormat && omittedDefaultCols
                  && (CmpCommon::getDefault(TRAF_UPSERT_WITH_INSERT_DEFAULT_SEMANTICS) == DF_OFF)))
             )
           )



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: [TRAFODION-1887] Upsert into ali...

Posted by zellerh <gi...@git.apache.org>.
Github user zellerh commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/380#discussion_r56245688
  
    --- Diff: core/sql/optimizer/BindRelExpr.cpp ---
    @@ -10171,17 +10164,26 @@ matched clause of merge). If the upsert caused a row to be updated in the
     base table then the old version of the row will have to be deleted from 
     indexes, and a new version inserted. Upsert is being transformed to merge
     so that we can delete the old version of an updated row from the index.
    +
    +Upsert is also converted into merge when there are omitted cols with default values and 
    +TRAF_UPSERT_WITH_INSERT_DEFAULT_SEMANTICS is set to  OFF in case of aligned format table or 
    +omitted current timestamp cols in case of non-aligned row format
    --- End diff --
    
    Thanks, that's good news. So, my proposal would be to stop removing those columns in the base table insert (unless an override CQD is set).
    
    Yes, if we always set the timestamp that would overwrite an existing row's timestamp, and I believe that is the right thing to do. I'll send out an email to the user list to ask for others' opinions.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: [TRAFODION-1887] Upsert into ali...

Posted by zellerh <gi...@git.apache.org>.
Github user zellerh commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/380#discussion_r56240661
  
    --- Diff: core/sql/optimizer/BindRelExpr.cpp ---
    @@ -10171,17 +10164,26 @@ matched clause of merge). If the upsert caused a row to be updated in the
     base table then the old version of the row will have to be deleted from 
     indexes, and a new version inserted. Upsert is being transformed to merge
     so that we can delete the old version of an updated row from the index.
    +
    +Upsert is also converted into merge when there are omitted cols with default values and 
    +TRAF_UPSERT_WITH_INSERT_DEFAULT_SEMANTICS is set to  OFF in case of aligned format table or 
    +omitted current timestamp cols in case of non-aligned row format
    --- End diff --
    
    Can I suggest two changes - not sure if they are feasible, but let me give it a try and ask for them.
    
    First, if we have a column with default current: Couldn't we just add an assignment of the column to current, without transforming the upsert to a merge?
    
    Second, I would find it more clear if we would define UPSERT as something that either inserts a new row or replaces an entire existing row, but does not merge old and new rows. If a user wants a merge then that's what the MERGE command is for.
    
    What would it mean to do that? As far as I can tell, several things:
    
    * Automatically generate an assignment for columns with DEFAULT CURRENT, no need to transform to MERGE because of that. This applies to both aligned and non-aligned format.
    * No need to transform to MERGE just because we have omitted columns in aligned format.
    * No need for the CQD for aligned format.
    * For non-aligned format, do one of two things:
      * Always generate all (default) values, even those not specified in the upsert command.
      * Have a CQD (could re-use the one now used for aligned) to suppress generation of defaults and take a chance to see values from previous rows. This trades space savings for unclear semantics.
    
    It's a bit late for this kind of design discussion, just want to bring it up here. If you feel we should go with what you coded that's ok with me, it's a step up from the current behavior. I can also file a JIRA for want I suggest and we can do that later.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: [TRAFODION-1887] Upsert into ali...

Posted by selvaganesang <gi...@git.apache.org>.
Github user selvaganesang commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/380#discussion_r56233621
  
    --- Diff: core/sql/optimizer/BindRelExpr.cpp ---
    @@ -10171,17 +10164,26 @@ matched clause of merge). If the upsert caused a row to be updated in the
     base table then the old version of the row will have to be deleted from 
     indexes, and a new version inserted. Upsert is being transformed to merge
     so that we can delete the old version of an updated row from the index.
    +
    +Upsert is also converted into merge when there are omitted cols with default values and 
    +TRAF_UPSERT_WITH_INSERT_DEFAULT_SEMANTICS is set to  OFF in case of aligned format table or 
    +omitted current timestamp cols in case of non-aligned row format
    --- End diff --
    
    For upsert, in case of non-aligned format, if any column other than the columns with default current is omitted, the default value is resolved at the time of select. If the current default column is omitted it will be always changed to merge. User can't force a different behavior for non-aligned format in this specific case. Rationale here: If we allow user to control it, then all omitted default columns should always be populated at the time of upsert for the merge to work correctly.
    
    For non-aligned format, user control it using the above cqd. Default behavior is change it to merge if the default columns are omitted in upsert.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: [TRAFODION-1887] Upsert into ali...

Posted by zellerh <gi...@git.apache.org>.
Github user zellerh commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/380#discussion_r56224510
  
    --- Diff: core/sql/optimizer/BindRelExpr.cpp ---
    @@ -10171,17 +10164,26 @@ matched clause of merge). If the upsert caused a row to be updated in the
     base table then the old version of the row will have to be deleted from 
     indexes, and a new version inserted. Upsert is being transformed to merge
     so that we can delete the old version of an updated row from the index.
    +
    --- End diff --
    
    The comment above mentions LP 1460771. I think that's TRAFODION-14 now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: [TRAFODION-1887] Upsert into ali...

Posted by zellerh <gi...@git.apache.org>.
Github user zellerh commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/380#discussion_r56228136
  
    --- Diff: core/sql/optimizer/BindRelExpr.cpp ---
    @@ -10171,17 +10164,26 @@ matched clause of merge). If the upsert caused a row to be updated in the
     base table then the old version of the row will have to be deleted from 
     indexes, and a new version inserted. Upsert is being transformed to merge
     so that we can delete the old version of an updated row from the index.
    +
    +Upsert is also converted into merge when there are omitted cols with default values and 
    +TRAF_UPSERT_WITH_INSERT_DEFAULT_SEMANTICS is set to  OFF in case of aligned format table or 
    +omitted current timestamp cols in case of non-aligned row format
    --- End diff --
    
    Can you explain this again, I would have thought it's the other way round: With non-aligned format, if we omit any column then we run a chance of seeing the value of an older row. Is that a "feature"? If so and if TRAF_UPSERT_WITH_INSERT_DEFAULT_SEMANTICS is set to ON, will that cause a difference between the behavior with aligned and non-aligned format? My apologies, I think you explained this to me before.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: [TRAFODION-1887] Upsert into ali...

Posted by zellerh <gi...@git.apache.org>.
Github user zellerh commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/380#discussion_r56237956
  
    --- Diff: core/sql/optimizer/BindRelExpr.cpp ---
    @@ -10171,17 +10164,26 @@ matched clause of merge). If the upsert caused a row to be updated in the
     base table then the old version of the row will have to be deleted from 
     indexes, and a new version inserted. Upsert is being transformed to merge
     so that we can delete the old version of an updated row from the index.
    +
    +Upsert is also converted into merge when there are omitted cols with default values and 
    +TRAF_UPSERT_WITH_INSERT_DEFAULT_SEMANTICS is set to  OFF in case of aligned format table or 
    +omitted current timestamp cols in case of non-aligned row format
     */
    -NABoolean Insert::isUpsertThatNeedsMerge() const
    +NABoolean Insert::isUpsertThatNeedsMerge(NABoolean isAlignedRowFormat, NABoolean omittedDefaultCols,
    +                                   NABoolean omittedCurrentDefaultClassCols) const
     {
    -  if (!isUpsert() || getIsTrafLoadPrep() || 
    -      (getTableDesc()->isIdentityColumnGeneratedAlways() && 
    -       getTableDesc()->hasIdentityColumnInClusteringKey()) ||
    -      getTableDesc()->getClusteringIndex()->getNAFileSet()->hasSyskey() || 
    -      !(getTableDesc()->hasSecondaryIndexes()))
    -    return FALSE;
    -
    -  return TRUE;
    +  if (isUpsert() && 
    +      (NOT getIsTrafLoadPrep()) && 
    +      (NOT (getTableDesc()->isIdentityColumnGeneratedAlways() && getTableDesc()->hasIdentityColumnInClusteringKey())) && 
    +      (NOT (getTableDesc()->getClusteringIndex()->getNAFileSet()->hasSyskey())) && 
    +       ((getTableDesc()->hasSecondaryIndexes()) ||
    --- End diff --
    
    Only if you find it easier to read, otherwise please ignore.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: [TRAFODION-1887] Upsert into ali...

Posted by selvaganesang <gi...@git.apache.org>.
Github user selvaganesang commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/380#discussion_r56231592
  
    --- Diff: core/sql/optimizer/BindRelExpr.cpp ---
    @@ -10171,17 +10164,26 @@ matched clause of merge). If the upsert caused a row to be updated in the
     base table then the old version of the row will have to be deleted from 
     indexes, and a new version inserted. Upsert is being transformed to merge
     so that we can delete the old version of an updated row from the index.
    +
    +Upsert is also converted into merge when there are omitted cols with default values and 
    +TRAF_UPSERT_WITH_INSERT_DEFAULT_SEMANTICS is set to  OFF in case of aligned format table or 
    +omitted current timestamp cols in case of non-aligned row format
     */
    -NABoolean Insert::isUpsertThatNeedsMerge() const
    +NABoolean Insert::isUpsertThatNeedsMerge(NABoolean isAlignedRowFormat, NABoolean omittedDefaultCols,
    +                                   NABoolean omittedCurrentDefaultClassCols) const
     {
    -  if (!isUpsert() || getIsTrafLoadPrep() || 
    -      (getTableDesc()->isIdentityColumnGeneratedAlways() && 
    -       getTableDesc()->hasIdentityColumnInClusteringKey()) ||
    -      getTableDesc()->getClusteringIndex()->getNAFileSet()->hasSyskey() || 
    -      !(getTableDesc()->hasSecondaryIndexes()))
    -    return FALSE;
    -
    -  return TRUE;
    +  if (isUpsert() && 
    +      (NOT getIsTrafLoadPrep()) && 
    +      (NOT (getTableDesc()->isIdentityColumnGeneratedAlways() && getTableDesc()->hasIdentityColumnInClusteringKey())) && 
    +      (NOT (getTableDesc()->getClusteringIndex()->getNAFileSet()->hasSyskey())) && 
    +       ((getTableDesc()->hasSecondaryIndexes()) ||
    --- End diff --
    
    Yes. You got it correct. I can modify the lines as above, if you want me to do that


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: [TRAFODION-1887] Upsert into ali...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/incubator-trafodion/pull/380


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---