You are viewing a plain text version of this content. The canonical link for it is here.
Posted to codereview@trafodion.apache.org by nonstop-qfchen <gi...@git.apache.org> on 2015/09/25 17:41:28 UTC

[GitHub] incubator-trafodion pull request: fix JIRA-1499 and 1500

GitHub user nonstop-qfchen opened a pull request:

    https://github.com/apache/incubator-trafodion/pull/92

    fix JIRA-1499 and 1500

    Please kindly review the fix to both JIRAs.  Thanks --Qifan

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/nonstop-qfchen/incubator-trafodion hivePartFunc

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-trafodion/pull/92.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #92
    
----
commit 48d8a724bc97c93efbc7a21b938bddf766978a25
Author: Qifan Chen <qf...@dev02.trafodion.org>
Date:   2015-09-25T15:36:03Z

    fix JIRA-1499 and 1500

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: fix JIRA-1499 and 1500

Posted by sureshsubbiah <gi...@git.apache.org>.
Github user sureshsubbiah commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/92#discussion_r40453410
  
    --- Diff: core/sql/optimizer/NodeMap.cpp ---
    @@ -1311,6 +1311,135 @@ NodeMap::getPopularNodeNumber(CollIndex beginPos, CollIndex endPos) const
       return popularNodeNumber;
     } // NodeMap::getNodeNum
     
    +// Smooth the node map to reassign entries with identical node Id of higher
    +// frequency than the rest to some other nodes. Assume there are m total entries in the map, and 
    +// n nodes in the cluster. Assume further there are s entries in the map (out of m) that refer 
    +// to very few nodes.  These s entrres are the subject of smoothing operation.  We do so by
    +// 1. allocate an array nodeUsageMap[] and the ith entry in it contains all indexes k in the map
    +//    that points at i (i.e., nodeMap.getNodeId[k] = i)
    +// 2. Find out s by frequency counting
    +// 3. Find out m - s
    +// 4. Find out how many nodes in each of the s entries that should be moved
    +// 
    +// The nodels accepting the reasssignment will be from the set of the nodes contained in the map.
    +//
    +// Example 1.  Assume the original node map with 6 entries as follows:
    +//
    +//    NodeMapIndex   0   1   2   3  4  5 
    +//    NodeMapEntry   0   1   1   3  1  5
    +//
    +// The nodes accepting the reasssignment will be { 0, 3, 5 }
    +//
    +// The subset of entries referring to a few nodes with high frequency: s = { 1, 2, 4 } 
    +// Since f(0)=f(3)=f(5)=1, the average frequency of nodes not in s: (1+1+1)/3 = 1.
    +// We will allow one (1) assigment in s to remain in node 1 since it is the average frequency, and 
    +// re-assign the rest (marked X) to different nodes via round-robin starting the 1st node in the map. 
    +// The node (1) is excluded from the re-assignment. 
    +//
    +//    NodeMapIndex   0   1   2   3  4  5 
    +//    NodeMapEntry   0   1   1   3  1  5
    +//                           X      X
    +//                           ^      ^
    +//                           |      |
    +//                          to 0   to 3
    +//
    +// The final smoothed node map:
    +//
    +//    NodeMapIndex   0   1   2   3  4  5 
    +//    NodeMapEntry   0   1   0   3  3  5
    +//
    +NABoolean NodeMap::smooth(Lng32 numNodes) 
    +{
    +  NABoolean smoothed = FALSE;
    +
    +  typedef ClusteredBitmap* ClusteredBitmapPtr;
    +
    +  ClusteredBitmap** nodeUsageMap = new (heap_) ClusteredBitmapPtr[numNodes];
    +
    +  for (Lng32 index = 0; index < numNodes; index++) {
    +     nodeUsageMap[index] = NULL;
    +  }
    +
    +  ClusteredBitmap includedNodes(heap_);
    +
    +  Lng32 highestFreq = 0;
    +  for (Lng32 index = 0; index < getNumEntries(); index++) {
    +    Lng32 currNodeNum = getNodeNumber(index);
    +
    +    if ( currNodeNum != ANY_NODE ) {
    +
    +      if ( nodeUsageMap[currNodeNum] == NULL ) 
    +         nodeUsageMap[currNodeNum] = new (heap_)ClusteredBitmap(heap_);
    +
    +      nodeUsageMap[currNodeNum]->insert(index);
    +      includedNodes.insert(currNodeNum);
    +
    +      Lng32 entries = nodeUsageMap[currNodeNum]->entries();
    +      if ( highestFreq < entries ) {
    +         highestFreq = entries;
    +      } 
    +    } 
    +  }
    +
    +  // Find how many entries wth the highest frequency, and compute the number of entris with
    --- End diff --
    
    typo on "entries"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: fix JIRA-1499 and 1500

Posted by robertamarton <gi...@git.apache.org>.
Github user robertamarton commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/92#discussion_r40445243
  
    --- Diff: core/sql/regress/hive/EXPECTED009 ---
    @@ -22,10 +22,20 @@
     >>-- make sure no external schemas exist in Trafodion
    --- End diff --
    
    Probably should change the comment now that external schemas exist


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: fix JIRA-1499 and 1500

Posted by nonstop-qfchen <gi...@git.apache.org>.
Github user nonstop-qfchen commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/92#discussion_r40456991
  
    --- Diff: core/sql/optimizer/NodeMap.cpp ---
    @@ -1311,6 +1311,135 @@ NodeMap::getPopularNodeNumber(CollIndex beginPos, CollIndex endPos) const
       return popularNodeNumber;
     } // NodeMap::getNodeNum
     
    +// Smooth the node map to reassign entries with identical node Id of higher
    +// frequency than the rest to some other nodes. Assume there are m total entries in the map, and 
    +// n nodes in the cluster. Assume further there are s entries in the map (out of m) that refer 
    +// to very few nodes.  These s entrres are the subject of smoothing operation.  We do so by
    +// 1. allocate an array nodeUsageMap[] and the ith entry in it contains all indexes k in the map
    +//    that points at i (i.e., nodeMap.getNodeId[k] = i)
    +// 2. Find out s by frequency counting
    +// 3. Find out m - s
    +// 4. Find out how many nodes in each of the s entries that should be moved
    +// 
    +// The nodels accepting the reasssignment will be from the set of the nodes contained in the map.
    +//
    +// Example 1.  Assume the original node map with 6 entries as follows:
    +//
    +//    NodeMapIndex   0   1   2   3  4  5 
    +//    NodeMapEntry   0   1   1   3  1  5
    +//
    +// The nodes accepting the reasssignment will be { 0, 3, 5 }
    +//
    +// The subset of entries referring to a few nodes with high frequency: s = { 1, 2, 4 } 
    +// Since f(0)=f(3)=f(5)=1, the average frequency of nodes not in s: (1+1+1)/3 = 1.
    +// We will allow one (1) assigment in s to remain in node 1 since it is the average frequency, and 
    +// re-assign the rest (marked X) to different nodes via round-robin starting the 1st node in the map. 
    +// The node (1) is excluded from the re-assignment. 
    +//
    +//    NodeMapIndex   0   1   2   3  4  5 
    +//    NodeMapEntry   0   1   1   3  1  5
    +//                           X      X
    +//                           ^      ^
    +//                           |      |
    +//                          to 0   to 3
    +//
    +// The final smoothed node map:
    +//
    +//    NodeMapIndex   0   1   2   3  4  5 
    +//    NodeMapEntry   0   1   0   3  3  5
    +//
    +NABoolean NodeMap::smooth(Lng32 numNodes) 
    +{
    +  NABoolean smoothed = FALSE;
    +
    +  typedef ClusteredBitmap* ClusteredBitmapPtr;
    +
    +  ClusteredBitmap** nodeUsageMap = new (heap_) ClusteredBitmapPtr[numNodes];
    +
    +  for (Lng32 index = 0; index < numNodes; index++) {
    +     nodeUsageMap[index] = NULL;
    +  }
    +
    +  ClusteredBitmap includedNodes(heap_);
    +
    +  Lng32 highestFreq = 0;
    +  for (Lng32 index = 0; index < getNumEntries(); index++) {
    +    Lng32 currNodeNum = getNodeNumber(index);
    +
    +    if ( currNodeNum != ANY_NODE ) {
    +
    +      if ( nodeUsageMap[currNodeNum] == NULL ) 
    +         nodeUsageMap[currNodeNum] = new (heap_)ClusteredBitmap(heap_);
    +
    +      nodeUsageMap[currNodeNum]->insert(index);
    +      includedNodes.insert(currNodeNum);
    +
    +      Lng32 entries = nodeUsageMap[currNodeNum]->entries();
    +      if ( highestFreq < entries ) {
    +         highestFreq = entries;
    +      } 
    +    } 
    +  }
    +
    +  // Find how many entries wth the highest frequency, and compute the number of entris with
    --- End diff --
    
    fixed.
    
    On Fri, Sep 25, 2015 at 12:20 PM, Suresh Subbiah <no...@github.com>
    wrote:
    
    > In core/sql/optimizer/NodeMap.cpp
    > <https://github.com/apache/incubator-trafodion/pull/92#discussion_r40453410>
    > :
    >
    > > +    if ( currNodeNum != ANY_NODE ) {
    > > +
    > > +      if ( nodeUsageMap[currNodeNum] == NULL )
    > > +         nodeUsageMap[currNodeNum] = new (heap_)ClusteredBitmap(heap_);
    > > +
    > > +      nodeUsageMap[currNodeNum]->insert(index);
    > > +      includedNodes.insert(currNodeNum);
    > > +
    > > +      Lng32 entries = nodeUsageMap[currNodeNum]->entries();
    > > +      if ( highestFreq < entries ) {
    > > +         highestFreq = entries;
    > > +      }
    > > +    }
    > > +  }
    > > +
    > > +  // Find how many entries wth the highest frequency, and compute the number of entris with
    >
    > typo on "entries"
    >
    > —
    > Reply to this email directly or view it on GitHub
    > <https://github.com/apache/incubator-trafodion/pull/92/files#r40453410>.
    >
    
    
    
    -- 
    Regards, --Qifan



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: fix JIRA-1499 and 1500

Posted by robertamarton <gi...@git.apache.org>.
Github user robertamarton commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/92#discussion_r40444601
  
    --- Diff: core/sql/common/ComMisc.cpp ---
    @@ -171,3 +171,61 @@ NAString ComConvertNativeNameToTrafName (
       return convertedName;
     }
     
    +// ----------------------------------------------------------------------------
    +// function: ComConvertTrafNameToNativeName
    +//
    +// this function converts the Trafodion external table name 
    +// into its native name format. Both names are in external format.
    +//
    +// Example:  TRAFODION._HV_HIVE_.abc becomes HIVE.HIVE.abc
    +//
    --- End diff --
    
    Probably should have double quotes around _HV_HIVE_ since it is in external format.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: fix JIRA-1499 and 1500

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/incubator-trafodion/pull/92


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: fix JIRA-1499 and 1500

Posted by sureshsubbiah <gi...@git.apache.org>.
Github user sureshsubbiah commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/92#discussion_r40453224
  
    --- Diff: core/sql/optimizer/NodeMap.cpp ---
    @@ -1311,6 +1311,135 @@ NodeMap::getPopularNodeNumber(CollIndex beginPos, CollIndex endPos) const
       return popularNodeNumber;
     } // NodeMap::getNodeNum
     
    +// Smooth the node map to reassign entries with identical node Id of higher
    +// frequency than the rest to some other nodes. Assume there are m total entries in the map, and 
    +// n nodes in the cluster. Assume further there are s entries in the map (out of m) that refer 
    +// to very few nodes.  These s entrres are the subject of smoothing operation.  We do so by
    +// 1. allocate an array nodeUsageMap[] and the ith entry in it contains all indexes k in the map
    +//    that points at i (i.e., nodeMap.getNodeId[k] = i)
    +// 2. Find out s by frequency counting
    +// 3. Find out m - s
    +// 4. Find out how many nodes in each of the s entries that should be moved
    +// 
    +// The nodels accepting the reasssignment will be from the set of the nodes contained in the map.
    +//
    +// Example 1.  Assume the original node map with 6 entries as follows:
    +//
    +//    NodeMapIndex   0   1   2   3  4  5 
    +//    NodeMapEntry   0   1   1   3  1  5
    +//
    +// The nodes accepting the reasssignment will be { 0, 3, 5 }
    +//
    +// The subset of entries referring to a few nodes with high frequency: s = { 1, 2, 4 } 
    +// Since f(0)=f(3)=f(5)=1, the average frequency of nodes not in s: (1+1+1)/3 = 1.
    +// We will allow one (1) assigment in s to remain in node 1 since it is the average frequency, and 
    +// re-assign the rest (marked X) to different nodes via round-robin starting the 1st node in the map. 
    +// The node (1) is excluded from the re-assignment. 
    +//
    +//    NodeMapIndex   0   1   2   3  4  5 
    +//    NodeMapEntry   0   1   1   3  1  5
    +//                           X      X
    +//                           ^      ^
    +//                           |      |
    +//                          to 0   to 3
    --- End diff --
    
    Should node map entry = 2 be included?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-trafodion pull request: fix JIRA-1499 and 1500

Posted by nonstop-qfchen <gi...@git.apache.org>.
Github user nonstop-qfchen commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/92#discussion_r40456945
  
    --- Diff: core/sql/optimizer/NodeMap.cpp ---
    @@ -1311,6 +1311,135 @@ NodeMap::getPopularNodeNumber(CollIndex beginPos, CollIndex endPos) const
       return popularNodeNumber;
     } // NodeMap::getNodeNum
     
    +// Smooth the node map to reassign entries with identical node Id of higher
    +// frequency than the rest to some other nodes. Assume there are m total entries in the map, and 
    +// n nodes in the cluster. Assume further there are s entries in the map (out of m) that refer 
    +// to very few nodes.  These s entrres are the subject of smoothing operation.  We do so by
    +// 1. allocate an array nodeUsageMap[] and the ith entry in it contains all indexes k in the map
    +//    that points at i (i.e., nodeMap.getNodeId[k] = i)
    +// 2. Find out s by frequency counting
    +// 3. Find out m - s
    +// 4. Find out how many nodes in each of the s entries that should be moved
    +// 
    +// The nodels accepting the reasssignment will be from the set of the nodes contained in the map.
    +//
    +// Example 1.  Assume the original node map with 6 entries as follows:
    +//
    +//    NodeMapIndex   0   1   2   3  4  5 
    +//    NodeMapEntry   0   1   1   3  1  5
    +//
    +// The nodes accepting the reasssignment will be { 0, 3, 5 }
    +//
    +// The subset of entries referring to a few nodes with high frequency: s = { 1, 2, 4 } 
    +// Since f(0)=f(3)=f(5)=1, the average frequency of nodes not in s: (1+1+1)/3 = 1.
    +// We will allow one (1) assigment in s to remain in node 1 since it is the average frequency, and 
    +// re-assign the rest (marked X) to different nodes via round-robin starting the 1st node in the map. 
    +// The node (1) is excluded from the re-assignment. 
    +//
    +//    NodeMapIndex   0   1   2   3  4  5 
    +//    NodeMapEntry   0   1   1   3  1  5
    +//                           X      X
    +//                           ^      ^
    +//                           |      |
    +//                          to 0   to 3
    --- End diff --
    
    It is included for number of times up to "base frequencies". Beyond that,
    other nodes are used.
    
    On Fri, Sep 25, 2015 at 12:18 PM, Suresh Subbiah <no...@github.com>
    wrote:
    
    > In core/sql/optimizer/NodeMap.cpp
    > <https://github.com/apache/incubator-trafodion/pull/92#discussion_r40453224>
    > :
    >
    > > +//    NodeMapEntry   0   1   1   3  1  5
    > > +//
    > > +// The nodes accepting the reasssignment will be { 0, 3, 5 }
    > > +//
    > > +// The subset of entries referring to a few nodes with high frequency: s = { 1, 2, 4 }
    > > +// Since f(0)=f(3)=f(5)=1, the average frequency of nodes not in s: (1+1+1)/3 = 1.
    > > +// We will allow one (1) assigment in s to remain in node 1 since it is the average frequency, and
    > > +// re-assign the rest (marked X) to different nodes via round-robin starting the 1st node in the map.
    > > +// The node (1) is excluded from the re-assignment.
    > > +//
    > > +//    NodeMapIndex   0   1   2   3  4  5
    > > +//    NodeMapEntry   0   1   1   3  1  5
    > > +//                           X      X
    > > +//                           ^      ^
    > > +//                           |      |
    > > +//                          to 0   to 3
    >
    > Should node map entry = 2 be included?
    >
    > —
    > Reply to this email directly or view it on GitHub
    > <https://github.com/apache/incubator-trafodion/pull/92/files#r40453224>.
    >
    
    
    
    -- 
    Regards, --Qifan



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---