You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafodion.apache.org by "Suresh Subbiah (JIRA)" <ji...@apache.org> on 2015/10/05 17:01:26 UTC
[jira] [Assigned] (TRAFODION-1271) LP Bug: 1464306 - Compiler:ESP colocation with Hbase Regions

     [ https://issues.apache.org/jira/browse/TRAFODION-1271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Suresh Subbiah reassigned TRAFODION-1271:
-----------------------------------------

    Assignee: Suresh Subbiah  (was: Ravisha Neelakanthappa)

> LP Bug: 1464306 - Compiler:ESP colocation with Hbase Regions
> ------------------------------------------------------------
>
>                 Key: TRAFODION-1271
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-1271
>             Project: Apache Trafodion
>          Issue Type: Bug
>          Components: sql-cmp
>            Reporter: Ravisha Neelakanthappa
>            Assignee: Suresh Subbiah
>            Priority: Critical
>
> There is a scope for performance improvement if ESPs are colocated with Habse regions they access by leveraging data locality of HBase Region server and Hadoop data nodes.
> Currently ESPs are assigned to any random node as shown in the code below:
>        // Get the node map for this ESP fragment.
>        NodeMap *nodeMap =
>           (NodeMap *)fragmentDir_->getPartitioningFunction(i)->getNodeMap();
>        for (CollIndex j=0; j<nodeMap->getNumEntries(); j++) {
>          nodeMap->setNodeNumber(j, ANY_NODE);
>          nodeMap->setClusterNumber(j, 0);
>        }
> Because of this assignment the communication between ESP and RegionServers can cross node boundaries causing
> slow performance.
> Here is the algorithm used for ESP colocation:
> 1. During startup create a Hashdictionary of NodeNames(Key):NodeNumber(value)
> 2. During NATable creation make a JNI call to get Node(Host) Names of Table's regions
> 3. get NodeNumber of each NodeName using Hashdictionary
> 4. Populate NodeMap with NodeNumber from step 3 above
> 5. During HbaseScan synthesis, new NodeMap gets created for each context being optimized. 
>    Copy NodeNumbers from NodeMap stored in table's partFunc. 
>    5a. If there is 1:1 mapping, do a direct copy
>    5b. If there is M:N (where M < N), use most popular NodeNumber of partition grouping
> 6. In the generator, assign ANY_NODE only if ESP colocation logic is OFF
>     
> Data locality:
> When data is written in HDFS, one copy is written locally, another is written to another node in a different rack (if possible) and a third copy is written to another node in the same rack. For all practical purposes the two extra copies are written to random nodes in the cluster.
> In typical HBase setups a RegionServer is co-located with an HDFS DataNode on the same physical machine. Thus every write is written locally and then to the two nodes as mentioned above. As long the regions are not moved between RegionServers there is good data locality: A RegionServer can serve most reads just from the local disk (and cache), provided short circuit reads are enabled
> When regions are re-assigned data locality is lost and the RegionServers in question need to request the data over the network from remote DataNodes, until the data is rewritten locally (Major compaction time)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)