You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by chrajeshbabu <gi...@git.apache.org> on 2015/11/25 17:38:31 UTC

[GitHub] phoenix pull request: PHOENIX-1734 Local index improvements

GitHub user chrajeshbabu opened a pull request:

    https://github.com/apache/phoenix/pull/135

    PHOENIX-1734 Local index improvements

    Patch supports storing local indexing data in the same data table.
    1) Removed code used HBase internals in balancer, split and merge.
    2) Create index create column families suffix with L#  for data column families.
    3) Changes in read and write path to use column families prefixed with L# for local indexes.
    4) Done changes to tests.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/chrajeshbabu/phoenix master

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/phoenix/pull/135.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #135
    
----
commit 4e663a2479adbf3e41826f40c1b2ed6bb69d7634
Author: Rajeshbabu Chintaguntla <ra...@apache.org>
Date:   2015-11-25T16:33:33Z

    PHOENIX-1734 Local index improvements(Rajeshbabu)

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: PHOENIX-1734 Local index improvements

Posted by chrajeshbabu <gi...@git.apache.org>.
Github user chrajeshbabu commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/135#discussion_r46111265
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/compile/UpsertCompiler.java ---
    @@ -134,7 +143,36 @@ private static void setValues(byte[][] values, int[] pkSlotIndex, int[] columnIn
                 }
             }
             ImmutableBytesPtr ptr = new ImmutableBytesPtr();
    -        table.newKey(ptr, pkValues);
    +        if(table.getIndexType()==IndexType.LOCAL) {
    --- End diff --
    
    bq.  so I don't think we should include this code.
    Will remove in the next patch.
    
    bq. We've talked about having a kind of "scrutiny" process that scans the data table and ensures that all the corresponding indexes have the correct rows (I filed PHOENIX-2460 for this). 
    +1 on this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: PHOENIX-1734 Local index improvements

Posted by chrajeshbabu <gi...@git.apache.org>.
Github user chrajeshbabu commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/135#discussion_r46096442
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/compile/UpsertCompiler.java ---
    @@ -134,7 +143,36 @@ private static void setValues(byte[][] values, int[] pkSlotIndex, int[] columnIn
                 }
             }
             ImmutableBytesPtr ptr = new ImmutableBytesPtr();
    -        table.newKey(ptr, pkValues);
    +        if(table.getIndexType()==IndexType.LOCAL) {
    --- End diff --
    
    We are preparing index updates server side only. But I have added this to prepare proper local index mutations when we run upsert into index table directly in any case if required. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: PHOENIX-1734 Local index improvements

Posted by JamesRTaylor <gi...@git.apache.org>.
Github user JamesRTaylor commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/135#discussion_r46085765
  
    --- Diff: phoenix-core/src/main/java/org/apache/hadoop/hbase/regionserver/IndexHalfStoreFileReaderGenerator.java ---
    @@ -80,6 +81,9 @@ public Reader preStoreFileReaderOpen(ObserverContext<RegionCoprocessorEnvironmen
             HRegionInfo childRegion = region.getRegionInfo();
             byte[] splitKey = null;
             if (reader == null && r != null) {
    +            if(!p.toString().contains(QueryConstants.LOCAL_INDEX_COLUMN_FAMILY_PREFIX)) {
    +                return super.preStoreFileReaderOpen(ctx, fs, p, in, size, cacheConf, r, reader);
    +            }
    --- End diff --
    
    Does the usage of this IndexHalfStoreFileReader go through limited private HBase APIs or are we still using private APIs here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: PHOENIX-1734 Local index improvements

Posted by JamesRTaylor <gi...@git.apache.org>.
Github user JamesRTaylor commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/135#discussion_r46085683
  
    --- Diff: phoenix-core/src/it/java/org/apache/phoenix/end2end/DeleteIT.java ---
    @@ -186,7 +186,9 @@ private void testDeleteRange(boolean autoCommit, boolean createIndex, boolean lo
             PreparedStatement stmt;
             conn.setAutoCommit(autoCommit);
             deleteStmt = "DELETE FROM IntIntKeyTest WHERE i >= ? and i < ?";
    -        assertIndexUsed(conn, deleteStmt, Arrays.<Object>asList(5,10), indexName, false);
    +        if(!local) {
    --- End diff --
    
    Why isn't the local index used here and below anymore?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: PHOENIX-1734 Local index improvements

Posted by chrajeshbabu <gi...@git.apache.org>.
Github user chrajeshbabu commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/135#discussion_r46096413
  
    --- Diff: phoenix-core/src/main/java/org/apache/hadoop/hbase/regionserver/IndexHalfStoreFileReaderGenerator.java ---
    @@ -80,6 +81,9 @@ public Reader preStoreFileReaderOpen(ObserverContext<RegionCoprocessorEnvironmen
             HRegionInfo childRegion = region.getRegionInfo();
             byte[] splitKey = null;
             if (reader == null && r != null) {
    +            if(!p.toString().contains(QueryConstants.LOCAL_INDEX_COLUMN_FAMILY_PREFIX)) {
    +                return super.preStoreFileReaderOpen(ctx, fs, p, in, size, cacheConf, r, reader);
    +            }
    --- End diff --
    
    Still using private APIs here will raise an improvement action for this and try to make it to use Scanner APIs.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: PHOENIX-1734 Local index improvements

Posted by JamesRTaylor <gi...@git.apache.org>.
Github user JamesRTaylor commented on the pull request:

    https://github.com/apache/phoenix/pull/135#issuecomment-160852616
  
    Couple of other comments:
    - You'll need to add code during installation of 4.7.0 (in ConnectionQueryServicesImpl.init()) that takes care of removing old coprocessors, adding new coprocessors, disabling local indexes and potentially automatically kicking off a new index build
    - We'll likely want to add a check that disallows column family names prefixed with "C#" to prevent inadvertently treating these as local index shadow column families. If we can get rid of the checks that use this prefix, then we might not need to do this.
    - Also, there's an edge case of an existing column family using the "C#" prefix. Unlikely, but not sure how we'd want to handle this. Maybe error on an attempt to create a local index on such a table?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: PHOENIX-1734 Local index improvements

Posted by JamesRTaylor <gi...@git.apache.org>.
Github user JamesRTaylor commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/135#discussion_r46096704
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/compile/UpsertCompiler.java ---
    @@ -134,7 +143,36 @@ private static void setValues(byte[][] values, int[] pkSlotIndex, int[] columnIn
                 }
             }
             ImmutableBytesPtr ptr = new ImmutableBytesPtr();
    -        table.newKey(ptr, pkValues);
    +        if(table.getIndexType()==IndexType.LOCAL) {
    --- End diff --
    
    I don't think we should support updating index tables directly, so I don't think we should include this code. We've talked about having a kind of "scrutiny" process that scans the data table and ensures that all the corresponding indexes have the correct rows (I filed PHOENIX-2460 for this). I suspect this would be done through our coprocessers, though, not through this client API. WDYT?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: PHOENIX-1734 Local index improvements

Posted by JamesRTaylor <gi...@git.apache.org>.
Github user JamesRTaylor commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/135#discussion_r46096617
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/coprocessor/UngroupedAggregateRegionObserver.java ---
    @@ -158,6 +162,10 @@ public RegionScanner preScannerOpen(ObserverContext<RegionCoprocessorEnvironment
                 throws IOException {
             s = super.preScannerOpen(e, scan, s);
             if (ScanUtil.isAnalyzeTable(scan)) {
    +            if (!ScanUtil.isLocalIndex(scan)) {
    +                scan.getFamilyMap().clear();
    +            }
    +
    --- End diff --
    
    That's a good idea, but how about we drive this from the client side instead? There's code in MetaDataClient.updateStatistics() that was adding the local index physical table (if stats on indexes are being updated). How about changing that to add the local index shadow column families?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: PHOENIX-1734 Local index improvements

Posted by chrajeshbabu <gi...@git.apache.org>.
Github user chrajeshbabu commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/135#discussion_r46096275
  
    --- Diff: phoenix-core/src/it/java/org/apache/phoenix/end2end/DeleteIT.java ---
    @@ -186,7 +186,9 @@ private void testDeleteRange(boolean autoCommit, boolean createIndex, boolean lo
             PreparedStatement stmt;
             conn.setAutoCommit(autoCommit);
             deleteStmt = "DELETE FROM IntIntKeyTest WHERE i >= ? and i < ?";
    -        assertIndexUsed(conn, deleteStmt, Arrays.<Object>asList(5,10), indexName, false);
    +        if(!local) {
    --- End diff --
    
    It will be used. In case of local indexes explain plan always have "SCAN OVER DATATABLENAME" so it's failing so not checking assertion for local indexes case. Will modify it make proper assertion in case of local indexes like it should have index id as well.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: PHOENIX-1734 Local index improvements

Posted by JamesRTaylor <gi...@git.apache.org>.
Github user JamesRTaylor commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/135#discussion_r46085866
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/coprocessor/UngroupedAggregateRegionObserver.java ---
    @@ -158,6 +162,10 @@ public RegionScanner preScannerOpen(ObserverContext<RegionCoprocessorEnvironment
                 throws IOException {
             s = super.preScannerOpen(e, scan, s);
             if (ScanUtil.isAnalyzeTable(scan)) {
    +            if (!ScanUtil.isLocalIndex(scan)) {
    +                scan.getFamilyMap().clear();
    +            }
    +
    --- End diff --
    
    Why this change?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: PHOENIX-1734 Local index improvements

Posted by JamesRTaylor <gi...@git.apache.org>.
Github user JamesRTaylor commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/135#discussion_r46085722
  
    --- Diff: phoenix-core/src/it/java/org/apache/phoenix/end2end/HashJoinIT.java ---
    @@ -873,7 +873,7 @@ public void initTable() throws Exception {
                     "    SERVER AGGREGATE INTO DISTINCT ROWS BY [\"I.0:NAME\"]\n" +
                     "CLIENT MERGE SORT\n" +
                     "    PARALLEL LEFT-JOIN TABLE 0\n" +
    -                "        CLIENT PARALLEL 1-WAY RANGE SCAN OVER " + MetaDataUtil.LOCAL_INDEX_TABLE_PREFIX + "" + JOIN_ITEM_TABLE_DISPLAY_NAME +" [-32768]\n" +
    +                "        CLIENT PARALLEL 1-WAY RANGE SCAN OVER " +JOIN_ITEM_TABLE_DISPLAY_NAME +" [-32768]\n" +
    --- End diff --
    
    Is local index still being used? How will the user easily know this? One way is through the indexId in the scan, but is this not obvious enough?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: PHOENIX-1734 Local index improvements

Posted by chrajeshbabu <gi...@git.apache.org>.
Github user chrajeshbabu commented on the pull request:

    https://github.com/apache/phoenix/pull/135#issuecomment-160454376
  
    Thanks for review @JamesRTaylor. Please find my answers to the comments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: PHOENIX-1734 Local index improvements

Posted by JamesRTaylor <gi...@git.apache.org>.
Github user JamesRTaylor commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/135#discussion_r46096509
  
    --- Diff: phoenix-core/src/it/java/org/apache/phoenix/end2end/HashJoinIT.java ---
    @@ -873,7 +873,7 @@ public void initTable() throws Exception {
                     "    SERVER AGGREGATE INTO DISTINCT ROWS BY [\"I.0:NAME\"]\n" +
                     "CLIENT MERGE SORT\n" +
                     "    PARALLEL LEFT-JOIN TABLE 0\n" +
    -                "        CLIENT PARALLEL 1-WAY RANGE SCAN OVER " + MetaDataUtil.LOCAL_INDEX_TABLE_PREFIX + "" + JOIN_ITEM_TABLE_DISPLAY_NAME +" [-32768]\n" +
    +                "        CLIENT PARALLEL 1-WAY RANGE SCAN OVER " +JOIN_ITEM_TABLE_DISPLAY_NAME +" [-32768]\n" +
    --- End diff --
    
    I think using the physical name makes sense, as we're trying to show more-or-less what's going on at the HBase level. How about adding "LOCAL INDEX" like this:
        OVER my_table LOCAL INDEX


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: PHOENIX-1734 Local index improvements

Posted by chrajeshbabu <gi...@git.apache.org>.
Github user chrajeshbabu commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/135#discussion_r46096498
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/coprocessor/UngroupedAggregateRegionObserver.java ---
    @@ -158,6 +162,10 @@ public RegionScanner preScannerOpen(ObserverContext<RegionCoprocessorEnvironment
                 throws IOException {
             s = super.preScannerOpen(e, scan, s);
             if (ScanUtil.isAnalyzeTable(scan)) {
    +            if (!ScanUtil.isLocalIndex(scan)) {
    +                scan.getFamilyMap().clear();
    +            }
    +
    --- End diff --
    
    when we run stats on data table we select only data column families this change helps to choose all the column families to stats preparation and prepare stats at a time for data and local index tables.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: PHOENIX-1734 Local index improvements

Posted by chrajeshbabu <gi...@git.apache.org>.
Github user chrajeshbabu commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/135#discussion_r46111214
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/coprocessor/UngroupedAggregateRegionObserver.java ---
    @@ -158,6 +162,10 @@ public RegionScanner preScannerOpen(ObserverContext<RegionCoprocessorEnvironment
                 throws IOException {
             s = super.preScannerOpen(e, scan, s);
             if (ScanUtil.isAnalyzeTable(scan)) {
    +            if (!ScanUtil.isLocalIndex(scan)) {
    +                scan.getFamilyMap().clear();
    +            }
    +
    --- End diff --
    
    bq. How about changing that to add the local index shadow column families?
    That's better James. Will do in the next patch.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: PHOENIX-1734 Local index improvements

Posted by JamesRTaylor <gi...@git.apache.org>.
Github user JamesRTaylor commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/135#discussion_r46085787
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/compile/UpsertCompiler.java ---
    @@ -134,7 +143,36 @@ private static void setValues(byte[][] values, int[] pkSlotIndex, int[] columnIn
                 }
             }
             ImmutableBytesPtr ptr = new ImmutableBytesPtr();
    -        table.newKey(ptr, pkValues);
    +        if(table.getIndexType()==IndexType.LOCAL) {
    --- End diff --
    
    Why this new code? Are we still generating local index updates on the server side?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] phoenix pull request: PHOENIX-1734 Local index improvements

Posted by chrajeshbabu <gi...@git.apache.org>.
Github user chrajeshbabu commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/135#discussion_r46096395
  
    --- Diff: phoenix-core/src/it/java/org/apache/phoenix/end2end/HashJoinIT.java ---
    @@ -873,7 +873,7 @@ public void initTable() throws Exception {
                     "    SERVER AGGREGATE INTO DISTINCT ROWS BY [\"I.0:NAME\"]\n" +
                     "CLIENT MERGE SORT\n" +
                     "    PARALLEL LEFT-JOIN TABLE 0\n" +
    -                "        CLIENT PARALLEL 1-WAY RANGE SCAN OVER " + MetaDataUtil.LOCAL_INDEX_TABLE_PREFIX + "" + JOIN_ITEM_TABLE_DISPLAY_NAME +" [-32768]\n" +
    +                "        CLIENT PARALLEL 1-WAY RANGE SCAN OVER " +JOIN_ITEM_TABLE_DISPLAY_NAME +" [-32768]\n" +
    --- End diff --
    
    Yes it's using local indexes but it's confusing because we are having same table name. Only difference here is index id.  
    What do you think of changing it to logical name in ExplainTable?
            buf.append("OVER " + tableRef.getTable().getPhysicalName().getString());


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---