You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@phoenix.apache.org by GitBox <gi...@apache.org> on 2021/06/28 04:24:47 UTC

[GitHub] [phoenix] kadirozde opened a new pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

kadirozde opened a new pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [phoenix] stoty commented on pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Posted by GitBox <gi...@apache.org>.

stoty commented on pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#issuecomment-886131258


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   5m  0s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files found.  |
   | +1 :green_heart: |  hbaseanti  |   0m  0s |  Patch does not have any anti-patterns.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  The patch appears to include 2 new or modified test files.  |
   ||| _ master Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  22m 12s |  master passed  |
   | +0 |  hbaserecompile  |  28m 37s |  HBase recompiled.  |
   | +1 :green_heart: |  compile  |   1m  2s |  master passed  |
   | +1 :green_heart: |  checkstyle  |   2m 26s |  master passed  |
   | +1 :green_heart: |  javadoc  |   0m 51s |  master passed  |
   | +0 :ok: |  spotbugs  |   3m  5s |  phoenix-core in master has 964 extant spotbugs warnings.  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  13m 29s |  the patch passed  |
   | +0 |  hbaserecompile  |  23m 35s |  HBase recompiled.  |
   | +1 :green_heart: |  compile  |   1m  6s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m  6s |  the patch passed  |
   | -1 :x: |  checkstyle  |   2m 36s |  phoenix-core: The patch generated 129 new + 4748 unchanged - 113 fixed = 4877 total (was 4861)  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  javadoc  |   0m 50s |  the patch passed  |
   | -1 :x: |  spotbugs  |   3m 21s |  phoenix-core generated 1 new + 962 unchanged - 2 fixed = 963 total (was 964)  |
   ||| _ Other Tests _ |
   | -1 :x: |  unit  | 115m 19s |  phoenix-core in the patch failed.  |
   | +1 :green_heart: |  asflicense  |   0m 46s |  The patch does not generate ASF License warnings.  |
   |  |   | 188m 42s |   |
   
   
   | Reason | Tests |
   |-------:|:------|
   | FindBugs | module:phoenix-core |
   |  |  org.apache.phoenix.schema.IndexDataColumnRef doesn't override ColumnRef.equals(Object)  At IndexDataColumnRef.java:At IndexDataColumnRef.java:[line 1] |
   | Failed junit tests | phoenix.end2end.ViewIT |
   |   | phoenix.end2end.DistinctPrefixFilterIT |
   |   | phoenix.end2end.ExplainPlanWithStatsEnabledIT |
   |   | phoenix.end2end.index.GlobalIndexOptimizationIT |
   |   | phoenix.end2end.RowValueConstructorOffsetIT |
   |   | phoenix.end2end.DefaultColumnValueIT |
   |   | phoenix.end2end.index.IndexUsageIT |
   |   | phoenix.end2end.IndexToolIT |
   |   | phoenix.end2end.DerivedTableIT |
   |   | phoenix.end2end.UserDefinedFunctionsIT |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1256/4/artifact/yetus-general-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/phoenix/pull/1256 |
   | Optional Tests | dupname asflicense javac javadoc unit spotbugs hbaserebuild hbaseanti checkstyle compile |
   | uname | Linux 34763af58f8a 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev/phoenix-personality.sh |
   | git revision | master / 6842c1d |
   | Default Java | Private Build-1.8.0_242-8u242-b08-0ubuntu3~16.04-b08 |
   | checkstyle | https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1256/4/artifact/yetus-general-check/output/diff-checkstyle-phoenix-core.txt |
   | spotbugs | https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1256/4/artifact/yetus-general-check/output/new-spotbugs-phoenix-core.html |
   | unit | https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1256/4/artifact/yetus-general-check/output/patch-unit-phoenix-core.txt |
   |  Test Results | https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1256/4/testReport/ |
   | Max. process+thread count | 10804 (vs. ulimit of 30000) |
   | modules | C: phoenix-core U: phoenix-core |
   | Console output | https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1256/4/console |
   | versions | git=2.7.4 maven=3.3.9 spotbugs=4.1.3 |
   | Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [phoenix] kadirozde commented on a change in pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Posted by GitBox <gi...@apache.org>.

kadirozde commented on a change in pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#discussion_r676046294



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/compile/TupleProjectionCompiler.java
##########
@@ -156,7 +155,7 @@ public static PTable createProjectedTable(SelectStatement select, StatementConte
         }
         // add LocalIndexDataColumnRef

Review comment:
       Ok
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [phoenix] kadirozde commented on a change in pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Posted by GitBox <gi...@apache.org>.

kadirozde commented on a change in pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#discussion_r813164218



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/iterate/RegionScannerFactory.java
##########
@@ -215,7 +223,9 @@ public boolean nextRaw(List<Cell> result) throws IOException {
           if (result.size() == 0) {
             return next;
           }
-          if (ScanUtil.isLocalIndex(scan) && !ScanUtil.isAnalyzeTable(scan)) {
+          if ((ScanUtil.isLocalIndex(scan)
+                  || ScanUtil.isUncoveredGlobalIndex(scan))

Review comment:
       I will add it




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [phoenix] kadirozde commented on a change in pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Posted by GitBox <gi...@apache.org>.

kadirozde commented on a change in pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#discussion_r813185144



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/util/IndexUtil.java
##########
@@ -576,20 +597,27 @@ public static void wrapResultUsingOffset(final RegionCoprocessorEnvironment envi
                 }
             }
             Result joinResult = null;
-            if (dataRegion != null) {
-                joinResult = dataRegion.get(get);
-            } else {
-                TableName dataTable =
-                    TableName.valueOf(MetaDataUtil.getLocalIndexUserTableName(environment.getRegion().
-                        getTableDescriptor().getTableName().getNameAsString()));
-                Table table = null;
-                try {
-                    table = environment.getConnection().getTable(dataTable);
+            if (ScanUtil.isLocalIndex(scan)) {
+                if (dataRegion != null) {
+                    joinResult = dataRegion.get(get);
+                } else {
+                    TableName dataTable =
+                            TableName.valueOf(MetaDataUtil.getLocalIndexUserTableName(
+                                    environment.getRegion().getTableDescriptor().getTableName().
+                                            getNameAsString()));
+                    try (Table table = environment.getConnection().getTable(dataTable)) {
+                        joinResult = table.get(get);
+                    }
+                }
+            } else if (ScanUtil.isUncoveredGlobalIndex(scan)) {
+                byte[] dataTableName = scan.getAttribute(PHYSICAL_DATA_TABLE_NAME);

Review comment:
       I see. I think this concern is outside the context of this PR as this PR does not change the existing index hint implementation or how we pass the data table physical name via a scan attribute.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [phoenix] gokceni commented on a change in pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Posted by GitBox <gi...@apache.org>.

gokceni commented on a change in pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#discussion_r813178601



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/util/IndexUtil.java
##########
@@ -576,20 +597,27 @@ public static void wrapResultUsingOffset(final RegionCoprocessorEnvironment envi
                 }
             }
             Result joinResult = null;
-            if (dataRegion != null) {
-                joinResult = dataRegion.get(get);
-            } else {
-                TableName dataTable =
-                    TableName.valueOf(MetaDataUtil.getLocalIndexUserTableName(environment.getRegion().
-                        getTableDescriptor().getTableName().getNameAsString()));
-                Table table = null;
-                try {
-                    table = environment.getConnection().getTable(dataTable);
+            if (ScanUtil.isLocalIndex(scan)) {
+                if (dataRegion != null) {
+                    joinResult = dataRegion.get(get);
+                } else {
+                    TableName dataTable =
+                            TableName.valueOf(MetaDataUtil.getLocalIndexUserTableName(
+                                    environment.getRegion().getTableDescriptor().getTableName().
+                                            getNameAsString()));
+                    try (Table table = environment.getConnection().getTable(dataTable)) {
+                        joinResult = table.get(get);
+                    }
+                }
+            } else if (ScanUtil.isUncoveredGlobalIndex(scan)) {
+                byte[] dataTableName = scan.getAttribute(PHYSICAL_DATA_TABLE_NAME);

Review comment:
       This code reminded me a test case where we have separate physical and logical table names for (case1. data table and case2. index table). I was just asking to make sure that we are not breaking anything.
   Something like this: 
   You have an index hint for a global index which has a different physical table name (logical name appears in the hint but its PHYSICAL_TABLE_NAME in syscat points to different hbase table). I think your code doesn't break this case but I wanted to make sure. Does this make sense?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [phoenix] lhofhansl commented on a change in pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Posted by GitBox <gi...@apache.org>.

lhofhansl commented on a change in pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#discussion_r659498881



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/util/IndexUtil.java
##########
@@ -574,20 +575,35 @@ public static void wrapResultUsingOffset(final RegionCoprocessorEnvironment envi
                 }
             }
             Result joinResult = null;
-            if (dataRegion != null) {
-                joinResult = dataRegion.get(get);
-            } else {
-                TableName dataTable =
-                    TableName.valueOf(MetaDataUtil.getLocalIndexUserTableName(environment.getRegion().
-                        getTableDescriptor().getTableName().getNameAsString()));
+            if (ScanUtil.isLocalIndex(scan)) {
+                if (dataRegion != null) {
+                    joinResult = dataRegion.get(get);
+                } else {
+                    TableName dataTable =
+                        TableName.valueOf(MetaDataUtil.getLocalIndexUserTableName(environment.getRegion().
+                            getTableDescriptor().getTableName().getNameAsString()));
+                    Table table = null;
+                    try {
+                        table = environment.getConnection().getTable(dataTable);
+                        joinResult = table.get(get);
+                    } finally {
+                        if (table != null) table.close();
+                    }
+                }
+            } else if (ScanUtil.isGlobalIndex(scan)) {
+                byte[] dataTableName = scan.getAttribute(PHYSICAL_DATA_TABLE_NAME);
                 Table table = null;
                 try {
-                    table = environment.getConnection().getTable(dataTable);
+                    table = ServerUtil.ConnectionFactory.
+                            getConnection(ServerUtil.ConnectionType.INDEX_WRITER_CONNECTION, environment).
+                            getTable(TableName.valueOf(dataTableName));
                     joinResult = table.get(get);

Review comment:
       This is doing a remote get for each row that matches the index. The index would have to be very selective for this to be an improvement.
   
   I was hoping we'd come up with some smart batching. But that's tricky to do here. :)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [phoenix] stoty commented on pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Posted by GitBox <gi...@apache.org>.

stoty commented on pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#issuecomment-869516369


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   5m 37s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files found.  |
   | +1 :green_heart: |  hbaseanti  |   0m  0s |  Patch does not have any anti-patterns.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.  |
   ||| _ master Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  24m 35s |  master passed  |
   | +0 |  hbaserecompile  |  36m 10s |  HBase recompiled.  |
   | +1 :green_heart: |  compile  |   1m 25s |  master passed  |
   | +1 :green_heart: |  checkstyle  |   2m  9s |  master passed  |
   | +1 :green_heart: |  javadoc  |   1m  4s |  master passed  |
   | +0 :ok: |  spotbugs  |   4m 17s |  phoenix-core in master has 965 extant spotbugs warnings.  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  18m 52s |  the patch passed  |
   | +0 |  hbaserecompile  |  34m 14s |  HBase recompiled.  |
   | +1 :green_heart: |  compile  |   1m 20s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 20s |  the patch passed  |
   | -1 :x: |  checkstyle  |   2m 14s |  phoenix-core: The patch generated 128 new + 4749 unchanged - 112 fixed = 4877 total (was 4861)  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  javadoc  |   0m 59s |  the patch passed  |
   | -1 :x: |  spotbugs  |   4m 13s |  phoenix-core generated 1 new + 963 unchanged - 2 fixed = 964 total (was 965)  |
   ||| _ Other Tests _ |
   | -1 :x: |  unit  |   1m 57s |  phoenix-core in the patch failed.  |
   | +1 :green_heart: |  asflicense  |   0m 16s |  The patch does not generate ASF License warnings.  |
   |  |   |  93m 33s |   |
   
   
   | Reason | Tests |
   |-------:|:------|
   | FindBugs | module:phoenix-core |
   |  |  org.apache.phoenix.schema.IndexDataColumnRef doesn't override ColumnRef.equals(Object)  At IndexDataColumnRef.java:At IndexDataColumnRef.java:[line 1] |
   | Failed junit tests | phoenix.compile.TenantSpecificViewIndexCompileTest |
   |   | phoenix.compile.QueryOptimizerTest |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1256/2/artifact/yetus-general-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/phoenix/pull/1256 |
   | Optional Tests | dupname asflicense javac javadoc unit spotbugs hbaserebuild hbaseanti checkstyle compile |
   | uname | Linux 15c712d84ee3 4.15.0-136-generic #140-Ubuntu SMP Thu Jan 28 05:20:47 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev/phoenix-personality.sh |
   | git revision | master / fcdf5bc |
   | Default Java | Private Build-1.8.0_242-8u242-b08-0ubuntu3~16.04-b08 |
   | checkstyle | https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1256/2/artifact/yetus-general-check/output/diff-checkstyle-phoenix-core.txt |
   | spotbugs | https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1256/2/artifact/yetus-general-check/output/new-spotbugs-phoenix-core.html |
   | unit | https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1256/2/artifact/yetus-general-check/output/patch-unit-phoenix-core.txt |
   |  Test Results | https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1256/2/testReport/ |
   | Max. process+thread count | 516 (vs. ulimit of 30000) |
   | modules | C: phoenix-core U: phoenix-core |
   | Console output | https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1256/2/console |
   | versions | git=2.7.4 maven=3.3.9 spotbugs=4.1.3 |
   | Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [phoenix] stoty commented on pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Posted by GitBox <gi...@apache.org>.

stoty commented on pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#issuecomment-870437873


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   6m 26s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files found.  |
   | +1 :green_heart: |  hbaseanti  |   0m  0s |  Patch does not have any anti-patterns.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  The patch appears to include 2 new or modified test files.  |
   ||| _ master Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  24m 27s |  master passed  |
   | +0 |  hbaserecompile  |  31m 28s |  HBase recompiled.  |
   | +1 :green_heart: |  compile  |   1m  7s |  master passed  |
   | +1 :green_heart: |  checkstyle  |   1m 50s |  master passed  |
   | +1 :green_heart: |  javadoc  |   0m 53s |  master passed  |
   | +0 :ok: |  spotbugs  |   3m 21s |  phoenix-core in master has 965 extant spotbugs warnings.  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  16m 54s |  the patch passed  |
   | +0 |  hbaserecompile  |  27m 15s |  HBase recompiled.  |
   | +1 :green_heart: |  compile  |   1m  6s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m  6s |  the patch passed  |
   | -1 :x: |  checkstyle  |   1m 50s |  phoenix-core: The patch generated 63 new + 4814 unchanged - 47 fixed = 4877 total (was 4861)  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  javadoc  |   0m 51s |  the patch passed  |
   | -1 :x: |  spotbugs  |   3m 31s |  phoenix-core generated 1 new + 963 unchanged - 2 fixed = 964 total (was 965)  |
   ||| _ Other Tests _ |
   | -1 :x: |  unit  | 124m 10s |  phoenix-core in the patch failed.  |
   | +1 :green_heart: |  asflicense  |   0m 46s |  The patch does not generate ASF License warnings.  |
   |  |   | 204m 29s |   |
   
   
   | Reason | Tests |
   |-------:|:------|
   | FindBugs | module:phoenix-core |
   |  |  org.apache.phoenix.schema.IndexDataColumnRef doesn't override ColumnRef.equals(Object)  At IndexDataColumnRef.java:At IndexDataColumnRef.java:[line 1] |
   | Failed junit tests | phoenix.end2end.index.GlobalIndexOptimizationIT |
   |   | phoenix.end2end.index.IndexUsageIT |
   |   | phoenix.end2end.DerivedTableIT |
   |   | phoenix.end2end.UserDefinedFunctionsIT |
   |   | phoenix.end2end.IndexToolIT |
   |   | phoenix.end2end.RowValueConstructorOffsetIT |
   |   | phoenix.end2end.ViewIT |
   |   | phoenix.end2end.DistinctPrefixFilterIT |
   |   | phoenix.end2end.ExplainPlanWithStatsEnabledIT |
   |   | phoenix.end2end.DefaultColumnValueIT |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1256/3/artifact/yetus-general-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/phoenix/pull/1256 |
   | Optional Tests | dupname asflicense javac javadoc unit spotbugs hbaserebuild hbaseanti checkstyle compile |
   | uname | Linux 90ed47f9bc26 4.15.0-136-generic #140-Ubuntu SMP Thu Jan 28 05:20:47 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev/phoenix-personality.sh |
   | git revision | master / fcdf5bc |
   | Default Java | Private Build-1.8.0_242-8u242-b08-0ubuntu3~16.04-b08 |
   | checkstyle | https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1256/3/artifact/yetus-general-check/output/diff-checkstyle-phoenix-core.txt |
   | spotbugs | https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1256/3/artifact/yetus-general-check/output/new-spotbugs-phoenix-core.html |
   | unit | https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1256/3/artifact/yetus-general-check/output/patch-unit-phoenix-core.txt |
   |  Test Results | https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1256/3/testReport/ |
   | Max. process+thread count | 12127 (vs. ulimit of 30000) |
   | modules | C: phoenix-core U: phoenix-core |
   | Console output | https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1256/3/console |
   | versions | git=2.7.4 maven=3.3.9 spotbugs=4.1.3 |
   | Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [phoenix] kadirozde commented on a change in pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Posted by GitBox <gi...@apache.org>.

kadirozde commented on a change in pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#discussion_r812566460



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/execute/BaseQueryPlan.java
##########
@@ -353,11 +357,12 @@ public final ResultIterator iterator(final Map<ImmutableBytesPtr,ServerCache> ca
                 KeyValueSchema schema = ProjectedColumnExpression.buildSchema(dataColumns);
                 // Set key value schema of the data columns.
                 serializeSchemaIntoScan(scan, schema);
-                
-                // Set index maintainer of the local index.
-                serializeIndexMaintainerIntoScan(scan, dataTable);
-                // Set view constants if exists.
-                serializeViewConstantsIntoScan(scan, dataTable);
+                if (table.getIndexType() == IndexType.LOCAL) {

Review comment:
       No. We populate scan attributes for global indexes at the constructor of the TableResultIterator using ScanUtil#setScanAttributesForClient in general when the table to be scanned is a global index table. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [phoenix] comnetwork edited a comment on pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Posted by GitBox <gi...@apache.org>.

comnetwork edited a comment on pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#issuecomment-871044039

@kadirozde ，thank you for reply.

> However, based on my experience, it will perform better in most of the cases in practice. This is because the index PK is designed by the user who knows the use case (the type and shape of queries) and the user wants that the index should be used if the index row key prefix length is greater than the data row key prefix length for a given query in general.

If there is just one index, your said may be right, but if there is lots of index, it is hard to make sure what is the user's intention, may be the user just leave out some columns in the global index.

> I understand your concern here and please help me out on how to proceed here. I can add a config param to use uncovered indexes without a specific hint. This mean that we will preserve the existing behavior if the config param is not specified. Would that address your concern?

In my opinion, your implemention now is better than the existing implemention which rewrite the sql with the columns that are not covered by the global index as `InSubquery` , you could remove the existing implemention and replace with your new implemention which also repsect the Index Hint and at the same time avoid to give user two different chocies to achieve the same purpose.

In short , I think if we could not make sure the index performs better, we would better be conservative and let the user to decide rather than making decisions for the user.

In any case, you may also let the user know you scaning the gobal index and retrieving the corresponding rows from the data table when they execute explain sql.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [phoenix] lhofhansl commented on a change in pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Posted by GitBox <gi...@apache.org>.

lhofhansl commented on a change in pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#discussion_r660057634



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/util/IndexUtil.java
##########
@@ -574,20 +575,35 @@ public static void wrapResultUsingOffset(final RegionCoprocessorEnvironment envi
                 }
             }
             Result joinResult = null;
-            if (dataRegion != null) {
-                joinResult = dataRegion.get(get);
-            } else {
-                TableName dataTable =
-                    TableName.valueOf(MetaDataUtil.getLocalIndexUserTableName(environment.getRegion().
-                        getTableDescriptor().getTableName().getNameAsString()));
+            if (ScanUtil.isLocalIndex(scan)) {
+                if (dataRegion != null) {
+                    joinResult = dataRegion.get(get);
+                } else {
+                    TableName dataTable =
+                        TableName.valueOf(MetaDataUtil.getLocalIndexUserTableName(environment.getRegion().
+                            getTableDescriptor().getTableName().getNameAsString()));
+                    Table table = null;
+                    try {
+                        table = environment.getConnection().getTable(dataTable);
+                        joinResult = table.get(get);
+                    } finally {
+                        if (table != null) table.close();
+                    }
+                }
+            } else if (ScanUtil.isGlobalIndex(scan)) {
+                byte[] dataTableName = scan.getAttribute(PHYSICAL_DATA_TABLE_NAME);
                 Table table = null;
                 try {
-                    table = environment.getConnection().getTable(dataTable);
+                    table = ServerUtil.ConnectionFactory.
+                            getConnection(ServerUtil.ConnectionType.INDEX_WRITER_CONNECTION, environment).
+                            getTable(TableName.valueOf(dataTableName));
                     joinResult = table.get(get);

Review comment:
       I think that's perfectly fine to do in a separate PR.
   
   And, yes, doing buffering will require quite some refactoring, since you (a) have to do it at a place where you still know the rows involved, and (b) a level higher than here, so that you can buffer.
   
   The only concern I have that in case using an index can be significantly slower than a full scan (this is true even for local indexes). For example a query of the form SELECT colA FROM table WHERE colB > 0; assuming colB is not selective will take longer. Note that FAST_DIFF (unfortunately Phoenix' default) is particularly slow for Get operations. My guess is that with defaults if the WHERE is not 99% - 99.9% selective, the query might be slower.
   
   In any case :) Another PR is cool.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [phoenix] lhofhansl commented on a change in pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Posted by GitBox <gi...@apache.org>.

lhofhansl commented on a change in pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#discussion_r659498881



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/util/IndexUtil.java
##########
@@ -574,20 +575,35 @@ public static void wrapResultUsingOffset(final RegionCoprocessorEnvironment envi
                 }
             }
             Result joinResult = null;
-            if (dataRegion != null) {
-                joinResult = dataRegion.get(get);
-            } else {
-                TableName dataTable =
-                    TableName.valueOf(MetaDataUtil.getLocalIndexUserTableName(environment.getRegion().
-                        getTableDescriptor().getTableName().getNameAsString()));
+            if (ScanUtil.isLocalIndex(scan)) {
+                if (dataRegion != null) {
+                    joinResult = dataRegion.get(get);
+                } else {
+                    TableName dataTable =
+                        TableName.valueOf(MetaDataUtil.getLocalIndexUserTableName(environment.getRegion().
+                            getTableDescriptor().getTableName().getNameAsString()));
+                    Table table = null;
+                    try {
+                        table = environment.getConnection().getTable(dataTable);
+                        joinResult = table.get(get);
+                    } finally {
+                        if (table != null) table.close();
+                    }
+                }
+            } else if (ScanUtil.isGlobalIndex(scan)) {
+                byte[] dataTableName = scan.getAttribute(PHYSICAL_DATA_TABLE_NAME);
                 Table table = null;
                 try {
-                    table = environment.getConnection().getTable(dataTable);
+                    table = ServerUtil.ConnectionFactory.
+                            getConnection(ServerUtil.ConnectionType.INDEX_WRITER_CONNECTION, environment).
+                            getTable(TableName.valueOf(dataTableName));
                     joinResult = table.get(get);

Review comment:
       This is doing a remote get for each row that matches the index. The index would have to be very selective for this to be an improvement.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [phoenix] kadirozde commented on a change in pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Posted by GitBox <gi...@apache.org>.

kadirozde commented on a change in pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#discussion_r676042652



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/coprocessor/BaseScannerRegionObserver.java
##########
@@ -80,9 +81,15 @@
         "_IndexRebuildDisableLoggingVerifyType";
     public static final String INDEX_REBUILD_DISABLE_LOGGING_BEYOND_MAXLOOKBACK_AGE =
         "_IndexRebuildDisableLoggingBeyondMaxLookbackAge";
+    @Deprecated
     public static final String LOCAL_INDEX_FILTER = "_LocalIndexFilter";
+    @Deprecated
     public static final String LOCAL_INDEX_LIMIT = "_LocalIndexLimit";
+    @Deprecated
     public static final String LOCAL_INDEX_FILTER_STR = "_LocalIndexFilterStr";

Review comment:
       I believe so too.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [phoenix] kadirozde commented on a change in pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Posted by GitBox <gi...@apache.org>.

kadirozde commented on a change in pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#discussion_r813154184



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/iterate/NonAggregateRegionScannerFactory.java
##########
@@ -128,21 +128,15 @@ public RegionScanner getRegionScanner(final Scan scan, final RegionScanner s) th
     PhoenixTransactionContext tx = null;
     ColumnReference[] dataColumns = IndexUtil.deserializeDataTableColumnsToJoin(scan);
     if (dataColumns != null) {
-      tupleProjector = IndexUtil.getTupleProjector(scan, dataColumns);
-      dataRegion = env.getRegion();
-      boolean useProto = false;
-      byte[] localIndexBytes = scan.getAttribute(BaseScannerRegionObserver.LOCAL_INDEX_BUILD_PROTO);
-      useProto = localIndexBytes != null;
-      if (localIndexBytes == null) {
-        localIndexBytes = scan.getAttribute(BaseScannerRegionObserver.LOCAL_INDEX_BUILD);
-      }
-      int clientVersion = ScanUtil.getClientVersion(scan);
-      List<IndexMaintainer> indexMaintainers =
-              IndexMaintainer.deserialize(localIndexBytes, useProto);
-      indexMaintainer = indexMaintainers.get(0);
-      viewConstants = IndexUtil.deserializeViewConstantsFromScan(scan);
-      byte[] txState = scan.getAttribute(BaseScannerRegionObserver.TX_STATE);
-      tx = TransactionFactory.getTransactionContext(txState, clientVersion);
+        tupleProjector = IndexUtil.getTupleProjector(scan, dataColumns);
+        dataRegion = env.getRegion();
+        int clientVersion = ScanUtil.getClientVersion(scan);
+        List<IndexMaintainer> indexMaintainers =
+                IndexUtil.deSerializeIndexMaintainersFromScan(scan);
+        indexMaintainer = indexMaintainers.get(0);

Review comment:
       These methods can serialize/deserialize more than one index maintainers. I do not know if there is a case where we do that actually. As far as know, we only need to do this only for one index maintainer.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [phoenix] kadirozde commented on a change in pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Posted by GitBox <gi...@apache.org>.

kadirozde commented on a change in pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#discussion_r813158222



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/util/IndexUtil.java
##########
@@ -576,20 +597,27 @@ public static void wrapResultUsingOffset(final RegionCoprocessorEnvironment envi
                 }
             }
             Result joinResult = null;
-            if (dataRegion != null) {
-                joinResult = dataRegion.get(get);
-            } else {
-                TableName dataTable =
-                    TableName.valueOf(MetaDataUtil.getLocalIndexUserTableName(environment.getRegion().
-                        getTableDescriptor().getTableName().getNameAsString()));
-                Table table = null;
-                try {
-                    table = environment.getConnection().getTable(dataTable);
+            if (ScanUtil.isLocalIndex(scan)) {
+                if (dataRegion != null) {
+                    joinResult = dataRegion.get(get);
+                } else {
+                    TableName dataTable =
+                            TableName.valueOf(MetaDataUtil.getLocalIndexUserTableName(
+                                    environment.getRegion().getTableDescriptor().getTableName().
+                                            getNameAsString()));
+                    try (Table table = environment.getConnection().getTable(dataTable)) {
+                        joinResult = table.get(get);
+                    }
+                }
+            } else if (ScanUtil.isUncoveredGlobalIndex(scan)) {
+                byte[] dataTableName = scan.getAttribute(PHYSICAL_DATA_TABLE_NAME);

Review comment:
       Not sure if I understood this comment. Data and index tables have different physical tables always.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [phoenix] comnetwork commented on pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Posted by GitBox <gi...@apache.org>.

comnetwork commented on pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#issuecomment-870392940


   @kadirozde @lhofhansl  FYI.
   
   1.You said "Phoenix client does not use a global index for the queries with the columns that are not covered by the global index" is not right , In QueryOptimizer.addPlan, if user specify a Index  Hint and there exists where clause, the sql would be rewritten as
   "SELECT /*+ NO_INDEX */ K,V1,V2 FROM T  WHERE ("K" IN ((SELECT /*+ INDEX(T IDX) */ ":K" FROM "IDX"  WHERE "0:V1" = 'bar')) AND V2 = 'foo') " , you may consider compatibility with exising code.
   
   2.Whether or not scaning the gobal index and  retrieving the corresponding rows from the data table is better than just scaning the data table is a complex problem, because there are many factors we need to  consider such as Network cost, random disk access cost , data distribution , column selective  etc.  you said "It is expected that such performance improvement will happen when the index row key prefix length is greater than the data row key prefix length for a given query" is very insufficient.  Lack of a CBO framework in Phoenix, it is sensible to be conservative, I think it is better to left the choice to user by users specifying the Index Hint just as the existing code.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [phoenix] lhofhansl commented on a change in pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Posted by GitBox <gi...@apache.org>.

lhofhansl commented on a change in pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#discussion_r659498491



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/util/IndexUtil.java
##########
@@ -574,20 +575,35 @@ public static void wrapResultUsingOffset(final RegionCoprocessorEnvironment envi
                 }
             }
             Result joinResult = null;
-            if (dataRegion != null) {
-                joinResult = dataRegion.get(get);
-            } else {
-                TableName dataTable =
-                    TableName.valueOf(MetaDataUtil.getLocalIndexUserTableName(environment.getRegion().
-                        getTableDescriptor().getTableName().getNameAsString()));
+            if (ScanUtil.isLocalIndex(scan)) {
+                if (dataRegion != null) {
+                    joinResult = dataRegion.get(get);
+                } else {
+                    TableName dataTable =

Review comment:
       I wonder if this part is even still needed for local indexes. (but that's a separate issue)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [phoenix] lhofhansl commented on a change in pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Posted by GitBox <gi...@apache.org>.

lhofhansl commented on a change in pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#discussion_r660057634



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/util/IndexUtil.java
##########
@@ -574,20 +575,35 @@ public static void wrapResultUsingOffset(final RegionCoprocessorEnvironment envi
                 }
             }
             Result joinResult = null;
-            if (dataRegion != null) {
-                joinResult = dataRegion.get(get);
-            } else {
-                TableName dataTable =
-                    TableName.valueOf(MetaDataUtil.getLocalIndexUserTableName(environment.getRegion().
-                        getTableDescriptor().getTableName().getNameAsString()));
+            if (ScanUtil.isLocalIndex(scan)) {
+                if (dataRegion != null) {
+                    joinResult = dataRegion.get(get);
+                } else {
+                    TableName dataTable =
+                        TableName.valueOf(MetaDataUtil.getLocalIndexUserTableName(environment.getRegion().
+                            getTableDescriptor().getTableName().getNameAsString()));
+                    Table table = null;
+                    try {
+                        table = environment.getConnection().getTable(dataTable);
+                        joinResult = table.get(get);
+                    } finally {
+                        if (table != null) table.close();
+                    }
+                }
+            } else if (ScanUtil.isGlobalIndex(scan)) {
+                byte[] dataTableName = scan.getAttribute(PHYSICAL_DATA_TABLE_NAME);
                 Table table = null;
                 try {
-                    table = environment.getConnection().getTable(dataTable);
+                    table = ServerUtil.ConnectionFactory.
+                            getConnection(ServerUtil.ConnectionType.INDEX_WRITER_CONNECTION, environment).
+                            getTable(TableName.valueOf(dataTableName));
                     joinResult = table.get(get);

Review comment:
       I think that's perfectly fine to do in a separate PR.
   
   And, yes, doing buffering will require quite some refactoring, since you (a) have to do it at a place where you still know the rows involved, and (b) a level higher than here, so that you can buffer.
   
   The only concern I have that in case using an index can be significantly slower than a full scan (this is true even for local indexes). For example a query of the form SELECT colA FROM table WHERE colB > 0; assuming colB is not selective will take longer. Note that FAST_DIFF (unfortunately Phoenix' default) is particularly slow for Get operations. My guess is that with defaults if the WHERE is not 99% - 99.9% selective, the query might be slower.
   
   In any case :) Another PR is cool.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [phoenix] comnetwork edited a comment on pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Posted by GitBox <gi...@apache.org>.

comnetwork edited a comment on pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#issuecomment-870392940


   @kadirozde @lhofhansl  FYI.
   
   1.You said "Phoenix client does not use a global index for the queries with the columns that are not covered by the global index" is not right , In QueryOptimizer.addPlan, if user specify a Index  Hint and there exists where clause, the sql would be rewritten as
   "SELECT /*+ NO_INDEX */ K,V1,V2 FROM T  WHERE ("K" IN ((SELECT /*+ INDEX(T IDX) */ ":K" FROM "IDX"  WHERE "0:V1" = 'bar')) AND V2 = 'foo') " , you may consider compatibility with exising code.
   
   2.Whether or not scaning the gobal index and  retrieving the corresponding rows from the data table is better than just scaning the data table is a complex problem, because there are many factors we need to  consider such as Network cost, random disk access cost , data distribution , column selective  etc.  you said "It is expected that such performance improvement will happen when the index row key prefix length is greater than the data row key prefix length for a given query" is extremely insufficient.  Lack of a CBO framework in Phoenix, it is sensible to be conservative, I think it is better to left the choice to user by users specifying the Index Hint just as the existing code.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [phoenix] lhofhansl edited a comment on pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Posted by GitBox <gi...@apache.org>.

lhofhansl edited a comment on pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#issuecomment-875280447


   If you do SELECT count(uncovered_column) FROM T WHERE covered_column = xyz, the global uncovered index is not used even when you hint it as expected (I just verified that current 5.x. Phoenix).
   
   I found that uncovered local indexes (that's what I tested) are sometimes much slower than doing to a full table scan. That happens when there is a WHERE clause that an index could be used for, but the WHERE restriction is not selective.
   
   (As noted above, FAST_DIFF (Phoenix' default) is actually the worst choice since SEEKs are slow with it. ROW_INDEX_V1 with ZSTD compression are far better. I blogged about this here: https://hadoop-hbase.blogspot.com/2018/10/apache-hbase-and-apache-phoenix-more-on.html a while ago: With FAST_DIFF the WHERE clause needed to be **0.5% (return 1/200 of the data)** to be effective. With ROW_INDEX_V1 + ZSTD that was 10%.)
   
   This is at best as good as uncovered local indexing, and probably worse since we need to go remote for each row, unless we do batching. And the batches would still be requiring a SKIP_SCAN, which in the general case is still very slow for FAST_DIFF.
   So I expect with defaults the WHERE clause would need to be somewhere between 1/1000 and 1/300 hundred selective for this to be improvement.
   
   Anyway... I think we should check this in. Presumable folks would create uncovered global indexes only when they know what they are doing.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [phoenix] comnetwork commented on pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Posted by GitBox <gi...@apache.org>.

comnetwork commented on pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#issuecomment-875379295


   @lhofhansl ，thank you for reply,  
   Yes , I agree that we should check this in because this implemention is better than existing code which rewrite the sql as `InSubquery`, but I think we should merge the two different implemention in order to not give user two different chocies to achieve the same purpose. One way is remove the existing code and switch to this new implemention, and in the same time , because we could not make sure the index performs better, we should provide an index hint (or just use the existing index hint) or config param  to disable it.
   And furthermore, we should illustratethat we using index and lookupback in the `explain`
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [phoenix] kadirozde commented on pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Posted by GitBox <gi...@apache.org>.

kadirozde commented on pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#issuecomment-886114583


   > @lhofhansl ，thank you for reply,
   > Yes , I agree that we should check this in because this implemention is better than existing code which rewrite the sql as `InSubquery`, but I think we should merge the two different implemention in order to not give user two different chocies to achieve the same purpose. One way is remove the existing code and switch to this new implemention, and in the same time , because we could not make sure the index performs better, we should provide an index hint (or just use the existing index hint) or config param to disable it.
   > And furthermore, we should illustratethat we using index and lookupback in the `explain`
   
   @Lars already enhanced the explain plan to indicate the server side merge for uncovered columns in PHOENIX-6409. Regarding to merging this PR and the existing subquery implementation without creating compatibility issues, it is a bit tricky for me now. I need to spend time to figure that out. Would you please create a jira for that? If you like to implement that jira, you would be more than welcome, @comnetwork.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [phoenix] kadirozde commented on pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Posted by GitBox <gi...@apache.org>.

kadirozde commented on pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#issuecomment-1048466917


   > 
   
   Yes, the index hint is required only for global indexes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [phoenix] gokceni commented on a change in pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Posted by GitBox <gi...@apache.org>.

gokceni commented on a change in pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#discussion_r812468025



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/util/IndexUtil.java
##########
@@ -576,20 +597,27 @@ public static void wrapResultUsingOffset(final RegionCoprocessorEnvironment envi
                 }
             }
             Result joinResult = null;
-            if (dataRegion != null) {
-                joinResult = dataRegion.get(get);
-            } else {
-                TableName dataTable =
-                    TableName.valueOf(MetaDataUtil.getLocalIndexUserTableName(environment.getRegion().
-                        getTableDescriptor().getTableName().getNameAsString()));
-                Table table = null;
-                try {
-                    table = environment.getConnection().getTable(dataTable);
+            if (ScanUtil.isLocalIndex(scan)) {
+                if (dataRegion != null) {
+                    joinResult = dataRegion.get(get);
+                } else {
+                    TableName dataTable =
+                            TableName.valueOf(MetaDataUtil.getLocalIndexUserTableName(
+                                    environment.getRegion().getTableDescriptor().getTableName().
+                                            getNameAsString()));
+                    try (Table table = environment.getConnection().getTable(dataTable)) {
+                        joinResult = table.get(get);
+                    }
+                }
+            } else if (ScanUtil.isUncoveredGlobalIndex(scan)) {
+                byte[] dataTableName = scan.getAttribute(PHYSICAL_DATA_TABLE_NAME);

Review comment:
       Can we have a test where the index table has a different Physical Table name and data table has a different physical table name?

##########
File path: phoenix-core/src/main/java/org/apache/phoenix/iterate/NonAggregateRegionScannerFactory.java
##########
@@ -128,21 +128,15 @@ public RegionScanner getRegionScanner(final Scan scan, final RegionScanner s) th
     PhoenixTransactionContext tx = null;
     ColumnReference[] dataColumns = IndexUtil.deserializeDataTableColumnsToJoin(scan);
     if (dataColumns != null) {
-      tupleProjector = IndexUtil.getTupleProjector(scan, dataColumns);
-      dataRegion = env.getRegion();
-      boolean useProto = false;
-      byte[] localIndexBytes = scan.getAttribute(BaseScannerRegionObserver.LOCAL_INDEX_BUILD_PROTO);
-      useProto = localIndexBytes != null;
-      if (localIndexBytes == null) {
-        localIndexBytes = scan.getAttribute(BaseScannerRegionObserver.LOCAL_INDEX_BUILD);
-      }
-      int clientVersion = ScanUtil.getClientVersion(scan);
-      List<IndexMaintainer> indexMaintainers =
-              IndexMaintainer.deserialize(localIndexBytes, useProto);
-      indexMaintainer = indexMaintainers.get(0);
-      viewConstants = IndexUtil.deserializeViewConstantsFromScan(scan);
-      byte[] txState = scan.getAttribute(BaseScannerRegionObserver.TX_STATE);
-      tx = TransactionFactory.getTransactionContext(txState, clientVersion);
+        tupleProjector = IndexUtil.getTupleProjector(scan, dataColumns);
+        dataRegion = env.getRegion();
+        int clientVersion = ScanUtil.getClientVersion(scan);
+        List<IndexMaintainer> indexMaintainers =
+                IndexUtil.deSerializeIndexMaintainersFromScan(scan);
+        indexMaintainer = indexMaintainers.get(0);

Review comment:
       Why are we getting the first one? I see that we used to do this before and it is not new but I am confused why how we sort these

##########
File path: phoenix-core/src/it/java/org/apache/phoenix/end2end/index/GlobalIndexOptimizationIT.java
##########
@@ -345,7 +335,7 @@ private void testOptimizationTenantSpecific(String dataTableName, String indexTa
                             "        CLIENT PARALLEL 1-WAY RANGE SCAN OVER " + indexTableName + " \\['tid1','a'\\]\n" +
                             "            SERVER FILTER BY FIRST KEY ONLY\n" +
                             "    DYNAMIC SERVER FILTER BY \\(\"" + dataTableName + ".K1\", \"" + dataTableName + ".K2\"\\) IN \\(\\(\\$\\d+.\\$\\d+, \\$\\d+.\\$\\d+\\)\\)";
-            assertTrue("Expected:\n" + expected + "\ndid not match\n" + actual, Pattern.matches(expected, actual));
+            //assertTrue("Expected:\n" + expected + "\ndid not match\n" + actual, Pattern.matches(expected, actual));

Review comment:
       nit: commented code. Forgotten?

##########
File path: phoenix-core/src/main/java/org/apache/phoenix/iterate/RegionScannerFactory.java
##########
@@ -215,7 +223,9 @@ public boolean nextRaw(List<Cell> result) throws IOException {
           if (result.size() == 0) {
             return next;
           }
-          if (ScanUtil.isLocalIndex(scan) && !ScanUtil.isAnalyzeTable(scan)) {
+          if ((ScanUtil.isLocalIndex(scan)
+                  || ScanUtil.isUncoveredGlobalIndex(scan))

Review comment:
       You seem to have this in Line 137 and in GroupedAggregateRegion... as well. I think it is better to have a ScanUtil.isLocalOrUncoveredGlobalIndex

##########
File path: phoenix-core/src/main/java/org/apache/phoenix/execute/BaseQueryPlan.java
##########
@@ -353,11 +357,12 @@ public final ResultIterator iterator(final Map<ImmutableBytesPtr,ServerCache> ca
                 KeyValueSchema schema = ProjectedColumnExpression.buildSchema(dataColumns);
                 // Set key value schema of the data columns.
                 serializeSchemaIntoScan(scan, schema);
-                
-                // Set index maintainer of the local index.
-                serializeIndexMaintainerIntoScan(scan, dataTable);
-                // Set view constants if exists.
-                serializeViewConstantsIntoScan(scan, dataTable);
+                if (table.getIndexType() == IndexType.LOCAL) {

Review comment:
       Don't we need this for global uncovered ones too?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [phoenix] kadirozde commented on a change in pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Posted by GitBox <gi...@apache.org>.

kadirozde commented on a change in pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#discussion_r812569784



##########
File path: phoenix-core/src/it/java/org/apache/phoenix/end2end/index/GlobalIndexOptimizationIT.java
##########
@@ -345,7 +335,7 @@ private void testOptimizationTenantSpecific(String dataTableName, String indexTa
                             "        CLIENT PARALLEL 1-WAY RANGE SCAN OVER " + indexTableName + " \\['tid1','a'\\]\n" +
                             "            SERVER FILTER BY FIRST KEY ONLY\n" +
                             "    DYNAMIC SERVER FILTER BY \\(\"" + dataTableName + ".K1\", \"" + dataTableName + ".K2\"\\) IN \\(\\(\\$\\d+.\\$\\d+, \\$\\d+.\\$\\d+\\)\\)";
-            assertTrue("Expected:\n" + expected + "\ndid not match\n" + actual, Pattern.matches(expected, actual));
+            //assertTrue("Expected:\n" + expected + "\ndid not match\n" + actual, Pattern.matches(expected, actual));

Review comment:
       Yes, it was. I will fix it




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [phoenix] kadirozde commented on a change in pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Posted by GitBox <gi...@apache.org>.

kadirozde commented on a change in pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#discussion_r676046412



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/compile/FromCompiler.java
##########
@@ -1168,8 +1168,8 @@ public ColumnRef resolveColumn(String schemaName, String tableName, String colNa
                 colRef = super.resolveColumn(schemaName, tableName, colName);
             } catch (ColumnNotFoundException e) {
                 // This could be a ColumnRef for local index data column.

Review comment:
       ok




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [phoenix] kadirozde commented on a change in pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Posted by GitBox <gi...@apache.org>.

kadirozde commented on a change in pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#discussion_r660011956



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/util/IndexUtil.java
##########
@@ -574,20 +575,35 @@ public static void wrapResultUsingOffset(final RegionCoprocessorEnvironment envi
                 }
             }
             Result joinResult = null;
-            if (dataRegion != null) {
-                joinResult = dataRegion.get(get);
-            } else {
-                TableName dataTable =
-                    TableName.valueOf(MetaDataUtil.getLocalIndexUserTableName(environment.getRegion().
-                        getTableDescriptor().getTableName().getNameAsString()));
+            if (ScanUtil.isLocalIndex(scan)) {
+                if (dataRegion != null) {
+                    joinResult = dataRegion.get(get);
+                } else {
+                    TableName dataTable =
+                        TableName.valueOf(MetaDataUtil.getLocalIndexUserTableName(environment.getRegion().
+                            getTableDescriptor().getTableName().getNameAsString()));
+                    Table table = null;
+                    try {
+                        table = environment.getConnection().getTable(dataTable);
+                        joinResult = table.get(get);
+                    } finally {
+                        if (table != null) table.close();
+                    }
+                }
+            } else if (ScanUtil.isGlobalIndex(scan)) {
+                byte[] dataTableName = scan.getAttribute(PHYSICAL_DATA_TABLE_NAME);
                 Table table = null;
                 try {
-                    table = environment.getConnection().getTable(dataTable);
+                    table = ServerUtil.ConnectionFactory.
+                            getConnection(ServerUtil.ConnectionType.INDEX_WRITER_CONNECTION, environment).
+                            getTable(TableName.valueOf(dataTableName));
                     joinResult = table.get(get);

Review comment:
       I was not sure if I do it in this PR or open a separate jira for that. We can buffer lots of data row keys in memory and then use a skip scan filter and even multiple threads to issue a separate scan for each data table region. Essentially, this is what we do for reverse index verification. However, it requires quite a bit code refactoring. Please note this PR updates both client and server code. If we leave the improvement to the second PR, that will update only the server side. What do you think?

##########
File path: phoenix-core/src/main/java/org/apache/phoenix/util/IndexUtil.java
##########
@@ -574,20 +575,35 @@ public static void wrapResultUsingOffset(final RegionCoprocessorEnvironment envi
                 }
             }
             Result joinResult = null;
-            if (dataRegion != null) {
-                joinResult = dataRegion.get(get);
-            } else {
-                TableName dataTable =
-                    TableName.valueOf(MetaDataUtil.getLocalIndexUserTableName(environment.getRegion().
-                        getTableDescriptor().getTableName().getNameAsString()));
+            if (ScanUtil.isLocalIndex(scan)) {
+                if (dataRegion != null) {
+                    joinResult = dataRegion.get(get);
+                } else {
+                    TableName dataTable =
+                        TableName.valueOf(MetaDataUtil.getLocalIndexUserTableName(environment.getRegion().
+                            getTableDescriptor().getTableName().getNameAsString()));
+                    Table table = null;
+                    try {
+                        table = environment.getConnection().getTable(dataTable);
+                        joinResult = table.get(get);
+                    } finally {
+                        if (table != null) table.close();
+                    }
+                }
+            } else if (ScanUtil.isGlobalIndex(scan)) {
+                byte[] dataTableName = scan.getAttribute(PHYSICAL_DATA_TABLE_NAME);
                 Table table = null;
                 try {
-                    table = environment.getConnection().getTable(dataTable);
+                    table = ServerUtil.ConnectionFactory.
+                            getConnection(ServerUtil.ConnectionType.INDEX_WRITER_CONNECTION, environment).
+                            getTable(TableName.valueOf(dataTableName));
                     joinResult = table.get(get);

Review comment:
       I have created https://issues.apache.org/jira/browse/PHOENIX-6501 for this improvement




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [phoenix] kadirozde commented on a change in pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Posted by GitBox <gi...@apache.org>.

kadirozde commented on a change in pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#discussion_r676046035



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/util/IndexUtil.java
##########
@@ -574,20 +575,35 @@ public static void wrapResultUsingOffset(final RegionCoprocessorEnvironment envi
                 }
             }
             Result joinResult = null;
-            if (dataRegion != null) {
-                joinResult = dataRegion.get(get);
-            } else {
-                TableName dataTable =
-                    TableName.valueOf(MetaDataUtil.getLocalIndexUserTableName(environment.getRegion().
-                        getTableDescriptor().getTableName().getNameAsString()));
+            if (ScanUtil.isLocalIndex(scan)) {
+                if (dataRegion != null) {
+                    joinResult = dataRegion.get(get);
+                } else {
+                    TableName dataTable =
+                        TableName.valueOf(MetaDataUtil.getLocalIndexUserTableName(environment.getRegion().
+                            getTableDescriptor().getTableName().getNameAsString()));
+                    Table table = null;
+                    try {
+                        table = environment.getConnection().getTable(dataTable);
+                        joinResult = table.get(get);
+                    } finally {
+                        if (table != null) table.close();
+                    }
+                }
+            } else if (ScanUtil.isGlobalIndex(scan)) {
+                byte[] dataTableName = scan.getAttribute(PHYSICAL_DATA_TABLE_NAME);
                 Table table = null;
                 try {
-                    table = environment.getConnection().getTable(dataTable);
+                    table = ServerUtil.ConnectionFactory.

Review comment:
       Ok




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [phoenix] stoty commented on pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Posted by GitBox <gi...@apache.org>.

stoty commented on pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#issuecomment-870437873


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   6m 26s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files found.  |
   | +1 :green_heart: |  hbaseanti  |   0m  0s |  Patch does not have any anti-patterns.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  The patch appears to include 2 new or modified test files.  |
   ||| _ master Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  24m 27s |  master passed  |
   | +0 |  hbaserecompile  |  31m 28s |  HBase recompiled.  |
   | +1 :green_heart: |  compile  |   1m  7s |  master passed  |
   | +1 :green_heart: |  checkstyle  |   1m 50s |  master passed  |
   | +1 :green_heart: |  javadoc  |   0m 53s |  master passed  |
   | +0 :ok: |  spotbugs  |   3m 21s |  phoenix-core in master has 965 extant spotbugs warnings.  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  16m 54s |  the patch passed  |
   | +0 |  hbaserecompile  |  27m 15s |  HBase recompiled.  |
   | +1 :green_heart: |  compile  |   1m  6s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m  6s |  the patch passed  |
   | -1 :x: |  checkstyle  |   1m 50s |  phoenix-core: The patch generated 63 new + 4814 unchanged - 47 fixed = 4877 total (was 4861)  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  javadoc  |   0m 51s |  the patch passed  |
   | -1 :x: |  spotbugs  |   3m 31s |  phoenix-core generated 1 new + 963 unchanged - 2 fixed = 964 total (was 965)  |
   ||| _ Other Tests _ |
   | -1 :x: |  unit  | 124m 10s |  phoenix-core in the patch failed.  |
   | +1 :green_heart: |  asflicense  |   0m 46s |  The patch does not generate ASF License warnings.  |
   |  |   | 204m 29s |   |
   
   
   | Reason | Tests |
   |-------:|:------|
   | FindBugs | module:phoenix-core |
   |  |  org.apache.phoenix.schema.IndexDataColumnRef doesn't override ColumnRef.equals(Object)  At IndexDataColumnRef.java:At IndexDataColumnRef.java:[line 1] |
   | Failed junit tests | phoenix.end2end.index.GlobalIndexOptimizationIT |
   |   | phoenix.end2end.index.IndexUsageIT |
   |   | phoenix.end2end.DerivedTableIT |
   |   | phoenix.end2end.UserDefinedFunctionsIT |
   |   | phoenix.end2end.IndexToolIT |
   |   | phoenix.end2end.RowValueConstructorOffsetIT |
   |   | phoenix.end2end.ViewIT |
   |   | phoenix.end2end.DistinctPrefixFilterIT |
   |   | phoenix.end2end.ExplainPlanWithStatsEnabledIT |
   |   | phoenix.end2end.DefaultColumnValueIT |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1256/3/artifact/yetus-general-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/phoenix/pull/1256 |
   | Optional Tests | dupname asflicense javac javadoc unit spotbugs hbaserebuild hbaseanti checkstyle compile |
   | uname | Linux 90ed47f9bc26 4.15.0-136-generic #140-Ubuntu SMP Thu Jan 28 05:20:47 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev/phoenix-personality.sh |
   | git revision | master / fcdf5bc |
   | Default Java | Private Build-1.8.0_242-8u242-b08-0ubuntu3~16.04-b08 |
   | checkstyle | https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1256/3/artifact/yetus-general-check/output/diff-checkstyle-phoenix-core.txt |
   | spotbugs | https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1256/3/artifact/yetus-general-check/output/new-spotbugs-phoenix-core.html |
   | unit | https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1256/3/artifact/yetus-general-check/output/patch-unit-phoenix-core.txt |
   |  Test Results | https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1256/3/testReport/ |
   | Max. process+thread count | 12127 (vs. ulimit of 30000) |
   | modules | C: phoenix-core U: phoenix-core |
   | Console output | https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1256/3/console |
   | versions | git=2.7.4 maven=3.3.9 spotbugs=4.1.3 |
   | Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [phoenix] kadirozde commented on a change in pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Posted by GitBox <gi...@apache.org>.

kadirozde commented on a change in pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#discussion_r659548098



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/iterate/ExplainTable.java
##########
@@ -302,7 +306,8 @@ private void appendPKColumnValue(StringBuilder buf, byte[] range, Boolean isNull
     private Long getViewIndexValue(PDataType type, byte[] range) {
         boolean useLongViewIndex = MetaDataUtil.getViewIndexIdDataType().equals(type);
         Object s = type.toObject(range);
-        return (useLongViewIndex ? (Long) s : (Short) s) + Short.MAX_VALUE + 2;
+        //return (useLongViewIndex ? (Long) s : (Short) s) + Short.MAX_VALUE + 2;
+        return (useLongViewIndex ? (Long) s : 0) + Short.MAX_VALUE + 2;

Review comment:
       Good catch! I forgot to undue the change there. I will fix it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [phoenix] comnetwork edited a comment on pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Posted by GitBox <gi...@apache.org>.

comnetwork edited a comment on pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#issuecomment-870392940


   @kadirozde @lhofhansl  FYI.
   
   1.You said "Phoenix client does not use a global index for the queries with the columns that are not covered by the global index" is not right , In QueryOptimizer.addPlan, for the sql with the columns that are not covered by the global index, if user specify a Index  Hint and there exists where clause, the sql would be rewritten as
   "SELECT /*+ NO_INDEX */ K,V1,V2 FROM T  WHERE ("K" IN ((SELECT /*+ INDEX(T IDX) */ ":K" FROM "IDX"  WHERE "0:V1" = 'bar')) AND V2 = 'foo') " (k is pk of T , v1 is in IDX and v2 is not), you may consider compatibility with exising code.
   
   2.Whether or not scaning the gobal index and  retrieving the corresponding rows from the data table is better than just scaning the data table is a complex problem, because there are many factors we need to  consider such as Network cost, random disk access cost , data distribution , column selective  etc.  you said "It is expected that such performance improvement will happen when the index row key prefix length is greater than the data row key prefix length for a given query" is extremely insufficient.  Lack of a CBO framework in Phoenix, it is sensible to be conservative, I think it is better to left the choice to user by users specifying the Index Hint just as the existing code.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [phoenix] comnetwork edited a comment on pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Posted by GitBox <gi...@apache.org>.

comnetwork edited a comment on pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#issuecomment-870392940






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [phoenix] comnetwork commented on pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Posted by GitBox <gi...@apache.org>.

comnetwork commented on pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#issuecomment-870392940


   @kadirozde @lhofhansl  FYI.
   
   1.You said "Phoenix client does not use a global index for the queries with the columns that are not covered by the global index" is not right , In QueryOptimizer.addPlan, if user specify a Index  Hint and there exists where clause, the sql would be rewritten as
   "SELECT /*+ NO_INDEX */ K,V1,V2 FROM T  WHERE ("K" IN ((SELECT /*+ INDEX(T IDX) */ ":K" FROM "IDX"  WHERE "0:V1" = 'bar')) AND V2 = 'foo') " , you may consider compatibility with exising code.
   
   2.Whether or not scaning the gobal index and  retrieving the corresponding rows from the data table is better than just scaning the data table is a complex problem, because there are many factors we need to  consider such as Network cost, random disk access cost , data distribution , column selective  etc.  you said "It is expected that such performance improvement will happen when the index row key prefix length is greater than the data row key prefix length for a given query" is very insufficient.  Lack of a CBO framework in Phoenix, it is sensible to be conservative, I think it is better to left the choice to user by users specifying the Index Hint just as the existing code.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [phoenix] virajjasani commented on a change in pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Posted by GitBox <gi...@apache.org>.

virajjasani commented on a change in pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#discussion_r667666618



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/compile/TupleProjectionCompiler.java
##########
@@ -156,7 +155,7 @@ public static PTable createProjectedTable(SelectStatement select, StatementConte
         }
         // add LocalIndexDataColumnRef

Review comment:
       nit: `IndexDataColumnRef` in place of `LocalIndexDataColumnRef`?

##########
File path: phoenix-core/src/main/java/org/apache/phoenix/util/IndexUtil.java
##########
@@ -574,20 +575,35 @@ public static void wrapResultUsingOffset(final RegionCoprocessorEnvironment envi
                 }
             }
             Result joinResult = null;
-            if (dataRegion != null) {
-                joinResult = dataRegion.get(get);
-            } else {
-                TableName dataTable =
-                    TableName.valueOf(MetaDataUtil.getLocalIndexUserTableName(environment.getRegion().
-                        getTableDescriptor().getTableName().getNameAsString()));
+            if (ScanUtil.isLocalIndex(scan)) {
+                if (dataRegion != null) {
+                    joinResult = dataRegion.get(get);
+                } else {
+                    TableName dataTable =
+                        TableName.valueOf(MetaDataUtil.getLocalIndexUserTableName(environment.getRegion().
+                            getTableDescriptor().getTableName().getNameAsString()));
+                    Table table = null;
+                    try {
+                        table = environment.getConnection().getTable(dataTable);
+                        joinResult = table.get(get);
+                    } finally {
+                        if (table != null) table.close();
+                    }

Review comment:
       Good to replace with `try-with-resources`?

##########
File path: phoenix-core/src/main/java/org/apache/phoenix/coprocessor/BaseScannerRegionObserver.java
##########
@@ -80,9 +81,15 @@
         "_IndexRebuildDisableLoggingVerifyType";
     public static final String INDEX_REBUILD_DISABLE_LOGGING_BEYOND_MAXLOOKBACK_AGE =
         "_IndexRebuildDisableLoggingBeyondMaxLookbackAge";
+    @Deprecated
     public static final String LOCAL_INDEX_FILTER = "_LocalIndexFilter";
+    @Deprecated
     public static final String LOCAL_INDEX_LIMIT = "_LocalIndexLimit";
+    @Deprecated
     public static final String LOCAL_INDEX_FILTER_STR = "_LocalIndexFilterStr";

Review comment:
       I believe these deprecated fields can be removed only after we come to a major release (e.g 6.x/7.x) where server running on that release can no longer be directly supported by client version <= 4.16/4.17, is that correct?

##########
File path: phoenix-core/src/main/java/org/apache/phoenix/compile/FromCompiler.java
##########
@@ -1168,8 +1168,8 @@ public ColumnRef resolveColumn(String schemaName, String tableName, String colNa
                 colRef = super.resolveColumn(schemaName, tableName, colName);
             } catch (ColumnNotFoundException e) {
                 // This could be a ColumnRef for local index data column.

Review comment:
       nit: we can remove `local` reference here?

##########
File path: phoenix-core/src/main/java/org/apache/phoenix/util/IndexUtil.java
##########
@@ -574,20 +575,35 @@ public static void wrapResultUsingOffset(final RegionCoprocessorEnvironment envi
                 }
             }
             Result joinResult = null;
-            if (dataRegion != null) {
-                joinResult = dataRegion.get(get);
-            } else {
-                TableName dataTable =
-                    TableName.valueOf(MetaDataUtil.getLocalIndexUserTableName(environment.getRegion().
-                        getTableDescriptor().getTableName().getNameAsString()));
+            if (ScanUtil.isLocalIndex(scan)) {
+                if (dataRegion != null) {
+                    joinResult = dataRegion.get(get);
+                } else {
+                    TableName dataTable =
+                        TableName.valueOf(MetaDataUtil.getLocalIndexUserTableName(environment.getRegion().
+                            getTableDescriptor().getTableName().getNameAsString()));
+                    Table table = null;
+                    try {
+                        table = environment.getConnection().getTable(dataTable);
+                        joinResult = table.get(get);
+                    } finally {
+                        if (table != null) table.close();
+                    }
+                }
+            } else if (ScanUtil.isGlobalIndex(scan)) {
+                byte[] dataTableName = scan.getAttribute(PHYSICAL_DATA_TABLE_NAME);
                 Table table = null;
                 try {
-                    table = environment.getConnection().getTable(dataTable);
+                    table = ServerUtil.ConnectionFactory.

Review comment:
       Same here reg `try-with-resources`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [phoenix] kadirozde commented on pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Posted by GitBox <gi...@apache.org>.

kadirozde commented on pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#issuecomment-914695124

@lhofhansl , @comnetwork, I have done some performance testing on a cluster with 15 region servers. I created a data table with 16 million rows. Each row is about 2500 bytes. The row key of this table is composed of four fields (VARCHAR, INTEGER, TIMESTAMP, VARCHAR). I run the same test without an index, with a covered index and with an uncovered index. The timestamp field is indexed. The query used in the test returned N rows that fall in to the a supplied timestamp range, where N is supplied as the limit parameter. The query returns four fields. The query times in ms are as follows:

limit covered uncovered no index
1 212 252 4404
10 215 256 5375
100 215 310 5169
1000 232 1125 4698
10000 433 7325 6440
100000 1588 67002 6789

It is clear that if the number of selected rows is large (in this case 10000 or more) the uncovered index starts to perform worse than the full table scan. No sure if these results are generalizable. Instead of using an uncovered index by default, I will add a logic to use an uncovered index only if it is given as a hint.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [phoenix] lhofhansl commented on a change in pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Posted by GitBox <gi...@apache.org>.

lhofhansl commented on a change in pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#discussion_r659498128



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/iterate/ExplainTable.java
##########
@@ -302,7 +306,8 @@ private void appendPKColumnValue(StringBuilder buf, byte[] range, Boolean isNull
     private Long getViewIndexValue(PDataType type, byte[] range) {
         boolean useLongViewIndex = MetaDataUtil.getViewIndexIdDataType().equals(type);
         Object s = type.toObject(range);
-        return (useLongViewIndex ? (Long) s : (Short) s) + Short.MAX_VALUE + 2;
+        //return (useLongViewIndex ? (Long) s : (Short) s) + Short.MAX_VALUE + 2;
+        return (useLongViewIndex ? (Long) s : 0) + Short.MAX_VALUE + 2;

Review comment:
       I suspect this will cause some very tricky backwards compatibility issue.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [phoenix] kadirozde commented on pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Posted by GitBox <gi...@apache.org>.

kadirozde commented on pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#issuecomment-1050268861


   @gokceni, Thank you for approving the updated version. @lhofhansl and @comnetwork, thank you for reviewing the earlier versions and I have updated the PR based on your comments. I am going to merge this now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [phoenix] kadirozde commented on pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Posted by GitBox <gi...@apache.org>.

kadirozde commented on pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#issuecomment-870799645


   > @kadirozde @lhofhansl FYI.
   > 
   > 1.You said "Phoenix client does not use a global index for the queries with the columns that are not covered by the global index" is not right , In QueryOptimizer.addPlan, for the sql with the columns that are not covered by the global index, if user specify a Index Hint and there exists where clause, the sql would be rewritten as
   > "SELECT /*+ NO_INDEX _/ K,V1,V2 FROM T WHERE ("K" IN ((SELECT /_+ INDEX(T IDX) */ ":K" FROM "IDX" WHERE "0:V1" = 'bar')) AND V2 = 'foo') " (k is pk of T , v1 is in IDX and v2 is not), you may consider compatibility with exising code.
   > 
   
   What I meant is that by default the uncovered global index is not used. One can construct a query plan manually using hints as you pointed it out to use the uncovered global index. Please note that you can also construct a SQL join statement and achieve the same thing.  
   
   > 2.Whether or not scaning the gobal index and retrieving the corresponding rows from the data table is better than just scaning the data table is a complex problem, because there are many factors we need to consider such as Network cost, random disk access cost , data distribution , column selective etc. You said "It is expected that such performance improvement will happen when the index row key prefix length is greater than the data row key prefix length for a given query" is extremely insufficient. Lack of a CBO framework in Phoenix, seems that it is sensible to be conservative, I think it is better to left whether or not select this strategy to user by user specifying the Index Hint just as the existing code.
   
   I agree that there is no guarantee that the index always performs better. However, based on my experience, it will perform better in most of the cases in practice.  This is because the index PK is designed by the user who knows the use case (the type and shape of queries) and the user wants that the index should be used if the index row key prefix length is greater than the data row key prefix length for a given query in general. I understand your concern here and please help me out on how to proceed here. I can add a config param to use uncovered indexes without a specific hint.  This mean that we will preserve the existing behavior if the config param is not specified. Would that address your concern?
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [phoenix] kadirozde commented on a change in pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Posted by GitBox <gi...@apache.org>.

kadirozde commented on a change in pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#discussion_r660268736



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/util/IndexUtil.java
##########
@@ -574,20 +575,35 @@ public static void wrapResultUsingOffset(final RegionCoprocessorEnvironment envi
                 }
             }
             Result joinResult = null;
-            if (dataRegion != null) {
-                joinResult = dataRegion.get(get);
-            } else {
-                TableName dataTable =
-                    TableName.valueOf(MetaDataUtil.getLocalIndexUserTableName(environment.getRegion().
-                        getTableDescriptor().getTableName().getNameAsString()));
+            if (ScanUtil.isLocalIndex(scan)) {
+                if (dataRegion != null) {
+                    joinResult = dataRegion.get(get);
+                } else {
+                    TableName dataTable =
+                        TableName.valueOf(MetaDataUtil.getLocalIndexUserTableName(environment.getRegion().
+                            getTableDescriptor().getTableName().getNameAsString()));
+                    Table table = null;
+                    try {
+                        table = environment.getConnection().getTable(dataTable);
+                        joinResult = table.get(get);
+                    } finally {
+                        if (table != null) table.close();
+                    }
+                }
+            } else if (ScanUtil.isGlobalIndex(scan)) {
+                byte[] dataTableName = scan.getAttribute(PHYSICAL_DATA_TABLE_NAME);
                 Table table = null;
                 try {
-                    table = environment.getConnection().getTable(dataTable);
+                    table = ServerUtil.ConnectionFactory.
+                            getConnection(ServerUtil.ConnectionType.INDEX_WRITER_CONNECTION, environment).
+                            getTable(TableName.valueOf(dataTableName));
                     joinResult = table.get(get);

Review comment:
       I have created https://issues.apache.org/jira/browse/PHOENIX-6501 for this improvement




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [phoenix] comnetwork edited a comment on pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Posted by GitBox <gi...@apache.org>.

comnetwork edited a comment on pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#issuecomment-870392940


   @kadirozde @lhofhansl  FYI.
   
   1.You said "Phoenix client does not use a global index for the queries with the columns that are not covered by the global index" is not right , In QueryOptimizer.addPlan, for the sql with the columns that are not covered by the global index, if user specify a Index  Hint and there exists where clause, the sql would be rewritten as
   "SELECT /*+ NO_INDEX */ K,V1,V2 FROM T  WHERE ("K" IN ((SELECT /*+ INDEX(T IDX) */ ":K" FROM "IDX"  WHERE "0:V1" = 'bar')) AND V2 = 'foo') " (k is pk of T , v1 is in IDX and v2 is not), you may consider compatibility with exising code.
   
   2.Whether or not scaning the gobal index and  retrieving the corresponding rows from the data table is better than just scaning the data table is a complex problem, because there are many factors we need to  consider such as Network cost, random disk access cost , data distribution , column selective  etc.  You said "It is expected that such performance improvement will happen when the index row key prefix length is greater than the data row key prefix length for a given query" is extremely insufficient.  Lack of a CBO framework in Phoenix, seems that it is sensible to be conservative, I think it is better to left whether or not select this strategy to user by user specifying the Index Hint just as the existing code.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [phoenix] comnetwork commented on pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Posted by GitBox <gi...@apache.org>.

comnetwork commented on pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#issuecomment-871044039


   @kadirozde ， thank you for reply


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [phoenix] lhofhansl commented on pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Posted by GitBox <gi...@apache.org>.

lhofhansl commented on pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#issuecomment-1048306004


   Sorry for the late reply. I have not looked the updated PR, gating this on a hint seems fine. Let's make this hint is for global indexes only, and does not apply to local indexes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [phoenix] lhofhansl commented on pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Posted by GitBox <gi...@apache.org>.

lhofhansl commented on pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#issuecomment-875280447


   If you do SELECT count(uncovered_column) FROM T WHERE covered_column = xyz, the global uncovered index is not used even when you hint it as expected (I just verified that current 5.x. Phoenix).
   
   I found that uncovered local indexes (that's what I tested) are something much slower than doing to a full table scan. That happens when there is a WHERE clause that an index could be used for, but the WHERE restriction is not selective.
   
   (As noted above, FAST_DIFF (Phoenix' default) is actually the worst choice since SEEKs are slow with it. ROW_INDEX_V1 with ZSTD compression are far better. I blogged about this here: https://hadoop-hbase.blogspot.com/2018/10/apache-hbase-and-apache-phoenix-more-on.html a while ago: With FAST_DIFF the WHERE clause needed to be **0.5% (return 1/200 of the data)** to be effective. With ROW_INDEX_V1 + ZSTD that was 10%.)
   
   This is at best as good as uncovered local indexing, and probably worse since we need to go remote for each row, unless we do batching. And the batches would still be requiring a SKIP_SCAN, which in the general case is still very slow for FAST_DIFF.
   So I expect with defaults the WHERE clause would need to be somewhere between 1/1000 and 1/300 hundred selective for this to be improvement.
   
   Anyway... I think we should check this in. Presumable folks would create uncovered global indexes only when they know what they are doing.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [phoenix] stoty commented on pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Posted by GitBox <gi...@apache.org>.

stoty commented on pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#issuecomment-869380102


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   6m 40s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files found.  |
   | +1 :green_heart: |  hbaseanti  |   0m  0s |  Patch does not have any anti-patterns.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.  |
   ||| _ master Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  24m 22s |  master passed  |
   | +0 |  hbaserecompile  |  31m 27s |  HBase recompiled.  |
   | +1 :green_heart: |  compile  |   1m  4s |  master passed  |
   | +1 :green_heart: |  checkstyle  |   1m 46s |  master passed  |
   | +1 :green_heart: |  javadoc  |   0m 55s |  master passed  |
   | +0 :ok: |  spotbugs  |   3m 20s |  phoenix-core in master has 965 extant spotbugs warnings.  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  16m 38s |  the patch passed  |
   | +0 |  hbaserecompile  |  26m 58s |  HBase recompiled.  |
   | +1 :green_heart: |  compile  |   1m  4s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m  4s |  the patch passed  |
   | -1 :x: |  checkstyle  |   1m 46s |  phoenix-core: The patch generated 129 new + 4749 unchanged - 112 fixed = 4878 total (was 4861)  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  javadoc  |   0m 51s |  the patch passed  |
   | -1 :x: |  spotbugs  |   3m 31s |  phoenix-core generated 1 new + 963 unchanged - 2 fixed = 964 total (was 965)  |
   ||| _ Other Tests _ |
   | -1 :x: |  unit  |   1m 45s |  phoenix-core in the patch failed.  |
   | +1 :green_heart: |  asflicense  |   0m 14s |  The patch does not generate ASF License warnings.  |
   |  |   |  79m 43s |   |
   
   
   | Reason | Tests |
   |-------:|:------|
   | FindBugs | module:phoenix-core |
   |  |  org.apache.phoenix.schema.IndexDataColumnRef doesn't override ColumnRef.equals(Object)  At IndexDataColumnRef.java:At IndexDataColumnRef.java:[line 1] |
   | Failed junit tests | phoenix.compile.QueryOptimizerTest |
   |   | phoenix.compile.TenantSpecificViewIndexCompileTest |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1256/1/artifact/yetus-general-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/phoenix/pull/1256 |
   | Optional Tests | dupname asflicense javac javadoc unit spotbugs hbaserebuild hbaseanti checkstyle compile |
   | uname | Linux f6d075d7d9a5 4.15.0-136-generic #140-Ubuntu SMP Thu Jan 28 05:20:47 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev/phoenix-personality.sh |
   | git revision | master / fcdf5bc |
   | Default Java | Private Build-1.8.0_242-8u242-b08-0ubuntu3~16.04-b08 |
   | checkstyle | https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1256/1/artifact/yetus-general-check/output/diff-checkstyle-phoenix-core.txt |
   | spotbugs | https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1256/1/artifact/yetus-general-check/output/new-spotbugs-phoenix-core.html |
   | unit | https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1256/1/artifact/yetus-general-check/output/patch-unit-phoenix-core.txt |
   |  Test Results | https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1256/1/testReport/ |
   | Max. process+thread count | 431 (vs. ulimit of 30000) |
   | modules | C: phoenix-core U: phoenix-core |
   | Console output | https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1256/1/console |
   | versions | git=2.7.4 maven=3.3.9 spotbugs=4.1.3 |
   | Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [phoenix] kadirozde commented on pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Posted by GitBox <gi...@apache.org>.

kadirozde commented on pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#issuecomment-1023603041


   > @kadirozde @lhofhansl FYI.
   > 
   > 1.You said "Phoenix client does not use a global index for the queries with the columns that are not covered by the global index" is not right , In QueryOptimizer.addPlan, for the sql with the columns that are not covered by the global index, if user specify a Index Hint and there exists where clause, the sql would be rewritten as "SELECT /*+ NO_INDEX _/ K,V1,V2 FROM T WHERE ("K" IN ((SELECT /_+ INDEX(T IDX) */ ":K" FROM "IDX" WHERE "0:V1" = 'bar')) AND V2 = 'foo') " (k is pk of T , v1 is in IDX and v2 is not), you may consider compatibility with exising code.
   > 
   > 2.Whether or not scaning the gobal index and retrieving the corresponding rows from the data table is better than just scaning the data table is a complex problem, because there are many factors we need to consider such as Network cost, random disk access cost , data distribution , column selective etc. You said "It is expected that such performance improvement will happen when the index row key prefix length is greater than the data row key prefix length for a given query" is extremely insufficient. Lack of a CBO framework in Phoenix, seems that it is sensible to be conservative, I think it is better to left whether or not select this strategy to user by user specifying the Index Hint just as the existing code.
   
   @comnetwork @lhofhansl, I have updated the PR such that the uncovered global indexes will be used only when the index hint is provided as @comnetwork suggested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [phoenix] kadirozde merged pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Posted by GitBox <gi...@apache.org>.

kadirozde merged pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [phoenix] kadirozde commented on a change in pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Posted by GitBox <gi...@apache.org>.

kadirozde commented on a change in pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#discussion_r676045892



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/util/IndexUtil.java
##########
@@ -574,20 +575,35 @@ public static void wrapResultUsingOffset(final RegionCoprocessorEnvironment envi
                 }
             }
             Result joinResult = null;
-            if (dataRegion != null) {
-                joinResult = dataRegion.get(get);
-            } else {
-                TableName dataTable =
-                    TableName.valueOf(MetaDataUtil.getLocalIndexUserTableName(environment.getRegion().
-                        getTableDescriptor().getTableName().getNameAsString()));
+            if (ScanUtil.isLocalIndex(scan)) {
+                if (dataRegion != null) {
+                    joinResult = dataRegion.get(get);
+                } else {
+                    TableName dataTable =
+                        TableName.valueOf(MetaDataUtil.getLocalIndexUserTableName(environment.getRegion().
+                            getTableDescriptor().getTableName().getNameAsString()));
+                    Table table = null;
+                    try {
+                        table = environment.getConnection().getTable(dataTable);
+                        joinResult = table.get(get);
+                    } finally {
+                        if (table != null) table.close();
+                    }

Review comment:
       Agree




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [phoenix] kadirozde commented on a change in pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Posted by GitBox <gi...@apache.org>.

kadirozde commented on a change in pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#discussion_r660011956



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/util/IndexUtil.java
##########
@@ -574,20 +575,35 @@ public static void wrapResultUsingOffset(final RegionCoprocessorEnvironment envi
                 }
             }
             Result joinResult = null;
-            if (dataRegion != null) {
-                joinResult = dataRegion.get(get);
-            } else {
-                TableName dataTable =
-                    TableName.valueOf(MetaDataUtil.getLocalIndexUserTableName(environment.getRegion().
-                        getTableDescriptor().getTableName().getNameAsString()));
+            if (ScanUtil.isLocalIndex(scan)) {
+                if (dataRegion != null) {
+                    joinResult = dataRegion.get(get);
+                } else {
+                    TableName dataTable =
+                        TableName.valueOf(MetaDataUtil.getLocalIndexUserTableName(environment.getRegion().
+                            getTableDescriptor().getTableName().getNameAsString()));
+                    Table table = null;
+                    try {
+                        table = environment.getConnection().getTable(dataTable);
+                        joinResult = table.get(get);
+                    } finally {
+                        if (table != null) table.close();
+                    }
+                }
+            } else if (ScanUtil.isGlobalIndex(scan)) {
+                byte[] dataTableName = scan.getAttribute(PHYSICAL_DATA_TABLE_NAME);
                 Table table = null;
                 try {
-                    table = environment.getConnection().getTable(dataTable);
+                    table = ServerUtil.ConnectionFactory.
+                            getConnection(ServerUtil.ConnectionType.INDEX_WRITER_CONNECTION, environment).
+                            getTable(TableName.valueOf(dataTableName));
                     joinResult = table.get(get);

Review comment:
       I was not sure if I do it in this PR or open a separate jira for that. We can buffer lots of data row keys in memory and then use a skip scan filter and even multiple threads to issue a separate scan for each data table region. Essentially, this is what we do for reverse index verification. However, it requires quite a bit code refactoring. Please note this PR updates both client and server code. If we leave the improvement to the second PR, that will update only the server side. What do you think?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [phoenix] comnetwork edited a comment on pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Posted by GitBox <gi...@apache.org>.

comnetwork edited a comment on pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#issuecomment-870392940


   @kadirozde @lhofhansl  FYI.
   
   1.You said "Phoenix client does not use a global index for the queries with the columns that are not covered by the global index" is not right , In QueryOptimizer.addPlan, if user specify a Index  Hint and there exists where clause, the sql would be rewritten as
   "SELECT /*+ NO_INDEX */ K,V1,V2 FROM T  WHERE ("K" IN ((SELECT /*+ INDEX(T IDX) */ ":K" FROM "IDX"  WHERE "0:V1" = 'bar')) AND V2 = 'foo') " (k is pk of T , v1 is in IDX and v2 is not), you may consider compatibility with exising code.
   
   2.Whether or not scaning the gobal index and  retrieving the corresponding rows from the data table is better than just scaning the data table is a complex problem, because there are many factors we need to  consider such as Network cost, random disk access cost , data distribution , column selective  etc.  you said "It is expected that such performance improvement will happen when the index row key prefix length is greater than the data row key prefix length for a given query" is extremely insufficient.  Lack of a CBO framework in Phoenix, it is sensible to be conservative, I think it is better to left the choice to user by users specifying the Index Hint just as the existing code.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org