You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by GitBox <gi...@apache.org> on 2021/01/25 10:53:20 UTC

[GitHub] [carbondata] ShreelekhyaG opened a new pull request #4080: [WIP] Filter query having invalid results when add segment to SI with Indexserver

ShreelekhyaG opened a new pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080


    ### Why is this PR needed?
    The rows added by the external segment are not visible on filter queries with the index server.
    
    ### What changes were proposed in this PR?
   Added segment path to the segment to identify as an external segment in filter resolver step.
       
    ### Does this PR introduce any user interface change?
    - No
   
    ### Is any new testcase added?
    - No, tested in cluster.
   
       
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

Posted by GitBox <gi...@apache.org>.
ShreelekhyaG commented on a change in pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#discussion_r566032652



##########
File path: core/src/main/java/org/apache/carbondata/core/indexstore/SegmentWrapperContainer.java
##########
@@ -31,6 +31,9 @@
 
   private SegmentWrapper[] segmentWrappers;
 
+  public SegmentWrapperContainer() {

Review comment:
       `NoSuchMethodException` is thrown as the default instance is called from reflectionUtils.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

Posted by GitBox <gi...@apache.org>.
Indhumathi27 commented on a change in pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#discussion_r569345693



##########
File path: index/secondary-index/src/test/scala/org/apache/carbondata/spark/testsuite/secondaryindex/TestSIWithAddSegment.scala
##########
@@ -86,8 +86,8 @@ class TestSIWithAddSegment extends QueryTest with BeforeAndAfterAll {
     sql(s"alter table maintable1 add segment options('path'='${ newSegmentPath }', " +
         s"'format'='carbon')")
     sql("CREATE INDEX maintable1_si  on table maintable1 (c) as 'carbondata'")
-    assert(sql("show segments for table maintable1_si").collect().length ==
-           sql("show segments for table maintable1").collect().length)
+    assert(sql("show segments for table maintable1_si").collect().length == 2)
+    assert(sql("show segments for table maintable1").collect().length == 3)

Review comment:
       We use externalSegment Resolver for pruning external segments on main table and for remaining segments, we create a implicit filter to prune using SI. So, there will not be any wrong results for the query




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#issuecomment-772544028


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5420/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#issuecomment-772714795


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5423/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] asfgit closed pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

Posted by GitBox <gi...@apache.org>.
asfgit closed pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#issuecomment-770913921


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12444/job/ApacheCarbon_PR_Builder_2.4.5/3635/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

Posted by GitBox <gi...@apache.org>.
ShreelekhyaG commented on a change in pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#discussion_r567904349



##########
File path: core/src/main/java/org/apache/carbondata/core/indexstore/ExtendedBlocklet.java
##########
@@ -221,7 +221,7 @@ public void deserializeFields(DataInput in, String[] locations, String tablePath
       indexUniqueId = in.readUTF();
     }
     String filePath = getPath();
-    if (filePath.startsWith(File.separator)) {
+    if (!FileFactory.isFileExist(filePath)) {

Review comment:
       As discussed, using FileExists only for local file path.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4080: [WIP] Filter query having invalid results when add segment to SI with Indexserver

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#issuecomment-766782521






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#issuecomment-773101117






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#issuecomment-772314934


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12444/job/ApacheCarbon_PR_Builder_2.4.5/3653/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#issuecomment-772551414


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12444/job/ApacheCarbon_PR_Builder_2.4.5/3659/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#issuecomment-773101117


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5426/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#issuecomment-768159789


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12444/job/ApacheCarbon_PR_Builder_2.4.5/3598/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] Indhumathi27 commented on pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

Posted by GitBox <gi...@apache.org>.
Indhumathi27 commented on pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#issuecomment-772547535


   @ShreelekhyaG please update PR description


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] asfgit closed pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

Posted by GitBox <gi...@apache.org>.
asfgit closed pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4080: [WIP] Filter query having invalid results when add segment to SI with Indexserver

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#issuecomment-766783288


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12444/job/ApacheCarbon_PR_Builder_2.4.5/3592/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

Posted by GitBox <gi...@apache.org>.
Indhumathi27 commented on a change in pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#discussion_r567844159



##########
File path: integration/spark/src/main/java/org/apache/spark/sql/secondaryindex/load/CarbonInternalLoaderUtil.java
##########
@@ -51,9 +51,10 @@
   public static List<String> getListOfValidSlices(LoadMetadataDetails[] details) {
     List<String> activeSlices = new ArrayList<>(CarbonCommonConstants.DEFAULT_COLLECTION_SIZE);
     for (LoadMetadataDetails oneLoad : details) {
-      if (SegmentStatus.SUCCESS.equals(oneLoad.getSegmentStatus())
+      // No need to consider external segments for SI load
+      if (oneLoad.getPath() == null && (SegmentStatus.SUCCESS.equals(oneLoad.getSegmentStatus())

Review comment:
       Please add code to not repair si, when add load is executed




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] akashrn5 commented on pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

Posted by GitBox <gi...@apache.org>.
akashrn5 commented on pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#issuecomment-773182792


   LGTM


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

Posted by GitBox <gi...@apache.org>.
Indhumathi27 commented on a change in pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#discussion_r567594628



##########
File path: core/src/main/java/org/apache/carbondata/core/index/IndexInputFormat.java
##########
@@ -159,6 +162,19 @@ public void initialize(InputSplit inputSplit, TaskAttemptContext taskAttemptCont
           if (filterResolverIntf != null) {
             filter.setExpression(filterResolverIntf.getFilterExpression());
           }
+          for (Segment segment : segmentsToLoad) {

Review comment:
       I think, after [PR-3656](https://github.com/apache/carbondata/pull/3656), we are not loading external segment to SI. External segment will be queried from main table only. But currently, i think SILoadEventListenerForFailedSegments listener is loading external segment while, si repair is enabled. Please check and handle it




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#issuecomment-768940103


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12444/job/ApacheCarbon_PR_Builder_2.4.5/3611/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

Posted by GitBox <gi...@apache.org>.
ShreelekhyaG commented on a change in pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#discussion_r569967707



##########
File path: index/secondary-index/src/test/scala/org/apache/carbondata/spark/testsuite/secondaryindex/TestSIWithAddSegment.scala
##########
@@ -61,13 +61,16 @@ class TestSIWithAddSegment extends QueryTest with BeforeAndAfterAll {
   }
 
   test("test if the query hits SI after adding a segment to the main table") {
-    val d = sql("select * from maintable where c = 'm'")
-    assert(d.queryExecution.executedPlan.isInstanceOf[BroadCastSIFilterPushJoin])
+    val extSegmentQuery = sql("select * from maintable where c = 'm'")
+    val loadedSegmentQuery = sql("select * from maintable where c = 'k'")

Review comment:
       Done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

Posted by GitBox <gi...@apache.org>.
ShreelekhyaG commented on a change in pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#discussion_r569160839



##########
File path: index/secondary-index/src/test/scala/org/apache/carbondata/spark/testsuite/secondaryindex/TestSIWithAddSegment.scala
##########
@@ -86,8 +86,8 @@ class TestSIWithAddSegment extends QueryTest with BeforeAndAfterAll {
     sql(s"alter table maintable1 add segment options('path'='${ newSegmentPath }', " +
         s"'format'='carbon')")
     sql("CREATE INDEX maintable1_si  on table maintable1 (c) as 'carbondata'")
-    assert(sql("show segments for table maintable1_si").collect().length ==
-           sql("show segments for table maintable1").collect().length)
+    assert(sql("show segments for table maintable1_si").collect().length == 2)
+    assert(sql("show segments for table maintable1").collect().length == 3)

Review comment:
       Disabled SI table after alter add load and added a check to verify in test cases.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#issuecomment-771878716


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5413/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] Indhumathi27 commented on pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

Posted by GitBox <gi...@apache.org>.
Indhumathi27 commented on pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#issuecomment-773142182


   LGTM


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] Indhumathi27 commented on pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

Posted by GitBox <gi...@apache.org>.
Indhumathi27 commented on pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#issuecomment-773142182


   LGTM


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#issuecomment-768161298


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5358/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#issuecomment-771878716






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

Posted by GitBox <gi...@apache.org>.
ShreelekhyaG commented on a change in pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#discussion_r567812890



##########
File path: core/src/main/java/org/apache/carbondata/core/index/IndexInputFormat.java
##########
@@ -159,6 +162,19 @@ public void initialize(InputSplit inputSplit, TaskAttemptContext taskAttemptCont
           if (filterResolverIntf != null) {
             filter.setExpression(filterResolverIntf.getFilterExpression());
           }
+          for (Segment segment : segmentsToLoad) {

Review comment:
       Done. Added check to skip for external segments.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4080: [WIP] Filter query having invalid results when add segment to SI with Indexserver

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#issuecomment-766782521


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5352/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

Posted by GitBox <gi...@apache.org>.
Indhumathi27 commented on a change in pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#discussion_r569951238



##########
File path: index/secondary-index/src/test/scala/org/apache/carbondata/spark/testsuite/secondaryindex/TestSIWithAddSegment.scala
##########
@@ -61,13 +61,16 @@ class TestSIWithAddSegment extends QueryTest with BeforeAndAfterAll {
   }
 
   test("test if the query hits SI after adding a segment to the main table") {
-    val d = sql("select * from maintable where c = 'm'")
-    assert(d.queryExecution.executedPlan.isInstanceOf[BroadCastSIFilterPushJoin])
+    val extSegmentQuery = sql("select * from maintable where c = 'm'")
+    val loadedSegmentQuery = sql("select * from maintable where c = 'k'")

Review comment:
       please verify result




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] ydvpankaj99 commented on pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

Posted by GitBox <gi...@apache.org>.
ydvpankaj99 commented on pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#issuecomment-768880913


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#issuecomment-770909939


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5395/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

Posted by GitBox <gi...@apache.org>.
Indhumathi27 commented on a change in pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#discussion_r567582402



##########
File path: core/src/main/java/org/apache/carbondata/core/indexstore/ExtendedBlocklet.java
##########
@@ -221,7 +221,7 @@ public void deserializeFields(DataInput in, String[] locations, String tablePath
       indexUniqueId = in.readUTF();
     }
     String filePath = getPath();
-    if (filePath.startsWith(File.separator)) {
+    if (!FileFactory.isFileExist(filePath)) {

Review comment:
       Please revert this change to do a rpc call. Use FileFactory.getupdatedFilePath instead as discussed




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

Posted by GitBox <gi...@apache.org>.
Indhumathi27 commented on a change in pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#discussion_r569246168



##########
File path: index/secondary-index/src/test/scala/org/apache/carbondata/spark/testsuite/secondaryindex/TestSIWithAddSegment.scala
##########
@@ -86,8 +86,8 @@ class TestSIWithAddSegment extends QueryTest with BeforeAndAfterAll {
     sql(s"alter table maintable1 add segment options('path'='${ newSegmentPath }', " +
         s"'format'='carbon')")
     sql("CREATE INDEX maintable1_si  on table maintable1 (c) as 'carbondata'")
-    assert(sql("show segments for table maintable1_si").collect().length ==
-           sql("show segments for table maintable1").collect().length)
+    assert(sql("show segments for table maintable1_si").collect().length == 2)
+    assert(sql("show segments for table maintable1").collect().length == 3)

Review comment:
       @akashrn5 @ShreelekhyaG  why we need to disable SI ?  I think we no need to disable SI, when external seg is added. Query will use SI to prune transactional segment and main table to prune external segment




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#issuecomment-772714704


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12444/job/ApacheCarbon_PR_Builder_2.4.5/3662/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#issuecomment-771004121


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12444/job/ApacheCarbon_PR_Builder_2.4.5/3638/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

Posted by GitBox <gi...@apache.org>.
ShreelekhyaG commented on a change in pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#discussion_r567903474



##########
File path: integration/spark/src/main/java/org/apache/spark/sql/secondaryindex/load/CarbonInternalLoaderUtil.java
##########
@@ -51,9 +51,10 @@
   public static List<String> getListOfValidSlices(LoadMetadataDetails[] details) {
     List<String> activeSlices = new ArrayList<>(CarbonCommonConstants.DEFAULT_COLLECTION_SIZE);
     for (LoadMetadataDetails oneLoad : details) {
-      if (SegmentStatus.SUCCESS.equals(oneLoad.getSegmentStatus())
+      // No need to consider external segments for SI load
+      if (oneLoad.getPath() == null && (SegmentStatus.SUCCESS.equals(oneLoad.getSegmentStatus())

Review comment:
       Done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] akashrn5 commented on a change in pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

Posted by GitBox <gi...@apache.org>.
akashrn5 commented on a change in pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#discussion_r569266756



##########
File path: index/secondary-index/src/test/scala/org/apache/carbondata/spark/testsuite/secondaryindex/TestSIWithAddSegment.scala
##########
@@ -86,8 +86,8 @@ class TestSIWithAddSegment extends QueryTest with BeforeAndAfterAll {
     sql(s"alter table maintable1 add segment options('path'='${ newSegmentPath }', " +
         s"'format'='carbon')")
     sql("CREATE INDEX maintable1_si  on table maintable1 (c) as 'carbondata'")
-    assert(sql("show segments for table maintable1_si").collect().length ==
-           sql("show segments for table maintable1").collect().length)
+    assert(sql("show segments for table maintable1_si").collect().length == 2)
+    assert(sql("show segments for table maintable1").collect().length == 3)

Review comment:
       since for external added segments we don't load SI, its scenario of segment mismatch of SI and main table, it may lead to wrong results or failure. We don't yet support of getting the results from SI for some segments and some segments from main table right.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#issuecomment-772313800


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5414/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

Posted by GitBox <gi...@apache.org>.
Indhumathi27 commented on a change in pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#discussion_r567844159



##########
File path: integration/spark/src/main/java/org/apache/spark/sql/secondaryindex/load/CarbonInternalLoaderUtil.java
##########
@@ -51,9 +51,10 @@
   public static List<String> getListOfValidSlices(LoadMetadataDetails[] details) {
     List<String> activeSlices = new ArrayList<>(CarbonCommonConstants.DEFAULT_COLLECTION_SIZE);
     for (LoadMetadataDetails oneLoad : details) {
-      if (SegmentStatus.SUCCESS.equals(oneLoad.getSegmentStatus())
+      // No need to consider external segments for SI load
+      if (oneLoad.getPath() == null && (SegmentStatus.SUCCESS.equals(oneLoad.getSegmentStatus())

Review comment:
       Please add code to not repair si, when add external load is executed




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#issuecomment-773101426


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12444/job/ApacheCarbon_PR_Builder_2.4.5/3666/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] akashrn5 commented on pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

Posted by GitBox <gi...@apache.org>.
akashrn5 commented on pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#issuecomment-773182792


   LGTM


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#issuecomment-768935266


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5369/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] vikramahuja1001 commented on a change in pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

Posted by GitBox <gi...@apache.org>.
vikramahuja1001 commented on a change in pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#discussion_r565949370



##########
File path: core/src/main/java/org/apache/carbondata/core/indexstore/SegmentWrapperContainer.java
##########
@@ -31,6 +31,9 @@
 
   private SegmentWrapper[] segmentWrappers;
 
+  public SegmentWrapperContainer() {

Review comment:
       why is this change required?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] akashrn5 commented on a change in pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

Posted by GitBox <gi...@apache.org>.
akashrn5 commented on a change in pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#discussion_r568509426



##########
File path: index/secondary-index/src/test/scala/org/apache/carbondata/spark/testsuite/secondaryindex/TestSIWithAddSegment.scala
##########
@@ -86,8 +86,8 @@ class TestSIWithAddSegment extends QueryTest with BeforeAndAfterAll {
     sql(s"alter table maintable1 add segment options('path'='${ newSegmentPath }', " +
         s"'format'='carbon')")
     sql("CREATE INDEX maintable1_si  on table maintable1 (c) as 'carbondata'")
-    assert(sql("show segments for table maintable1_si").collect().length ==
-           sql("show segments for table maintable1").collect().length)
+    assert(sql("show segments for table maintable1_si").collect().length == 2)
+    assert(sql("show segments for table maintable1").collect().length == 3)

Review comment:
       also have an assert of checking SI table is disabled and query doesn't hit SI

##########
File path: core/src/main/java/org/apache/carbondata/core/indexstore/ExtendedBlocklet.java
##########
@@ -221,7 +223,13 @@ public void deserializeFields(DataInput in, String[] locations, String tablePath
       indexUniqueId = in.readUTF();
     }
     String filePath = getPath();
-    if (filePath.startsWith(File.separator)) {
+    boolean isLocalFile = FileFactory.getCarbonFile(filePath) instanceof LocalCarbonFile;
+    // If it is external segment path, table path need not be appended to filePath
+    // Example filepath: hdfs://hacluster/opt/newsegmentpath/
+    // filePath value would start with hdfs:// or s3:// . If it is local
+    // ubuntu storage, it starts with File separator, so check if given path exists or not.
+    if ((!isLocalFile && filePath.startsWith(File.separator)) || (isLocalFile && !FileFactory

Review comment:
       the comment is not clear, please rewrite it with better example and scenarios




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#issuecomment-770996367


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5398/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#issuecomment-771881699


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12444/job/ApacheCarbon_PR_Builder_2.4.5/3652/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

Posted by GitBox <gi...@apache.org>.
ShreelekhyaG commented on a change in pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#discussion_r569160528



##########
File path: core/src/main/java/org/apache/carbondata/core/indexstore/ExtendedBlocklet.java
##########
@@ -221,7 +223,13 @@ public void deserializeFields(DataInput in, String[] locations, String tablePath
       indexUniqueId = in.readUTF();
     }
     String filePath = getPath();
-    if (filePath.startsWith(File.separator)) {
+    boolean isLocalFile = FileFactory.getCarbonFile(filePath) instanceof LocalCarbonFile;
+    // If it is external segment path, table path need not be appended to filePath
+    // Example filepath: hdfs://hacluster/opt/newsegmentpath/
+    // filePath value would start with hdfs:// or s3:// . If it is local
+    // ubuntu storage, it starts with File separator, so check if given path exists or not.
+    if ((!isLocalFile && filePath.startsWith(File.separator)) || (isLocalFile && !FileFactory

Review comment:
       ok done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] akashrn5 commented on a change in pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

Posted by GitBox <gi...@apache.org>.
akashrn5 commented on a change in pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#discussion_r568509426



##########
File path: index/secondary-index/src/test/scala/org/apache/carbondata/spark/testsuite/secondaryindex/TestSIWithAddSegment.scala
##########
@@ -86,8 +86,8 @@ class TestSIWithAddSegment extends QueryTest with BeforeAndAfterAll {
     sql(s"alter table maintable1 add segment options('path'='${ newSegmentPath }', " +
         s"'format'='carbon')")
     sql("CREATE INDEX maintable1_si  on table maintable1 (c) as 'carbondata'")
-    assert(sql("show segments for table maintable1_si").collect().length ==
-           sql("show segments for table maintable1").collect().length)
+    assert(sql("show segments for table maintable1_si").collect().length == 2)
+    assert(sql("show segments for table maintable1").collect().length == 3)

Review comment:
       also have an assert of checking SI table is disabled and query doesn't hit SI

##########
File path: core/src/main/java/org/apache/carbondata/core/indexstore/ExtendedBlocklet.java
##########
@@ -221,7 +223,13 @@ public void deserializeFields(DataInput in, String[] locations, String tablePath
       indexUniqueId = in.readUTF();
     }
     String filePath = getPath();
-    if (filePath.startsWith(File.separator)) {
+    boolean isLocalFile = FileFactory.getCarbonFile(filePath) instanceof LocalCarbonFile;
+    // If it is external segment path, table path need not be appended to filePath
+    // Example filepath: hdfs://hacluster/opt/newsegmentpath/
+    // filePath value would start with hdfs:// or s3:// . If it is local
+    // ubuntu storage, it starts with File separator, so check if given path exists or not.
+    if ((!isLocalFile && filePath.startsWith(File.separator)) || (isLocalFile && !FileFactory

Review comment:
       the comment is not clear, please rewrite it with better example and scenarios




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #4080: [CARBONDATA-4111] Filter query having invalid results after add segment to table having SI with Indexserver

Posted by GitBox <gi...@apache.org>.
Indhumathi27 commented on a change in pull request #4080:
URL: https://github.com/apache/carbondata/pull/4080#discussion_r569246168



##########
File path: index/secondary-index/src/test/scala/org/apache/carbondata/spark/testsuite/secondaryindex/TestSIWithAddSegment.scala
##########
@@ -86,8 +86,8 @@ class TestSIWithAddSegment extends QueryTest with BeforeAndAfterAll {
     sql(s"alter table maintable1 add segment options('path'='${ newSegmentPath }', " +
         s"'format'='carbon')")
     sql("CREATE INDEX maintable1_si  on table maintable1 (c) as 'carbondata'")
-    assert(sql("show segments for table maintable1_si").collect().length ==
-           sql("show segments for table maintable1").collect().length)
+    assert(sql("show segments for table maintable1_si").collect().length == 2)
+    assert(sql("show segments for table maintable1").collect().length == 3)

Review comment:
       @akashrn5 why we need to disable SI ? 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org