You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@carbondata.apache.org by GitBox <gi...@apache.org> on 2021/08/23 14:33:49 UTC

[GitHub] [carbondata] maheshrajus opened a new pull request #4209: [WIP] Avoid refetching all indexes to get segment properties

maheshrajus opened a new pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209


    ### Why is this PR needed?
    need to update
    
    ### What changes were proposed in this PR?
   need to update
       
    ### Does this PR introduce any user interface change?
    - No
    - Yes. (please explain the change and update document)
   
    ### Is any new testcase added?
    - No
    - Yes
   
       
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4209: [WIP] Avoid refetching all indexes to get segment properties

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#issuecomment-906366831


   Build Success with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/299/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4209: [CARBONDATA-4278] Avoid refetching all indexes to get segment properties

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#issuecomment-906697802


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/4156/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] asfgit closed pull request #4209: [CARBONDATA-4278] Avoid refetching all indexes to get segment properties

Posted by GitBox <gi...@apache.org>.
asfgit closed pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] maheshrajus commented on pull request #4209: [CARBONDATA-4278] Avoid refetching all indexes to get segment properties

Posted by GitBox <gi...@apache.org>.
maheshrajus commented on pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#issuecomment-906604592


   retest this please


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] ajantha-bhat commented on pull request #4209: [CARBONDATA-4278] Avoid refetching all indexes to get segment properties

Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#issuecomment-910045640


   LGTM


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4209: [CARBONDATA-4278] Avoid refetching all indexes to get segment properties

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#issuecomment-906693743


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5900/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4209: [WIP] Avoid refetching all indexes to get segment properties

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#issuecomment-903913169


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/4136/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #4209: [CARBONDATA-4278] Avoid refetching all indexes to get segment properties

Posted by GitBox <gi...@apache.org>.
Indhumathi27 commented on a change in pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#discussion_r699978830



##########
File path: core/src/main/java/org/apache/carbondata/core/index/TableIndex.java
##########
@@ -207,15 +208,22 @@ public CarbonTable getTable() {
       Map<Segment, List<Index>> indexes) throws IOException {
     Set<String> missingSISegments = filter.getMissingSISegments();
     for (Segment segment : segments) {
+      List<Index> segmentIndices = indexes.get(segment);
       if (segment == null ||
-          indexes.get(segment) == null || indexes.get(segment).isEmpty()) {
+          indexes.get(segment) == null || segmentIndices.isEmpty()) {
         continue;
       }
       boolean isExternalOrMissingSISegment = segment.isExternalSegment() ||
           (missingSISegments != null && missingSISegments.contains(segment.getSegmentNo()));
       List<Blocklet> pruneBlocklets = new ArrayList<>();
-      SegmentProperties segmentProperties =
-          segmentPropertiesFetcher.getSegmentProperties(segment, partitionLocations);
+      SegmentProperties segmentProperties;
+      if (segmentIndices.get(0) instanceof BlockIndex) {

Review comment:
       When cgIndexExprWrapper.prune is called (for SI/Bloom), in that case, it will have either BloomcoarseGainIndex or SIindex. So, we have to keep both cases




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4209: [WIP] Avoid refetching all indexes to get segment properties

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#issuecomment-903834730


   Build Failed  with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/280/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] maheshrajus commented on a change in pull request #4209: [CARBONDATA-4278] Avoid refetching all indexes to get segment properties

Posted by GitBox <gi...@apache.org>.
maheshrajus commented on a change in pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#discussion_r699976048



##########
File path: core/src/main/java/org/apache/carbondata/core/index/TableIndex.java
##########
@@ -207,15 +208,22 @@ public CarbonTable getTable() {
       Map<Segment, List<Index>> indexes) throws IOException {
     Set<String> missingSISegments = filter.getMissingSISegments();
     for (Segment segment : segments) {
+      List<Index> segmentIndices = indexes.get(segment);
       if (segment == null ||
-          indexes.get(segment) == null || indexes.get(segment).isEmpty()) {
+          indexes.get(segment) == null || segmentIndices.isEmpty()) {

Review comment:
       done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #4209: [CARBONDATA-4278] Avoid refetching all indexes to get segment properties

Posted by GitBox <gi...@apache.org>.
Indhumathi27 commented on a change in pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#discussion_r699978830



##########
File path: core/src/main/java/org/apache/carbondata/core/index/TableIndex.java
##########
@@ -207,15 +208,22 @@ public CarbonTable getTable() {
       Map<Segment, List<Index>> indexes) throws IOException {
     Set<String> missingSISegments = filter.getMissingSISegments();
     for (Segment segment : segments) {
+      List<Index> segmentIndices = indexes.get(segment);
       if (segment == null ||
-          indexes.get(segment) == null || indexes.get(segment).isEmpty()) {
+          indexes.get(segment) == null || segmentIndices.isEmpty()) {
         continue;
       }
       boolean isExternalOrMissingSISegment = segment.isExternalSegment() ||
           (missingSISegments != null && missingSISegments.contains(segment.getSegmentNo()));
       List<Blocklet> pruneBlocklets = new ArrayList<>();
-      SegmentProperties segmentProperties =
-          segmentPropertiesFetcher.getSegmentProperties(segment, partitionLocations);
+      SegmentProperties segmentProperties;
+      if (segmentIndices.get(0) instanceof BlockIndex) {

Review comment:
       When cgIndexExprWrapper.prune is called (for SI/Bloom), in that case, it will have either BloomcoarseGainIndex or SIindex. So, we have to keep both cases




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4209: [CARBONDATA-4278] Avoid refetching all indexes to get segment properties

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#issuecomment-906689769


   Build Success with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/301/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4209: [WIP] Avoid refetching all indexes to get segment properties

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#issuecomment-906371382


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/4154/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #4209: [CARBONDATA-4278] Avoid refetching all indexes to get segment properties

Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on a change in pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#discussion_r699953757



##########
File path: core/src/main/java/org/apache/carbondata/core/index/TableIndex.java
##########
@@ -207,15 +208,22 @@ public CarbonTable getTable() {
       Map<Segment, List<Index>> indexes) throws IOException {
     Set<String> missingSISegments = filter.getMissingSISegments();
     for (Segment segment : segments) {
+      List<Index> segmentIndices = indexes.get(segment);
       if (segment == null ||
-          indexes.get(segment) == null || indexes.get(segment).isEmpty()) {
+          indexes.get(segment) == null || segmentIndices.isEmpty()) {

Review comment:
       In line 213, `indexes.get(segment)` is called again. Avoid it by using the existing result from `segmentIndices`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #4209: [CARBONDATA-4278] Avoid refetching all indexes to get segment properties

Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on a change in pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#discussion_r699953757



##########
File path: core/src/main/java/org/apache/carbondata/core/index/TableIndex.java
##########
@@ -207,15 +208,22 @@ public CarbonTable getTable() {
       Map<Segment, List<Index>> indexes) throws IOException {
     Set<String> missingSISegments = filter.getMissingSISegments();
     for (Segment segment : segments) {
+      List<Index> segmentIndices = indexes.get(segment);
       if (segment == null ||
-          indexes.get(segment) == null || indexes.get(segment).isEmpty()) {
+          indexes.get(segment) == null || segmentIndices.isEmpty()) {

Review comment:
       In line 213, `indexes.get(segment)` is called again. Avoid it by using the existing result from `segmentIndices`.

##########
File path: core/src/main/java/org/apache/carbondata/core/index/TableIndex.java
##########
@@ -207,15 +208,22 @@ public CarbonTable getTable() {
       Map<Segment, List<Index>> indexes) throws IOException {
     Set<String> missingSISegments = filter.getMissingSISegments();
     for (Segment segment : segments) {
+      List<Index> segmentIndices = indexes.get(segment);
       if (segment == null ||
-          indexes.get(segment) == null || indexes.get(segment).isEmpty()) {
+          indexes.get(segment) == null || segmentIndices.isEmpty()) {
         continue;
       }
       boolean isExternalOrMissingSISegment = segment.isExternalSegment() ||
           (missingSISegments != null && missingSISegments.contains(segment.getSegmentNo()));
       List<Blocklet> pruneBlocklets = new ArrayList<>();
-      SegmentProperties segmentProperties =
-          segmentPropertiesFetcher.getSegmentProperties(segment, partitionLocations);
+      SegmentProperties segmentProperties;
+      if (segmentIndices.get(0) instanceof BlockIndex) {

Review comment:
       I mean line 211, when `indexes.get(segment)` is done, all indexes will be present in this list. Like blockindex, SI, bloom, lucence. 
   so, we just have to find blockindex (which is always be present even for SI table) and get the the segment properties ? 
   @Indhumathi27 : Is my understanding correct or we need both cases here ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4209: [CARBONDATA-4278] Avoid refetching all indexes to get segment properties

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#issuecomment-910132712


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5907/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4209: [WIP] Avoid refetching all indexes to get segment properties

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#issuecomment-903900437


   Build Failed  with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/281/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4209: [CARBONDATA-4278] Avoid refetching all indexes to get segment properties

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#issuecomment-910133774


   Build Success with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/309/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] maheshrajus commented on a change in pull request #4209: [CARBONDATA-4278] Avoid refetching all indexes to get segment properties

Posted by GitBox <gi...@apache.org>.
maheshrajus commented on a change in pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#discussion_r699975802



##########
File path: core/src/main/java/org/apache/carbondata/core/index/TableIndex.java
##########
@@ -207,15 +208,22 @@ public CarbonTable getTable() {
       Map<Segment, List<Index>> indexes) throws IOException {
     Set<String> missingSISegments = filter.getMissingSISegments();
     for (Segment segment : segments) {
+      List<Index> segmentIndices = indexes.get(segment);
       if (segment == null ||
-          indexes.get(segment) == null || indexes.get(segment).isEmpty()) {
+          indexes.get(segment) == null || segmentIndices.isEmpty()) {
         continue;
       }
       boolean isExternalOrMissingSISegment = segment.isExternalSegment() ||
           (missingSISegments != null && missingSISegments.contains(segment.getSegmentNo()));
       List<Blocklet> pruneBlocklets = new ArrayList<>();
-      SegmentProperties segmentProperties =
-          segmentPropertiesFetcher.getSegmentProperties(segment, partitionLocations);
+      SegmentProperties segmentProperties;
+      if (segmentIndices.get(0) instanceof BlockIndex) {

Review comment:
       i checked in SI flow, it has only SI index info and does not have blockindex info. We can add separate checks




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4209: [WIP] Avoid refetching all indexes to get segment properties

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#issuecomment-906382333


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5898/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4209: [WIP] Avoid refetching all indexes to get segment properties

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#issuecomment-903909007


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5879/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #4209: [CARBONDATA-4278] Avoid refetching all indexes to get segment properties

Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on a change in pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#discussion_r699965205



##########
File path: core/src/main/java/org/apache/carbondata/core/index/TableIndex.java
##########
@@ -207,15 +208,22 @@ public CarbonTable getTable() {
       Map<Segment, List<Index>> indexes) throws IOException {
     Set<String> missingSISegments = filter.getMissingSISegments();
     for (Segment segment : segments) {
+      List<Index> segmentIndices = indexes.get(segment);
       if (segment == null ||
-          indexes.get(segment) == null || indexes.get(segment).isEmpty()) {
+          indexes.get(segment) == null || segmentIndices.isEmpty()) {
         continue;
       }
       boolean isExternalOrMissingSISegment = segment.isExternalSegment() ||
           (missingSISegments != null && missingSISegments.contains(segment.getSegmentNo()));
       List<Blocklet> pruneBlocklets = new ArrayList<>();
-      SegmentProperties segmentProperties =
-          segmentPropertiesFetcher.getSegmentProperties(segment, partitionLocations);
+      SegmentProperties segmentProperties;
+      if (segmentIndices.get(0) instanceof BlockIndex) {

Review comment:
       I mean line 211, when `indexes.get(segment)` is done, all indexes will be present in this list. Like blockindex, SI, bloom, lucence. 
   so, we just have to find blockindex (which is always be present even for SI table) and get the the segment properties ? 
   @Indhumathi27 : Is my understanding correct or we need both cases here ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #4209: [CARBONDATA-4278] Avoid refetching all indexes to get segment properties

Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on a change in pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#discussion_r698263969



##########
File path: core/src/main/java/org/apache/carbondata/core/index/TableIndex.java
##########
@@ -207,15 +208,22 @@ public CarbonTable getTable() {
       Map<Segment, List<Index>> indexes) throws IOException {
     Set<String> missingSISegments = filter.getMissingSISegments();
     for (Segment segment : segments) {
+      List<Index> segmentIndices = indexes.get(segment);
       if (segment == null ||
-          indexes.get(segment) == null || indexes.get(segment).isEmpty()) {
+          indexes.get(segment) == null || segmentIndices.isEmpty()) {
         continue;
       }
       boolean isExternalOrMissingSISegment = segment.isExternalSegment() ||
           (missingSISegments != null && missingSISegments.contains(segment.getSegmentNo()));
       List<Blocklet> pruneBlocklets = new ArrayList<>();
-      SegmentProperties segmentProperties =
-          segmentPropertiesFetcher.getSegmentProperties(segment, partitionLocations);
+      SegmentProperties segmentProperties;
+      if (segmentIndices.get(0) instanceof BlockIndex) {

Review comment:
       Need to loop here and check if any of the segmentIndices is BlockIndex instead of always checking the first index?
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4209: [CARBONDATA-4278] Avoid refetching all indexes to get segment properties

Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#issuecomment-910137002


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/4163/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] ajantha-bhat commented on pull request #4209: [CARBONDATA-4278] Avoid refetching all indexes to get segment properties

Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#issuecomment-910045640


   LGTM


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] maheshrajus commented on a change in pull request #4209: [CARBONDATA-4278] Avoid refetching all indexes to get segment properties

Posted by GitBox <gi...@apache.org>.
maheshrajus commented on a change in pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#discussion_r699885987



##########
File path: core/src/main/java/org/apache/carbondata/core/index/TableIndex.java
##########
@@ -207,15 +208,22 @@ public CarbonTable getTable() {
       Map<Segment, List<Index>> indexes) throws IOException {
     Set<String> missingSISegments = filter.getMissingSISegments();
     for (Segment segment : segments) {
+      List<Index> segmentIndices = indexes.get(segment);
       if (segment == null ||
-          indexes.get(segment) == null || indexes.get(segment).isEmpty()) {
+          indexes.get(segment) == null || segmentIndices.isEmpty()) {
         continue;
       }
       boolean isExternalOrMissingSISegment = segment.isExternalSegment() ||
           (missingSISegments != null && missingSISegments.contains(segment.getSegmentNo()));
       List<Blocklet> pruneBlocklets = new ArrayList<>();
-      SegmentProperties segmentProperties =
-          segmentPropertiesFetcher.getSegmentProperties(segment, partitionLocations);
+      SegmentProperties segmentProperties;
+      if (segmentIndices.get(0) instanceof BlockIndex) {

Review comment:
       @ajantha-bhat  @Indhumathi27 1) we are getting segment properties from one blockindex which is same for all. 
   2) In this flow we may receive SI and bloom indexes also which will prepare by SI , bloom index factory [List(SecondaryIndex) or List(BloomIndex)]. In this case we should use "segmentPropertiesFetcher.getSegmentProperties" as they are not instance of BlockIndex. inside getSegmentProperties method we are getting and preparing list of BlockIndex and getting segment propteries from block index.
   
   I will add scenario description in code and push.

##########
File path: core/src/main/java/org/apache/carbondata/core/index/TableIndex.java
##########
@@ -207,15 +208,22 @@ public CarbonTable getTable() {
       Map<Segment, List<Index>> indexes) throws IOException {
     Set<String> missingSISegments = filter.getMissingSISegments();
     for (Segment segment : segments) {
+      List<Index> segmentIndices = indexes.get(segment);
       if (segment == null ||
-          indexes.get(segment) == null || indexes.get(segment).isEmpty()) {
+          indexes.get(segment) == null || segmentIndices.isEmpty()) {
         continue;
       }
       boolean isExternalOrMissingSISegment = segment.isExternalSegment() ||
           (missingSISegments != null && missingSISegments.contains(segment.getSegmentNo()));
       List<Blocklet> pruneBlocklets = new ArrayList<>();
-      SegmentProperties segmentProperties =
-          segmentPropertiesFetcher.getSegmentProperties(segment, partitionLocations);
+      SegmentProperties segmentProperties;
+      if (segmentIndices.get(0) instanceof BlockIndex) {

Review comment:
       i checked in SI flow, it has only SI index info and does not have blockindex info. We can add separate checks

##########
File path: core/src/main/java/org/apache/carbondata/core/index/TableIndex.java
##########
@@ -207,15 +208,22 @@ public CarbonTable getTable() {
       Map<Segment, List<Index>> indexes) throws IOException {
     Set<String> missingSISegments = filter.getMissingSISegments();
     for (Segment segment : segments) {
+      List<Index> segmentIndices = indexes.get(segment);
       if (segment == null ||
-          indexes.get(segment) == null || indexes.get(segment).isEmpty()) {
+          indexes.get(segment) == null || segmentIndices.isEmpty()) {

Review comment:
       done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] maheshrajus commented on a change in pull request #4209: [CARBONDATA-4278] Avoid refetching all indexes to get segment properties

Posted by GitBox <gi...@apache.org>.
maheshrajus commented on a change in pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#discussion_r699885987



##########
File path: core/src/main/java/org/apache/carbondata/core/index/TableIndex.java
##########
@@ -207,15 +208,22 @@ public CarbonTable getTable() {
       Map<Segment, List<Index>> indexes) throws IOException {
     Set<String> missingSISegments = filter.getMissingSISegments();
     for (Segment segment : segments) {
+      List<Index> segmentIndices = indexes.get(segment);
       if (segment == null ||
-          indexes.get(segment) == null || indexes.get(segment).isEmpty()) {
+          indexes.get(segment) == null || segmentIndices.isEmpty()) {
         continue;
       }
       boolean isExternalOrMissingSISegment = segment.isExternalSegment() ||
           (missingSISegments != null && missingSISegments.contains(segment.getSegmentNo()));
       List<Blocklet> pruneBlocklets = new ArrayList<>();
-      SegmentProperties segmentProperties =
-          segmentPropertiesFetcher.getSegmentProperties(segment, partitionLocations);
+      SegmentProperties segmentProperties;
+      if (segmentIndices.get(0) instanceof BlockIndex) {

Review comment:
       @ajantha-bhat  @Indhumathi27 1) we are getting segment properties from one blockindex which is same for all. 
   2) In this flow we may receive SI and bloom indexes also which will prepare by SI , bloom index factory [List(SecondaryIndex) or List(BloomIndex)]. In this case we should use "segmentPropertiesFetcher.getSegmentProperties" as they are not instance of BlockIndex. inside getSegmentProperties method we are getting and preparing list of BlockIndex and getting segment propteries from block index.
   
   I will add scenario description in code and push.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #4209: [CARBONDATA-4278] Avoid refetching all indexes to get segment properties

Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on a change in pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#discussion_r698263969



##########
File path: core/src/main/java/org/apache/carbondata/core/index/TableIndex.java
##########
@@ -207,15 +208,22 @@ public CarbonTable getTable() {
       Map<Segment, List<Index>> indexes) throws IOException {
     Set<String> missingSISegments = filter.getMissingSISegments();
     for (Segment segment : segments) {
+      List<Index> segmentIndices = indexes.get(segment);
       if (segment == null ||
-          indexes.get(segment) == null || indexes.get(segment).isEmpty()) {
+          indexes.get(segment) == null || segmentIndices.isEmpty()) {
         continue;
       }
       boolean isExternalOrMissingSISegment = segment.isExternalSegment() ||
           (missingSISegments != null && missingSISegments.contains(segment.getSegmentNo()));
       List<Blocklet> pruneBlocklets = new ArrayList<>();
-      SegmentProperties segmentProperties =
-          segmentPropertiesFetcher.getSegmentProperties(segment, partitionLocations);
+      SegmentProperties segmentProperties;
+      if (segmentIndices.get(0) instanceof BlockIndex) {

Review comment:
       Need to loop here and check if any of the `segmentIndices `is `BlockIndex `instead of always checking the first index?
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #4209: [CARBONDATA-4278] Avoid refetching all indexes to get segment properties

Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on a change in pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#discussion_r698265226



##########
File path: core/src/main/java/org/apache/carbondata/core/index/TableIndex.java
##########
@@ -207,15 +208,22 @@ public CarbonTable getTable() {
       Map<Segment, List<Index>> indexes) throws IOException {
     Set<String> missingSISegments = filter.getMissingSISegments();
     for (Segment segment : segments) {
+      List<Index> segmentIndices = indexes.get(segment);
       if (segment == null ||
-          indexes.get(segment) == null || indexes.get(segment).isEmpty()) {
+          indexes.get(segment) == null || segmentIndices.isEmpty()) {
         continue;
       }
       boolean isExternalOrMissingSISegment = segment.isExternalSegment() ||
           (missingSISegments != null && missingSISegments.contains(segment.getSegmentNo()));
       List<Blocklet> pruneBlocklets = new ArrayList<>();
-      SegmentProperties segmentProperties =
-          segmentPropertiesFetcher.getSegmentProperties(segment, partitionLocations);
+      SegmentProperties segmentProperties;
+      if (segmentIndices.get(0) instanceof BlockIndex) {

Review comment:
       Also CC: @Indhumathi27 , you can also review once. 
   May be we can remove   `segmentPropertiesFetcher.getSegmentProperties(segment, partitionLocations); `
   and always call `segmentPropertiesFetcher.getSegmentPropertiesFromIndex` as we have indexes already at this place.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org