You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@carbondata.apache.org by GitBox <gi...@apache.org> on 2021/08/23 14:33:49 UTC
[GitHub] [carbondata] maheshrajus opened a new pull request #4209: [WIP] Avoid refetching all indexes to get segment properties
maheshrajus opened a new pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209
### Why is this PR needed?
need to update
### What changes were proposed in this PR?
need to update
### Does this PR introduce any user interface change?
- No
- Yes. (please explain the change and update document)
### Is any new testcase added?
- No
- Yes
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4209: [WIP] Avoid refetching all indexes to get segment properties
Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#issuecomment-906366831
Build Success with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/299/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4209: [CARBONDATA-4278] Avoid refetching all indexes to get segment properties
Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#issuecomment-906697802
Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/4156/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] asfgit closed pull request #4209: [CARBONDATA-4278] Avoid refetching all indexes to get segment properties
Posted by GitBox <gi...@apache.org>.
asfgit closed pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] maheshrajus commented on pull request #4209: [CARBONDATA-4278] Avoid refetching all indexes to get segment properties
Posted by GitBox <gi...@apache.org>.
maheshrajus commented on pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#issuecomment-906604592
retest this please
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on pull request #4209: [CARBONDATA-4278] Avoid refetching all indexes to get segment properties
Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#issuecomment-910045640
LGTM
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4209: [CARBONDATA-4278] Avoid refetching all indexes to get segment properties
Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#issuecomment-906693743
Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5900/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4209: [WIP] Avoid refetching all indexes to get segment properties
Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#issuecomment-903913169
Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/4136/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #4209: [CARBONDATA-4278] Avoid refetching all indexes to get segment properties
Posted by GitBox <gi...@apache.org>.
Indhumathi27 commented on a change in pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#discussion_r699978830
##########
File path: core/src/main/java/org/apache/carbondata/core/index/TableIndex.java
##########
@@ -207,15 +208,22 @@ public CarbonTable getTable() {
Map<Segment, List<Index>> indexes) throws IOException {
Set<String> missingSISegments = filter.getMissingSISegments();
for (Segment segment : segments) {
+ List<Index> segmentIndices = indexes.get(segment);
if (segment == null ||
- indexes.get(segment) == null || indexes.get(segment).isEmpty()) {
+ indexes.get(segment) == null || segmentIndices.isEmpty()) {
continue;
}
boolean isExternalOrMissingSISegment = segment.isExternalSegment() ||
(missingSISegments != null && missingSISegments.contains(segment.getSegmentNo()));
List<Blocklet> pruneBlocklets = new ArrayList<>();
- SegmentProperties segmentProperties =
- segmentPropertiesFetcher.getSegmentProperties(segment, partitionLocations);
+ SegmentProperties segmentProperties;
+ if (segmentIndices.get(0) instanceof BlockIndex) {
Review comment:
When cgIndexExprWrapper.prune is called (for SI/Bloom), in that case, it will have either BloomcoarseGainIndex or SIindex. So, we have to keep both cases
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4209: [WIP] Avoid refetching all indexes to get segment properties
Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#issuecomment-903834730
Build Failed with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/280/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] maheshrajus commented on a change in pull request #4209: [CARBONDATA-4278] Avoid refetching all indexes to get segment properties
Posted by GitBox <gi...@apache.org>.
maheshrajus commented on a change in pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#discussion_r699976048
##########
File path: core/src/main/java/org/apache/carbondata/core/index/TableIndex.java
##########
@@ -207,15 +208,22 @@ public CarbonTable getTable() {
Map<Segment, List<Index>> indexes) throws IOException {
Set<String> missingSISegments = filter.getMissingSISegments();
for (Segment segment : segments) {
+ List<Index> segmentIndices = indexes.get(segment);
if (segment == null ||
- indexes.get(segment) == null || indexes.get(segment).isEmpty()) {
+ indexes.get(segment) == null || segmentIndices.isEmpty()) {
Review comment:
done
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #4209: [CARBONDATA-4278] Avoid refetching all indexes to get segment properties
Posted by GitBox <gi...@apache.org>.
Indhumathi27 commented on a change in pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#discussion_r699978830
##########
File path: core/src/main/java/org/apache/carbondata/core/index/TableIndex.java
##########
@@ -207,15 +208,22 @@ public CarbonTable getTable() {
Map<Segment, List<Index>> indexes) throws IOException {
Set<String> missingSISegments = filter.getMissingSISegments();
for (Segment segment : segments) {
+ List<Index> segmentIndices = indexes.get(segment);
if (segment == null ||
- indexes.get(segment) == null || indexes.get(segment).isEmpty()) {
+ indexes.get(segment) == null || segmentIndices.isEmpty()) {
continue;
}
boolean isExternalOrMissingSISegment = segment.isExternalSegment() ||
(missingSISegments != null && missingSISegments.contains(segment.getSegmentNo()));
List<Blocklet> pruneBlocklets = new ArrayList<>();
- SegmentProperties segmentProperties =
- segmentPropertiesFetcher.getSegmentProperties(segment, partitionLocations);
+ SegmentProperties segmentProperties;
+ if (segmentIndices.get(0) instanceof BlockIndex) {
Review comment:
When cgIndexExprWrapper.prune is called (for SI/Bloom), in that case, it will have either BloomcoarseGainIndex or SIindex. So, we have to keep both cases
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4209: [CARBONDATA-4278] Avoid refetching all indexes to get segment properties
Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#issuecomment-906689769
Build Success with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/301/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4209: [WIP] Avoid refetching all indexes to get segment properties
Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#issuecomment-906371382
Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/4154/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #4209: [CARBONDATA-4278] Avoid refetching all indexes to get segment properties
Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on a change in pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#discussion_r699953757
##########
File path: core/src/main/java/org/apache/carbondata/core/index/TableIndex.java
##########
@@ -207,15 +208,22 @@ public CarbonTable getTable() {
Map<Segment, List<Index>> indexes) throws IOException {
Set<String> missingSISegments = filter.getMissingSISegments();
for (Segment segment : segments) {
+ List<Index> segmentIndices = indexes.get(segment);
if (segment == null ||
- indexes.get(segment) == null || indexes.get(segment).isEmpty()) {
+ indexes.get(segment) == null || segmentIndices.isEmpty()) {
Review comment:
In line 213, `indexes.get(segment)` is called again. Avoid it by using the existing result from `segmentIndices`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #4209: [CARBONDATA-4278] Avoid refetching all indexes to get segment properties
Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on a change in pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#discussion_r699953757
##########
File path: core/src/main/java/org/apache/carbondata/core/index/TableIndex.java
##########
@@ -207,15 +208,22 @@ public CarbonTable getTable() {
Map<Segment, List<Index>> indexes) throws IOException {
Set<String> missingSISegments = filter.getMissingSISegments();
for (Segment segment : segments) {
+ List<Index> segmentIndices = indexes.get(segment);
if (segment == null ||
- indexes.get(segment) == null || indexes.get(segment).isEmpty()) {
+ indexes.get(segment) == null || segmentIndices.isEmpty()) {
Review comment:
In line 213, `indexes.get(segment)` is called again. Avoid it by using the existing result from `segmentIndices`.
##########
File path: core/src/main/java/org/apache/carbondata/core/index/TableIndex.java
##########
@@ -207,15 +208,22 @@ public CarbonTable getTable() {
Map<Segment, List<Index>> indexes) throws IOException {
Set<String> missingSISegments = filter.getMissingSISegments();
for (Segment segment : segments) {
+ List<Index> segmentIndices = indexes.get(segment);
if (segment == null ||
- indexes.get(segment) == null || indexes.get(segment).isEmpty()) {
+ indexes.get(segment) == null || segmentIndices.isEmpty()) {
continue;
}
boolean isExternalOrMissingSISegment = segment.isExternalSegment() ||
(missingSISegments != null && missingSISegments.contains(segment.getSegmentNo()));
List<Blocklet> pruneBlocklets = new ArrayList<>();
- SegmentProperties segmentProperties =
- segmentPropertiesFetcher.getSegmentProperties(segment, partitionLocations);
+ SegmentProperties segmentProperties;
+ if (segmentIndices.get(0) instanceof BlockIndex) {
Review comment:
I mean line 211, when `indexes.get(segment)` is done, all indexes will be present in this list. Like blockindex, SI, bloom, lucence.
so, we just have to find blockindex (which is always be present even for SI table) and get the the segment properties ?
@Indhumathi27 : Is my understanding correct or we need both cases here ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4209: [CARBONDATA-4278] Avoid refetching all indexes to get segment properties
Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#issuecomment-910132712
Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5907/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4209: [WIP] Avoid refetching all indexes to get segment properties
Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#issuecomment-903900437
Build Failed with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/281/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4209: [CARBONDATA-4278] Avoid refetching all indexes to get segment properties
Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#issuecomment-910133774
Build Success with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/309/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] maheshrajus commented on a change in pull request #4209: [CARBONDATA-4278] Avoid refetching all indexes to get segment properties
Posted by GitBox <gi...@apache.org>.
maheshrajus commented on a change in pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#discussion_r699975802
##########
File path: core/src/main/java/org/apache/carbondata/core/index/TableIndex.java
##########
@@ -207,15 +208,22 @@ public CarbonTable getTable() {
Map<Segment, List<Index>> indexes) throws IOException {
Set<String> missingSISegments = filter.getMissingSISegments();
for (Segment segment : segments) {
+ List<Index> segmentIndices = indexes.get(segment);
if (segment == null ||
- indexes.get(segment) == null || indexes.get(segment).isEmpty()) {
+ indexes.get(segment) == null || segmentIndices.isEmpty()) {
continue;
}
boolean isExternalOrMissingSISegment = segment.isExternalSegment() ||
(missingSISegments != null && missingSISegments.contains(segment.getSegmentNo()));
List<Blocklet> pruneBlocklets = new ArrayList<>();
- SegmentProperties segmentProperties =
- segmentPropertiesFetcher.getSegmentProperties(segment, partitionLocations);
+ SegmentProperties segmentProperties;
+ if (segmentIndices.get(0) instanceof BlockIndex) {
Review comment:
i checked in SI flow, it has only SI index info and does not have blockindex info. We can add separate checks
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4209: [WIP] Avoid refetching all indexes to get segment properties
Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#issuecomment-906382333
Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5898/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4209: [WIP] Avoid refetching all indexes to get segment properties
Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#issuecomment-903909007
Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5879/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #4209: [CARBONDATA-4278] Avoid refetching all indexes to get segment properties
Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on a change in pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#discussion_r699965205
##########
File path: core/src/main/java/org/apache/carbondata/core/index/TableIndex.java
##########
@@ -207,15 +208,22 @@ public CarbonTable getTable() {
Map<Segment, List<Index>> indexes) throws IOException {
Set<String> missingSISegments = filter.getMissingSISegments();
for (Segment segment : segments) {
+ List<Index> segmentIndices = indexes.get(segment);
if (segment == null ||
- indexes.get(segment) == null || indexes.get(segment).isEmpty()) {
+ indexes.get(segment) == null || segmentIndices.isEmpty()) {
continue;
}
boolean isExternalOrMissingSISegment = segment.isExternalSegment() ||
(missingSISegments != null && missingSISegments.contains(segment.getSegmentNo()));
List<Blocklet> pruneBlocklets = new ArrayList<>();
- SegmentProperties segmentProperties =
- segmentPropertiesFetcher.getSegmentProperties(segment, partitionLocations);
+ SegmentProperties segmentProperties;
+ if (segmentIndices.get(0) instanceof BlockIndex) {
Review comment:
I mean line 211, when `indexes.get(segment)` is done, all indexes will be present in this list. Like blockindex, SI, bloom, lucence.
so, we just have to find blockindex (which is always be present even for SI table) and get the the segment properties ?
@Indhumathi27 : Is my understanding correct or we need both cases here ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #4209: [CARBONDATA-4278] Avoid refetching all indexes to get segment properties
Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on a change in pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#discussion_r698263969
##########
File path: core/src/main/java/org/apache/carbondata/core/index/TableIndex.java
##########
@@ -207,15 +208,22 @@ public CarbonTable getTable() {
Map<Segment, List<Index>> indexes) throws IOException {
Set<String> missingSISegments = filter.getMissingSISegments();
for (Segment segment : segments) {
+ List<Index> segmentIndices = indexes.get(segment);
if (segment == null ||
- indexes.get(segment) == null || indexes.get(segment).isEmpty()) {
+ indexes.get(segment) == null || segmentIndices.isEmpty()) {
continue;
}
boolean isExternalOrMissingSISegment = segment.isExternalSegment() ||
(missingSISegments != null && missingSISegments.contains(segment.getSegmentNo()));
List<Blocklet> pruneBlocklets = new ArrayList<>();
- SegmentProperties segmentProperties =
- segmentPropertiesFetcher.getSegmentProperties(segment, partitionLocations);
+ SegmentProperties segmentProperties;
+ if (segmentIndices.get(0) instanceof BlockIndex) {
Review comment:
Need to loop here and check if any of the segmentIndices is BlockIndex instead of always checking the first index?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4209: [CARBONDATA-4278] Avoid refetching all indexes to get segment properties
Posted by GitBox <gi...@apache.org>.
CarbonDataQA2 commented on pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#issuecomment-910137002
Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/4163/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on pull request #4209: [CARBONDATA-4278] Avoid refetching all indexes to get segment properties
Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#issuecomment-910045640
LGTM
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] maheshrajus commented on a change in pull request #4209: [CARBONDATA-4278] Avoid refetching all indexes to get segment properties
Posted by GitBox <gi...@apache.org>.
maheshrajus commented on a change in pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#discussion_r699885987
##########
File path: core/src/main/java/org/apache/carbondata/core/index/TableIndex.java
##########
@@ -207,15 +208,22 @@ public CarbonTable getTable() {
Map<Segment, List<Index>> indexes) throws IOException {
Set<String> missingSISegments = filter.getMissingSISegments();
for (Segment segment : segments) {
+ List<Index> segmentIndices = indexes.get(segment);
if (segment == null ||
- indexes.get(segment) == null || indexes.get(segment).isEmpty()) {
+ indexes.get(segment) == null || segmentIndices.isEmpty()) {
continue;
}
boolean isExternalOrMissingSISegment = segment.isExternalSegment() ||
(missingSISegments != null && missingSISegments.contains(segment.getSegmentNo()));
List<Blocklet> pruneBlocklets = new ArrayList<>();
- SegmentProperties segmentProperties =
- segmentPropertiesFetcher.getSegmentProperties(segment, partitionLocations);
+ SegmentProperties segmentProperties;
+ if (segmentIndices.get(0) instanceof BlockIndex) {
Review comment:
@ajantha-bhat @Indhumathi27 1) we are getting segment properties from one blockindex which is same for all.
2) In this flow we may receive SI and bloom indexes also which will prepare by SI , bloom index factory [List(SecondaryIndex) or List(BloomIndex)]. In this case we should use "segmentPropertiesFetcher.getSegmentProperties" as they are not instance of BlockIndex. inside getSegmentProperties method we are getting and preparing list of BlockIndex and getting segment propteries from block index.
I will add scenario description in code and push.
##########
File path: core/src/main/java/org/apache/carbondata/core/index/TableIndex.java
##########
@@ -207,15 +208,22 @@ public CarbonTable getTable() {
Map<Segment, List<Index>> indexes) throws IOException {
Set<String> missingSISegments = filter.getMissingSISegments();
for (Segment segment : segments) {
+ List<Index> segmentIndices = indexes.get(segment);
if (segment == null ||
- indexes.get(segment) == null || indexes.get(segment).isEmpty()) {
+ indexes.get(segment) == null || segmentIndices.isEmpty()) {
continue;
}
boolean isExternalOrMissingSISegment = segment.isExternalSegment() ||
(missingSISegments != null && missingSISegments.contains(segment.getSegmentNo()));
List<Blocklet> pruneBlocklets = new ArrayList<>();
- SegmentProperties segmentProperties =
- segmentPropertiesFetcher.getSegmentProperties(segment, partitionLocations);
+ SegmentProperties segmentProperties;
+ if (segmentIndices.get(0) instanceof BlockIndex) {
Review comment:
i checked in SI flow, it has only SI index info and does not have blockindex info. We can add separate checks
##########
File path: core/src/main/java/org/apache/carbondata/core/index/TableIndex.java
##########
@@ -207,15 +208,22 @@ public CarbonTable getTable() {
Map<Segment, List<Index>> indexes) throws IOException {
Set<String> missingSISegments = filter.getMissingSISegments();
for (Segment segment : segments) {
+ List<Index> segmentIndices = indexes.get(segment);
if (segment == null ||
- indexes.get(segment) == null || indexes.get(segment).isEmpty()) {
+ indexes.get(segment) == null || segmentIndices.isEmpty()) {
Review comment:
done
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] maheshrajus commented on a change in pull request #4209: [CARBONDATA-4278] Avoid refetching all indexes to get segment properties
Posted by GitBox <gi...@apache.org>.
maheshrajus commented on a change in pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#discussion_r699885987
##########
File path: core/src/main/java/org/apache/carbondata/core/index/TableIndex.java
##########
@@ -207,15 +208,22 @@ public CarbonTable getTable() {
Map<Segment, List<Index>> indexes) throws IOException {
Set<String> missingSISegments = filter.getMissingSISegments();
for (Segment segment : segments) {
+ List<Index> segmentIndices = indexes.get(segment);
if (segment == null ||
- indexes.get(segment) == null || indexes.get(segment).isEmpty()) {
+ indexes.get(segment) == null || segmentIndices.isEmpty()) {
continue;
}
boolean isExternalOrMissingSISegment = segment.isExternalSegment() ||
(missingSISegments != null && missingSISegments.contains(segment.getSegmentNo()));
List<Blocklet> pruneBlocklets = new ArrayList<>();
- SegmentProperties segmentProperties =
- segmentPropertiesFetcher.getSegmentProperties(segment, partitionLocations);
+ SegmentProperties segmentProperties;
+ if (segmentIndices.get(0) instanceof BlockIndex) {
Review comment:
@ajantha-bhat @Indhumathi27 1) we are getting segment properties from one blockindex which is same for all.
2) In this flow we may receive SI and bloom indexes also which will prepare by SI , bloom index factory [List(SecondaryIndex) or List(BloomIndex)]. In this case we should use "segmentPropertiesFetcher.getSegmentProperties" as they are not instance of BlockIndex. inside getSegmentProperties method we are getting and preparing list of BlockIndex and getting segment propteries from block index.
I will add scenario description in code and push.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #4209: [CARBONDATA-4278] Avoid refetching all indexes to get segment properties
Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on a change in pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#discussion_r698263969
##########
File path: core/src/main/java/org/apache/carbondata/core/index/TableIndex.java
##########
@@ -207,15 +208,22 @@ public CarbonTable getTable() {
Map<Segment, List<Index>> indexes) throws IOException {
Set<String> missingSISegments = filter.getMissingSISegments();
for (Segment segment : segments) {
+ List<Index> segmentIndices = indexes.get(segment);
if (segment == null ||
- indexes.get(segment) == null || indexes.get(segment).isEmpty()) {
+ indexes.get(segment) == null || segmentIndices.isEmpty()) {
continue;
}
boolean isExternalOrMissingSISegment = segment.isExternalSegment() ||
(missingSISegments != null && missingSISegments.contains(segment.getSegmentNo()));
List<Blocklet> pruneBlocklets = new ArrayList<>();
- SegmentProperties segmentProperties =
- segmentPropertiesFetcher.getSegmentProperties(segment, partitionLocations);
+ SegmentProperties segmentProperties;
+ if (segmentIndices.get(0) instanceof BlockIndex) {
Review comment:
Need to loop here and check if any of the `segmentIndices `is `BlockIndex `instead of always checking the first index?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #4209: [CARBONDATA-4278] Avoid refetching all indexes to get segment properties
Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on a change in pull request #4209:
URL: https://github.com/apache/carbondata/pull/4209#discussion_r698265226
##########
File path: core/src/main/java/org/apache/carbondata/core/index/TableIndex.java
##########
@@ -207,15 +208,22 @@ public CarbonTable getTable() {
Map<Segment, List<Index>> indexes) throws IOException {
Set<String> missingSISegments = filter.getMissingSISegments();
for (Segment segment : segments) {
+ List<Index> segmentIndices = indexes.get(segment);
if (segment == null ||
- indexes.get(segment) == null || indexes.get(segment).isEmpty()) {
+ indexes.get(segment) == null || segmentIndices.isEmpty()) {
continue;
}
boolean isExternalOrMissingSISegment = segment.isExternalSegment() ||
(missingSISegments != null && missingSISegments.contains(segment.getSegmentNo()));
List<Blocklet> pruneBlocklets = new ArrayList<>();
- SegmentProperties segmentProperties =
- segmentPropertiesFetcher.getSegmentProperties(segment, partitionLocations);
+ SegmentProperties segmentProperties;
+ if (segmentIndices.get(0) instanceof BlockIndex) {
Review comment:
Also CC: @Indhumathi27 , you can also review once.
May be we can remove `segmentPropertiesFetcher.getSegmentProperties(segment, partitionLocations); `
and always call `segmentPropertiesFetcher.getSegmentPropertiesFromIndex` as we have indexes already at this place.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscribe@carbondata.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org