You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by GitBox <gi...@apache.org> on 2019/05/03 07:09:39 UTC
[GitHub] [carbondata] kumarvishal09 commented on a change in pull request #3177: [CARBONDATA-3337][CARBONDATA-3306] Distributed index server

kumarvishal09 commented on a change in pull request #3177: [CARBONDATA-3337][CARBONDATA-3306] Distributed index server
URL: https://github.com/apache/carbondata/pull/3177#discussion_r280678335
 
 

 ##########
 File path: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java
 ##########
 @@ -572,26 +536,42 @@ public BlockMappingVO getBlockRowCount(Job job, CarbonTable table,
     For NonTransactional table, one of the reason for a segment refresh is below scenario.
     SDK is written one set of files with UUID, with same UUID it can write again.
     So, latest files content should reflect the new count by refreshing the segment */
-    List<Segment> toBeCleanedSegments = new ArrayList<>();
+    List<String> toBeCleanedSegments = new ArrayList<>();
     for (Segment eachSegment : filteredSegment) {
       boolean refreshNeeded = DataMapStoreManager.getInstance()
           .getTableSegmentRefresher(getOrCreateCarbonTable(job.getConfiguration()))
           .isRefreshNeeded(eachSegment,
               updateStatusManager.getInvalidTimestampRange(eachSegment.getSegmentNo()));
       if (refreshNeeded) {
-        toBeCleanedSegments.add(eachSegment);
+        toBeCleanedSegments.add(eachSegment.getSegmentNo());
       }
     }
-    // remove entry in the segment index if there are invalid segments
-    toBeCleanedSegments.addAll(allSegments.getInvalidSegments());
+    for (Segment segment : allSegments.getInvalidSegments()) {
+      // remove entry in the segment index if there are invalid segments
+      toBeCleanedSegments.add(segment.getSegmentNo());
+    }
     if (toBeCleanedSegments.size() > 0) {
       DataMapStoreManager.getInstance()
           .clearInvalidSegments(getOrCreateCarbonTable(job.getConfiguration()),
               toBeCleanedSegments);
     }
     if (isIUDTable || isUpdateFlow) {
-      Map<String, Long> blockletToRowCountMap =
-          defaultDataMap.getBlockRowCount(filteredSegment, partitions, defaultDataMap);
+      Map<String, Long> blockletToRowCountMap = new HashMap<>();
+      if (CarbonProperties.getInstance().isDistributedPruningEnabled(table.getDatabaseName(),
+          table.getTableName())) {
+        List<InputSplit> extendedBlocklets = CarbonTableInputFormat.convertToCarbonInputSplit(
+            getDistributedSplit(table, null, partitions, allSegments.getValidSegments(),
+                allSegments.getInvalidSegments(), toBeCleanedSegments));
+        for (InputSplit extendedBlocklet : extendedBlocklets) {
+          CarbonInputSplit blocklet = (CarbonInputSplit) extendedBlocklet;
+          blockletToRowCountMap.put(blocklet.getSegmentId() + "," + blocklet.getFilePath(),
+              Long.getLong(String.valueOf(blocklet.getDetailInfo().getRowCount())));
 
 Review comment:
   Please change this blockletToRowCountMap.put(blocklet.getSegmentId() + "," + blocklet.getFilePath(),
                 Long.getLong(String.valueOf(blocklet.getDetailInfo().getRowCount()))); to 
   blockletToRowCountMap.put(blocklet.getSegmentId() + "," + blocklet.getFilePath(),
                 (long)blocklet.getDetailInfo().getRowCount());

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services