You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@phoenix.apache.org by "stoty (via GitHub)" <gi...@apache.org> on 2023/05/17 12:51:37 UTC

[GitHub] [phoenix] stoty commented on a diff in pull request #1596: PHOENIX-6944 Randomize mapper task ordering for MR tools

stoty commented on code in PR #1596:
URL: https://github.com/apache/phoenix/pull/1596#discussion_r1196467171


##########
phoenix-core/src/main/java/org/apache/phoenix/mapreduce/PhoenixInputFormat.java:
##########
@@ -87,74 +88,106 @@ public List<InputSplit> getSplits(JobContext context) throws IOException, Interr
         return generateSplits(queryPlan, configuration);
     }
 
-    protected List<InputSplit> generateSplits(final QueryPlan qplan, Configuration config) throws IOException {
+    /**
+     * Randomise the length parameter of the splits to ensure random execution order.
+     * Yarn orders splits by size before execution.
+     *
+     * @param splits
+     */
+    protected void randomizeSplitLength(List<InputSplit> splits) {
+        LOGGER.info("Randomizing split size");
+        if (splits.size() == 0) {
+            return;
+        }
+        double defaultLength = 1000000d;
+        double totalLength = splits.stream().mapToDouble(s -> {
+            try {
+                return (double) s.getLength();
+            } catch (IOException | InterruptedException e1) {
+                return defaultLength;
+            }
+        }).sum();
+        long avgLength = (long) (totalLength / splits.size());
+        splits.stream().forEach(s -> ((PhoenixInputSplit) s)
+                .setLength(avgLength + ThreadLocalRandom.current().nextInt(10000)));

Review Comment:
   > I assume that it performs better because there aren't big differences in the split lengths, right?
   
   The diffrences in split length don't matter for performance, as long we make sure that the splits for the same RS are randomly distributed.
   I chose a relatively small random length so that we don't skew the reported completion performances too much, this way there will be less than 1% difference between the splits (as we originallly set the length to the size of the region in bytes)
   
   
   > The sum of the new split lengths will be bigger than the original, why this isn't it a problem?
   
   The absolute split lengths have no significance.
   Only the relative split lengths are used for 
   a.) determining the execution order
   b.) computing the displayed completion percent 
   
   We are already using wildly inaccurate values, as we use the region size instead of the guidepost size even when we split by guideposts, and it hasn't caused any problems.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org