You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2020/10/10 18:31:22 UTC

[GitHub] [druid] vogievetsky commented on a change in pull request #10500: Web console: targetRowsPerSegment for hashed partionin

vogievetsky commented on a change in pull request #10500:
URL: https://github.com/apache/druid/pull/10500#discussion_r502818534



##########
File path: web-console/src/utils/ingestion-spec.tsx
##########
@@ -2181,17 +2181,41 @@ export function getPartitionRelatedTuningSpecFormFields(
           info: <>Total number of rows in segments waiting for being pushed.</>,
         },
         // partitionsSpec type: hashed
+        {
+          name: 'partitionsSpec.targetRowsPerSegment',
+          label: 'Target rows per segment',
+          type: 'number',
+          defined: (t: TuningConfig) =>
+            deepGet(t, 'partitionsSpec.type') === 'hashed' &&
+            !deepGet(t, 'partitionsSpec.numShards'),
+          info: (
+            <p>
+              If you have skewed data, you may want to set this field to generate evenly sized
+              segments.
+              <br />
+              <br /> A target row count for each partition. Each partition will have a row count
+              close to the target assuming evenly distributed keys. Defaults to 5 million if
+              numShards is null.
+            </p>
+          ),
+        },
         {
           name: 'partitionsSpec.numShards',
           label: 'Num shards',
           type: 'number',
-          defined: (t: TuningConfig) => deepGet(t, 'partitionsSpec.type') === 'hashed',
+          defined: (t: TuningConfig) =>
+            deepGet(t, 'partitionsSpec.type') === 'hashed' &&
+            !deepGet(t, 'partitionsSpec.targetRowsPerSegment'),

Review comment:
       Does this mean that either `numShards` should be set or `targetRowsPerSegment`? What is their relationship exactly? Does one have to be set? What if both are set? I am asking because it looks like there might need some validation logic?

##########
File path: web-console/src/dialogs/compaction-dialog/compaction-dialog.tsx
##########
@@ -74,17 +74,39 @@ const COMPACTION_CONFIG_FIELDS: Field<CompactionConfig>[] = [
     info: <>Total number of rows in segments waiting for being pushed.</>,
   },
   // partitionsSpec type: hashed
+  {
+    name: 'tuningConfig.partitionsSpec.targetRowsPerSegment',
+    label: 'Target rows per segment',
+    type: 'number',
+    defined: (t: CompactionConfig) =>
+      deepGet(t, 'tuningConfig.partitionsSpec.type') === 'hashed' &&
+      !deepGet(t, 'tuningConfig.partitionsSpec.numShards'),
+    info: (
+      <p>
+        If you have skewed data, you may want to set this field to generate evenly sized segments.
+        <br />
+        <br /> A target row count for each partition. Each partition will have a row count close to
+        the target assuming evenly distributed keys. Defaults to 5 million if numShards is null.
+      </p>
+    ),
+  },
   {
     name: 'tuningConfig.partitionsSpec.numShards',
     label: 'Num shards',
     type: 'number',
-    defined: (t: CompactionConfig) => deepGet(t, 'tuningConfig.partitionsSpec.type') === 'hashed',
+    defined: (t: CompactionConfig) =>
+      deepGet(t, 'tuningConfig.partitionsSpec.type') === 'hashed' &&
+      !deepGet(t, 'tuningConfig.partitionsSpec.targetRowsPerSegment'),
     info: (
-      <>
+      <p>
+        If you know the optimal number of shards and want to speed up the time it takes for
+        compaction to run, set this field.
+        <br />

Review comment:
       do not use `<br />` in these descriptions. What you want is paragraphs. You can return multiple paragraphs by wrapping them in a fragment `<>`. So you would want:
   
   ```
   <>
     <p>Paragraph 1</p>
     <p>Paragraph 2</p>
   </>
   ```
   
   See https://github.com/apache/druid/blob/cbe2b44e29f1dba0b1882a6afe771cca85083278/web-console/src/utils/ingestion-spec.tsx#L780 for example




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org