You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by "ektravel (via GitHub)" <gi...@apache.org> on 2023/05/30 15:03:18 UTC

[GitHub] [druid] ektravel opened a new pull request, #14354: Document storeCompactionState

ektravel opened a new pull request, #14354:
URL: https://github.com/apache/druid/pull/14354

   ### Document storeCompactionState context parameter
   
   This PR:
   
   - Rearranges the context parameters in alphabetical order.
   - Add `storeCompactionState` context parameter to the table.
   - Add "ingestion/tasks" back to sidebars.json (mistakenly removed on [PR 14023](https://github.com/apache/druid/pull/14023)) 
   
   This PR has:
   
   - [x] been self-reviewed.
      - [ ] using the [concurrency checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md) (Remove this item if the PR doesn't have any relation to concurrency.)
   - [ ] added documentation for new or modified features or behaviors.
   - [ ] a release note entry in the PR description.
   - [ ] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
   - [ ] added or updated version, license, or notice information in [licenses.yaml](https://github.com/apache/druid/blob/master/dev/license.md)
   - [ ] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
   - [ ] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for [code coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md) is met.
   - [ ] added integration tests.
   - [ ] been tested in a test Druid cluster.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] techdocsmith commented on a diff in pull request #14354: Document storeCompactionState

Posted by "techdocsmith (via GitHub)" <gi...@apache.org>.
techdocsmith commented on code in PR #14354:
URL: https://github.com/apache/druid/pull/14354#discussion_r1210803934


##########
docs/ingestion/tasks.md:
##########
@@ -387,13 +387,14 @@ The settings get passed into the `context` field of the compaction tasks issued
 
 The following parameters apply to all task types.
 
-|property|default|description|
+|Property|Description|Default|
 |--------|-------|-----------|
-|`taskLockTimeout`|300000|Task lock timeout in milliseconds. For more details, see [Locking](#locking).<br/><br/>When a task acquires a lock, it sends a request via HTTP and awaits until it receives a response containing the lock acquisition result. As a result, an HTTP timeout error can occur if `taskLockTimeout` is greater than `druid.server.http.maxIdleTime` of Overlords.|
-|`forceTimeChunkLock`|true|_Setting this to false is still experimental_<br/> Force to always use time chunk lock. If not set, each task automatically chooses a lock type to use. If set, this parameter overwrites `druid.indexer.tasklock.forceTimeChunkLock` [configuration for the overlord](../configuration/index.md#overlord-operations). See [Locking](#locking) for more details.|
-|`priority`|Different based on task types. See [Priority](#priority).|Task priority|
-|`useLineageBasedSegmentAllocation`|false in 0.21 or earlier, true in 0.22 or later|Enable the new lineage-based segment allocation protocol for the native Parallel task with dynamic partitioning. This option should be off during the replacing rolling upgrade from one of the Druid versions between 0.19 and 0.21 to Druid 0.22 or higher. Once the upgrade is done, it must be set to true to ensure data correctness.|
-|`storeEmptyColumns`|true|Boolean value for whether or not to store empty columns during ingestion. When set to true, Druid stores every column specified in the [`dimensionsSpec`](ingestion-spec.md#dimensionsspec). <br/><br/>If you set `storeEmptyColumns` to false, Druid SQL queries referencing empty columns will fail. If you intend to leave `storeEmptyColumns` disabled, you should either ingest dummy data for empty columns or else not query on empty columns.<br/><br/>When set in the task context, `storeEmptyColumns` overrides the system property [`druid.indexer.task.storeEmptyColumns`](../configuration/index.md#additional-peon-configuration).|
+|`forceTimeChunkLock`|_Setting this to false is still experimental._<br/> Force to always use time chunk lock. If not set, each task automatically chooses a lock type to use. If set, this parameter overwrites `druid.indexer.tasklock.forceTimeChunkLock` [configuration for the overlord](../configuration/index.md#overlord-operations). See [Locking](#locking) for more details.|true|
+|`priority`|Task priority|Depends on the task type. See [Priority](#priority) for more details.|
+|`storeCompactionState`|Determines whether the task's metadata stores the state of the segments created by that task. In most cases, you should not need to set this parameter as it is set automatically on compaction tasks. |True by default for compaction tasks. For all other tasks, defaults to false. |

Review Comment:
   We should have an official style for this however,  "whether" is not that great an introductory word. Other considerations:
   > When `true`, Druid stores the compaction state...
   > Enables the task to store metadata about the compaction state of created segments.
   
   I think `true` and `false` as configuration options should probably be code terms. Except where case matters, I think they should be all lowercase.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] kfaraz commented on a diff in pull request #14354: Document storeCompactionState

Posted by "kfaraz (via GitHub)" <gi...@apache.org>.
kfaraz commented on code in PR #14354:
URL: https://github.com/apache/druid/pull/14354#discussion_r1213855107


##########
docs/ingestion/tasks.md:
##########
@@ -387,13 +387,14 @@ The settings get passed into the `context` field of the compaction tasks issued
 
 The following parameters apply to all task types.
 
-|property|default|description|
+|Property|Description|Default|
 |--------|-------|-----------|
-|`taskLockTimeout`|300000|Task lock timeout in milliseconds. For more details, see [Locking](#locking).<br/><br/>When a task acquires a lock, it sends a request via HTTP and awaits until it receives a response containing the lock acquisition result. As a result, an HTTP timeout error can occur if `taskLockTimeout` is greater than `druid.server.http.maxIdleTime` of Overlords.|
-|`forceTimeChunkLock`|true|_Setting this to false is still experimental_<br/> Force to always use time chunk lock. If not set, each task automatically chooses a lock type to use. If set, this parameter overwrites `druid.indexer.tasklock.forceTimeChunkLock` [configuration for the overlord](../configuration/index.md#overlord-operations). See [Locking](#locking) for more details.|
-|`priority`|Different based on task types. See [Priority](#priority).|Task priority|
-|`useLineageBasedSegmentAllocation`|false in 0.21 or earlier, true in 0.22 or later|Enable the new lineage-based segment allocation protocol for the native Parallel task with dynamic partitioning. This option should be off during the replacing rolling upgrade from one of the Druid versions between 0.19 and 0.21 to Druid 0.22 or higher. Once the upgrade is done, it must be set to true to ensure data correctness.|
-|`storeEmptyColumns`|true|Boolean value for whether or not to store empty columns during ingestion. When set to true, Druid stores every column specified in the [`dimensionsSpec`](ingestion-spec.md#dimensionsspec). <br/><br/>If you set `storeEmptyColumns` to false, Druid SQL queries referencing empty columns will fail. If you intend to leave `storeEmptyColumns` disabled, you should either ingest dummy data for empty columns or else not query on empty columns.<br/><br/>When set in the task context, `storeEmptyColumns` overrides the system property [`druid.indexer.task.storeEmptyColumns`](../configuration/index.md#additional-peon-configuration).|
+|`forceTimeChunkLock`|_Setting this to false is still experimental._<br/> Force to always use time chunk lock. If not set, each task automatically chooses a lock type to use. If set, this parameter overwrites `druid.indexer.tasklock.forceTimeChunkLock` [configuration for the overlord](../configuration/index.md#overlord-operations). See [Locking](#locking) for more details.|true|
+|`priority`|Task priority|Depends on the task type. See [Priority](#priority) for more details.|
+|`storeCompactionState`|Determines whether the task's metadata stores the state of the segments created by that task. In most cases, you should not need to set this parameter as it is set automatically on compaction tasks. |True by default for compaction tasks. For all other tasks, defaults to false. |

Review Comment:
   Makes sense, @suneet-s . Thanks for the clarification!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] ektravel commented on a diff in pull request #14354: Document storeCompactionState

Posted by "ektravel (via GitHub)" <gi...@apache.org>.
ektravel commented on code in PR #14354:
URL: https://github.com/apache/druid/pull/14354#discussion_r1210812501


##########
docs/ingestion/tasks.md:
##########
@@ -387,13 +387,14 @@ The settings get passed into the `context` field of the compaction tasks issued
 
 The following parameters apply to all task types.
 
-|property|default|description|
+|Property|Description|Default|
 |--------|-------|-----------|
-|`taskLockTimeout`|300000|Task lock timeout in milliseconds. For more details, see [Locking](#locking).<br/><br/>When a task acquires a lock, it sends a request via HTTP and awaits until it receives a response containing the lock acquisition result. As a result, an HTTP timeout error can occur if `taskLockTimeout` is greater than `druid.server.http.maxIdleTime` of Overlords.|
-|`forceTimeChunkLock`|true|_Setting this to false is still experimental_<br/> Force to always use time chunk lock. If not set, each task automatically chooses a lock type to use. If set, this parameter overwrites `druid.indexer.tasklock.forceTimeChunkLock` [configuration for the overlord](../configuration/index.md#overlord-operations). See [Locking](#locking) for more details.|
-|`priority`|Different based on task types. See [Priority](#priority).|Task priority|
-|`useLineageBasedSegmentAllocation`|false in 0.21 or earlier, true in 0.22 or later|Enable the new lineage-based segment allocation protocol for the native Parallel task with dynamic partitioning. This option should be off during the replacing rolling upgrade from one of the Druid versions between 0.19 and 0.21 to Druid 0.22 or higher. Once the upgrade is done, it must be set to true to ensure data correctness.|
-|`storeEmptyColumns`|true|Boolean value for whether or not to store empty columns during ingestion. When set to true, Druid stores every column specified in the [`dimensionsSpec`](ingestion-spec.md#dimensionsspec). <br/><br/>If you set `storeEmptyColumns` to false, Druid SQL queries referencing empty columns will fail. If you intend to leave `storeEmptyColumns` disabled, you should either ingest dummy data for empty columns or else not query on empty columns.<br/><br/>When set in the task context, `storeEmptyColumns` overrides the system property [`druid.indexer.task.storeEmptyColumns`](../configuration/index.md#additional-peon-configuration).|
+|`forceTimeChunkLock`|_Setting this to false is still experimental._<br/> Force to always use time chunk lock. If not set, each task automatically chooses a lock type to use. If set, this parameter overwrites `druid.indexer.tasklock.forceTimeChunkLock` [configuration for the overlord](../configuration/index.md#overlord-operations). See [Locking](#locking) for more details.|true|
+|`priority`|Task priority|Depends on the task type. See [Priority](#priority) for more details.|
+|`storeCompactionState`|Determines whether the task's metadata stores the state of the segments created by that task. In most cases, you should not need to set this parameter as it is set automatically on compaction tasks. |True by default for compaction tasks. For all other tasks, defaults to false. |

Review Comment:
   @suneet-s 
   A compaction task launches several sub-tasks too. Can you please clarify which of the exact task types have this flag as `true`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] kfaraz commented on a diff in pull request #14354: Document storeCompactionState

Posted by "kfaraz (via GitHub)" <gi...@apache.org>.
kfaraz commented on code in PR #14354:
URL: https://github.com/apache/druid/pull/14354#discussion_r1210754012


##########
docs/ingestion/tasks.md:
##########
@@ -387,13 +387,14 @@ The settings get passed into the `context` field of the compaction tasks issued
 
 The following parameters apply to all task types.
 
-|property|default|description|
+|Property|Description|Default|
 |--------|-------|-----------|
-|`taskLockTimeout`|300000|Task lock timeout in milliseconds. For more details, see [Locking](#locking).<br/><br/>When a task acquires a lock, it sends a request via HTTP and awaits until it receives a response containing the lock acquisition result. As a result, an HTTP timeout error can occur if `taskLockTimeout` is greater than `druid.server.http.maxIdleTime` of Overlords.|
-|`forceTimeChunkLock`|true|_Setting this to false is still experimental_<br/> Force to always use time chunk lock. If not set, each task automatically chooses a lock type to use. If set, this parameter overwrites `druid.indexer.tasklock.forceTimeChunkLock` [configuration for the overlord](../configuration/index.md#overlord-operations). See [Locking](#locking) for more details.|
-|`priority`|Different based on task types. See [Priority](#priority).|Task priority|
-|`useLineageBasedSegmentAllocation`|false in 0.21 or earlier, true in 0.22 or later|Enable the new lineage-based segment allocation protocol for the native Parallel task with dynamic partitioning. This option should be off during the replacing rolling upgrade from one of the Druid versions between 0.19 and 0.21 to Druid 0.22 or higher. Once the upgrade is done, it must be set to true to ensure data correctness.|
-|`storeEmptyColumns`|true|Boolean value for whether or not to store empty columns during ingestion. When set to true, Druid stores every column specified in the [`dimensionsSpec`](ingestion-spec.md#dimensionsspec). <br/><br/>If you set `storeEmptyColumns` to false, Druid SQL queries referencing empty columns will fail. If you intend to leave `storeEmptyColumns` disabled, you should either ingest dummy data for empty columns or else not query on empty columns.<br/><br/>When set in the task context, `storeEmptyColumns` overrides the system property [`druid.indexer.task.storeEmptyColumns`](../configuration/index.md#additional-peon-configuration).|
+|`forceTimeChunkLock`|_Setting this to false is still experimental._<br/> Force to always use time chunk lock. If not set, each task automatically chooses a lock type to use. If set, this parameter overwrites `druid.indexer.tasklock.forceTimeChunkLock` [configuration for the overlord](../configuration/index.md#overlord-operations). See [Locking](#locking) for more details.|true|

Review Comment:
   ```suggestion
   |`forceTimeChunkLock`|_Setting this to false is still experimental._<br/> Force to use time chunk lock. If set, this parameter overrides the overlord runtime property `druid.indexer.tasklock.forceTimeChunkLock` [configuration for the overlord](../configuration/index.md#overlord-operations). If neither this parameter nor the runtime property is set, each task automatically chooses a lock type to use. . See [Locking](#locking) for more details.|true|
   ```
   
   Suggestions:
   - Remove the `always` as this parameter just applies to the task at hand.
   - Clarify the override behaviour of the runtime property and the context parameter



##########
docs/ingestion/tasks.md:
##########
@@ -387,13 +387,14 @@ The settings get passed into the `context` field of the compaction tasks issued
 
 The following parameters apply to all task types.
 
-|property|default|description|
+|Property|Description|Default|
 |--------|-------|-----------|
-|`taskLockTimeout`|300000|Task lock timeout in milliseconds. For more details, see [Locking](#locking).<br/><br/>When a task acquires a lock, it sends a request via HTTP and awaits until it receives a response containing the lock acquisition result. As a result, an HTTP timeout error can occur if `taskLockTimeout` is greater than `druid.server.http.maxIdleTime` of Overlords.|
-|`forceTimeChunkLock`|true|_Setting this to false is still experimental_<br/> Force to always use time chunk lock. If not set, each task automatically chooses a lock type to use. If set, this parameter overwrites `druid.indexer.tasklock.forceTimeChunkLock` [configuration for the overlord](../configuration/index.md#overlord-operations). See [Locking](#locking) for more details.|
-|`priority`|Different based on task types. See [Priority](#priority).|Task priority|
-|`useLineageBasedSegmentAllocation`|false in 0.21 or earlier, true in 0.22 or later|Enable the new lineage-based segment allocation protocol for the native Parallel task with dynamic partitioning. This option should be off during the replacing rolling upgrade from one of the Druid versions between 0.19 and 0.21 to Druid 0.22 or higher. Once the upgrade is done, it must be set to true to ensure data correctness.|
-|`storeEmptyColumns`|true|Boolean value for whether or not to store empty columns during ingestion. When set to true, Druid stores every column specified in the [`dimensionsSpec`](ingestion-spec.md#dimensionsspec). <br/><br/>If you set `storeEmptyColumns` to false, Druid SQL queries referencing empty columns will fail. If you intend to leave `storeEmptyColumns` disabled, you should either ingest dummy data for empty columns or else not query on empty columns.<br/><br/>When set in the task context, `storeEmptyColumns` overrides the system property [`druid.indexer.task.storeEmptyColumns`](../configuration/index.md#additional-peon-configuration).|
+|`forceTimeChunkLock`|_Setting this to false is still experimental._<br/> Force to always use time chunk lock. If not set, each task automatically chooses a lock type to use. If set, this parameter overwrites `druid.indexer.tasklock.forceTimeChunkLock` [configuration for the overlord](../configuration/index.md#overlord-operations). See [Locking](#locking) for more details.|true|
+|`priority`|Task priority|Depends on the task type. See [Priority](#priority) for more details.|
+|`storeCompactionState`|Determines whether the task's metadata stores the state of the segments created by that task. In most cases, you should not need to set this parameter as it is set automatically on compaction tasks. |True by default for compaction tasks. For all other tasks, defaults to false. |

Review Comment:
   ```suggestion
   |`storeCompactionState`|Whether the task should store the compaction state of created segments in its metadata. The stored compaction state is used to determine ... In most cases, you should not need to set this parameter as it is set automatically on compaction tasks. |true for compaction tasks, false for other task types |
   ```
   
   Suggestions:
   - Remove the word `Determines` as I don't think we use it to describe most other boolean-type configs in the Druid docs. (Please keep it if you feel this is better)
   - Remove possessive case on task
   - Should we use `False/True` (capitalized) or `false/true`? Whichever we choose, we should stick to one throughout the docs.
   - Add a little more info about how the stored compaction state is used.
   - A compaction task launches several sub-tasks too. It would be nice to clarify which of the exact task types have this flag as `true`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] techdocsmith merged pull request #14354: Document storeCompactionState

Posted by "techdocsmith (via GitHub)" <gi...@apache.org>.
techdocsmith merged PR #14354:
URL: https://github.com/apache/druid/pull/14354


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] suneet-s commented on a diff in pull request #14354: Document storeCompactionState

Posted by "suneet-s (via GitHub)" <gi...@apache.org>.
suneet-s commented on code in PR #14354:
URL: https://github.com/apache/druid/pull/14354#discussion_r1213575090


##########
docs/ingestion/tasks.md:
##########
@@ -387,13 +387,14 @@ The settings get passed into the `context` field of the compaction tasks issued
 
 The following parameters apply to all task types.
 
-|property|default|description|
+|Property|Description|Default|
 |--------|-------|-----------|
-|`taskLockTimeout`|300000|Task lock timeout in milliseconds. For more details, see [Locking](#locking).<br/><br/>When a task acquires a lock, it sends a request via HTTP and awaits until it receives a response containing the lock acquisition result. As a result, an HTTP timeout error can occur if `taskLockTimeout` is greater than `druid.server.http.maxIdleTime` of Overlords.|
-|`forceTimeChunkLock`|true|_Setting this to false is still experimental_<br/> Force to always use time chunk lock. If not set, each task automatically chooses a lock type to use. If set, this parameter overwrites `druid.indexer.tasklock.forceTimeChunkLock` [configuration for the overlord](../configuration/index.md#overlord-operations). See [Locking](#locking) for more details.|
-|`priority`|Different based on task types. See [Priority](#priority).|Task priority|
-|`useLineageBasedSegmentAllocation`|false in 0.21 or earlier, true in 0.22 or later|Enable the new lineage-based segment allocation protocol for the native Parallel task with dynamic partitioning. This option should be off during the replacing rolling upgrade from one of the Druid versions between 0.19 and 0.21 to Druid 0.22 or higher. Once the upgrade is done, it must be set to true to ensure data correctness.|
-|`storeEmptyColumns`|true|Boolean value for whether or not to store empty columns during ingestion. When set to true, Druid stores every column specified in the [`dimensionsSpec`](ingestion-spec.md#dimensionsspec). <br/><br/>If you set `storeEmptyColumns` to false, Druid SQL queries referencing empty columns will fail. If you intend to leave `storeEmptyColumns` disabled, you should either ingest dummy data for empty columns or else not query on empty columns.<br/><br/>When set in the task context, `storeEmptyColumns` overrides the system property [`druid.indexer.task.storeEmptyColumns`](../configuration/index.md#additional-peon-configuration).|
+|`forceTimeChunkLock`|_Setting this to false is still experimental._<br/> Force to always use time chunk lock. If not set, each task automatically chooses a lock type to use. If set, this parameter overwrites `druid.indexer.tasklock.forceTimeChunkLock` [configuration for the overlord](../configuration/index.md#overlord-operations). See [Locking](#locking) for more details.|true|
+|`priority`|Task priority|Depends on the task type. See [Priority](#priority) for more details.|
+|`storeCompactionState`|Determines whether the task's metadata stores the state of the segments created by that task. In most cases, you should not need to set this parameter as it is set automatically on compaction tasks. |True by default for compaction tasks. For all other tasks, defaults to false. |

Review Comment:
   @kfaraz / @ektravel 
   
   > A compaction task launches several sub-tasks too. It would be nice to clarify which of the exact task types have this flag as true
   
   Setting this to task context will also propagate the context to all the sub tasks that are spawned by the main task. I don't think we need to document this behavior as it is the same as other task context properties. I can only imagine a user wanting to set this context on a parent task like `index_parallel`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] ektravel commented on a diff in pull request #14354: Document storeCompactionState

Posted by "ektravel (via GitHub)" <gi...@apache.org>.
ektravel commented on code in PR #14354:
URL: https://github.com/apache/druid/pull/14354#discussion_r1210812501


##########
docs/ingestion/tasks.md:
##########
@@ -387,13 +387,14 @@ The settings get passed into the `context` field of the compaction tasks issued
 
 The following parameters apply to all task types.
 
-|property|default|description|
+|Property|Description|Default|
 |--------|-------|-----------|
-|`taskLockTimeout`|300000|Task lock timeout in milliseconds. For more details, see [Locking](#locking).<br/><br/>When a task acquires a lock, it sends a request via HTTP and awaits until it receives a response containing the lock acquisition result. As a result, an HTTP timeout error can occur if `taskLockTimeout` is greater than `druid.server.http.maxIdleTime` of Overlords.|
-|`forceTimeChunkLock`|true|_Setting this to false is still experimental_<br/> Force to always use time chunk lock. If not set, each task automatically chooses a lock type to use. If set, this parameter overwrites `druid.indexer.tasklock.forceTimeChunkLock` [configuration for the overlord](../configuration/index.md#overlord-operations). See [Locking](#locking) for more details.|
-|`priority`|Different based on task types. See [Priority](#priority).|Task priority|
-|`useLineageBasedSegmentAllocation`|false in 0.21 or earlier, true in 0.22 or later|Enable the new lineage-based segment allocation protocol for the native Parallel task with dynamic partitioning. This option should be off during the replacing rolling upgrade from one of the Druid versions between 0.19 and 0.21 to Druid 0.22 or higher. Once the upgrade is done, it must be set to true to ensure data correctness.|
-|`storeEmptyColumns`|true|Boolean value for whether or not to store empty columns during ingestion. When set to true, Druid stores every column specified in the [`dimensionsSpec`](ingestion-spec.md#dimensionsspec). <br/><br/>If you set `storeEmptyColumns` to false, Druid SQL queries referencing empty columns will fail. If you intend to leave `storeEmptyColumns` disabled, you should either ingest dummy data for empty columns or else not query on empty columns.<br/><br/>When set in the task context, `storeEmptyColumns` overrides the system property [`druid.indexer.task.storeEmptyColumns`](../configuration/index.md#additional-peon-configuration).|
+|`forceTimeChunkLock`|_Setting this to false is still experimental._<br/> Force to always use time chunk lock. If not set, each task automatically chooses a lock type to use. If set, this parameter overwrites `druid.indexer.tasklock.forceTimeChunkLock` [configuration for the overlord](../configuration/index.md#overlord-operations). See [Locking](#locking) for more details.|true|
+|`priority`|Task priority|Depends on the task type. See [Priority](#priority) for more details.|
+|`storeCompactionState`|Determines whether the task's metadata stores the state of the segments created by that task. In most cases, you should not need to set this parameter as it is set automatically on compaction tasks. |True by default for compaction tasks. For all other tasks, defaults to false. |

Review Comment:
   @suneet-s 
   A compaction task launches several sub-tasks. Can you please clarify which of the exact task types have this flag as `true`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] ektravel commented on pull request #14354: Document storeCompactionState

Posted by "ektravel (via GitHub)" <gi...@apache.org>.
ektravel commented on PR #14354:
URL: https://github.com/apache/druid/pull/14354#issuecomment-1568605677

   @suneet-s Please review the definition for `storeCompactionState` on [line 394](https://github.com/apache/druid/pull/14354/files#diff-6812a1fef2b8183b749a2cb629dd1e34bc8aeb7e520b7dee73e2a66b08caf894R394).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] suneet-s commented on a diff in pull request #14354: Document storeCompactionState

Posted by "suneet-s (via GitHub)" <gi...@apache.org>.
suneet-s commented on code in PR #14354:
URL: https://github.com/apache/druid/pull/14354#discussion_r1213566606


##########
docs/ingestion/tasks.md:
##########
@@ -387,13 +387,14 @@ The settings get passed into the `context` field of the compaction tasks issued
 
 The following parameters apply to all task types.
 
-|property|default|description|
+|Property|Description|Default|
 |--------|-------|-----------|
-|`taskLockTimeout`|300000|Task lock timeout in milliseconds. For more details, see [Locking](#locking).<br/><br/>When a task acquires a lock, it sends a request via HTTP and awaits until it receives a response containing the lock acquisition result. As a result, an HTTP timeout error can occur if `taskLockTimeout` is greater than `druid.server.http.maxIdleTime` of Overlords.|
-|`forceTimeChunkLock`|true|_Setting this to false is still experimental_<br/> Force to always use time chunk lock. If not set, each task automatically chooses a lock type to use. If set, this parameter overwrites `druid.indexer.tasklock.forceTimeChunkLock` [configuration for the overlord](../configuration/index.md#overlord-operations). See [Locking](#locking) for more details.|
-|`priority`|Different based on task types. See [Priority](#priority).|Task priority|
-|`useLineageBasedSegmentAllocation`|false in 0.21 or earlier, true in 0.22 or later|Enable the new lineage-based segment allocation protocol for the native Parallel task with dynamic partitioning. This option should be off during the replacing rolling upgrade from one of the Druid versions between 0.19 and 0.21 to Druid 0.22 or higher. Once the upgrade is done, it must be set to true to ensure data correctness.|
-|`storeEmptyColumns`|true|Boolean value for whether or not to store empty columns during ingestion. When set to true, Druid stores every column specified in the [`dimensionsSpec`](ingestion-spec.md#dimensionsspec). <br/><br/>If you set `storeEmptyColumns` to false, Druid SQL queries referencing empty columns will fail. If you intend to leave `storeEmptyColumns` disabled, you should either ingest dummy data for empty columns or else not query on empty columns.<br/><br/>When set in the task context, `storeEmptyColumns` overrides the system property [`druid.indexer.task.storeEmptyColumns`](../configuration/index.md#additional-peon-configuration).|
+|`forceTimeChunkLock`|_Setting this to false is still experimental._<br/> Force to use time chunk lock. When `true`, this parameter overrides the overlord runtime property `druid.indexer.tasklock.forceTimeChunkLock` [configuration for the overlord](../configuration/index.md#overlord-operations). If neither this parameter nor the runtime property is `true`, each task automatically chooses a lock type to use. See [Locking](#locking) for more details.|`true`|
+|`priority`|Task priority|Depends on the task type. See [Priority](#priority) for more details.|
+|`storeCompactionState`|Enables the task to store the compaction state of created segments in its metadata. When `true`, the segments created by the task fill `lastCompactionState` in the task metadata. This parameter is set automatically on compaction tasks. |`true` for compaction tasks, `false` for other task types|

Review Comment:
   ```suggestion
   |`storeCompactionState`|Enables the task to store the compaction state of created segments in the metadata store. When `true`, the segments created by the task fill `lastCompactionState` in the segment metadata. This parameter is set automatically on compaction tasks. |`true` for compaction tasks, `false` for other task types|
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org