You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@druid.apache.org by "cryptoe (via GitHub)" <gi...@apache.org> on 2023/03/16 05:20:54 UTC

[GitHub] [druid] cryptoe opened a new pull request, #13941: [MSQ] Regression bug fix where ever LimitFrameProcessor's were used.

cryptoe opened a new pull request, #13941:
URL: https://github.com/apache/druid/pull/13941

An MSQ insert statement with LIMIT and PARTITIONED BY ALL TIME no longer works. It returns a Result partition information is not ready yet error. This is a regression introduced because of #13506 .

This is reproducible using any source datasource, for example, this 2-row, 1-column mytest
```
select * from mytest
__time c1
2022-01-01T00:00:00.000Z 1
2022-01-01T00:00:00.000Z 2
```
The insert statement that fails:
```
insert into new_ds
select __time, c1
from mytest
limit 1
partitioned by ALL TIME
clustered by c1
```
```
UnknownError: org.apache.druid.java.util.common.ISE: Result partition information is not ready yet (Stack trace)
Failed task ID: query-17a6a628-473d-4b54-a41a-64143b680e3a (on host: ip-10-201-4-98.ec2.internal:8100)
Debug: get query detail archive
```
The stack trace:
```
org.apache.druid.java.util.common.ISE: Result partition information is not ready yet
at org.apache.druid.msq.kernel.controller.ControllerStageTracker.getResultPartitionBoundaries(ControllerStageTracker.java:224)
at org.apache.druid.msq.kernel.controller.ControllerQueryKernel.getResultPartitionBoundariesForStage(ControllerQueryKernel.java:394)
at org.apache.druid.msq.exec.ControllerImpl$RunQueryUntilDone.startStages(ControllerImpl.java:2385)
at org.apache.druid.msq.exec.ControllerImpl$RunQueryUntilDone.run(ControllerImpl.java:2164)
at org.apache.druid.msq.exec.ControllerImpl$RunQueryUntilDone.access$000(ControllerImpl.java:2121)
at org.apache.druid.msq.exec.ControllerImpl.runTask(ControllerImpl.java:373)
at org.apache.druid.msq.exec.ControllerImpl.run(ControllerImpl.java:317)
at org.apache.druid.msq.indexing.MSQControllerTask.runTask(MSQControllerTask.java:179)
at org.apache.druid.indexing.common.task.AbstractTask.run(AbstractTask.java:169)
at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:477)
at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:449)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)

```

Thanks @weishiuntsai for catching this.

### The fix

The limitprocessors are fed shuffled data already. Hence they need not add a shuffling step for the next stage.
Adjusted the `ScanQuerKit`, `GroupByQueryKit` for that.

Added a case in` ControllerStageTracker#generateResultPartitionsAndBoundariesWithoutKeyStatistics` for `ShuffleKind.MIX`

#### Fixed the bug ...

#### Release note
Release notes are not needed since #13506 did not go out.

<hr>

##### Key changed/added classes in this PR
* `ScanQueryKit`
* `GroupByQueryKit`
* `ControllerStageTracker`

<hr>

This PR has:

- [x] been self-reviewed.
- [x] a release note entry in the PR description.
- [x] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
- [x] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for [code coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md) is met.
- [x] been tested in a test Druid cluster.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] gianm merged pull request #13941: [MSQ] Regression bug fix where ever LimitFrameProcessor's were used.

Posted by "gianm (via GitHub)" <gi...@apache.org>.

gianm merged PR #13941:
URL: https://github.com/apache/druid/pull/13941


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] gianm commented on pull request #13941: [MSQ] Regression bug fix where ever LimitFrameProcessor's were used.

Posted by "gianm (via GitHub)" <gi...@apache.org>.

gianm commented on PR #13941:
URL: https://github.com/apache/druid/pull/13941#issuecomment-1472286229

   Thanks for the fix! Looks good to me.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org