You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by vo...@apache.org on 2022/09/17 17:48:06 UTC
[druid] branch master updated: Docs: Clarify the situation with SELECT. (#13109)

This is an automated email from the ASF dual-hosted git repository.

vogievetsky pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/druid.git


The following commit(s) were added to refs/heads/master by this push:
     new d9b2968edb Docs: Clarify the situation with SELECT. (#13109)
d9b2968edb is described below

commit d9b2968edba9300464d3579efce9b67da72605dc
Author: Gian Merlino <gi...@gmail.com>
AuthorDate: Sat Sep 17 10:47:57 2022 -0700

    Docs: Clarify the situation with SELECT. (#13109)
---
 docs/multi-stage-query/api.md      | 28 ++++++++++++++--------------
 docs/multi-stage-query/concepts.md | 11 ++++++-----
 docs/multi-stage-query/index.md    |  9 +++++----
 3 files changed, 25 insertions(+), 23 deletions(-)

diff --git a/docs/multi-stage-query/api.md b/docs/multi-stage-query/api.md
index 8532ea6817..96e30e0ea7 100644
--- a/docs/multi-stage-query/api.md
+++ b/docs/multi-stage-query/api.md
@@ -42,15 +42,15 @@ You submit queries to the MSQ task engine using the `POST /druid/v2/sql/task/` e
 
 #### Request
 
-Currently, the MSQ task engine ignores the provided values of `resultFormat`, `header`,
-`typesHeader`, and `sqlTypesHeader`. SQL SELECT queries write out their results into the task report (in the `multiStageQuery.payload.results.results` key) formatted as if `resultFormat` is an `array`.
+The SQL task endpoint accepts [SQL requests in the JSON-over-HTTP form](../querying/sql-api.md#request-body) using the
+`query`, `context`, and `parameters` fields, but ignoring the `resultFormat`, `header`, `typesHeader`, and
+`sqlTypesHeader` fields.
 
-For task queries similar to the [example queries](./examples.md), you need to escape characters such as quotation marks (") if you use something like `curl`. 
-You don't need to escape characters if you use a method that can parse JSON seamlessly, such as Python.
-The Python example in this topic escapes quotation marks although it's not required.
+This endpoint accepts [INSERT](reference.md#insert) and [REPLACE](reference.md#replace) statements.
 
-The following example is the same query that you submit when you complete [Convert a JSON ingestion
-spec](../tutorials/tutorial-msq-convert-spec.md) where you insert data into a table named `wikipedia`. 
+As an experimental feature, this endpoint also accepts SELECT queries. SELECT query results are collected from workers
+by the controller, and written into the [task report](#get-the-report-for-a-query-task) as an array of arrays. The
+behavior and result format of plain SELECT queries (without INSERT or REPLACE) is subject to change.
 
 <!--DOCUSAURUS_CODE_TABS-->
 
@@ -199,9 +199,12 @@ A report provides detailed information about a query task, including things like
 
 Keep the following in mind when using the task API to view reports:
 
-- For SELECT queries, the report includes the results. At this time, if you want to view results for SELECT queries, you need to retrieve them as a generic map from the report and extract the results.
-- The task report stores query details for controller tasks.
-- If you encounter `500 Server Error` or `404 Not Found` errors, the task may be in the process of starting up or shutting down.
+- The task report for an entire job is associated with the `query_controller` task. The `query_worker` tasks do not have
+  their own reports; their information is incorporated into the controller report.
+- The task report API may report `404 Not Found` temporarily while the task is in the process of starting up.
+- As an experimental feature, the SQL task engine supports running SELECT queries. SELECT query results are written into
+the `multiStageQuery.payload.results.results` task report key as an array of arrays. The behavior and result format of plain
+SELECT queries (without INSERT or REPLACE) is subject to change.
 
 For an explanation of the fields in a report, see [Report response fields](#report-response-fields).
 
@@ -230,11 +233,8 @@ import requests
 # Make sure you replace `username`, `password`, `your-instance`, `port`, and `taskId` with the values for your deployment.
 url = "https://<username>:<password>@<hostname>:<port>/druid/indexer/v1/task/<taskId>/reports"
 
-payload={}
 headers = {}
-
-response = requests.request("GET", url, headers=headers, data=payload)
-
+response = requests.request("GET", url, headers=headers)
 print(response.text)
 ```
 
diff --git a/docs/multi-stage-query/concepts.md b/docs/multi-stage-query/concepts.md
index ea65fd76de..5d12a9927b 100644
--- a/docs/multi-stage-query/concepts.md
+++ b/docs/multi-stage-query/concepts.md
@@ -29,14 +29,15 @@ sidebar_label: "Key concepts"
 
 ## SQL task engine
 
-The `druid-multi-stage-query` extension adds a multi-stage query (MSQ) task engine that executes SQL SELECT,
-[INSERT](reference.md#insert), and [REPLACE](reference.md#replace) statements as batch tasks in the indexing service,
-which execute on [Middle Managers](../design/architecture.md#druid-services). INSERT and REPLACE tasks publish
+The `druid-multi-stage-query` extension adds a multi-stage query (MSQ) task engine that executes SQL statements as batch
+tasks in the indexing service, which execute on [Middle Managers](../design/architecture.md#druid-services).
+[INSERT](reference.md#insert) and [REPLACE](reference.md#replace) tasks publish
 [segments](../design/architecture.md#datasources-and-segments) just like [all other forms of batch
 ingestion](../ingestion/index.md#batch). Each query occupies at least two task slots while running: one controller task,
-and at least one worker task.
+and at least one worker task. As an experimental feature, the MSQ task engine also supports running SELECT queries as
+batch tasks. The behavior and result format of plain SELECT (without INSERT or REPLACE) is subject to change.
 
-You can execute queries using the MSQ task engine through the **Query** view in the [web
+You can execute SQL statements using the MSQ task engine through the **Query** view in the [web
 console](../operations/web-console.md) or through the [`/druid/v2/sql/task` API](api.md).
 
 For more details on how SQL queries are executed using the MSQ task engine, see [multi-stage query
diff --git a/docs/multi-stage-query/index.md b/docs/multi-stage-query/index.md
index d97de6dd63..64130aa03c 100644
--- a/docs/multi-stage-query/index.md
+++ b/docs/multi-stage-query/index.md
@@ -30,11 +30,12 @@ description: Introduces multi-stage query architecture and its task engine
 
 Apache Druid supports SQL-based ingestion using the bundled [`druid-multi-stage-query` extension](#load-the-extension).
 This extension adds a [multi-stage query task engine for SQL](concepts.md#sql-task-engine) that allows running SQL
-[INSERT](concepts.md#insert) and [REPLACE](concepts.md#replace) statements as batch tasks.
+[INSERT](concepts.md#insert) and [REPLACE](concepts.md#replace) statements as batch tasks. As an experimental feature,
+the task engine also supports running SELECT queries as batch tasks.
 
-Nearly all SELECT capabilities are available for `INSERT ... SELECT` and `REPLACE ... SELECT` queries, with certain
-exceptions listed on the [Known issues](./known-issues.md#select) page. This allows great flexibility to apply
-transformations, filters, JOINs, aggregations, and so on while ingesting data. This also allows in-database
+Nearly all SELECT capabilities are available in the SQL task engine, with certain exceptions listed on the [Known
+issues](./known-issues.md#select) page. This allows great flexibility to apply transformations, filters, JOINs,
+aggregations, and so on as part of `INSERT ... SELECT` and `REPLACE ... SELECT` statements. This also allows in-database
 transformation: creating new tables based on queries of other tables.
 
 ## Vocabulary


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org