You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2019/04/16 10:33:59 UTC
[GitHub] [incubator-druid] bharathpatnaik edited a comment on issue #7488:
Unable to do a Semi Join
bharathpatnaik edited a comment on issue #7488: Unable to do a Semi Join
URL: https://github.com/apache/incubator-druid/issues/7488#issuecomment-483605408
Druid 0.14
Very large cluster (Don't know the exact size but can handle billions of telemetry rows with ease)
I get a cannot build a plan error for a query which looks like this:
**Query1:**
`
_SELECT a from table1
WHERE a IN
(SELECT a from table1 WHERE ...)
Exact Queries used in the process were:
SELECT "dimensions_did" FROM
(
SELECT "dimensions_did",
COUNT(DISTINCT month_of_year) AS months_present_in_year,
SUM(time_spent_in_month) AS time_spent_in_year FROM
(
SELECT
"dimensions_did",
month_of_year,
COUNT(DISTINCT week_of_year) AS weeks_present_in_month,
SUM(time_spent_in_week) AS time_spent_in_month FROM
(
SELECT
"dimensions_did", month_of_year, week_of_year,
COUNT(DISTINCT day_of_year) AS days_present_in_week,
SUM(time_spent_in_day) AS time_spent_in_week FROM
(
SELECT "dimensions_did",
TIME_EXTRACT(MILLIS_TO_TIMESTAMP("syncts"), 'MONTH') AS month_of_year,
TIME_EXTRACT(MILLIS_TO_TIMESTAMP("syncts"), 'WEEK') AS week_of_year,
TIME_EXTRACT(MILLIS_TO_TIMESTAMP("syncts"), 'DOY') AS day_of_year,
SUM("total_time_spent") AS time_spent_in_day FROM
"druid"."summary-events"
WHERE "syncts" >= 1546300800000 AND "syncts" <= 1553990400000
AND "total_time_spent" > 0 AND "content_object_type" = 'Content' AND "dimensions_mode" = 'play'
GROUP BY 1, 2, 3, 4
)
GROUP BY 1, 2, 3
)
WHERE days_present_in_week >= 2
GROUP BY 1, 2
)
WHERE
weeks_present_in_month >= 2
GROUP BY 1
)
WHERE months_present_in_year >= 2 AND time_spent_in_year > 1000
LIMIT 100_
`
**Query 2**
`
_SELECT "dimensions_did"
FROM "druid"."summary-events"
WHERE "syncts" >= 1546300800000 AND "syncts" <= 1553990400000 AND "total_time_spent" > 0 AND "content_object_type" = 'Content' AND "dimensions_mode" = 'play'
GROUP BY 1 HAVING COUNT(DISTINCT TIME_FORMAT(MILLIS_TO_TIMESTAMP("syncts"), 'yyyy/MM/dd')) > 36
_
`
**Actual Query Used:**
`
_SELECT "dimensions_did" FROM
(
SELECT "dimensions_did",
COUNT(DISTINCT month_of_year) AS months_present_in_year,
SUM(time_spent_in_month) AS time_spent_in_year FROM
(
SELECT
"dimensions_did",
month_of_year,
COUNT(DISTINCT week_of_year) AS weeks_present_in_month,
SUM(time_spent_in_week) AS time_spent_in_month FROM
(
SELECT
"dimensions_did", month_of_year, week_of_year,
COUNT(DISTINCT day_of_year) AS days_present_in_week,
SUM(time_spent_in_day) AS time_spent_in_week FROM
(
SELECT "dimensions_did",
TIME_EXTRACT(MILLIS_TO_TIMESTAMP("syncts"), 'MONTH') AS month_of_year,
TIME_EXTRACT(MILLIS_TO_TIMESTAMP("syncts"), 'WEEK') AS week_of_year,
TIME_EXTRACT(MILLIS_TO_TIMESTAMP("syncts"), 'DOY') AS day_of_year,
SUM("total_time_spent") AS time_spent_in_day FROM
"druid"."summary-events"
WHERE "syncts" >= 1546300800000 AND "syncts" <= 1553990400000
AND "total_time_spent" > 0 AND "content_object_type" = 'Content' AND "dimensions_mode" = 'play'
GROUP BY 1, 2, 3, 4
)
GROUP BY 1, 2, 3
)
WHERE days_present_in_week >= 2
GROUP BY 1, 2
)
WHERE
weeks_present_in_month >= 2
GROUP BY 1
)
WHERE months_present_in_year >= 2 AND time_spent_in_year > 1000
AND "dimensions_did" IN
(SELECT "dimensions_did"
FROM "druid"."summary-events"
WHERE "syncts" >= 1546300800000 AND "syncts" <= 1553990400000 AND "total_time_spent" > 0 AND "content_object_type" = 'Content' AND "dimensions_mode" = 'play'
GROUP BY 1 HAVING COUNT(DISTINCT TIME_FORMAT(MILLIS_TO_TIMESTAMP("syncts"), 'yyyy/MM/dd')) > 36
)
LIMIT 100_
`
Thanks!
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org