You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2021/11/08 23:39:21 UTC

[GitHub] [druid] lokesh-lingarajan opened a new pull request #11894: Creating an index based on druid_segments(datasource, used, start, end)

lokesh-lingarajan opened a new pull request #11894:
URL: https://github.com/apache/druid/pull/11894


   This index helps in faster query results during kill task's query on interval based unused segment listing. This had become a bottleneck at our production loads causing coordinator to wait longer for metadata db replies and hence causing increased ingestion lag. The new index has helped reduce the query times for such queries and has made it possible for us to use kill feature in production.
   
   
   This PR has:
   - [ x] been self-reviewed.
   - [x] been tested in a dev,stage and prod Druid clusters.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] lokesh-lingarajan commented on a change in pull request #11894: Creating an index based on druid_segments(datasource, used, start, end)

Posted by GitBox <gi...@apache.org>.
lokesh-lingarajan commented on a change in pull request #11894:
URL: https://github.com/apache/druid/pull/11894#discussion_r749720322



##########
File path: server/src/main/java/org/apache/druid/metadata/SQLMetadataConnector.java
##########
@@ -283,6 +283,11 @@ public void createSegmentTable(final String tableName)
                 "CREATE INDEX idx_%1$s_datasource_used_end ON %1$s(dataSource, used, %2$send%2$s)",
                 tableName,
                 getQuoteString()
+            ),
+            StringUtils.format(
+                "CREATE INDEX idx_%1$s_datasource_used_start_end ON %1$s(dataSource, used, start, %2$send%2$s)",
+                tableName,
+                getQuoteString()

Review comment:
       We did that in production 2 weeks back on the same thesis, but quickly ingestion lag started going up and we had to recreate the index with (datasource, used, end) again to fix it. There is some dependency which we did not investigate further. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] xvrl merged pull request #11894: Modifying index from druid_segments(datasource, used, end) to druid_segments(datasource, used, end, start) to support kill task

Posted by GitBox <gi...@apache.org>.
xvrl merged pull request #11894:
URL: https://github.com/apache/druid/pull/11894


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] FrankChen021 commented on a change in pull request #11894: Creating an index based on druid_segments(datasource, used, start, end)

Posted by GitBox <gi...@apache.org>.
FrankChen021 commented on a change in pull request #11894:
URL: https://github.com/apache/druid/pull/11894#discussion_r762435841



##########
File path: server/src/main/java/org/apache/druid/metadata/SQLMetadataConnector.java
##########
@@ -283,6 +283,11 @@ public void createSegmentTable(final String tableName)
                 "CREATE INDEX idx_%1$s_datasource_used_end ON %1$s(dataSource, used, %2$send%2$s)",
                 tableName,
                 getQuoteString()
+            ),
+            StringUtils.format(
+                "CREATE INDEX idx_%1$s_datasource_used_start_end ON %1$s(dataSource, used, start, %2$send%2$s)",
+                tableName,
+                getQuoteString()

Review comment:
       Sorry for the late reply. How about put the ‘start’ field after ‘end’ field so that it’s the prefix of (datasource, used, end) query? I think you could try to explain queries to see how the index is used.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] lokesh-lingarajan commented on a change in pull request #11894: Creating an index based on druid_segments(datasource, used, start, end)

Posted by GitBox <gi...@apache.org>.
lokesh-lingarajan commented on a change in pull request #11894:
URL: https://github.com/apache/druid/pull/11894#discussion_r768863015



##########
File path: server/src/main/java/org/apache/druid/metadata/SQLMetadataConnector.java
##########
@@ -283,6 +283,11 @@ public void createSegmentTable(final String tableName)
                 "CREATE INDEX idx_%1$s_datasource_used_end ON %1$s(dataSource, used, %2$send%2$s)",
                 tableName,
                 getQuoteString()
+            ),
+            StringUtils.format(
+                "CREATE INDEX idx_%1$s_datasource_used_start_end ON %1$s(dataSource, used, start, %2$send%2$s)",
+                tableName,
+                getQuoteString()

Review comment:
       @FrankChen021 - Thanks for your recommendation, we tested the change of putting start at the end and deployed it in lab, stag and prod. All the envs are working fine without issues. Please take a look at the change now :)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] FrankChen021 commented on a change in pull request #11894: Creating an index based on druid_segments(datasource, used, start, end)

Posted by GitBox <gi...@apache.org>.
FrankChen021 commented on a change in pull request #11894:
URL: https://github.com/apache/druid/pull/11894#discussion_r748794899



##########
File path: server/src/main/java/org/apache/druid/metadata/SQLMetadataConnector.java
##########
@@ -283,6 +283,11 @@ public void createSegmentTable(final String tableName)
                 "CREATE INDEX idx_%1$s_datasource_used_end ON %1$s(dataSource, used, %2$send%2$s)",
                 tableName,
                 getQuoteString()
+            ),
+            StringUtils.format(
+                "CREATE INDEX idx_%1$s_datasource_used_start_end ON %1$s(dataSource, used, start, %2$send%2$s)",
+                tableName,
+                getQuoteString()

Review comment:
       Since this index contains all fields in the 2nd index above, can we delete the index above?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] lokesh-lingarajan commented on a change in pull request #11894: Creating an index based on druid_segments(datasource, used, start, end)

Posted by GitBox <gi...@apache.org>.
lokesh-lingarajan commented on a change in pull request #11894:
URL: https://github.com/apache/druid/pull/11894#discussion_r752017835



##########
File path: server/src/main/java/org/apache/druid/metadata/SQLMetadataConnector.java
##########
@@ -283,6 +283,11 @@ public void createSegmentTable(final String tableName)
                 "CREATE INDEX idx_%1$s_datasource_used_end ON %1$s(dataSource, used, %2$send%2$s)",
                 tableName,
                 getQuoteString()
+            ),
+            StringUtils.format(
+                "CREATE INDEX idx_%1$s_datasource_used_start_end ON %1$s(dataSource, used, start, %2$send%2$s)",
+                tableName,
+                getQuoteString()

Review comment:
       @FrankChen021 - since the new index has "end" as the last column, may be queries that rely only on (datasource, used, end) fields got slow enough to induce ingestion lag in our environment. Since the symptoms were evident after deleting the index with (datasource, used, end) columns, I feel we should have both indexes. wdyt ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] lokesh-lingarajan commented on a change in pull request #11894: Modifying index from druid_segments(datasource, used, end) to druid_segments(datasource, used, end, start) to support kill task

Posted by GitBox <gi...@apache.org>.
lokesh-lingarajan commented on a change in pull request #11894:
URL: https://github.com/apache/druid/pull/11894#discussion_r768863015



##########
File path: server/src/main/java/org/apache/druid/metadata/SQLMetadataConnector.java
##########
@@ -283,6 +283,11 @@ public void createSegmentTable(final String tableName)
                 "CREATE INDEX idx_%1$s_datasource_used_end ON %1$s(dataSource, used, %2$send%2$s)",
                 tableName,
                 getQuoteString()
+            ),
+            StringUtils.format(
+                "CREATE INDEX idx_%1$s_datasource_used_start_end ON %1$s(dataSource, used, start, %2$send%2$s)",
+                tableName,
+                getQuoteString()

Review comment:
       @FrankChen021 - Thanks for your recommendation, we tested the change of putting start at the end of the index and it worked in all our clusters. Fix is working without any issues. Please take a look at the change now :)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org