You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2016/10/13 22:33:20 UTC
[jira] [Commented] (DRILL-4826) Query against
INFORMATION_SCHEMA.TABLES degrades as the number of views increases
[ https://issues.apache.org/jira/browse/DRILL-4826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15573419#comment-15573419 ]
ASF GitHub Bot commented on DRILL-4826:
---------------------------------------
Github user ppadma commented on a diff in the pull request:
https://github.com/apache/drill/pull/592#discussion_r83323951
--- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/AbstractSchema.java ---
@@ -231,4 +231,21 @@ public void dropTable(String tableName) {
}
return tables;
}
-}
\ No newline at end of file
+
+ public List<Pair<String, Schema.TableType>> getTableNamesAndTypes(boolean bulkLoad, int bulkSize) {
+ final List<String> tableNames = Lists.newArrayList(getTableNames());
+ final List<Pair<String, Schema.TableType>> tableNamesAndTypes = Lists.newArrayList();
+ final List<Pair<String, ? extends Table>> tables;
+ if (bulkLoad) {
+ tables = getTablesByNamesByBulkLoad(tableNames, bulkSize);
--- End diff --
why do we even have this option to do bulkLoad or not ? why not just do bulkLoad always ?
> Query against INFORMATION_SCHEMA.TABLES degrades as the number of views increases
> ---------------------------------------------------------------------------------
>
> Key: DRILL-4826
> URL: https://issues.apache.org/jira/browse/DRILL-4826
> Project: Apache Drill
> Issue Type: Bug
> Reporter: Parth Chandra
> Assignee: Padma Penumarthy
>
> Queries against INFORMATION_SCHEMA.TABLES and INFORMATION_SCHEMA.VIEWS slow down as the number of views increases.
> BI tools like Tableau issue a query like the following at connection time:
> {code}
> select TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME, TABLE_TYPE from INFORMATION_SCHEMA.`TABLES` WHERE TABLE_CATALOG LIKE 'DRILL' ESCAPE '\' AND TABLE_SCHEMA <> 'sys' AND TABLE_SCHEMA <> 'INFORMATION_SCHEMA'ORDER BY TABLE_TYPE, TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME
> {code}
> The time to query the information schema tables degrades as the number of views increases. On a test system:
> || Views || Time(secs) ||
> |500 | 6 |
> |1000 | 19 |
> |1500 | 33 |
> This can result in a single connection taking more than a minute to establish.
> The problem occurs because we read the view file for every view and this appears to take most of the time.
> Querying information_schema.tables does not, in fact, need to open the view file at all, it merely needs to get a listing of the view files. Eliminating the view file read will speed up the query tremendously.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)