You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2020/08/14 16:20:56 UTC

[GitHub] [druid] abhishekrb19 opened a new issue #10283: Druid SQL: Querying a new data source with no data returns a CalciteContextException.

abhishekrb19 opened a new issue #10283:
URL: https://github.com/apache/druid/issues/10283


   Querying a new data source with no data -- unpublished/no segments, returns a CalciteContextException. 
   
   ### Affected Version
   
   0.18.1
   
   ### Description
   
   Bring up a supervisor with data source "ds". Issue a SQL query against the new data source once the supervisor is up and running:
   ```
   SELECT * FROM ds
   ```
   returns the following exception:
   ```
   Unknown exception / org.apache.calcite.runtime.CalciteContextException: From line 1, column 15 to line 1, column 17: Object 'ds' not found / org.apache.calcite.tools.ValidationException
   ```
   
   Looking at the logs, the broker appears to be doing a SQL validation:
   ```
   Caused by: org.apache.calcite.sql.validate.SqlValidatorException: Object 'ds' not found
   	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:1.8.0_162]
   	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[?:1.8.0_162]
   ```
   
   As a workaround, we'd have to resort to batch-inline ingest, where we ingest a single canned row corresponding to this data source, so there's at least one segment that is available/published and hence is query-able. I think it'd be nice to return "no data", which would also be consistent with the canonical native JSON query and other SQL-like engines.
   
   /cc: @gianm  @clintropolis  @jihoonson. Thanks!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] gianm commented on issue #10283: Druid SQL: Querying a new data source with no data returns a CalciteContextException.

Posted by GitBox <gi...@apache.org>.

gianm commented on issue #10283:
URL: https://github.com/apache/druid/issues/10283#issuecomment-677859327

This one is interesting.

Our SQL metadata is based on doing segment metadata queries. (We look at all the active segments and build a table schema based on what columns are found.) Therefore, as far as the SQL API is concerned, datasources don't exist until they actually have some segments.

You should be able to query the datasource `ds` once it has some data, i.e., once the supervisor launches tasks and they read something. It doesn't need to be published. Data that's currently only available from the realtime indexing system is still good enough to update the SQL metadata. Would that be good enough in your case?

Beyond that, it would be tough to do a more fundamental fix without making major changes to how SQL metadata works in Druid. You could imagine defining what tables _should be_ rather than having the SQL layer observe what they _are_. This would be in line with what a lot of RDBMSes do (CREATE TABLE, ALTER TABLE, etc). It's something that does have its uses.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] abhishekrb19 commented on issue #10283: Druid SQL: Querying a new data source with no data returns a CalciteContextException.

Posted by GitBox <gi...@apache.org>.

abhishekrb19 commented on issue #10283:
URL: https://github.com/apache/druid/issues/10283#issuecomment-677909367


   @gianm, thanks for the clarification. 
   
   > Data that's currently only available from the realtime indexing system is still good enough to update the SQL metadata. Would that be good enough in your case?
   
   Yes, I think this should be good enough for most cases. However, if the supervisor is stopped/terminated/suspended before any data arrives, I'd think the data source won't be query-able from Druid SQL API?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org